AWS Open Source Blog

Category: Intermediate (200)

Virtual GPU device plugin for inference workloads in Kubernetes

Machine learning (ML) has become a centerpiece for enterprise transformation. AWS provides a broad and deep set of ML capabilities for builders with all levels of expertise. Developers with no prior ML experience can seamlessly build sophisticated AI-driven applications using AWS AI services. Developers and data scientists can use Amazon SageMaker, a managed machine learning […]

Read More

Getting started with the open source data science tool Metaflow on AWS

Data science is hard. Customers face business challenges today at a scale larger and more complex than ever before, and data scientists bring unique skills to the table to help solve some of those problems. The concept is simple: Data scientists use large amounts of data to break a problem down into pieces that machines […]

Read More

Using multiple queues and instance types in AWS ParallelCluster 2.9

Since its release as an officially supported AWS tool and open source project in November 2018, AWS ParallelCluster has made it simple for high performance computing (HPC) customers to set up easy-to-use environments with compute, storage, job scheduling, and networking in the cloud in one cohesive package. These clusters can cater to a wide variety […]

Read More

Getting started with Travis-CI.com on AWS Graviton2

AWS Graviton2 processors deliver a major leap in performance and capabilities over first-generation AWS Graviton processors. They power Amazon Elastic Compute Cloud (Amazon EC2) M6g, C6g, and R6g instances, and their variants with local disk storage. Graviton2-based EC2 instances provide up to 40% better price/performance over comparable current generation x86-based instances for a wide variety […]

Read More

Dgraph on AWS: Setting up a horizontally scalable graph database

This article is a guest post from Jaoquin Menchaca, an SRE at Dgraph. Dgraph is an open source, distributed graph database, built for production environments, and written entirely in Go. Dgraph is fast, transactional, sharded, and distributed (joins, filters, sorts), consistently replicated with Raft, and provides fault tolerance with synchronous replication and horizontal scalability. The […]

Read More

Managing AWS ParallelCluster SSH users with OpenLDAP

A common request from AWS ParallelCluster users is to have the ability to deploy multiple POSIX user accounts. The wiki on the project GitHub page documents a simple mechanism for achieving this, and a previous blog post, “AWS ParallelCluster with AWS Directory Services Authentication,” documents how to integrate AWS ParallelCluster with AWS Directory Service. However, […]

Read More
workflow: how to deploy TorchServe on an Amazon EKS cluster for inference, which will allow you to quickly deploy a pre-trained machine learning model as a scalable, fault-tolerant web-service for low latency inference

Running TorchServe on Amazon Elastic Kubernetes Service

This article was contributed by Josiah Davis, Charles Frenzel, and Chen Wu. TorchServe is a model serving library that makes it easy to deploy and manage PyTorch models at scale in production environments. TorchServe removes the heavy lifting of deploying and serving PyTorch models with Kubernetes. TorchServe is built and maintained by AWS in collaboration […]

Read More
Kubeflow logo surrounded by AWS logos

Enterprise-ready Kubeflow: Securing and scaling AI and machine learning pipelines with AWS

Many AWS customers are building AI and machine learning pipelines on top of Amazon Elastic Kubernetes Service (Amazon EKS) using Kubeflow across many use cases, including computer vision, natural language understanding, speech translation, and financial modeling. In this post, we will describe AWS contributions to the Kubeflow project, which provide enterprise readiness for Kubeflow deployments. […]

Read More

Monitor AWS services used by Kubernetes with Prometheus and PromCat

AWS offers Amazon CloudWatch to provide observability of the operational health for your AWS resources and applications through logs, metrics, and events. CloudWatch is a great way to monitor and visualize AWS resources metrics and logs. Recently I’ve found that some customers are adopting Prometheus as their monitoring standard because it offers the ability to […]

Read More

Deploy, track, and roll back RDS database code changes using open source tools Liquibase and Jenkins

Customers across industries and verticals deal with relational database code deployment. In most cases, developers rely on database administrators (DBAs) to perform the database code deployment. This works well when the number of databases and the amount of database code changes are low. As organizations scale, however, they deal with different database engines—including Oracle, SQL […]

Read More