AWS Open Source Blog
Category: Technical How-to
Building a multi-tenant Kubeflow environment on Amazon EKS using Amazon Cognito and ADFS
NOTE: Since this blog post was written, much about Kubeflow has changed. While we are leaving it up for historical reference, more accurate information about Kubeflow on AWS can be found here. The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable, and scalable. The project’s goal is […]
Amazon MWAA with AWS CodeArtifact for Python dependencies
This post was written by Dzenan Softic and Sam Dengler. Many organizations rely on Apache Airflow, an open source project, to orchestrate their data pipelines. In 2020, Amazon Web Services (AWS) released Amazon Managed Workflows for Apache Airflow (Amazon MWAA), which lets engineers focus on business solutions rather than on running and maintaining infrastructure for […]
Auto-instrumenting a Python application with an AWS Distro for OpenTelemetry Lambda layer
Customers want better insight into understanding the behavior of their systems, but not all customers can afford to make significant code changes in their existing pipelines to add more observability. In this walkthrough, we explain how to get telemetry data from AWS Lambda Python functions, without having to change a line of code. Find the […]
Querying AWS at scale across APIs, Regions, and accounts
This post was contributed by David Boeke, Bob Tordella, Jon Udell, and Nathan Wallace. Steampipe is an open source tool for querying cloud APIs in a universal way and reasoning about the data in SQL. To enable you to select * from cloud, the tool embeds Postgres, and it maps cloud APIs to database tables […]
Scaling Cortex with parallel compaction
In this post, Albert Choi, an intern on the Amazon Managed Service for Prometheus team, shares his experience of designing and implementing parallel compactors inside of the Cortex open source project. The addition to the compactors enables Cortex to handle large volumes of active metrics per tenant. This blog post details the work done as […]
Performing canary deployments and metrics-driven rollback with Amazon Managed Service for Prometheus and Flagger
This post was written by Kevin Bell and Stefan Prodan. Canary deployments are a popular tool to reduce risk when deploying software, by exposing a new version to a small subset of traffic before rolling it out more broadly. Creating the machinery to do this kind of controlled rollout, and monitoring for possible problems and […]
Delta Sharing on AWS
This post was written by Frank Munz, Staff Developer Advocate at Databricks. An introduction to Delta Sharing During the past decade, much thought went into system and application architectures using domain-driven design and microservices, but we are still on the verge of building distributed data meshes. Such data meshes are based on two fundamental principles: […]
Implementing a hub and spoke dashboard for multi-account data science projects
Modern data science environments often involve many independent projects, each spanning multiple accounts. In order to maintain a global overview of the activities within the projects, a mechanism to collect data from the different accounts into a central one is crucial. In this post, we show how to leverage existing services—Amazon DynamoDB, AWS Lambda, Amazon […]
Enhancing Spinnaker deployment for dynamic AWS account registration
This post was written by Manabu McCloskey, Gaurav Dhamija, Nima Kaviani, Siddhi Shah, Kevin Kidd, Brandon Leach, and Shrirang Moghe. Multi-account Amazon Web Services (AWS) environments are a recommended best practice through which AWS customers can have clear separation of concerns across teams and applications where rapid innovation, flexible security controls, and varied adoption of […]
Setting up Amazon Managed Grafana cross-account data source using customer managed IAM roles
Amazon Managed Grafana is a fully managed and secure data visualization service for open source Grafana that enables customers to instantly query, correlate, and visualize operational metrics, logs, and traces for their applications from multiple data sources. Amazon Managed Grafana integrates with multiple Amazon Web Services (AWS) security services, and supports AWS Single Sign-On (AWS […]