AWS Open Source Blog

Category: Technical How-to

Credit: pressmaster – stock.adobe.com

Integrating EC2 macOS workers with EKS and GitLab

At our annual re:Invent conference in December 2020 we announced an all new macOS-based Amazon Elastic Compute Cloud (Amazon EC2) instance. This new instance allows developers to build, test, and package their applications for all Apple platforms, such as macOS, iOS, iPadOS, tvOS, and watchOS. Customers have been asking us for ways to integrate their […]

Building a Prometheus Remote Write Exporter for the OpenTelemetry Python SDK

In this post, AWS intern engineers Azfaar Qureshi and Shovnik Bhattacharya talk about their experience building the OpenTelemetry Prometheus Remote Write Exporter for Python. They share their experiences in tackling challenges they faced while building this tool, which is used for sending metrics to Prometheus protocol-based service endpoints. As software deployments become increasingly more complex, […]

Architecture diagram of the example in the post.

Using Amazon Managed Service for Prometheus to monitor EC2 environments

April 16, 2021: This article has been updated to reflect changes introduced by AWS Signature Version 4 support on Prometheus server. We recently announced Amazon Managed Service for Prometheus (AMP) that allows you to create a fully managed, secure, Prometheus-compatible environment to ingest, query, and store Prometheus metrics. In a previous blog post from the […]

Siarhei – stock.adobe.com

AWS ParallelCluster post-install: EnginFrame and DCV Session Manager Broker

With the newest tools and services provided by AWS, such as AWS ParallelCluster, you can set up a fully functional high-performance computing (HPC) cluster in minutes. ParallelCluster not only simplifies the process of setting up and running technical and scientific applications, it also takes advantage of the power, scale, and flexibility of the cloud and […]

Thitichaya – stock.adobe.com

Configuring Grafana Cloud Agent for Amazon Managed Service for Prometheus

This post was written by Robert Fratto, Imaya Kumar Jagannathan, and Alolita Sharma. The Grafana Cloud Agent is a lightweight alternative to running a full Prometheus server. It keeps the necessary parts for discovering and scraping Prometheus exporters and sending metrics to the backend, which in this case is the Amazon Managed Service for Prometheus […]

Continuous deployment of Cloud Custodian to AWS Control Tower

Cloud Custodian is an open source, cloud security, governance, and management tool that enables users to keep their Amazon Web Services (AWS) environment secure and well managed by defining policies in a YAML domain specific language (DSL). Cloud Custodian works by defining policies in a YAML file and running the defined policies against AWS accounts. […]

3dddcharacter – stock.adobe.com

Setting up Grafana on EC2 to query metrics from Amazon Managed Service for Prometheus

The recently launched Amazon Managed Service for Prometheus (AMP) service provides a highly available and secure environment to ingest, query, and store Prometheus metrics. We can query the metrics from the AMP environment using Amazon Managed Grafana, a self-hosted Grafana server, or using the HTTP APIs. In this article, we will look at how to […]

How the Bottlerocket build system works

Bottlerocket is an open source, special-purpose operating system designed for hosting Linux containers, which was launched in 2020. As I delved into the Bottlerocket build system for a deeper understanding, I found it helpful to describe the system in detail (a form of rubber-duck debugging). This article is the result of my exploration and will […]

pickup – stock.adobe.com

Leverage deep learning in Scala with GPU on Spark 3.0

This post was contributed by Qing Lan, Carol McDonald, and Kong Zhao. With the growing interest in deep learning (DL), more users are using DL in their production environments. Because DL requires intensive computational power, developers are leveraging GPUs to do their training and inference jobs. As part of a major Apache Spark initiative to […]

olegkruglyak3 – stock.adobe.com

How Netflix uses Deep Java Library (DJL) for distributed deep learning inference in real-time

This post was written by Stanislav Kirdey, Lan Qing, Lai Wei, and Lu Huang. Netflix is one of the world’s largest entertainment services with over 260 million members in more than 190 countries. One of the ways Netflix is able to sustain a high-quality customer experience is by employing deep learning models in the observability […]