AWS Open Source Blog

Category: Advanced (300)

Improving HA and long-term storage for Prometheus using Thanos on EKS with S3

Prometheus is an open source systems monitoring and alerting toolkit that is widely adopted as a standard monitoring tool with self-managed and provider-managed Kubernetes. Prometheus provides many useful features, such as dynamic service discovery, powerful queries, and seamless alert notification integration. Beyond certain scale, however, problems arise when basic Prometheus capabilities do not meet requirements […]

Dgraph on AWS: Setting up a horizontally scalable graph database

This article is a guest post from Joaquin Menchaca, an SRE at Dgraph. Dgraph is an open source, distributed graph database, built for production environments, and written entirely in Go. Dgraph is fast, transactional, sharded, and distributed (joins, filters, sorts), consistently replicated with Raft, and provides fault tolerance with synchronous replication and horizontal scalability. The […]

Realize policy as code with AWS Cloud Development Kit through Open Policy Agent

AWS Cloud Development Kit (AWS CDK) is an open source software framework that allows users to define and provision AWS infrastructure using familiar programming languages. Using CDK, you can version control infrastructure, and the Infrastructure-as-Code concept opens up new opportunities to manage AWS infrastructure more efficiently and reliably. But when planning to deploy new AWS […]

User uploads data in BIDS format to S3 and starts the Lambda function → Lambda parses the uploaded data and launches a cluster of EC2 instances → EC2 instances run fMRIprep which preprocesses the data → preprocessed data are saved to S3.

fMRI data preprocessing on AWS using fMRIprep

A typical fMRI study often produces imaging data of terabytes or more. Storing and preprocessing this data can be challenging on a single computer because it often has neither enough disk space to store the data nor enough computing power to preprocess it. Traditionally, researchers use a combination of cloud-based storage and on-premises high-performance clusters […]

Splitting an application’s logs into multiple streams: a Fluent tutorial

Not all logs are of equal importance. Some require real-time analytics, others simply need to be stored long term so that they can be analyzed if needed. In this tutorial, I will show three different methods by which you can “fork” a single application’s stream of logs into multiple streams which can be parsed, filtered, […]