AWS Big Data Blog

Automating Index State Management for Amazon ES

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. When it comes to time-series data, it’s more common to access new data over existing data, such as the last 4 hours or 1 day. Often, application teams are tasked with maintaining multiple indexes for diverse data workloads, which brings […]

Migrating IBM Netezza to Amazon Redshift using the AWS Schema Conversion Tool

The post How to migrate a large data warehouse from IBM Netezza to Amazon Redshift with no downtime described a high-level strategy to move from an on-premises Netezza data warehouse to Amazon Redshift. In this post, we explain how a large European Enterprise customer implemented a Netezza migration strategy spanning multiple environments, using the AWS […]

Federating Amazon Redshift access from OneLogin

December 2022: This post was reviewed and updated for accuracy. You can use federation to access AWS accounts using credentials from a corporate directory, utilizing open standards such as SAML, to exchange identity and security information between an identity provider (IdP) and an application. With this integration, you manage user identities to AWS resources centrally […]

Amazon QuickSight adds support for on-sheet filter controls

Amazon QuickSight now supports easy and intuitive filter controls that you can place beside visuals on dashboards, allowing readers to quickly slice and dice data in the context of its visual representation. You can create these filter controls from existing or new filters with a single click, and configure them to support different operations, such […]

Building high-quality benchmark tests for Redshift using open-source tools: Best practices

Amazon Redshift is the most popular and fastest cloud data warehouse, offering seamless integration with your data lake, up to three times faster performance than any other cloud data warehouse, and up to 75% lower cost than any other cloud data warehouse. When you use Amazon Redshift to scale compute and storage independently, a need […]

Automating deployment of Amazon Redshift ETL jobs with AWS CodeBuild, AWS Batch, and DBT

This post was last reviewed and updated June, 2022 to update the code and service used on the AWS CloudFormation template. Data has become an essential part of every business, and its volume, velocity, and variety continue to increase. This has resulted in more complex ETL jobs with interdependencies between each other. There is also […]

ICBiome uses Amazon QuickSight to empower hospitals in dealing with harmful pathogens

In response to the COVID-19 pandemic, hospitals and healthcare organizations are increasingly employing genetic sequencing to screen, track, and contain harmful pathogens. ICBiome is a startup that has been working on this problem for several years, creating innovative data analytics products using AWS to help hospitals and researchers address both community-associated and hospital-acquired infections. Building […]

Enabling multi-factor authentication for an Amazon Redshift cluster using Okta as an identity provider

December 2022: This post was reviewed and updated for accuracy. Many organizations have started using single sign-on (SSO) with multi-factor authentication (MFA) for enhanced security. This additional authentication factor is the new normal, which enhances the security provided by the user name and password model. Using SSO reduces the effort needed to maintain and remember […]

Unified serverless streaming ETL architecture with Amazon Kinesis Data Analytics

Businesses across the world are seeing a massive influx of data at an enormous pace through multiple channels. With the advent of cloud computing, many companies are realizing the benefits of getting their data into the cloud to gain meaningful insights and save costs on data processing and storage. As businesses embark on their journey […]

Normalize data with Amazon Elasticsearch Service ingest pipelines

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Amazon OpenSearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost-effectively at scale. Search and log analytics are the two most popular use cases for Amazon OpenSearch Service. In log analytics […]