AWS Big Data Blog

Category: Database

JOIN Amazon Redshift AND Amazon RDS PostgreSQL WITH dblink

Tony Gibbs is a Solutions Architect with AWS (Update: This blog post has been translated into Japanese) When it comes to choosing a SQL-based database in AWS, there are many options. Sometimes it can be difficult to know which one to choose. For example, when would you use Amazon Aurora instead of Amazon RDS PostgreSQL […]

Read More

Using Spark SQL for ETL

Ben Snively is a Solutions Architect with AWS With big data, you deal with many different formats and large volumes of data. SQL-style queries have been around for nearly four decades. Many systems support SQL-style syntax on top of the data layers, and the Hadoop/Spark ecosystem is no exception. This allows companies to try new […]

Read More

Real-time in-memory OLTP and Analytics with Apache Ignite on AWS

Babu Elumalai is a Solutions Architect with AWS Organizations are generating tremendous amounts of data, and they increasingly need tools and systems that help them use this data to make decisions. The data has both immediate value (for example, trying to understand how a new promotion is performing in real time) and historic value (trying […]

Read More

Encrypt Your Amazon Redshift Loads with Amazon S3 and AWS KMS

Russell Nash is a Solutions Architect with AWS Have you been looking for a straightforward way to encrypt your Amazon Redshift data loads? Have you wondered how to safely manage the keys and where to perform the encryption? In this post, I will walk through a solution that meets these requirements by showing you how […]

Read More

Analyze Your Data on Amazon DynamoDB with Apache Spark

Manjeet Chayel is a Solutions Architect with AWS Every day, tons of customer data is generated, such as website logs, gaming data, advertising data, and streaming videos. Many companies capture this information as it’s generated and process it in real time to understand their customers. Amazon DynamoDB is a fast and flexible NoSQL database service […]

Read More

Amazon Redshift UDF repository on AWSLabs

Christopher Crosbie is a Healthcare and Life Science Solutions Architect with Amazon Web Services Zach Christopherson, an Amazon Redshift Database Engineer, contributed to this post Did you ever have a need for complex string parsing in Amazon Redshift and wish you could simply add f_parse_url_query_string(url) to your SQL query? Have you ever tried to weigh which would be less […]

Read More

Agile Analytics with Amazon Redshift

Nick Corbett is a Big Data Consultant for AWS Professional Services What makes outstanding business intelligence (BI)? It needs to be accurate and up-to-date, but this alone won’t differentiate a solution. Perhaps a better measure is to consider the reaction you get when your latest report or metric is released to the business. Good BI […]

Read More

Query Routing and Rewrite: Introducing pgbouncer-rr for Amazon Redshift and PostgreSQL

Bob Strahan is a senior consultant with AWS Professional Services Have you ever wanted to split your database load across multiple servers or clusters without impacting the configuration or code of your client applications? Or perhaps you have wished for a way to intercept and modify application queries, so that you can make them use […]

Read More

Performance Tuning Your Titan Graph Database on AWS

At AWS re:Invent 2017, we announced the preview of Amazon Neptune, a fast and reliable graph database built for the cloud. Neptune is fully managed and highly available, and it includes read replicas, point-in-time recovery, and continuous backups to Amazon S3. If you are about to build an application yourself and need a graph database, […]

Read More

Top 10 Performance Tuning Techniques for Amazon Redshift

Ian Meyers is a Solutions Architecture Senior Manager with AWS Zach Christopherson, an Amazon Redshift Database Engineer, contributed to this post Amazon Redshift is a fully managed, petabyte scale, massively parallel data warehouse that offers simple operations and high performance. Customers use Amazon Redshift for everything from accelerating existing database environments that are struggling to […]

Read More