AWS Big Data Blog
Query, visualize, and forecast TruFactor web session intelligence with AWS Data Exchange
This post showcases TruFactor Intelligence-as-a-Service data on AWS Data Exchange. TruFactor’s anonymization platform and proprietary AI ingests, filters, and transforms more than 85 billion high-quality raw signals daily from wireless carriers, OEMs, and mobile apps into a unified phygital consumer graph across physical and digital dimensions. TruFactor intelligence is application-ready for use within any AWS analytics or ML service to power your models and applications running on AWS, with no additional processing required.
Accelerate Amazon Redshift Federated Query adoption with AWS CloudFormation
Amazon Redshift Federated Query allows you to combine the data from one or more Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL databases with data already in Amazon Redshift. You can also combine such data with data in an Amazon S3 data lake.
Build a Simplified ETL and Live Data Query Solution using Redshift Federated Query
You may have heard the saying that the best ETL is no ETL. Amazon Redshift now makes this possible with Federated Query. In its initial release, this feature lets you query data in Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL using Amazon Redshift external schemas. Federated Query also exposes the metadata from these source databases through system views and driver APIs, which allows business intelligence tools like Tableau and Amazon Quicksight to connect to Amazon Redshift and query data in PostgreSQL without having to make local copies.
Build a cloud-native network performance analytics solution on AWS for wireless service providers
February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. This post demonstrates a serverless, cloud-based approach to building a network performance analytics solution using AWS services that can provide flexibility and performance while keeping costs under control with pay-per-use AWS services. […]
Integrating AWS Lake Formation with Amazon RDS for SQL Server
This post shows how to ingest data from Amazon RDS into a data lake on Amazon S3 using Lake Formation blueprints and how to have column-level access controls for running SQL queries on the extracted data from Amazon Athena.
A public data lake for analysis of COVID-19 data
April 2024: This post was reviewed for accuracy. As the COVID-19 pandemic continues to threaten and take lives around the world, we must work together across organizations and scientific disciplines to fight this disease. Innumerable healthcare workers, medical researchers, scientists, and public health officials are already on the front lines caring for patients, searching for […]
Simplify your Spark dependency management with Docker in EMR 6.0.0
Apache Spark is a powerful data processing engine that gives data analyst and engineering teams easy to use APIs and tools to analyze their data, but it can be challenging for teams to manage their Python and R library dependencies. Installing every dependency that a job may need before it runs and dealing with library […]
Improved speed and scalability in Amazon Redshift
Amazon Redshift delivers fast performance, at scale, for the most demanding workloads. Getting there was not easy, and it takes consistent investment across a variety of technical focus areas to make this happen. This post breaks down what it takes to build the world’s fastest cloud data warehouse.
Apache Hive is 2x faster with Hive LLAP on EMR 6.0.0
Customers use Apache Hive with Amazon EMR to provide SQL-based access to petabytes of data stored on Amazon S3. Amazon EMR 6.0.0 adds support for Hive LLAP, providing an average performance speedup of 2x over EMR 5.29, with up to 10x improvement on individual Hive TPC-DS queries. This post shows you how to enable Hive […]
Expertise validation in AWS data analytics with AWS Certification
AWS Training and Certification now has a new exam version for the AWS Certified Data Analytics – Specialty certification, which validates expertise in designing, building, and maintaining analytics solutions that are efficient, cost-effective, and secure.
The new exam version includes updated content across all domains: collection, storage and data management, processing, analysis and visualization, and security. Earning AWS Certified Data Analytics – Specialty shows that you meet the standard set by AWS data analytics experts.