AWS Big Data Blog
Amazon EMR introduces EMR runtime for Presto, providing a 2.6 times speedup
Presto is an open-source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes. Presto was designed and written from the ground up for interactive analytics, and approaches the speed of commercial data warehouses while scaling to the size of organizations like Facebook. Running Presto […]
Read MoreAmazon Redshift announces general availability of support for JSON and semi-structured data processing
At AWS re:Invent 2020, we announced the preview of native support for JSON and semi-structured data in Amazon Redshift. This includes a new data type, SUPER, which allows you to store JSON and other semi-structured data in Amazon Redshift tables, and support for the PartiQL query language, which allows you to seamlessly query and process […]
Read MoreBuild a Lake House Architecture on AWS
Organizations can gain deeper and richer insights when they bring together all their relevant data of all structures and types and from all sources to analyze. In order to analyze these vast amounts of data, they are taking all their data from various silos and aggregating all of that data in one location, what many […]
Read MoreHow Goldman Sachs migrated from their on-premises Apache Kafka cluster to Amazon MSK
This is a guest post by Zachary Whitford, Associate, Richa Prajapati, Vice President and Aldo Piddiu, Vice President in the Global Investment Research engineering team at Goldman Sachs. The Global Investment Research (GIR) division at Goldman Sachs delivers client-focused research in the equity, fixed income, currency, and commodities markets. GIR analysts help the firm’s investor […]
Read MoreManage fine-grained access control using AWS Lake Formation
AWS Lake Formation is a fully managed service that helps you build, secure, and manage data lakes, and provide access control for data in the data lake. Customers across lines of business (LOBs) need a way to manage granular access permissions for different users at the table and column level. Lake Formation helps you manage […]
Read MoreSet up and manage data ingestion easily with Amazon Redshift native console integration with partners
We’re excited to announce that Amazon Redshift console partner integration is now generally available. This new console integration provides rapid provisioning and seamless integration with AWS partners. You can onboard with data integration partner solutions in less than a minute directly on the Amazon Redshift console, and ingest data from multiple data sources using partners’ […]
Read MoreHow VNR AG built a serverless customer data platform to power BI reporting with Amazon QuickSight
This is a guest blog post by Marc Müller, David Amornvuttkul, and Amira Lotfy at VNR AG. German publishing house VNR AG has a simple mission: to make expert knowledge accessible to everyone. Founded in 1976, the company has published more than 300 volumes in law, investment, health, and workplace environments. It provides customers with […]
Read MoreAmazon EMR announces general availability of EMR Studio
At AWS re:Invent 2020, we announced the preview of Amazon EMR Studio, an integrated development environment (IDE) that makes it easy for data scientists and data engineers to develop, visualize, and debug applications written in R, Python, Scala, and PySpark. Today, we’re excited to announce the general availability of EMR Studio and new features we’ve […]
Read MoreEstimate Amazon EC2 Spot Instance cost savings with AWS Glue DataBrew, AWS Glue, and Amazon QuickSight
AWS provides many ways to optimize your workloads and save on costs. For example, services like AWS Cost Explorer and AWS Trusted Advisor provide cost savings recommendations to help you optimize your AWS environments. However, you may also want to estimate cost savings when comparing Amazon Elastic Compute Cloud (Amazon EC2) Spot to On-Demand Instances. […]
Read MoreBill.com uses Amazon QuickSight to enable users with secure and governed enterprise BI
Bill.com is a leading provider of cloud-based software that simplifies, digitizes, and automates back-office financial processes for small and mid-size businesses. Bill.com helps businesses streamline their financial workflow, generate and process invoices, stream approvals, send and receive payments, sync with their accounting systems, and manage their cash. It connects businesses from all industries, ranging from […]
Read More