AWS Big Data Blog
Top analytics announcements of AWS re:Invent 2022
Missed AWS re:Invent 2022? We’ve got you covered!
AWS offers the most scalable, highest performing data services to keep up with the growing volume and velocity of data to help organizations to be data-driven in real-time. We help customers unify diverse data sources by investing in a zero ETL future. We provide the industry’s most comprehensive set of capabilities for an end-to-end data strategy for all your workloads. And, our services help you enable end-to-end data governance so your teams are free to move faster with data.
This post walks you through all of the new analytics service launches. You’ll find links to blog posts, announcements, session recordings, and press releases so you can dive deeper into those launches.
For more 2022 re:Invent recaps and ongoing coverage of all the important AWS launches, be sure to stay in touch with us here:
- Analytics at re:Invent: session recordings
- AWS Databases & Analytics online events: Analytics in 15, fireside chat, deep dive, and roadmap
- Top announcements of AWS re:Invent 2022: AWS VP and Chief Evangelist Jeff Barr’s picks for some of the most impactful launches
Press releases
Press release – AWS Announces Five New Database and Analytics Capabilities
This press release covers five new database and analytics capabilities that make it faster and easier for you to manage and analyze data at petabyte scale—Amazon DocumentDB Elastic Clusters, Amazon OpenSearch Serverless, Amazon Athena for Apache Spark, AWS Glue Data Quality, and Amazon Redshift Multi-AZ.
Press release – AWS Announced Five New Capabilities for Amazon QuickSight
This press release covers five new capabilities for Amazon QuickSight, the most popular serverless BI service built for the cloud. These new capabilities will help customers streamline business intelligence operations.
Press release – AWS Announced Two New Capabilities to Move Toward a Zero-ETL Future on AWS
This press release covers two new integrations that make it easier for customers to connect and analyze data across data stores without having to move data between services.
Press release – AWS Announced Amazon DataZone
Amazon DataZone makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on premises, and on third-party sources. To learn more about Amazon DataZone, please visit the product page or watch the re:Invent session recording.
Keynotes and leadership sessions
Adam Selipsky, Chief Executive Officer of Amazon Web Services, highlighted innovations in data, infrastructure, and more that are helping customers achieve their goals faster, take advantage of untapped potential, and create a better future with AWS. The analytics new launches Adam mentioned include Amazon OpenSearch Serverless, Amazon DataZone, and Amazon Aurora zero-ETL integration with Amazon Redshift.
Swami Sivasubramanian, Vice President of AWS Data and Machine Learning, revealed the latest AWS innovations that can help you transform your company’s data into meaningful insights and actions for your business. In this keynote, Swami launched Amazon Athena for Apache Spark, Amazon Redshift integration for Apache Spark, Amazon Redshift Multi-AZ, AWS Glue Data Quality, and other new AWS capabilities.
G2 Krishnamoorthy, VP of AWS Analytics, covered the latest service innovations around data and also highlighted customer successes with AWS analytics. New analytics capabilities he covered include AWS Glue for Ray, Amazon Redshift Streaming Ingestion, Amazon Aurora zero-ETL to Amazon Redshift, Amazon Redshift integration for Apache Spark, Amazon DataZone, Amazon Athena for Apache Spark, Amazon OpenSearch Serverless, Amazon QuickSight API, and more.
Mai-Lan Tomsen Bukovec, Vice President of AWS Foundational Data Services, and Andy Warfield, AWS Distinguished Engineer, shared the latest AWS storage innovations and an inside look at how customers drive modern business on data lakes and with high-performance data. They also dived deep into technical and organizational strategies that protect with resilience, respond with agility, and fuel innovations with data-driven insights on AWS storage.
Amazon DataZone
To gain value from your data, it needs to be accessible by people and systems that need it for analytics. This session introduces you to Amazon DataZone, a new AWS business data catalog that allows you to unlock data across organizational boundaries with built-in governance.
Amazon Security Lake
- Blog – Preview: Amazon Security Lake – A Purpose-Built Customer-Owned Data Lake Service
This new service automatically centralizes your organization’s security data from cloud and on-premises sources into a purpose-built data lake stored in your account.
Amazon Redshift
- Blog – New for Amazon Redshift – Zero-ETL integration, simplified data ingestion techniques, security and reliability features.
This year at re:Invent, Amazon Redshift has announced a number of features to help you simplify data ingestion and get to insights easily and quickly within a secure, reliable environment. - Blog – New for Amazon Redshift – General Availability of Streaming Ingestion for Kinesis Data Streams and Managed Streaming for Apache Kafka
Amazon Redshift Streaming Ingestion ingests hundreds of megabytes of data per second so you can query data in near real time. You can connect to multiple Kinesis Data Streams or Amazon Managed Streaming for Apache Kafka data streams and pull data directly to Amazon Redshift without staging data in Amazon S3. - Blog – New – Amazon Redshift Integration with Apache Spark
This new release makes it easy to build and run Spark applications on Amazon Redshift and Redshift Serverless, enabling you to open up the data warehouse for a broader set of AWS analytics and machine learning (ML) solutions. - Blog – Simplify data ingestion from Amazon S3 to Amazon Redshift using auto-copy
This feature simplifies data loading from Amazon S3 into Amazon Redshift. You can now set up continuous file ingestion rules to track your Amazon S3 paths and automatically load new files without needing additional tools or custom solutions. - Blog – Centrally manage access and permissions for Amazon Redshift data sharing with AWS Lake Formation
This feature enables you to centrally manage access to your Amazon Redshift datashares using Lake Formation. It opens up new design patterns and broadens governance and security posture across data warehouses. - Blog – Simplify data loading on the Amazon Redshift console with Informatica Data Loader
You need to bring data quickly and at scale from various data stores. You also need a simple, easy, and cloud-native solution to quickly onboard new data sources or to analyze recent data for actionable insights. With this feature, you can securely connect and load data to Amazon Redshift at scale via a simple and guided interface.
Watch the recording to learn about important new features of Amazon Redshift. Learn how Amazon Redshift reinvented data warehousing to help you analyze all your data across data lakes, data warehouses, and databases with the best price performance. In this session, Goldman Sachs shared their Amazon Redshift use case.
Amazon OpenSearch Service
- Blog – Run Search and Analytics Workloads without Managing Clusters
This new release provisions and scales resources to deliver fast data ingestion and query responses for even the most demanding and unpredictable workloads, eliminating the need to configure and optimize clusters. Learn more by watching the recording of the re:Invent breakout session Provision & scale OpenSearch resources with serverless. To get started with Amazon OpenSearch Serverless, please join this workshop.
Streaming services
- Announcement – Introducing Amazon Managed Streaming for Apache Kafka (MSK) Delivery Partners
The new Amazon MSK Service Delivery specialization for AWS partners helps customers migrate and build real-time streaming analytics solutions with fully managed Apache Kafka. - Announcement – Amazon Kinesis Data Firehose adds support for data stream delivery to Amazon OpenSearch Serverless
With a few clicks, you can easily ingest, transform, and reliably deliver streaming data into an Amazon OpenSearch Serverless without building and managing your own data ingestion and delivery infrastructure.
AWS Glue
- Blog – Join the Preview – AWS Glue Data Quality
AWS Glue Data Quality can analyze your tables and recommend a set of rules automatically based on what it finds. - Announcement – Announcing AWS Glue for Ray (Preview)
Data engineers can use AWS Glue for Ray to process large datasets with Python and popular Python libraries. - Blog – New AWS Glue 4.0 – New and Updated Engines, More Data Formats, and More
This version of AWS Glue includes Python 3.10 and Apache Spark 3.3.0, plus native support for the Cloud Shuffle Storage Plugin for Apache Spark. It also includes Pandas support and more. - Announcement – Introducing AWS Glue Delivery
The new AWS Glue Delivery specialization validates AWS Partners with deep expertise and proven success in delivering AWS Glue for data integration, data pipeline, and data catalog use cases. - Announcement – AWS Glue for Apache Spark Native support for Data Lake Frameworks
AWS Glue for Apache Spark now supports three open-source data lake storage frameworks: Apache Hudi, Apache Iceberg, and Linux Foundation Delta Lake. - Announcement – AWS Glue introduces custom visual transforms
AWS Glue now offers custom visual transforms, which let customers define, reuse, and share business-specific ETL logic among their teams.
Amazon Athena
- Blog – New — Amazon Athena for Apache Spark
With this feature, we can run Apache Spark workloads, use Jupyter Notebook as the interface to perform data processing on Athena, and programmatically interact with Spark applications using Athena APIs.
Amazon QuickSight
- Blog – New analytical questions available in Amazon QuickSight Q: “Why” and “Forecast”
Amazon QuickSight announces support for two new question types that simplify and scale complex analytical tasks using natural language: “forecast” and “why.” - Blog – Announcing Automated Data Preparation for Amazon QuickSight Q
Automated data preparation uses machine learning to infer semantic information about data and adds it to datasets as metadata about the columns (fields), making it faster for you to prepare data in order to support natural language questions. - Blog – New Amazon QuickSight API Capabilities to Accelerate Your BI Transformation
New QuickSight API capabilities allow programmatic creation and management of dashboards, analysis, and templates. - Blog – Create and Share Operational Reports at Scale with Amazon QuickSight Paginated Reports
This feature allows customers to create and share highly formatted, personalized reports containing business-critical data to hundreds of thousands of end users—without any infrastructure setup or maintenance, up-front licensing, or long-term commitments.
Amazon AppFlow
- Blog – Announcing Additional Data Connectors for Amazon AppFlow
We’ve added 22 new data connectors for Amazon AppFlow, including connectors for marketing, customer service and engagement, and business operations.
Thanks for reading! re:Invent is certainly not just about new launches. The Analytics and Business Intelligence tracks dived deep into each one of the analytics services through sessions (86 in total!), covering a wide range of topics and use cases.
Check out the YouTube playlist for session recordings.
About the author
Gwen Chen is Senior Product Marketing Manager for Amazon Redshift and re:Invent Analytics Track Lead. She believes in the power of communication, and likes data, analytics, and AI/ML.