AWS for Industries
Travel and hospitality at AWS re:Invent 2022: Top announcements
AWS re:Invent 2022 was back in full swing this year with a robust audience and equally robust roster of new service and feature releases. Throughout the sessions, we heard travel and hospitality customers and Amazon Web Services (AWS) leaders discuss the importance of data, security, and sustainability. Here is a list of the top announcements most relevant for the travel and hospitality industry and a playlist of travel and hospitality on-demand breakout sessions you can watch for inspiration.
Unifying and governing data
In the 2022 Skift Digital Transformation Report, only 18 percent of respondents rated their data collection and analytics capabilities as “excellent” today. Respondents noted multiple challenges that kept them from using data more effectively including siloed and fragmented data, poor data quality, IT infrastructure challenges, organizational structure barriers, and data security concerns.
The following service and feature announcements can help travel and hospitality companies overcome those challenges to actionable insights from their data.
- AWS Clean Rooms (Preview) helps customers and their partners to more easily and securely match, analyze, and collaborate on their combined datasets—without sharing or revealing underlying data.
- Amazon DataZone can be used to share, search, and discover data at scale across organizational boundaries. You can collaborate on data projects through a unified data analytics portal that gives you a 360-degree view of all your trusted business data regardless of where it is stored. Enforce your governance policies and access your data in accordance with your organization’s security and compliance regulations.
- Amazon Redshift announcements:
- Amazon Aurora zero-ETL integration with Amazon Redshift: Amazon Aurora now supports zero-ETL integration with Amazon Redshift, to enable near real-time analytics and machine learning (ML) using Amazon Redshift on petabytes of transactional data from Aurora. Within seconds of transactional data being written into Aurora, the data is available in Amazon Redshift, so you don’t have to build and maintain complex data pipelines to perform extract, transform, and load (ETL) operations.
- Amazon Redshift Integration for Apache Spark: Amazon Redshift Integration for Apache Spark simplifies and accelerates Apache Spark applications accessing Amazon Redshift data from AWS analytics services such as Amazon EMR, AWS Glue, and Amazon SageMaker. There is no manual setup or maintenance of uncertified connectors so you can start with Apache Spark jobs using data in Amazon Redshift in seconds.
- Amazon Redshift extends SQL capabilities to simplify and speed up data warehouse migrations (Preview): Amazon Redshift now supports new SQL functionalities to simplify building multi-dimensional analytics applications and incorporating fast changing data in Redshift. In addition, Amazon Redshift now extends support for a larger, semi-structured data size (up to 16 MB) when ingesting nested data from JSON and PARQUET source files. Together, these enhancements reduce the code conversion effort if you are migrating to Amazon Redshift from other data warehouse systems and help improve performance.
- Amazon Athena now supports Apache Spark, a popular open-source distributed processing system that is optimized for fast analytics workloads against data of any size. Athena is an interactive query service that helps you query petabytes of data wherever it lives, such as in data lakes, databases, or other data stores. With Amazon Athena for Apache Spark, you get the streamlined, interactive, serverless experience of Athena with Spark, in addition to SQL.
- AWS Glue announcements:
- AWS Glue 4.0: AWS Glue version 4.0 upgrades the Spark engines to Apache Spark 3.3.0 and Python 3.10 so you can develop, run, and scale your data integration workloads and get insights faster. It upgrades connectors for native AWS Glue database sources such as RDS, MySQL, and SQLServer, which simplifies connections to common database sources. AWS Glue 4.0 also adds native support for the new Cloud Shuffle Storage Plugin for Apache Spark, which helps customers scale their disk usage during runtime.
- AWS Glue now offers custom visual transforms: Data engineers can now write reusable transforms for the AWS Glue visual job editor, which creates consistency between teams and helps keep jobs up to date by minimizing duplicate effort and code. You can define AWS Glue custom visual transforms using Apache Spark code as well as the user input form. You can also specify validations for the input form to help protect users from making mistakes. Once you save the files defining the transform to your AWS account, it automatically appears in the dropdown list of available transforms in the visual job editor.
- AWS Glue for Apache Spark now supports three open source data lake storage frameworks: Apache Hudi, Apache Iceberg, and Linux Foundation Delta Lake: This feature removes the need to install a separate connector and reduces the configuration steps required to use these frameworks in AWS Glue for Apache Spark jobs. These open source data lake frameworks simplify incremental data processing in data lakes built on Amazon Simple Storage Service (Amazon S3). They enable capabilities including time travel queries, ACID (Atomicity, Consistency, Isolation, Durability) transactions, streaming ingestion, change data capture (CDC), upserts, and deletes.
- AWS Glue for Ray (Preview) is a new engine option on AWS Glue: Data engineers can use AWS Glue for Ray to process large datasets with Python and popular Python libraries. AWS Glue for Ray combines that serverless option for data integration with Ray (io), a popular new open-source compute framework that helps you scale Python workloads. You can create and run Ray jobs anywhere that you run AWS Glue ETL jobs. When the Ray job is ready, you can run it on demand or on a schedule.
- AWS Glue Data Quality (Preview) for AWS Glue: This feature will help data engineers and analysts to avoid creating manual data quality checks on data being written to a data lake or data warehouse. This feature can prevent “bad” data entering into your downstream applications (reporting/BI/ML). AWS Glue Data Quality uses open-source Deequ to evaluate rules.
Enhancing security
When asked about software and technology investments for 2022 and 2023, 32 percent of Skift respondents said they plan to invest in improving cybersecurity/data security. Two ways in which travel and hospitality customers can do so:
- Amazon Security Lake (Preview): Amazon Security Lake automatically centralizes security data from cloud, on-premises, and custom sources into a purpose-built data lake stored in your account. With Security Lake, you can get a more complete understanding of your security data across your entire organization. You can also improve the protection of your workloads, applications, and data. Security Lake has adopted the Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service can normalize and combine security data from AWS and a broad range of enterprise security data sources.
- Amazon Macie introduces automated sensitive data discovery: With this new capability, Macie automatically and intelligently samples and analyzes objects across your Amazon S3 buckets, inspecting them for sensitive data such as personally identifiable information (PII), financial data, and AWS credentials. Macie then builds and continuously maintains an interactive data map of where your sensitive data resides in S3 across all accounts and Regions where you’ve enabled Macie, and provides a sensitivity score for each bucket. This helps you continuously identify and remediate data security risks without manual configuration and lowers the cost to monitor for and respond to data security risks.
Other services, features, and announcements you might be interested in:
- AWS MGN new migration and modernization features: AWS Application Migration Service (AWS MGN) announced support for several new migration and modernization features, including application and wave management, custom modernization actions, and launch template configuration. Application Migration Service helps minimize time-intensive manual processes by automating the conversion of your source servers to run natively on AWS with optional modernization features.
- Account customization within AWS Control Tower: With this release, you can now use AWS Control Tower to define account blueprints that scale your multi-account provisioning without starting from scratch with every account. An account blueprint describes the specific resources and configurations that are used when an account is provisioned. You may also use pre-defined blueprints, built and managed by AWS partners, to customize accounts for specific use cases.
- AWS Marketplace for containers supports direct deployment to EKS clusters: Amazon EKS customers can now find and deploy third-party operational software to their EKS clusters through the EKS console or using CLI, eksctl, AWS APIs, or infrastructure as code tools such as AWS CloudFormation and Terraform. This helps EKS customers reduce time required to find, subscribe to, and deploy third-party software, helping customers set up production-ready EKS clusters in minutes.