AWS Partner Network (APN) Blog

Tag: Apache Hive

Rackspace-APN-Blog-062822

Apache Iceberg: An Introduction from Rackspace on Running the New Open Table Format on AWS

Data-driven decision making is accelerating and defining the way organizations work. With this transformation, there has been a rapid adoption of data lakes across the industry. Hear from Rackspace, an AWS Premier Tier Services Partner, about the drawbacks of existing data lake architecture, what Apache Iceberg is, and how it overcomes the shortcomings of the current state of data lakes. Then dive deep on the design differences between Apache Hive and Iceberg.

WANdisco-AWS-Partners

How WANdisco LiveData Migrator Can Migrate Apache Hive Metastore to AWS Glue Data Catalog

Big datasets have traditionally been locked on-premises because of data gravity, making it difficult to leverage cloud-native, serverless, and cutting-edge technologies provided by AWS and its community of partners. Modernizing an on-premises analytics platform takes time, effort, and careful planning. Explore the challenges of migrating large, complex, actively-used structured datasets to AWS and how the combination of WANdisco LiveData Migrator, Amazon S3, and AWS Glue Data Catalog overcome those challenges.

Mactores-AWS-Partners

Lower TCO and Increase Query Performance by Running Hive on Spark in Amazon EMR

Learn how Mactores helped Seagate Technology to use Apache Hive on Apache Spark for queries larger than 10TB, combined with the use of transient Amazon EMR clusters leveraging Amazon EC2 Spot Instances. It was imperative for Seagate to have systems in place to ensure the cost of collecting, storing, and processing data did not exceed their ROI. Moving to Hive on Spark enabled Seagate to continue processing petabytes of data at scale with significantly lower TCO.

Mactores-AWS-Partners

Optimizing Presto SQL on Amazon EMR to Deliver Faster Query Processing

Seagate asked Mactores Cognition to evaluate and deliver an alternative data platform to process petabytes of data with consistent performance. It needed to lower query processing time and total cost of ownership, and provide the scalability required to support about 2,000 daily users. Learn about the the three migration options Mactores tested and the architecture of the solution Seagate selected. This effort improved the overall efficiency of Seagate’s Amazon EMR cluster and business operations.