AWS Partner Network (APN) Blog

Tag: Apache Hive

WANdisco-AWS-Partners

How WANdisco LiveData Migrator Can Migrate Apache Hive Metastore to AWS Glue Data Catalog

Big datasets have traditionally been locked on-premises because of data gravity, making it difficult to leverage cloud-native, serverless, and cutting-edge technologies provided by AWS and its community of partners. Modernizing an on-premises analytics platform takes time, effort, and careful planning. Explore the challenges of migrating large, complex, actively-used structured datasets to AWS and how the combination of WANdisco LiveData Migrator, Amazon S3, and AWS Glue Data Catalog overcome those challenges.

Read More
Mactores-AWS-Partners

Lower TCO and Increase Query Performance by Running Hive on Spark in Amazon EMR

Learn how Mactores helped Seagate Technology to use Apache Hive on Apache Spark for queries larger than 10TB, combined with the use of transient Amazon EMR clusters leveraging Amazon EC2 Spot Instances. It was imperative for Seagate to have systems in place to ensure the cost of collecting, storing, and processing data did not exceed their ROI. Moving to Hive on Spark enabled Seagate to continue processing petabytes of data at scale with significantly lower TCO.

Read More
Mactores-AWS-Partners

Optimizing Presto SQL on Amazon EMR to Deliver Faster Query Processing

Seagate asked Mactores Cognition to evaluate and deliver an alternative data platform to process petabytes of data with consistent performance. It needed to lower query processing time and total cost of ownership, and provide the scalability required to support about 2,000 daily users. Learn about the the three migration options Mactores tested and the architecture of the solution Seagate selected. This effort improved the overall efficiency of Seagate’s Amazon EMR cluster and business operations.

Read More