Tag: Presto on Amazon EMR
Learn how Mactores helped Seagate Technology to use Apache Hive on Apache Spark for queries larger than 10TB, combined with the use of transient Amazon EMR clusters leveraging Amazon EC2 Spot Instances. It was imperative for Seagate to have systems in place to ensure the cost of collecting, storing, and processing data did not exceed their ROI. Moving to Hive on Spark enabled Seagate to continue processing petabytes of data at scale with significantly lower TCO.
Seagate asked Mactores Cognition to evaluate and deliver an alternative data platform to process petabytes of data with consistent performance. It needed to lower query processing time and total cost of ownership, and provide the scalability required to support about 2,000 daily users. Learn about the the three migration options Mactores tested and the architecture of the solution Seagate selected. This effort improved the overall efficiency of Seagate’s Amazon EMR cluster and business operations.