Apache Hadoop, Apache Hive, Apache Spark, and Presto are commonly used to process, transform, and analyze data that is part of a larger data architecture. For data that needs to remain on-premises for governance, compliance, or other reasons, you can use EMR to deploy and run applications like Apache Hadoop and Apache Spark on-premises, close to your data. This reduces the need to move large amounts of on-premises data to the cloud, reducing the overall time needed to process that data.