Apache Hadoop and Apache Spark to Amazon EMR Migration Acceleration Program

Are you having challenges with your on-premises Apache Hadoop or Apache Spark deployments? Are you frustrated with over provisioning resources to handle workload variability? Do you spend too much time keeping up with rapidly changing open-source software innovation?

If so, you are not alone. Migrating your big data and machine learning to AWS and Amazon EMR offers many advantages over on-premises deployments. These include separation of compute and storage, increased agility, resilient and persistent storage, and managed services that provide up-to-date, familiar environments to develop and operate big data applications.

Jumpstart Your migration to Amazon EMR

AWS is here to help you migrate your big data and applications. Our Apache Hadoop and Apache Spark to Amazon EMR Migration Acceleration Program provides two ways to help you get there quickly and with confidence.

Self-service EMR migration guide

Follow step-by-step instructions, get guidance on key design decisions, and learn best practices.

Free EMR migration workshop

Create a migration plan for your organization in a free workshop with EMR specialists, with virtual or on-site delivery.

Migrating big data and analytics workloads from on-premises to the cloud involves careful decision making. AWS has helped many customers successfully migrate their big data from on-premises to Amazon EMR. Based on these successes, we put together a new detailed, step-by-step EMR Migration Guide. In the Guide, you will learn the best practices for:

  • Migrating data, applications, and catalogs
  • Using persistent and transient resources
  • Configuring security policies, access controls, and audit logs
  • Estimating and minimizing costs, while maximizing value
  • Leveraging the AWS Cloud for high availability (HA) and disaster recovery (DR)
  • Automating common administrative tasks

This new EMR Migration Workshop is a multi-day, customizable workshop that can jumpstart your migration to the cloud. The workshop provides a small and interactive setup where participants directly interact with AWS experts, discuss strategies, and map out a way forward. The workshop focuses on explaining the benefits of a cloud-native architecture, dives deep into Amazon EMR’s benefits and common usage patterns, architecting a data lake on AWS, migration methodologies, and security controls. The workshop also has guided hands-on labs that allow participants to try the most common architecture patterns in the cloud.

Q: How do I know if I qualify for the workshop?

You can get in touch with us and we can help you qualify. If you have an Apache Hadoop/Spark workload on-premises and want to migrate to the AWS cloud, you are a good candidate.

Q: Who do I need from my team for the workshop to be effective?

We recommend that your Apache Hadoop/Spark Admins, Data Engineers, and Infrastructure Engineers be present. You can also invite Analysts, Data Scientists, and ML Engineers.

Cloudwick

Cloudwick has 10 years of Global 1000 Hadoop operations expertise and has migrated more than 30 Hadoop clusters to AWS.

See how Cloudwick helped Rakuten Rewards modernize

Click here to learn more >>

Mactores

Mactores offers rapid migration and transformation solutions to accelerate your data platform migrations using automation and expertise, migrating complex Hadoop workloads.

Click here to learn more >>

Provectus delivers highly-efficient cloud-native data analytics solutions to accelerate enterprise transformation and enable AI, helping businesses gain a competitive edge.

See how Provectus helped IMVU rearchitect

Click here to learn more >>

SoftServe

Full EMR migration cycle: workshop, assessment, implementation.

See how SoftServe helped a financial services company move to EMR

 
TEKsystems

Move from legacy platforms to Amazon EMR with TEKsystems. Structured assessments and migration services, plus proprietary accelerators to boost ROI and reduce risk.

Click here to learn more >>

Unravel Data

SaaS solution that automatically scans current clusters, identifies apps best suited for cloud, estimates cost, maps dependencies, and guarantees performance for migrated workloads. Accelerate your cloud migration with data and insights using Unravel.

Click here to learn more >>

Try Unravel for Amazon EMR >>

Wipro

Wipro helps its customers realize the value of data  by modernizing their legacy data estate on AWS through a cost effective innovative approach and  efficiency in operationalizing big-data applications.

Click here to learn more >>

Customer success

Intuit: Migrating Apache Spark and Hive (49:28)

Intuit talks about how they migrated analytics, data processing (ETL), and data science workloads, including key motivations, benefits, and details of key architectural changes and best practices.

Using Amazon EMR to build a Spark ecosystem at Opendoor (35:24)

Opendoor covers their journey from in-house data processing solutions using Kubernetes to migrating to Amazon EMR to achieve a balance of cost and performance.

Hadoop/Spark to Amazon EMR: Architect it for security & governance (55:46)

Airbnb and Guardian Life discuss why and how they migrated their Apache Hadoop and Apache Spark workloads to Amazon EMR and the benefits they experienced.

Build data engineering platforms with Amazon EMR (55:21)

Salesforce.com and Vanguard discuss in detail how they use Amazon EMR to build a self-service, secure, and auditable data engineering platform.

Discover more Amazon EMR migration resources

Visit the EMR resources page for videos, blogs, and documentation
Ready to build?
Get started with Amazon EMR
Have more questions?
Contact us