Get Started with the Project

5 Steps  |  60 Minutes

Analyze_Big_Data_HERO-ART_SM

Cost to complete the project:  The estimated cost to complete this project is $1.05. This cost assumes that you are within the AWS Free Tier limits, you follow the recommended configurations, and that you terminate all resources used in the project within an hour of creating them. Your use case may require different configurations that can impact your bill. Use the Simple Monthly Calculator to estimate costs tailored for your needs.

Monthly Billing Estimate: The total cost of this project will vary depending on your usage and configuration settings. Using the default configuration recommended in this guide, it will typically cost $769 per month for this project. AWS pricing is based on your usage of each individual service. The total combined usage of each service will create your monthly bill. Explore the tabs below to learn what each service does and how it affects your bill.

AWS pricing is based on your usage of each individual service. The total combined usage of each service will create your monthly bill. Explore the tabs below to learn what each service does and how it affects your bill.

  • Amazon EMR

    Product Description: Amazon EMR is a managed Hadoop service that allows you to run the latest versions of popular big data frameworks such as Apache Spark, Presto, Hbase, Hive, and more, on fully customizable clusters. Amazon EMR gives you full control over the configuration of your clusters and the software you install on them.

    How Pricing Works: With Amazon EMR, you pay an hourly rate for every instance hour you use (for example, a 10-node cluster running for 10 hours costs the same as a 100-node cluster running for 1 hour). The hourly rate depends on the instance type used. Hourly prices range from $0.011/hour to $0.27/hour and are charged in addition to the EC2 costs. For more details, see Amazon EMR Pricing.

    Cost Estimate: Let's say that you follow this Project guide and launch a 3-node EMR cluster on an m3.xlarge EC2 instance in the US East Region.  Your EMR cost will be $0.21/hour ($156.21/month, if the cluster runs continuously for 31 days).

  • Amazon EC2

    Product Description: Amazon EC2 provides the virtual servers, known as instances, that you will use as nodes in your EMR cluster. Amazon EC2 allows you to configure and scale your compute capacity easily to meet changing requirements and demand. It is integrated with Amazon’s proven computing environment, allowing you to leverage the AWS suite of services.

    How Pricing Works: Amazon EC2 pricing is based on four components: the instance type you choose (EC2 comes in 40+ types of instances with options optimized for compute, memory, storage and more), the region your instances are based in, the operating system you run, and the pricing model you select (on-demand instances, reserved capacity, spot, etc.). For more information, see Amazon EC2 Pricing.

    Cost Estimate: Let's say you follow this guide and launch an EMR cluster with three nodes, where each node corresponds to an m3.xlarge EC2 instance in the US East region. With on-demand pricing, your EC2 charges will amount to $0.81/hour ($593.71/month).

  • Amazon S3

    Product Description: Amazon S3 provides secure, durable, and highly-scalable object storage for your data. Amazon S3 makes it is easy to use object storage with a simple web interface to store and retrieve data from anywhere on the web, meaning that your website will be reliably available to all your visitors. 

    How Pricing Works: S3 Pricing is based on five components: the type of S3 storage you use, where you store your website content (e.g. US East vs. Asia Pacific - Sydney), the amount you store, the number of requests you or your users make to store new content or retrieve the content, and the amount of data that is transferred from S3 to you or your users. For more infromation, see Amazon S3 pricing.

    Cost Estimate: If you complete this project you will incur $0.03 of S3 charges to transfer and store the sample log data used in the project.

Get Started with the Project