AWS Quick Starts — Customer Ready Solutions

Qubole on Data Lake Foundation

Qubole Data Service (QDS) for business insights through machine learning and AI

This Quick Start deployment guide provides step-by-step instructions for deploying and configuring a production-ready Qubole Data Service (QDS) environment that is built on a data lake foundation in the AWS Cloud. You can use this Qubole environment to process and analyze your own datasets and implement your own use cases. The Quick Start optionally deploys an environment with prepopulated data, notebooks, and queries to analyze structured and semi-structured data, and gain key business insights into product sales performance for a fictional online retailer.

QDS is a cloud-native, data activation platform that helps operationalize the data lake, reducing the costs and complexities of managing big data. Qubole self-manages and constantly analyzes and learns about the platform’s usage through heuristics and machine learning. It provides insights and recommendations to optimize reliability, performance, and costs. Qubole works in concert with AWS services such as Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2).

See also: If this architecture doesn't meet your specific requirements, see the other data lake deployments in the Quick Start catalog.

datalake_icon_crs

This Quick Start was developed by Qubole in collaboration with AWS. Qubole is an
APN Partner.

  •  What you'll build
  •  How to deploy
  •  Cost and licenses
  •  Resources
  •  What you'll build
  • This Quick Start adds the following components:

    • A standard VPC, which is extended to support communications between instances in the public subnet and Qubole SaaS, and to provide access to the metastore within Qubole SaaS.
    • Preconfigured Apache Spark and Hadoop clusters. Qubole manages these clusters, and they are automatically started and scaled depending on the user’s workloads.
    • Preconfigured data sources that provide access to Amazon Relational Database Service (Amazon RDS) and S3 buckets in the data lake.
    • Preconfigured Qubole metastore, notebooks, and queries to show business insights. 
    • A basic wizard that helps you with Qubole account configuration and dataset deployment.
    • Data analysis and visualization, using Qubole’s Analyze and Notebooks interfaces.
  •  How to deploy
  • You can build the Qubole environment on AWS in about 12 minutes, by following these steps:

    1. If you don't already have an AWS account, sign up at https://aws.amazon.com.
    2. Create a Qubole account.
    3. Get a Qubole API token, AWS account ID, and external ID.
    4. Launch the Quick Start. You can choose from two options:
    5. Finish the Qubole configuration.
    6. Test the deployment.

    The Quick Start includes parameters that you can customize. For example, you can configure your network or customize the settings for Qubole, AWS services, and the Qubole wizard.

  •  Cost and licenses
  • You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. The AWS CloudFormation templates for this Quick Start include configuration parameters that you can customize. Some of these settings, such as instance type, will affect the cost of deployment. See the pricing pages for each AWS service you will be using for cost estimates.

    The Quick Start deploys QDS Business Edition, which allows you to consume up to 10,000 Qubole Compute Usage Hours (QCUH) per month at no cost. However, you are responsible for the cost of AWS resources that Qubole manages on your behalf. To learn more about QDS Business Edition, see the Qubole FAQ.

    After you deploy the Quick Start, you can upgrade to QDS Enterprise Edition and use Qubole Cloud Agents, which provide actionable Alerts, Insights, and Recommendations (AIR) to optimize reliability, performance, and costs. To upgrade your license to QDS Enterprise Edition, see the Enterprise Edition upgrade webpage on the Qubole website.

  •  Resources
  • This Quick Start reference deployment is related to a solution featured in Solution Space that includes a solution brief, optional consulting offers crafted by AWS Competency Partners, and AWS co-investment in proof-of-concept (PoC) projects. To learn more about these resources, visit Solution Space.