Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help
ProServ

Overview

Product Overview:

Background The complexity of taking a systematic approach to handle and generate insights from the data presents a major roadblock to productionizing data in financial services. The goal of productionizing data is to optimize time between the data collection and generating business value. Legacy architectures lacks agility and flexibility to prepare data effectively for analytics. Digital Alpha brings in its data platform using Amazon EMR for running large-scale distributed data processing jobs, interactive SQL queries, and machine learning (ML) applications.

Features

  • Easy to use: Simplifies building and operating big data environments and applications that includes easy provisioning, managed scaling, and cluster reconfiguration.
  • Elastic: Quickly and easily provision as much capacity as required and automatically or manually add/remove capacity.
  • Low cost: Reduces the cost of processing large amounts of data with features like per-second pricing, Amazon EC2 Spot integration, Amazon EC2 Reserved Instance integration, elasticity, and Amazon S3 integration.
  • Flexible data stores: Leverage multiple data stores including Amazon S3, Hadoop Distributed File System (HDFS), and Amazon DynamoDB.
  • Distributed computing: Perform high computing simulations such as Monte Carlo for scenario analysis to assess portfolio risks, predict market movements, and support investment decision-making.

Benefits

  • Cost Savings: Pricing depends on the instance type and the number of instances deployed and the region where the cluster is launched, on-demand pricing offers to optimize costs.
  • AWS Integration: Provide capabilities and functionalities related to cluster networking, storage, and security.
  • Deployment: Offers variety of ways to configure a software that can include versatile frameworks, such as Hadoop, and applications, such as Hive, Pig, or Spark on the cluster.
  • Scalability and Flexibility: Provides flexibility to scale cluster up or down as per change in computing need and can resize cluster to add instances for peak workloads and remove instances to control costs when peak workloads subside.
  • Reliability: Provides configuration options that control if a cluster is terminated automatically or manually.
  • Security: Leverage other AWS services, such as IAM and Amazon VPC, and features such as Amazon EC2 key pairs, to help secure clusters and data.
  • Monitoring: Provides the ability to archive log files in Amazon S3 so as to store logs and troubleshoot issues even after a cluster terminates

Common Use Cases

  • Trading analytics and risk calculations
  • Regulation of broker-dealers to ensure market integrity
  • Analyse future viability of mortgages and insurance policies

Deliverables

  1. Identify the right analytics architecture pattern for your financial workloads
  2. One Click Solution where the tech stack is fully automated with Infrastructure as Code
  3. A framework that makes it simple to create data processing logic utilising big data workloads like Apache Spark, Hive, Presto, and others, and to run and scale it on EMR
  4. Provide CI/CD pipelines for infrastructure and ETL code using AWS Code Pipeline
  5. Distribute the results via visualization or APIs

Add-ons

Managed Services

Two different options

Batch ETL Pipeline with Amazon EMR and Apache Spark

A data pipeline that automatically picks up the data files from the S3 input bucket, processes it with required transformations, and makes it available in the target S3 bucket, which will be used for querying. To implement this pipeline, integrate a transient EMR cluster with Spark as the distributed processing engine. This EMR cluster is not active and gets created just before executing the job and gets terminated after completing the job.

  • Transient EMR cluster with Spark as the distributed processing engine

  • Amazon EMR costs optimized with idle checks and automatic resource termination using AWS Lambda function

  • Glue Catalog to create efficient data queries and transformations

    Orchestrating Amazon EMR Jobs with AWS Step Functions

The workflow creates a transient EMR cluster, submits a Spark job that does ETL transforms, and then, upon completion of the job, terminate the cluster. The workflow gets triggered as soon as a file arrives in S3 and the objective of the workflow is to execute a Spark + Hudi job to process the input file.

  • Transient EMR cluster with Spark as the distributed processing engine to perform ETL transformations
  • Orchestrate a data pipeline that can create EMR clusters, submit jobs, and terminate clusters as required with Step Functions
  • Quickly analyze Data from S3 using AWS Athena
Sold by Digital Alpha Platforms
Categories
Fulfillment method Professional Services

Pricing Information

This service is priced based on the scope of your request. Please contact seller for pricing details.

Support

If you have any questions about this service or Digital Alpha Platforms, please reach out and we will get you the information you need Phone: 609-759-1367 Email: support@digital-alpha.com Contact Us: https://digital-alpha.com/