StreamAnalytix
Product Overview
A unified platform for end to end Spark based data processing
Data ingestion, quality, blending, enrichment, analytics, machine learning, action triggers, visualization
Click + Code
Use drag-and-drop Spark operators in an intuitive UI. Or, introduce custom logic using Java, Scala, and Python.
Ingest, blend and store both batch and streaming data
Use built-in connectors for Kafka, S3, ElasticSearch, HBase, Hive and more. Input/output data formats like JSON, CSV, Parquet.
Strong data science enablement
Easily build, train, calibrate, deploy and monitor ML models on batch + real-time data. Use built-in operators like Spark MLlib, ML, PMML, H2O, TensorFlow.
Get full support for Spark 2.2 and Hadoop 2.7.3
Use Extensions API to write your own functionality in Java, Scala, SQL, Python
StreamAnalytix Enterprise AMI
Allows designing and deploying data flows at scale by connecting to Amazon EMR and other cloud native services. Rapidly build and operationalize analytics flows by scaling out to EMR (v 5.11.1) seamlessly.
Data pipelines do not consume EMR resources at design time; a pipeline can be deployed on a configured EMR cluster. EMR connection settings are discovered during AMI launch
Run up to 15 active pipelines on a cluster
Any number of pipelines can be designed using the AMI
For additional scaling options please contact support
Version
Video
Categories
Operating System
Linux/Unix, CentOS 7.x
Delivery Methods