Overview
The IData Pipeline is a no-code AWS data pipeline. No coding required, just configure, ingest and consume. Using a simple JSON configuration for each dataset, you can inform the Pipeline what tasks should be performed upon a dataset. The Pipeline can perform deduplication of data, data quality checks, transformation with Javascript, transformation into S3 parquet in your data lake, transformation into Redshift or Snowflake and much more. When data lands in its final destination, automatic notifications are fired via SNS, enabling downstream systems and users to be notified immediately. The Pipeline also has a user-interface for real-time monitoring of data flows and automatic notifications to support teams when warnings or errors occur. Other features include Lakehouse support using Apache Iceberg open-source technology. Apache Iceberg enables upsert and time-travel queries against your data lake. Snowflake and Redshift support. If you need to move datasets directly into Redshift or Snowflake, this is automated with a simple JSON configuration. Dataset validation using data quality configuration rules and data deduplication capabilities. Data transformation. Associate Javascript with a dataset to perform your own powerful custom transformations. Infrastructure as code (IaC). The IData Pipeline can be spun up in an hour or less in your AWS account using our IAC code written entirely in Terraform. Structured, semi-structured, and unstructured data support. The Pipeline can consume and process basically unlimited types of data (csv, delimited, JSON, XML, PDF, images, video, etc). The Pipeline uses AWS Glue which automatically enables a large number of AWS services to consume datasets downstream.
Highlights
- No-code data pipeline - There's no coding, just simple JSON configuration, Data quality rules, Data transformation using Javascript, Automatic conversion into S3 using parquet, Lakehouse Technology - The pipeline uses open-source Apache Iceberg technology which provide upsert capabilities and time travel queries on top of S3 object store, Snowflake and Redshift integration including automatic table creation, Support for any data type, structured, semi-structured and unstructured
Details
Pricing
Free trial
Dimension | Description | Cost/unit/hour |
---|---|---|
Hours | Container Hours | $5.00 |
Vendor refund policy
This is a placeholder value. Please update this value via the AWS Marketplace Management Portal.
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
IData Pipeline Installation
- Amazon ECS
- Amazon EKS
- Amazon ECS Anywhere
- Amazon EKS Anywhere
Container image
Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.
Version release notes
Bug fix - dataset status UI REST API was not reporting the proper status for some ingestion types
Additional details
Usage instructions
The GIT repository for installation can be found here: https://github.com/idata-corporation/pipeline-infra
Resources
Vendor resources
Support
Vendor support
For support, please contact support@idata.net and one of our engineers will be in contact. Or checkout the product documentation at
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.