Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

StreamSets Data Collector

StreamSets | 3.22.3

Linux/Unix, Amazon Linux Amazon Linux 2 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

5 AWS reviews

External reviews

98 reviews
from G2

External reviews are not included in the AWS star rating for the product.


4-star reviews ( Show all reviews )

    Mohammad I.

Capable streaming data processing tool

  • September 06, 2023
  • Review provided by G2

What do you like best about the product?
Listed are the things which I liked most about Streamset -

a. Presence of inbuilt connectors (in-preise version) which can useful in using it for almost every source/target systems.
b. The is GUI is user friendly and it has certainly helped my platform team to create the streaming data pipeline faster )Previously we were using pyspark)
c. Alongwith tool, the Streamset support team is also excellent.
d. The availability of streamsets academy through which we an get our resources trained easily.
What do you dislike about the product?
There are lesser number of connectors available in the cloud version of Streamsets.
The inability to supports "exactly once" delivery of data creates limitation in few of the use cases.Although we have managed this through workaround but having ths ability in Streamsets will certainly help.
What problems is the product solving and how is that benefiting you?
1.It has allowed us to perform CDC on the mainframe data and put the data to KAFKA topics which can be used by multiple platforms as per their requirement.
2. It has helped to create event based real time pipeline swhich is being used to generate marketing prompts to the customers.
3.The development time (as compared to pyspark) has been reduced as it is low code GUI tool
4. It has also helped in reducing our dependency on other ELT tools e.g. Informatica DEI.


    Marketing and Advertising

Streamset make day by day easier

  • August 11, 2023
  • Review verified by G2

What do you like best about the product?
How quickly a pipeline can be deployed and made to work. On the other hand, the large number of connectors that can be used allows you to connect with almost any data source.
What do you dislike about the product?
In my particular case, I would love for Streamsets to have more direct connections to Google Cloud Platform services, such as stages to be able to directly execute workflows, or cloud functions
What problems is the product solving and how is that benefiting you?
Migrating to Streamsets platform from Streamsets Control Hub really helped us mitigate certain connectivity issues with Control Hub that our DataCollectors were having.


    Banking

A great tool to work with Streaming Data

  • August 04, 2023
  • Review provided by G2

What do you like best about the product?
1. It has got multiple inbuilt components to connect with most of the sources/targets.
2. Its ability to handle & perform transformation on streaming data easily and effectively.
3.Topologies are quite good and provide visibilty on how systems are connected & data flows across enterprise.
4.Orchestration & Scheduling jobs are quite easy.
What do you dislike about the product?
1. Debugging is bit difficult, needs slight improvement with the error message.
2. Latency should be reduced as working with large datasets takes a bit of time.
What problems is the product solving and how is that benefiting you?
It helped us in collecting & transforming of realtime data so that the same can be used to generate customized messages to the customers.
It has also helped to reduce dependency on costly informatica & Abinitio tools.


    Telecommunications

StreamSets Data Collector & Transformer Review

  • September 08, 2022
  • Review provided by G2

What do you like best about the product?
Easy to learn and use for complex ETL processes.
What do you dislike about the product?
Fewer support documents online other than documentation.
What problems is the product solving and how is that benefiting you?
Below are the problems I have solved.
1. Data Collector: Collected data from on-premise sources to the cloud.
2. Applied transformations to prepare data for analytics.


    Information Technology and Services

Pleasantly surprised by its capabilities.

  • August 19, 2022
  • Review verified by G2

What do you like best about the product?
The UI canvas and choosing the different stages like processors and origin and destinations.
What do you dislike about the product?
The lineage/provenance feature needs work. I hate to compare it with Apache Nifi but this is one feature that Nifi trumps Stream Sets on.
What problems is the product solving and how is that benefiting you?
We have a lot of different formats of data and transforming it using hand-coded ETL tools or other systems is cumbersome and frustrating. Stream sets does things elegantly and in a manner that is least time-consuming.


    Paula S.

StreamSets is easy to use and maintain, has transparent appearance.

  • August 11, 2022
  • Review verified by G2

What do you like best about the product?
Very easy to follow where data goes, catch up on nodes and prepare a preview.
What do you dislike about the product?
Sometimes is not clear from the first view how to set up nodes for a new person. A site with an explanation of how each node works would be very helpful.
What problems is the product solving and how is that benefiting you?
Changing data format without using programming language.


    Abhishek K.

Streamsets : A Powerful Data Engineering + DataOps Tool

  • January 20, 2022
  • Review verified by G2

What do you like best about the product?
The easy-to-use canvas to create Data Engineering Pipelines with required Stages (Sources + Processors + Executors + Destinations).
Scheduling Data Pipelines were never that easy.
Fetching application Secrets from Key-Vault for enhanced Security.
What do you dislike about the product?
In-built Job Monitoring / Visualisation is not that user-friendly; Streamsets should include features to visualize things like "How many records were streamed from Source to Destination on a particular date, etc."
Better and Detailed logging/error information.
Fragment drill-down feature while monitoring data flow in a running Job.
What problems is the product solving and how is that benefiting you?
Being part of one of the Health Care Service provider accounts, we as a Data Engineering Team utilize Streamsets to design Data Pipelines to hydrate ADLS/GCS. This Datasets further helps Data scientists and analysts to generate patterns/insights for the healthcare benefits of customers.
Recommendations to others considering the product:
A product to consider for fast-paced Data Engineering pipeline development.


    jeetlove

Solutions Architect

  • April 17, 2021
  • Review verified by AWS Marketplace

StreamSets Data Collector makes it easy to deploy execution engines from Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, ADLS, and other core cloud platforms. Data Collector simplifies the design experience for Apache Kafka and runs on-premises or any cloud, wherever your data lives.


    Meghana V.

Very good data operation platform, Hassle-free filtration of data and numerous options for the same

  • March 24, 2021
  • Review verified by G2

What do you like best about the product?
Right from the ingestion,filtering,debugging by looking into preview or snapshots

Decent data processing speed, lightweight data collector to configure pipeline, processing the data,preview the data, monitor the pipelines.
Friendly user interface for deleting or adding the connection ,stop,start the pipeline
What do you dislike about the product?
Rate of consumption of real time data can be improved to avoid the lag/dataloss

Editing the single component should be more independent
What problems is the product solving and how is that benefiting you?
For the consumption of real time rawdata from our site and filtering,tagging the data to get the number of transactions and this helped us to monitor the system as well as to build our Workloadmodel.

Anomaly detection based on the traffic pattern.

Storing of raw data increases cost,using streamset we filtered out unnecessary data and used only required data for analysis.


    Information Technology and Services

User friendly interface

  • March 18, 2021
  • Review provided by G2

What do you like best about the product?
Very easy to use and understand at very first time itself
What do you dislike about the product?
Nothing much but very few minimal things like code suggestions when using scripting languages like groovy,jython and javascript
What problems is the product solving and how is that benefiting you?
Mostly I worked on data movement from different sources to different destinations involving many transformations. Worked on both batch processing and live streaming modes. Worked on triggering events for notifications based on certain conditions.
Recommendations to others considering the product:
I suggest it as one of the best ETL/ELT tool for data ingestion