Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

StreamSets Data Collector

StreamSets | 3.22.3

Linux/Unix, Amazon Linux Amazon Linux 2 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

5 AWS reviews

External reviews

98 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Paula S.

StreamSets is easy to use and maintain, has transparent appearance.

  • August 11, 2022
  • Review verified by G2

What do you like best about the product?
Very easy to follow where data goes, catch up on nodes and prepare a preview.
What do you dislike about the product?
Sometimes is not clear from the first view how to set up nodes for a new person. A site with an explanation of how each node works would be very helpful.
What problems is the product solving and how is that benefiting you?
Changing data format without using programming language.


    Eliana T.

it was good, user friendly, the help desk is very good

  • June 09, 2022
  • Review verified by G2

What do you like best about the product?
it´s really user friendly, there´s a lot of information in the portal
What do you dislike about the product?
sometimes there are a lot of tricks that can´t be found in the portal and you go around without knowing how to proceed
What problems is the product solving and how is that benefiting you?
sending data from one place to another in an easy way


    nitin s.

Very Powerful and Easy Data Engineering platform. Capable to handle multiple platform and huge data.

  • January 30, 2022
  • Review verified by G2

What do you like best about the product?
StreamSets is very light. Since it is containerized app, it is easy to use with Docker if you are an individual developer. For organizations they can use Kubernetes.
They have a very easy and user-friendly user interface. It takes only a few days for new developers to start and deploy their first pipelines.
StreamSets provides easy and powerful stages(kind of connectors) to integrate StreamSets with different platforms such as Kafka, SalesForce, Oracle DB, Rest API, HTTPS connection, Data lakes and many more.
StreamSets uses regex expression for data transformation related operation which is really easy.
Monitoring StreamSets pipelines are very easy, you can register your Data collector to control hub using provisioning agents. After registering you can deploy pipelines to SCH and create jobs. All of this can be done using their Python SDK which can easily be integrated with ADO release pipelines.
After creating/deploying pipelines users can use SCH subscription to create alerts if pipelines/jobs changes their status.
For individual alerts pipeline have built-in capability to do so.
After their version 4.0.1 , sdc are merged with their data ops platform. This allows individual developers to have the feel of a Control Hub. It also remove platform dependancy.
They have very excellent security. Pipeline can be integrated with Azure Keyvaults which eliminates the needs of sharing credentials with Developers. Same goes for parametrs and runtime parameter. Developers can easily replace any value in pipeline with ADO library variables.
If you are an Organization they provide very extensive support, work instantly on any bug if found by an organization. They also have customer success team which will do anything to make sure your organisation's experience with StreamSets is seamless.
What do you dislike about the product?
A few of the stages are a bit unstable. Like Oracle CDC client. They work fine but in some corner case scenario, it becomes a bit tricky. Logging mechanism is excellent and extensive but it could be simpler.
What problems is the product solving and how is that benefiting you?
I am in an organization where we are working on sharing Data between mutiple application running on different platoform. So we needed a tool/platform with can easily integrate with variety of technology and can adopt with this everchanging era.
StreamSets allowed us to share real time data between platfoms which also removed dependancy from heavier ETL tools like SSIS, Abinitio.
Since it is easier which allows our talent developement team enable our developers to use StreamSets.


    Abhishek K.

Streamsets : A Powerful Data Engineering + DataOps Tool

  • January 20, 2022
  • Review verified by G2

What do you like best about the product?
The easy-to-use canvas to create Data Engineering Pipelines with required Stages (Sources + Processors + Executors + Destinations).
Scheduling Data Pipelines were never that easy.
Fetching application Secrets from Key-Vault for enhanced Security.
What do you dislike about the product?
In-built Job Monitoring / Visualisation is not that user-friendly; Streamsets should include features to visualize things like "How many records were streamed from Source to Destination on a particular date, etc."
Better and Detailed logging/error information.
Fragment drill-down feature while monitoring data flow in a running Job.
What problems is the product solving and how is that benefiting you?
Being part of one of the Health Care Service provider accounts, we as a Data Engineering Team utilize Streamsets to design Data Pipelines to hydrate ADLS/GCS. This Datasets further helps Data scientists and analysts to generate patterns/insights for the healthcare benefits of customers.
Recommendations to others considering the product:
A product to consider for fast-paced Data Engineering pipeline development.


    Aird

Excellent and Useful Engine for Everything data

  • June 28, 2021
  • Review verified by AWS Marketplace

I have been using streamsets for a while now and I can say this is a very powerful design and execution engine. Makes it easy of me to create pipelines, seamless transition from s3 specifically to my Kafka and all. This is very good and will highly recommend


    best app

streamsets review

  • May 15, 2021
  • Review verified by AWS Marketplace

best datastreaming app in aws marketplace, and im using it every time, and my experience is very good so it is highly recommended by me


    jeetlove

Solutions Architect

  • April 17, 2021
  • Review verified by AWS Marketplace

StreamSets Data Collector makes it easy to deploy execution engines from Oracle, Salesforce, JDBC, Hive, and more to Snowflake, Databricks, ADLS, and other core cloud platforms. Data Collector simplifies the design experience for Apache Kafka and runs on-premises or any cloud, wherever your data lives.


    Nuzhat

StreamSets

  • April 09, 2021
  • Review verified by AWS Marketplace

It is one of best service, it is a lightweight, powerful design and execution engine that streams data in real time. Data Collector provides a web-based user interface (UI) to configure pipelines, preview data, monitor pipelines, and review snapshots of data.

Makes Life Easy


    Meghana V.

Very good data operation platform, Hassle-free filtration of data and numerous options for the same

  • March 24, 2021
  • Review verified by G2

What do you like best about the product?
Right from the ingestion,filtering,debugging by looking into preview or snapshots

Decent data processing speed, lightweight data collector to configure pipeline, processing the data,preview the data, monitor the pipelines.
Friendly user interface for deleting or adding the connection ,stop,start the pipeline
What do you dislike about the product?
Rate of consumption of real time data can be improved to avoid the lag/dataloss

Editing the single component should be more independent
What problems is the product solving and how is that benefiting you?
For the consumption of real time rawdata from our site and filtering,tagging the data to get the number of transactions and this helped us to monitor the system as well as to build our Workloadmodel.

Anomaly detection based on the traffic pattern.

Storing of raw data increases cost,using streamset we filtered out unnecessary data and used only required data for analysis.


    sai s.

Streamsets review

  • March 21, 2021
  • Review provided by G2

What do you like best about the product?
It was very useful when we have used it for loading our tables into hive databases and easy to configure as most of it was drag and drop and minimal customisation required when using streamsets I found it much easier compared to NIFI
What do you dislike about the product?
It been a while that I had actually worked with streamiest but when I used to work on the platform we used to face some issues while mapping the components in the data flow we used to face some performance issues for huge datasets
What problems is the product solving and how is that benefiting you?
we used streamsets for building the data lake in our insurance company where we would be getting files at multiple times of the day and the pipes used to trigger when ever the files used to arrive at the landing zone. we used to perform various transformations and data quality checks with in streamsets and used to load the data in hive tables in onpremises