Sign in
Categories
Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

StreamSets Data Collector

StreamSets | 3.22.3

Linux/Unix, Amazon Linux Amazon Linux 2 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

5 AWS reviews

External reviews

44 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Telecommunications

StreamSets Data Collector & Transformer Review

  • September 08, 2022
  • Review provided by G2

What do you like best about the product?
Easy to learn and use for complex ETL processes.
What do you dislike about the product?
Fewer support documents online other than documentation.
What problems is the product solving and how is that benefiting you?
Below are the problems I have solved.
1. Data Collector: Collected data from on-premise sources to the cloud.
2. Applied transformations to prepare data for analytics.


    Mustafa K.

Best Data Pipeline Building Platform

  • August 30, 2022
  • Review provided by G2

What do you like best about the product?
Stream Set is one of the leading Data Pipeline creating platforms and it is used by many tech giants also. Also, it is partnered with AWS, Snowflake, Google Cloud, and Azure. Which is very help full for Devops, Dataops and Data engineers. because it provides a comprehensive solution on one platform.
What do you dislike about the product?
I think it didn't have any downfall because the platform is so versatile. The only thing they can improve is by adding more regional servers around the world so that latency will reduce.
What problems is the product solving and how is that benefiting you?
I want to connect my Apache Kafka and Apache Nifi with data lake so I found this Platform and it really helped me, because of this amazing platform my work got complete in few click only.


    Rishabh G.

Streamsets review

  • August 27, 2022
  • Review provided by G2

What do you like best about the product?
Build fast and efficient data pipelines . Setting up environment is not complex and can be done within minutes.
What do you dislike about the product?
Lag on streamets cloud UI , not enough use cases and examples for many problems.Hybrid Data sets cannot be joined in data collector which is a significant limitation. Hands on lab has lesser time duration to practice.
What problems is the product solving and how is that benefiting you?
Integrating API and speeding up build.


    Information Technology and Services

Pleasantly surprised by its capabilities.

  • August 19, 2022
  • Review verified by G2

What do you like best about the product?
The UI canvas and choosing the different stages like processors and origin and destinations.
What do you dislike about the product?
The lineage/provenance feature needs work. I hate to compare it with Apache Nifi but this is one feature that Nifi trumps Stream Sets on.
What problems is the product solving and how is that benefiting you?
We have a lot of different formats of data and transforming it using hand-coded ETL tools or other systems is cumbersome and frustrating. Stream sets does things elegantly and in a manner that is least time-consuming.


    Insurance

Streamsets is a great product for dataops.

  • August 19, 2022
  • Review verified by G2

What do you like best about the product?
the ability to create a pipeline with with visual representation of the excecutions.
What do you dislike about the product?
this training provided is very basic and could be more specific.
What problems is the product solving and how is that benefiting you?
data engineering


    Tribhuban G.

Review of Working in StreamSets platform

  • August 16, 2022
  • Review provided by G2

What do you like best about the product?
The intuitive canvas for designing all the Streamsets pipelines coupled with the ease of configuration of environment values in Streamsets are very useful for a Data Architect.
What do you dislike about the product?
the datacollector has to be designed properly. Few of the components require external jars . For example a simple DB configuration like MySQLDB requires the jar for mysql connector to be installed in the Datacollector in order to use tht Datacollector for data reading purposes.
What problems is the product solving and how is that benefiting you?
Problems related to Data Analytics and real-time predictions for various real-life business use cases. This has helped in generating new business ideas and predictions of solutions.


    Paula S.

StreamSets is easy to use and maintain, has transparent appearance.

  • August 11, 2022
  • Review verified by G2

What do you like best about the product?
Very easy to follow where data goes, catch up on nodes and prepare a preview.
What do you dislike about the product?
Sometimes is not clear from the first view how to set up nodes for a new person. A site with an explanation of how each node works would be very helpful.
What problems is the product solving and how is that benefiting you?
Changing data format without using programming language.


    Eliana T.

it was good, user friendly, the help desk is very good

  • June 09, 2022
  • Review verified by G2

What do you like best about the product?
it´s really user friendly, there´s a lot of information in the portal
What do you dislike about the product?
sometimes there are a lot of tricks that can´t be found in the portal and you go around without knowing how to proceed
What problems is the product solving and how is that benefiting you?
sending data from one place to another in an easy way


    nitin s.

Very Powerful and Easy Data Engineering platform. Capable to handle multiple platform and huge data.

  • January 30, 2022
  • Review verified by G2

What do you like best about the product?
StreamSets is very light. Since it is containerized app, it is easy to use with Docker if you are an individual developer. For organizations they can use Kubernetes.
They have a very easy and user-friendly user interface. It takes only a few days for new developers to start and deploy their first pipelines.
StreamSets provides easy and powerful stages(kind of connectors) to integrate StreamSets with different platforms such as Kafka, SalesForce, Oracle DB, Rest API, HTTPS connection, Data lakes and many more.
StreamSets uses regex expression for data transformation related operation which is really easy.
Monitoring StreamSets pipelines are very easy, you can register your Data collector to control hub using provisioning agents. After registering you can deploy pipelines to SCH and create jobs. All of this can be done using their Python SDK which can easily be integrated with ADO release pipelines.
After creating/deploying pipelines users can use SCH subscription to create alerts if pipelines/jobs changes their status.
For individual alerts pipeline have built-in capability to do so.
After their version 4.0.1 , sdc are merged with their data ops platform. This allows individual developers to have the feel of a Control Hub. It also remove platform dependancy.
They have very excellent security. Pipeline can be integrated with Azure Keyvaults which eliminates the needs of sharing credentials with Developers. Same goes for parametrs and runtime parameter. Developers can easily replace any value in pipeline with ADO library variables.
If you are an Organization they provide very extensive support, work instantly on any bug if found by an organization. They also have customer success team which will do anything to make sure your organisation's experience with StreamSets is seamless.
What do you dislike about the product?
A few of the stages are a bit unstable. Like Oracle CDC client. They work fine but in some corner case scenario, it becomes a bit tricky. Logging mechanism is excellent and extensive but it could be simpler.
What problems is the product solving and how is that benefiting you?
I am in an organization where we are working on sharing Data between mutiple application running on different platoform. So we needed a tool/platform with can easily integrate with variety of technology and can adopt with this everchanging era.
StreamSets allowed us to share real time data between platfoms which also removed dependancy from heavier ETL tools like SSIS, Abinitio.
Since it is easier which allows our talent developement team enable our developers to use StreamSets.


    Abhishek K.

Streamsets : A Powerful Data Engineering + DataOps Tool

  • January 20, 2022
  • Review verified by G2

What do you like best about the product?
The easy-to-use canvas to create Data Engineering Pipelines with required Stages (Sources + Processors + Executors + Destinations).
Scheduling Data Pipelines were never that easy.
Fetching application Secrets from Key-Vault for enhanced Security.
What do you dislike about the product?
In-built Job Monitoring / Visualisation is not that user-friendly; Streamsets should include features to visualize things like "How many records were streamed from Source to Destination on a particular date, etc."
Better and Detailed logging/error information.
Fragment drill-down feature while monitoring data flow in a running Job.
What problems is the product solving and how is that benefiting you?
Being part of one of the Health Care Service provider accounts, we as a Data Engineering Team utilize Streamsets to design Data Pipelines to hydrate ADLS/GCS. This Datasets further helps Data scientists and analysts to generate patterns/insights for the healthcare benefits of customers.
Recommendations to others considering the product:
A product to consider for fast-paced Data Engineering pipeline development.