Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

StreamSets Data Collector

StreamSets | 3.22.3

Linux/Unix, Amazon Linux Amazon Linux 2 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

5 AWS reviews

External reviews

98 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Hospital & Health Care

Been using Streamsets for all of use cases for onprem to cloud transfers

  • October 25, 2020
  • Review provided by G2

What do you like best about the product?
Easy UX makes it easier to configure pipelines
What do you dislike about the product?
Streamsets Control hub has a lot of issues when multiple DC attached
What problems is the product solving and how is that benefiting you?
Onprem to Cloud data transfers


    Bishnu R.

Managing pipeline over StreamSets on K8S environment

  • October 24, 2020
  • Review verified by G2

What do you like best about the product?
I did not find any difficulties to integrate streamSet Control Hub with Kubernetes by help of StreamSet Controller Agent.
What do you dislike about the product?
Some time updated docker image of StreamSet agent comes with vulnerabilities which the should take are before release.
What problems is the product solving and how is that benefiting you?
I am managing StreamSet control agent in k8s environment and still did not experienced any issue.
Recommendations to others considering the product:
Yes, i will always recommend the SteamSet to others.


    Information Technology and Services

Review on Streamsets

  • October 24, 2020
  • Review provided by G2

What do you like best about the product?
It is used as real time whenever there are updates in the source database.
What do you dislike about the product?
Sometimes, data sits in Kafka and we need to add extra functionality to pull residual data.
What problems is the product solving and how is that benefiting you?
If we use Streamsets, we can access source data in less than 5 minutes without any further delay.


    Jered L.

StreamSets used to be a great open source tool, but has lost it’s niche

  • October 23, 2020
  • Review verified by G2

What do you like best about the product?
The simplicity of creating data pipelines visually, with no clunky installation and no databases / metadata to manage, like with other ETL tools. All pipeline info is stored on the file system itself.
What do you dislike about the product?
They have closed source their software and locked it behind a SaaS model. This is a relatively recent change which caused lots of headaches
What problems is the product solving and how is that benefiting you?
We are regularly using StreamSets to pull 220 million records daily from Salesforce, eliminating the need to write complicated python code. We are also using this tool for Oracle CDC, which has worked well at scale (20 million transactions / day). I have also used this tool to consume from over 5 different JDBC based sources, with great performance and simple implementation.

The best benefit we have realized is the fact that you can dockerize the SDC service, deploy it in ECS or any container orchestration service, and run pipelines that scale horizontally, instead of having static servers hosting the service. If you can implement this properly, it makes writing ingestion pipelines EXTREMELY simple. I can add a new data source to our ETL jobs within a day, instead of weeks by doing this. And it scales to handle thousands of tables!
Recommendations to others considering the product:
There is no reason not to try the data collector.. It is free to download the Tarbell and install. Try it using docker on your local machine, then deploy it in a development environment for testing.


    Zorik Z.

Very powerful tool with a lots of endpoints and high performance

  • October 21, 2020
  • Review provided by G2

What do you like best about the product?
Easy integration with different endpoints
What do you dislike about the product?
It will be better to have more tutorials and documentation.
What problems is the product solving and how is that benefiting you?
I was working on StreamSets integration with Greenplum using GRPC
Recommendations to others considering the product:
powerful tool with many endpoints and great support


    Sai Paramahamsa P.

Streamsets is an amazing tool for data movement across environments seemlessly

  • October 21, 2020
  • Review provided by G2

What do you like best about the product?
Developer friendly user interface and resonable speed of data transfer across various databases
What do you dislike about the product?
Debugging is a pain in streamsets because of the exhaustive java logs
What problems is the product solving and how is that benefiting you?
We use streamsets for both batch and near real time data ingestion and manipulation into our enterprise data warehouse
Recommendations to others considering the product:
The support team of Streamsets is very committed and will always help tweak the software for any specific but highly used components that are missing in the current version. So take advantage and get the best version of this tool for your enterprise needs.


    Marco M.

Friendly and powerful ETL framework, still evolving

  • October 21, 2020
  • Review verified by G2

What do you like best about the product?
Intuitive, very useful plugins, easy to deploy/maintain. It's really awesome how can you build pipelines for microservices, streaming and batch purposes in a single environment.
Very straightforward to be installed and ready to use, even for production (at least for the essential parts, then obviously should be reviewed with a security team)
What do you dislike about the product?
The bugs you may find are solved with workarounds, you have to wait a bit for a stable solution. Some missing plugins (but they will be added soon, if not already).
Many updates during the year, if you don't have a proper set up to move from test/pre-production to production, you may have some issue to face every time
What problems is the product solving and how is that benefiting you?
Managing the entire preprocessing during ingestion. It's very handy, easy to add new pipelines for new data sources or maintain the already present ones
Recommendations to others considering the product:
Create always a box that can be easily updated with the latest release: a lot of issues might be solved in every minor. Moreover, it can be easily updated using the logic of a git repo
Try to use always the Streamsets logic as much as possible and avoid to have big groovy/Jython block, you will benefit from it
Prepare a CI if possible to keep Stremasets always up-to-date
Share any debug or solution you may have found in the community, a lot of people may look for it


    Rizwan S.

I'll recommend it to my friends because there no single line of code

  • October 12, 2020
  • Review provided by G2

What do you like best about the product?
Easy to use and understand, just drag and drop. We can graphically monitor the flow.
What do you dislike about the product?
Please increase the origins and destinations
What problems is the product solving and how is that benefiting you?
Integrating cloud data with HDFS


    Denis Y.

Easy to setup fast data flow with a lot of features and flexibility.

  • October 07, 2020
  • Review verified by G2

What do you like best about the product?
Convenient access to Hadoop FS, access to web API and parsing a JSON response. You can easily combine a lot of different technologies in one flow (Hadoop, python, Java, web API).
I have started to use it while learning Big Data for learning purposes but now we use it in our company on an everyday basis.
What do you dislike about the product?
Everything beyond expectations. Only one limitation - how to convince people to use it more and it is not easy to find enough professionals on the market.
What problems is the product solving and how is that benefiting you?
Recalculating monthly accounting reports in different currencies


    Information Technology and Services

Solution Architect

  • October 01, 2020
  • Review provided by G2

What do you like best about the product?
As a Solution Architect, I found StreamSets tool very useful in understanding dataflows in our company
What do you dislike about the product?
To be honest I'd like more training to be offered by the company
What problems is the product solving and how is that benefiting you?
StreamSets solution was very helpful in getting an enormous amount of data from IoT devices into the central database.