Sign in
Categories
Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

StreamSets Transformer (Large)

StreamSets | 3.18.00

Linux/Unix, Amazon Linux Amazon Linux 2 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

0 AWS reviews
  • 5 star
    0
  • 4 star
    0
  • 3 star
    0
  • 2 star
    0
  • 1 star
    0

External reviews

34 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Banking

Easy to use and very nice interface

  • October 27, 2020
  • Review provided by G2

What do you like best?
The tool had a lot of options to integrate with different protocols, language and origin. We used this tool to integrate it with Kafka/Aws, send emails and develop different types of data feed. The user interface was quite nice and easy to use. Be it a simple task or a complex task, we were always able to find a processor or executor to achieve our goal.
What do you dislike?
Since the tool was new, there was a limited support on the internet. Ask streamsets page is helpful but I expected a developed ecosystem. Sometimes we faced issue with using known libraries like moment.js. It's a pain to maintain these libraries in your server. We had to use different language to implement certain module because Javascript library for that task was not supported. So our pipelines looked like a bunch of lot of processors each having a different language/framework.
What problems are you solving with the product? What benefits have you realized?
We were trying to develop data feed for different downstreams originated from wide variety of sources. I really liked how Streamsets control hub had the option to schedule your pipelines. The streamsets control hub had internal version control which was an additional benefit.


    Harry Kim B.

It was powerful but lots of jobs failure

  • October 27, 2020
  • Review provided by G2

What do you like best?
This tool can connect from the ftp or mft server to our MSSQ
What do you dislike?
The jobs designed to our project are usually failing which led our team a lot of monitoring works and manual processing of data.
What problems are you solving with the product? What benefits have you realized?
It's about a scheduled extracting and storing of data from one server to another. This is very beneficial to our live dashboards which need a real tome update for our clients.
Recommendations to others considering the product:
Maybe, if we can add more real-time support that can cater all time-zones and making the tool more user-friendly.


    Hospital & Health Care

Been using Streamsets for all of use cases for onprem to cloud transfers

  • October 25, 2020
  • Review provided by G2

What do you like best?
Easy UX makes it easier to configure pipelines
What do you dislike?
Streamsets Control hub has a lot of issues when multiple DC attached
What problems are you solving with the product? What benefits have you realized?
Onprem to Cloud data transfers


    Bishnu R.

Managing pipeline over StreamSets on K8S environment

  • October 24, 2020
  • Review verified by G2

What do you like best?
I did not find any difficulties to integrate streamSet Control Hub with Kubernetes by help of StreamSet Controller Agent.
What do you dislike?
Some time updated docker image of StreamSet agent comes with vulnerabilities which the should take are before release.
What problems are you solving with the product? What benefits have you realized?
I am managing StreamSet control agent in k8s environment and still did not experienced any issue.
Recommendations to others considering the product:
Yes, i will always recommend the SteamSet to others.


    Information Technology and Services

Review on Streamsets

  • October 24, 2020
  • Review provided by G2

What do you like best?
It is used as real time whenever there are updates in the source database.
What do you dislike?
Sometimes, data sits in Kafka and we need to add extra functionality to pull residual data.
What problems are you solving with the product? What benefits have you realized?
If we use Streamsets, we can access source data in less than 5 minutes without any further delay.


    Jered L.

StreamSets is a great open source ETL tool that is horizontally scalable and simple to use

  • October 23, 2020
  • Review verified by G2

What do you like best?
The simplicity of creating data pipelines visually, with no clunky installation and no databases / metadata to manage, like with other ETL tools. All pipeline info is stored on the file system itself.
What do you dislike?
Sometimes there can be unhelpful error messages, although these are very rare (EX: Java Null Pointer Exception). Occasionally we have noticed CDC pipelines can stop receiving records, and we need to restart them to resume pulling data.
What problems are you solving with the product? What benefits have you realized?
We are regularly using StreamSets to pull 220 million records daily from Salesforce, eliminating the need to write complicated python code. We are also using this tool for Oracle CDC, which has worked well at scale (20 million transactions / day). I have also used this tool to consume from over 5 different JDBC based sources, with great performance and simple implementation.

The best benefit we have realized is the fact that you can dockerize the SDC service, deploy it in ECS or any container orchestration service, and run pipelines that scale horizontally, instead of having static servers hosting the service. If you can implement this properly, it makes writing ingestion pipelines EXTREMELY simple. I can add a new data source to our ETL jobs within a day, instead of weeks by doing this. And it scales to handle thousands of tables!
Recommendations to others considering the product:
There is no reason not to try the data collector.. It is free to download the Tarbell and install. Try it using docker on your local machine, then deploy it in a development environment for testing.


    Zorik Z.

Very powerful tool with a lots of endpoints and high performance

  • October 21, 2020
  • Review provided by G2

What do you like best?
Easy integration with different endpoints
What do you dislike?
It will be better to have more tutorials and documentation.
What problems are you solving with the product? What benefits have you realized?
I was working on StreamSets integration with Greenplum using GRPC
Recommendations to others considering the product:
powerful tool with many endpoints and great support


    Sai Paramahamsa P.

Streamsets is an amazing tool for data movement across environments seemlessly

  • October 21, 2020
  • Review provided by G2

What do you like best?
Developer friendly user interface and resonable speed of data transfer across various databases
What do you dislike?
Debugging is a pain in streamsets because of the exhaustive java logs
What problems are you solving with the product? What benefits have you realized?
We use streamsets for both batch and near real time data ingestion and manipulation into our enterprise data warehouse
Recommendations to others considering the product:
The support team of Streamsets is very committed and will always help tweak the software for any specific but highly used components that are missing in the current version. So take advantage and get the best version of this tool for your enterprise needs.


    Marco M.

Friendly and powerful ETL framework, still evolving

  • October 21, 2020
  • Review verified by G2

What do you like best?
Intuitive, very useful plugins, easy to deploy/maintain. It's really awesome how can you build pipelines for microservices, streaming and batch purposes in a single environment.
Very straightforward to be installed and ready to use, even for production (at least for the essential parts, then obviously should be reviewed with a security team)
What do you dislike?
The bugs you may find are solved with workarounds, you have to wait a bit for a stable solution. Some missing plugins (but they will be added soon, if not already).
Many updates during the year, if you don't have a proper set up to move from test/pre-production to production, you may have some issue to face every time
What problems are you solving with the product? What benefits have you realized?
Managing the entire preprocessing during ingestion. It's very handy, easy to add new pipelines for new data sources or maintain the already present ones
Recommendations to others considering the product:
Create always a box that can be easily updated with the latest release: a lot of issues might be solved in every minor. Moreover, it can be easily updated using the logic of a git repo
Try to use always the Streamsets logic as much as possible and avoid to have big groovy/Jython block, you will benefit from it
Prepare a CI if possible to keep Stremasets always up-to-date
Share any debug or solution you may have found in the community, a lot of people may look for it


    Rizwan S.

I'll recommend it to my friends because there no single line of code

  • October 12, 2020
  • Review provided by G2

What do you like best?
Easy to use and understand, just drag and drop. We can graphically monitor the flow.
What do you dislike?
Please increase the origins and destinations
What problems are you solving with the product? What benefits have you realized?
Integrating cloud data with HDFS