Sign in
Categories
Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

Trifacta Data Preparation for Amazon Redshift and S3

Trifacta | Trifacta Data Preparation 8.2.1 for Amazon Redshift and S3

Linux/Unix, CentOS 7.9.2009 - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

0 AWS reviews
  • 5 star
    0
  • 4 star
    0
  • 3 star
    0
  • 2 star
    0
  • 1 star
    0

External reviews

108 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Juri S.

Excellent ETL tool for non-programmers

  • July 13, 2020
  • Review provided by G2

What do you like best?
Dataprep has a great user interface that makes it easy to inspect the data and to transform it. Pipelines are easy to set up and to understand thanks to the Flows UI. Easy to use for everyone, no programming skills needed. I also like the fact that pipelines are reusable and can be invoked via an API.
What do you dislike?
Working with dates and international date formats in Dataprep is still a nightmare, e.g. if you want to convert date formats or check whether a date format is correct. I wish they used ISO date formats as a standard. Also, working with and switching between multiple accounts (on Google Cloud) is not well supported.
What problems are you solving with the product? What benefits have you realized?
Build and run reusable and scalable ETL pipelines within a very short time to ingest data into BigQuery. Make data prepping accessible to non-programmers.


    Hardik M.

A perfect tool for Data Scientists, ML engineers for data processing and getting data insights.

  • July 10, 2020
  • Review verified by G2

What do you like best?
Data processing in the click of buttons. Features such as Join and other transformations are very useful. Instead of using python libraries, this is the quickest way for data processing. Moreover, its affiliation with Google cloud platform helps to create the model in short period of time. The data processed using Trifacta dataprep can be directly used for AutoML tables. It provides insights like the correlation between columns. The interface displays the histogram showing the spread of data, along with histogram dataprep displays the type of data i.e. categorical or numerical. It also has the feature for standard datatypes like date, phone number, address, etc. It can also process the .json format data into structured data useful for Machine Learning.
What do you dislike?
The interface lags sometimes could be more smooth and lightweight. Not much of the resources available, video tutorials on Youtube if produced can help a lot.
What problems are you solving with the product? What benefits have you realized?
I have solved the problem of data processing required for training the models. It handles every problem like invalid data type, missing values and many more.
Recommendations to others considering the product:
Trifacta is a great software for data processing and would save you time. It has most of the transformation features. It does not require any programming skills. It is a user-friendly application that can be handled by a non-tech savvy person too. Moreover, GCP credits cab be used to operate dataprep. It is a simple application in which you can excel within a week. It eliminates the need to learn to code for data processing. There are tooltips to provide the details of each function. The interface displays the histogram showing the spread of data, along with histogram dataprep displays the type of data i.e. categorical or numerical. It also has the feature for standard datatypes like date, phone number, address, etc. It can also process the .json format data into structured data useful for Machine Learning.


    Meir K.

Trifacta is incredible data tool that is a game changer for how we work with our data.

  • July 10, 2020
  • Review verified by G2

What do you like best?
The fact that it is cloud based, simple to use and comes with lots of free training. The level of support is also great. They are super focused on helping their users learn the platform and walk you through multiple sessions to ensure that we are making progress in our learning and usability.
What do you dislike?
Some of the pages load slowly and formulas require a more technical background.
What problems are you solving with the product? What benefits have you realized?
Being able to profile our data super fast and also answer data questions that we could not answer previously. We are still new and I anticipate having much more to include on this topic after a few more months.
Recommendations to others considering the product:
Speak to users who have implimented Trifacta 6 months earlier or more and how the product is helping them work better.


    Computer Software

Great tool for building a data preparation pipeline

  • July 10, 2020
  • Review verified by G2

What do you like best?
You can see a preview of the data cleaning and processing rules you setup
You can schedule runs and output the data to different destinations, including BigQuery database
Many pre-set rules to apply in one click
Able to work with huge volumes of data
What do you dislike?
There is a limit of how many rows you can see in the sample, the limit is quite big but in some cases you are not able to view all special cases and can miss handling them
What problems are you solving with the product? What benefits have you realized?
Data cleaning and preparation for ML projects
Data aggregations
Schedule runs and read/write between data sources and databases
Recommendations to others considering the product:
Great for: Machine Learning pipelines, Data cleaning and preparation for ML projects
Data aggregations, Schedule runs and read/write between data sources and databases
You can see a preview of the data cleaning and processing rules you setup
You can schedule runs and output the data to different destinations, including BigQuery database
Many pre-set rules to apply in one click
Able to work with huge volumes of data


    Kyle C.

Google Cloud Dataprep

  • July 10, 2020
  • Review verified by G2

What do you like best?
When looking to clean or prep large files Dataprep comes in very handy. Especially with its direct integrations with Cloud Storage and Bigquery. The tool is also easy to use and suggests many common data transformation steps based on how your source data looks when you first upload it into the tool.

Scheduling ingestion jobs is also very simple to do in Dataprep when your source data lives in Google Cloud Storage and your destination is Bigquery. For anyone using the Google Cloud Platform for data warehousing purposes, Dataprep is a must use tool.
What do you dislike?
The product can be a little bit intimidating to first time users. There are so many tools available and I find that the documentation is not the best. I find the tool also suffers from intermittent slow downs from time to time.

More integrations would be great. I know that more are coming down the pipe and am looking forward to trying them out.

Notifications when a flow fails would be a great addition. If something changes with the source data without us knowing it may cause a recipe to fail. If we can be notified when a failure happens it would be great.
What problems are you solving with the product? What benefits have you realized?
Using Dataprep to clean data extracts from multiple and upload the prepped data into Google Bigquery. We have daily scripts running that dump daily raw extracts into Google Cloud Storage. Using parameterized datasets in Dataprep, we pull the new files daily into a flow that cleans/preps it and sticks the data into Bigquery.
Recommendations to others considering the product:
If you are using the Google Cloud Platform to spin up a Datawarehouse or if you require data cleansing/prep work for other purposes, Trifacta Dataprep is a must use tool.

It integrates seamlessly into pipelines built in the GCP ecosystem and they have new integrations coming down the pipe (Salesforce being one of them).


    Michiel M.

Easy Data preparations using Dataprep

  • July 10, 2020
  • Review verified by G2

What do you like best?
In general I like the easiness of preparing data.
Formatting, Transforming or calculating data has never been easier.
Without even knowing programming languages, you are able to set up data transformations.
What do you dislike?
I miss some connectors on Google Cloud platform for Dataprep.
Seeing the new updates, I believe there is being a lot of effort done on it at the moment
What problems are you solving with the product? What benefits have you realized?
I mainly use Trifacta to automatically process data which end up in automatic generated reports.
I build such reports for Operations, Finance, Sales and the Human Resources department.
Recommendations to others considering the product:
It's a nice and easy tool to do quick data prepartions


    Stephen W.

Useful for wrangling data

  • July 10, 2020
  • Review verified by G2

What do you like best?
Trifacta (or at least Cloud Dataprep by Trifacta, which is the verision I have used) is a useful tool in data preperation. The company I work at switched to Google Cloud Platform, having previously used a combination of SQL servers & Datameer, and while BigQuery gave us a step up with the SQL stuff, when it comes to dealing with unstructured data Trifacta helped solve a lot of the problems I previously solved with Datameer. The flow chart UI is intuitive, and there are a wide range of functions.
What do you dislike?
The sampling functionality, while good, isn't the fastest when relying on DataFlow (this may relate to a resource constraint within my organisation) making the process not totally responsive at times. Arguably there could be greater crossover in the way that functions are named with BigQuery if GCP users are likely to use the two in tandem rather than as an either/or. It might be useful to be able to schedule based on more triggers outside Trifacta itself (i.e. file or table modifications); doing everything in Trifacta & chaining flows together sort of gets around this, but again it's not quite so as integrated with the GCP ecosystem as it might be. I have had the odd issue with crashes, but typically with very large datasets & awkward transformations of these; nothing I wasn't able to resolve through giving Trifacta better input.
What problems are you solving with the product? What benefits have you realized?
I mainly use it to transform unstructured JSON into a useable form for projects in BigQuery, as well as for text processing/regex stuff where again I feel it outperforms the other GCP alternatives. It's been very helpful in realising some automation goals that rely on these processes where otherwise I might have needed to move data around more and use more python - it made the process faster & easier.
Recommendations to others considering the product:
I'd definitely advise any GCP user to make it part of their skillset. My approach was to just sort of dive into using it, but the other, similar tools I used before it might have made that more practical for me than some. The documentation is good and the community/support pages are helpful so if you do get stuck, while there's not a wealth of stuff on stackoverflow like you might find for other products, you shouldn't struggle to find solutions to problems.


    Mario M.

Satisfying Trifacta

  • July 10, 2020
  • Review verified by G2

What do you like best?
The possibility of using code sentences.
What do you dislike?
The time spent to output the file from Trifacta once everything is finished.
What problems are you solving with the product? What benefits have you realized?
Data Cleaning and Data Transformations. With Trifacta, I have spent way less time on performing these tasks.
Recommendations to others considering the product:
Using the tutorial or any tools to understand how Trifacta works.


    Dorota K.

Without DataPrep by Trifacta - I would be very frustrated!

  • July 10, 2020
  • Review provided by G2

What do you like best?
Simple to start , logical to use.Easy to follow the logic of someone else's flows. Easy to correct and go back to the previous steps. Add/delete the wrong transformations. Easy to automate the data preparation.
What do you dislike?
Performance sometimes but rare. It looks like when it was some release of the DataPrep on Google Cloud - there were some performance challenges.
What problems are you solving with the product? What benefits have you realized?
Making preparation and updates of my data much more easy. Once I create the logic in the flow - it takes minutes to refresh the data going forward.
I can easily handover the logic to people that don't have the data mindset - they can follow the logic easily.
Preparing the data for the BI solutions we are using in the company makes the process much easier and the fact that I can use this product as a part of Google Cloud Platform makes my job much easier.
Coming enhancements that Trifacta has in mind will probably make the process even better.
Recommendations to others considering the product:
it is really easy to use! It has a lot of functionality and functions. A lot of opportunity to do what you want with the data. It reads from the DB but also use csv files. You can schedule the updates. You can make it as a part of your automated ETL process.


    Max R.

Greatest tool to discover your data

  • July 10, 2020
  • Review provided by G2

What do you like best?
This tool has limitless possibilities when it comes to complex data. Some large files we are processing only with Trifacta as all other tools are not usable.
What do you dislike?
Could be a bit more flexible with initial samples and it sizing.
What problems are you solving with the product? What benefits have you realized?
Processing of Big Data and data preparation rules
Recommendations to others considering the product:
Don't wait. Do it right away!