PDI, best data cleaning tool
What do you like best about the product?
Pentaho comes in two editions, enterprise and community, I had experience with the community edition and here are all the advatages I see:
1. Its under apache2.0 license so while you read and work under the agreements, you can have this powerful tool for free
2. Has a very friendly user interface, so anybody, even without strong programming skill could make some transformations in just minutes
3. It has a wide variety of data inputs formats, allowing you to read from simple csv's or excels files to databases, json's and even s3 storage
4. It has a lot of tools for transformating your data without coding
5. If the functions that PDI has integrated aren't enough for you, you can add some scripting steps
1. Its under apache2.0 license so while you read and work under the agreements, you can have this powerful tool for free
2. Has a very friendly user interface, so anybody, even without strong programming skill could make some transformations in just minutes
3. It has a wide variety of data inputs formats, allowing you to read from simple csv's or excels files to databases, json's and even s3 storage
4. It has a lot of tools for transformating your data without coding
5. If the functions that PDI has integrated aren't enough for you, you can add some scripting steps
What do you dislike about the product?
I see a strong oportunity on improving their documentation, sometimes its kinda hard finding examples for all the functionalities that PDI offers
What problems is the product solving and how is that benefiting you?
I mainly use pantaho for transforming data on the ETL cycle, so I do cleansing of different sources and storage it in a DWH
There are no comments to display