AWS Marketplace: Dataiku Trial Reviews

Financial Services

Initial stages

April 24, 2025
Review provided by G2

What do you like best about the product?

It’s Models and Agents capabilities and all in one platform

What do you dislike about the product?

Data literacy and enablement needs to be focused

What problems is the product solving and how is that benefiting you?

Data Analytics

Satish M.

Very intuitive

April 24, 2025
Review provided by G2

What do you like best about the product?

Uniform platform for Ai for engineering for enterprise

What do you dislike about the product?

It is very common in the industry and 400 plus platforms exist

What problems is the product solving and how is that benefiting you?

Simple self serving analysis

Maximus P.

Great Product With Tons of Potential

April 24, 2025
Review provided by G2

What do you like best about the product?

I like the ability to easily view data in numerous ways.

What do you dislike about the product?

Dataiku makes frequent updates that can sometimes cause issues to existing workflows.

What problems is the product solving and how is that benefiting you?

It helps to quickly and effectively transform data into useful reports and end uses.

Banking

KYC Risk model

April 24, 2025
Review provided by G2

What do you like best about the product?

Flexibility, configurability, and modular setups

What do you dislike about the product?

Cost and license constraints that prohibited more users on the platform

What problems is the product solving and how is that benefiting you?

Compliance efficiency and integration

Savita M.

Dataiku great data science platform

April 24, 2025
Review provided by G2

What do you like best about the product?

Dataiku is very user friendly low code/no code platform for AI/ML capabilities

What do you dislike about the product?

More infrastructure compatibility and Flexibility in integrations

What problems is the product solving and how is that benefiting you?

It has really made users to be able to build low code quick prototypes for testing very easy and accessible.

Juliette M.

Dataiku review

April 24, 2025
Review provided by G2

What do you like best about the product?

I love the platform, it's intuitive and very useful. The llm recipes are especially useful. Overall I think its a great platform, it looks great, it makes sense, and it definetely allows me to do my work quicker.

What do you dislike about the product?

The actual support hasn't always been the best. I've often reached out for support and wasted a lot of time going back and forth without resolving a problem, only to be told that the person trying to help me doesn't know as much on the cloud version of dataiku. The documentation is never cloud-specific too so it's a little confusing. The process through which dataiku have been working out a use case for us has also had some difficulties,

What problems is the product solving and how is that benefiting you?

We are still testing out dataiku, seeing what it can do for us, but so far it's made simple data transformations a lot easier. We are also using some of the traditional data modelling and some traditional ML features. It's been most useful for using llms, allowing us to summarize and extract data from free text, giving us data that we've not been able to access until now

melika v.

Review of dataiku as a developer

April 24, 2025
Review provided by G2

What do you like best about the product?

Bringing everything into one place, from data to model development and deployment.

What do you dislike about the product?

Clusters shutting down for no reason, not that much stability in connections and the time it takes to add a library to a template cluster and rebuilding it.

What problems is the product solving and how is that benefiting you?

It is benefiting prototyping an app and delivering it to clients

Satish K.

Dataiku is Awesome

April 24, 2025
Review provided by G2

What do you like best about the product?

🔄 Smart Data Preparation
Transform raw data into structured, ready-to-use assets using intuitive tools enhanced by AI-driven suggestions, auto-schema detection, and intelligent type recognition.

🧪 Continuous Development
Support agile analytics with a CI/CD-style environment where data flows, scripts, and models evolve continuously, promoting rapid iteration and improvement.

⚙️ Ease of Implementation
Minimize setup complexity with modular components, drag-and-drop interfaces, and seamless integration with existing data ecosystems (cloud, on-prem, hybrid).

✅ Robust Data Validation
Ensure data quality through built-in validation checks, profiling dashboards, and the flexibility to implement custom Python logic for complex or domain-specific rules.

🧠 Scenario Building
Model and simulate different business or analytical scenarios using parameterized workflows, branching logic, and reusable components to support what-if analyses.

🌀 Flow Zones
Organize and manage data processes in "Flow Zones" — clearly defined stages (e.g., Ingest → Transform → Validate → Output) that make pipeline orchestration transparent and scalable.

📚 Integrated WIKI Page
Empower collaboration and knowledge sharing with an embedded WIKI page. Document logic, share best practices, track changes, and onboard new users effortlessly.

What do you dislike about the product?

While DSS offers a powerful visual interface and flexibility, working with large datasets often introduces significant friction, particularly during scenario execution and debugging.

🚧 Key Pain Points:
Performance Bottlenecks:
Executing complex scenarios on large datasets directly in the DSS engine is slow and resource-intensive, often making it impractical for time-sensitive analytics.

Dependence on External Engines:
To achieve acceptable performance, teams must offload processing to SQL or Spark engines, requiring:

Additional infrastructure setup (clusters, permissions, connections)

Advanced SQL or PySpark expertise, which can be a barrier for data analysts or citizen data scientists.

Debugging Overhead:
Troubleshooting large workflows is cumbersome due to:

Limited transparency into underlying code execution

Multi-layered architecture (visual flow → Spark/SQL translation → execution engine)

Slower iteration cycles, especially with Spark

What problems is the product solving and how is that benefiting you?

✅ Automated Data Validation
Prebuilt validation rules with customizable logic (Python/SQL)

Auto-profiling and anomaly detection at ingest

Validation integrated directly into data pipelines and alerts

🧠 Smart Data Ingestion & Reading
Intelligent schema detection, auto-type inference, and data previews

Efficient sampling of large datasets without full-load requirements

Flexible connectors for cloud, on-prem, and APIs with minimal setup

📊 Quick Insights Through Data Visualization
One-click data summaries with charts, distributions, and KPIs

Drill-down capabilities for root-cause analysis

Seamless embedding of visuals into flows, dashboards, and WIKI pages

🔐 Built-in Data Governance
Centralized metadata catalog and lineage tracking

Role-based access controls and audit trails

Versioning, change tracking, and approval workflows

Integration with data privacy and compliance frameworks (GDPR, HIPAA, etc.)

Callan M.

Good user experience but limited capabilities

July 17, 2024
Review provided by G2

What do you like best about the product?

Robust tool, great capabilities and helped me automate some reports that I didnt need to spending time on every week.

What do you dislike about the product?

Can be difficult to figure out what all the tool does. User experience could be better.

What problems is the product solving and how is that benefiting you?

It makes working in large data sets faster. I can get more insghts faster and as a result have more time to do the other elements of my job.

Sabrine Bendimerad

Saves a lot of time because I can quickly handle all the data preparation tasks and concentrate on building my machine learning algorithms

June 11, 2024
Review provided by PeerSpot

What is our primary use case?

We use the solution for data science and machine learning.

How has it helped my organization?

We were a team of six Dataiku scientists and one data engineer. We focused on fully leveraging Dataiku for all our data science-related tasks. This included data preparation, preprocessing, benchmarking machine learning algorithms, handling everything related to production, and making our algorithms available to stakeholders.

What is most valuable?

The advantage is that you can focus on machine learning while having access to what they call 'recipes.' These recipes allow me to preprocess and prepare data without writing any code. This saves a lot of time because I can quickly handle all the data preparation tasks and concentrate on building my machine learning algorithms.

What needs improvement?

One of the main challenges was collaboration. Developers typically use GitHub to push and manage code, but integrating GitHub with Dataiku was complicated. While it was theoretically possible to use GitHub with Dataiku, in practice, it was difficult to manage our code effectively and push it from Dataiku to GitHub.

Another limitation was its ability to handle different types of data. While Dataiku is powerful for working with structured data, like regular or geospatial data, it struggled with more complex data types such as text and image. In addition to the challenges with GitHub integration, the limited support for diverse data types was another feature lacking at that time.

For how long have I used the solution?

I have been using Dataiku for over a year.

What do I think about the stability of the solution?

Since Dataiku relies on various open-source libraries and tools, updates or upgrades to these components can sometimes impact the stability of Dataiku's features. This can make it challenging to maintain consistent stability, as changes in the underlying open-source tools can affect how Dataiku functions.

I rate the stability as six out of ten.

What do I think about the scalability of the solution?

There are some scalability issues.

I rate the scalability as seven out of ten.

How are customer service and support?

Technical support was very good compared to other tools. We had access to chat and support.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup is very easy. It has many tutorials and many guidelines. After the initial deployment, it took about a week to manage all the setup and resolve various issues before we had a stable version of Dataiku that we could use consistently.

I rate it as eight out of ten, whereas ten is easy.

What's my experience with pricing, setup cost, and licensing?

It is very expensive.

What other advice do I have?

I wouldn't recommend using Dataiku if only one data scientist is on the team. However, having a larger team—let's say more than five data scientists—can be very helpful. Dataiku offers features that are especially useful when multiple people are working on the same project, and it also has tools that make it easier to move from the proof of concept stage to production.

Overall, I rate the solution as seven out of ten.

Dataiku Trial

Reviews from AWS customer

External reviews

Initial stages

Very intuitive

Great Product With Tons of Potential

KYC Risk model

Dataiku great data science platform

Dataiku review

Review of dataiku as a developer

Dataiku is Awesome

Good user experience but limited capabilities

Saves a lot of time because I can quickly handle all the data preparation tasks and concentrate on building my machine learning algorithms

What is our primary use case?

How has it helped my organization?

What is most valuable?

What needs improvement?

For how long have I used the solution?

What do I think about the stability of the solution?

What do I think about the scalability of the solution?

How are customer service and support?

How would you rate customer service and support?

How was the initial setup?

What's my experience with pricing, setup cost, and licensing?

What other advice do I have?