IBM watsonx.data as a Service - GenAI Ready Data Lakehouse for AWS

IBM watsonx.data is an open, hybrid data lakehouse with built-in data fabric and multi-engine optimization to prepare structured and unstructured data for AI.

4.4

View purchase options

Request private offer

Request demo

Overview

Try agent mode

Create proposal

Ask question

IBM watsonx.data as a Service is an open, hybrid-cloud data lakehouse on AWS that combines lakehouse storage with integrated data fabric capabilities for governance, lineage, and data quality. Using open formats such as Apache Iceberg and Parquet, and engines including Presto SQL and Apache Spark, the platform provides governed access to structured, semi-structured, and unstructured data across hybrid, multi-cloud, and on-premises environments.

watsonx.data is GenAI-ready, automating ingestion, preparation, and retrieval of unstructured data to fuel accurate generative AI. With vector search and multi-model capabilities through Cassandra (Astra DB) and Milvus, watsonx.data supports advanced RAG, similarity search, and real-time operational workloads. Internal testing shows improved accuracy over vector-only RAG by leveraging retrieval governance and integrated metadata.

watsonx.data offers enterprise-grade deployment flexibility and security, including VPC-based deployments, AWS PrivateLink, and support for FedRAMP (Medium) and HIPPA for AWS GovCloud. Native AWS integrations, such as AWS Lake Formation and the Common Policy Gateway (CPG) for unified access control, enable real-time policy synchronization and full auditability. With multi-engine optimization across Presto and Spark, organizations can reduce data warehouse costs while scaling analytics and AI across their AWS footprint.

Q: How does watsonx.data integrate with AWS-native services?

The platform integrates with AWS Lake Formation for access management and metadata alignment, supports AWS PrivateLink for secure connectivity, and uses the Common Policy Gateway (CPG) for unified access control with real-time policy synchronization and full audit tracking.

Q: What security and compliance capabilities are available?

Q: What deployment options does watsonx.data support?

IBM watsonx.data supports SaaS on AWS, in-customer VPC deployments on AWS and Azure, multi-cloud architectures, and on-premises deployments on Red Hat OpenShift. On-premises deployments can take advantage of existing IBM Power and IBM Fusion HCI environments to deliver optimized performance, while maintaining flexibility for data residency, security, and compliance requirements.

Q: How does watsonx.data improve GenAI and RAG accuracy?

watsonx.data enhances generative AI results by combining governed retrieval with integrated vector databases such as Milvus and Cassandra (Astra DB), enabling fusion of unstructured, structured, and metadata-rich context. Internal testing shows higher answer correctness compared to vector-only RAG by applying data fabric governance and optimized retrieval strategies.

Highlights

Unify hybrid-cloud analytics through a single entry point: Access all enterprise data across AWS, on-premises, and multi-cloud environments through a shared metadata layer that supports open table formats such as Apache Iceberg and Parquet, enabling consistent analytics and governance without ETL.
Deploy and connect to AWS data sources in minutes: Begin querying data quickly by connecting AWS storage (e.g. Amazon S3) and analytics environments - including Db2 Warehouse on AWS and Netezza on AWS - within minutes, supported by built-in governance, security automation, and multi-engine execution through Presto and Spark.
Reduce the cost of your data warehouse by up to 50% through workload optimization: Lower analytics spend by offloading and optimizing workloads across fit-for-purpose engines (Presto, Spark) and storage tiers, enabling measurable cost reductions of up to 50% when augmenting traditional warehouse workloads.

Details

Sold by

IBM Software

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Buyer guide

Gain valuable insights from real users who purchased this product, powered by PeerSpot.

Get the buyer guide

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

IBM watsonx.data as a Service - GenAI Ready Data Lakehouse for AWS

Info

View purchase options

Pricing is based on the duration and terms of your contract with the vendor, and additional usage. You pay upfront or in installments according to your contract terms with the vendor. This entitles you to a specified quantity of use for the contract duration. Usage-based pricing is in effect for overages or additional usage not covered in the contract. These charges are applied on top of the contract price. If you choose not to renew or replace your contract before the contract end date, access to your entitlements will expire.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

12-month contract (4)

Info

Dimension	Description	Cost/12 months
Extra-small Watsonx.data installation	Watsonx.data Resource Units annual Contract "pack" of 2000 Resource Units	$2,000.00
Small Watsonx.data installation	Watsonx.data Resource Units annual Contract "pack" of 20000 Resource Units	$20,000.00
Medium Watsonx.data installation	Watsonx.data Resource Units annual Contract "pack" of 50000 Resource Units	$50,000.00
Large Watsonx.data installation	Watsonx.data Resource Units annual Contract "pack" of 100000 Resource Units	$100,000.00

Additional usage costs (1)

Info

The following dimensions are not included in the contract terms, which will be charged based on your usage.

Dimension	Cost/unit
Overage charge for overconsumption of contracted resource units	$1.10

Vendor refund policy

All orders are non-cancellable and all fees and other amounts that you pay are non-refundable.

Custom pricing options

Request private offer

Request a private offer to receive a custom quote.

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Request demo

Delivery details

Software as a Service (SaaS)

SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

Resources

Vendor resources

IBM watsonx.data SAAS on AWS documentation

IBM watsonx.data Software documentation

watsonx.data community

Support

Vendor support

This product includes enterprise-grade support designed for fast deployment and low operational risk. Customers have access to comprehensive public documentation, step-by-step integration guides, and architecture references aligned with AWS best practices. Technical support is available through defined support channels with documented SLAs, and our team actively assists with onboarding, configuration, and troubleshooting.

Get support

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Product comparison

Info

Updated weekly

IBM watsonx.data as a Service - GenAI Ready Data Lakehouse for AWS

By IBM Software

Databricks Data Intelligence Platform

By Databricks, Inc.

Cloudera on AWS

By Cloudera

Accolades

Info

Top

In Data Warehouses

Top

In Databases & Analytics Platforms, ML Solutions, Data Analytics

Top

In Data Analysis

Customer reviews

Info

Sentiment is AI generated from actual customer reviews on AWS and G2

Reviews

Functionality

Ease of use

Customer service

Cost effectiveness

167 reviews

500 reviews

133 reviews

Positive reviews

Mixed reviews

Negative reviews

Overview

Info

AI generated from product descriptions

Open Table Format Support

Supports open table formats including Apache Iceberg and Parquet for consistent analytics and governance across hybrid-cloud environments without requiring ETL processes.

Multi-Engine Query Optimization

Provides multi-engine optimization across Presto SQL and Apache Spark to execute queries across structured, semi-structured, and unstructured data with workload-specific optimization.

Vector Database Integration

Integrates vector search and multi-model capabilities through Cassandra (Astra DB) and Milvus to support advanced retrieval-augmented generation (RAG), similarity search, and real-time operational workloads.

Enterprise Security and Compliance

Offers VPC-based deployments, AWS PrivateLink connectivity, and compliance support for FedRAMP (Medium) and HIPAA for AWS GovCloud environments.

Unified Access Control and Governance

Implements integrated data fabric with governance, lineage, and data quality capabilities, including AWS Lake Formation integration and Common Policy Gateway (CPG) for unified access control with real-time policy synchronization and audit tracking.

Lakehouse Architecture

Unified data foundation built on lakehouse architecture providing open, unified foundation for data and governance with support for open standards and formats

Data Intelligence Engine

Powered by Data Intelligence Engine that enables organization-wide access to data and insights across all users and roles

Multi-Workload Unification

Consolidates data engineering, analytics, business intelligence, data science and machine learning workloads on a single common platform

Collaborative Development Environment

Native collaboration capabilities enabling data teams to collaborate across entire data and AI workflow

Open Source Foundation

Built on open source data projects and open standards to maximize flexibility and interoperability with existing data ecosystems

Workload Auto-scaling

Intelligently autoscales workloads up and down across hybrid and public cloud environments for optimized cloud infrastructure utilization.

Multi-function Analytics Platform

Provides integrated data warehouse, machine learning, and custom analytics capabilities with unified analytic functions to eliminate data silos.

Shared Data Experience (SDX)

Implements security and governance policies that are set once and applied consistently across all data and workloads, with portability across supported infrastructures.

Data Lifecycle Management

Manages complete data lifecycle functions including ingestion, transformation, querying, optimization, and predictive analytics across multiple cloud environments.

Unified Security and Governance

Ensures all workloads share common security, governance, and metadata with capabilities for data discovery, curation, and self-service access controls.

Contract

Info

Standard contract

Customer reviews

Leave a review

Ratings and reviews

Info

4.4

174 ratings

5 star

4 star

3 star

2 star

1 star

57%

39%

3 AWS reviews

171 external reviews

External reviews are from G2 and PeerSpot .

Nikita S.

Open Lakehouse Architecture with Seamless Integration and High-Performance Querying

Reviewed on Jul 26, 2026

Review provided by G2

What do you like best about the product?

I like its open lakehouse architecture, seamless integration with multiple data sources, high-performance querying, and scalability. Together, these strengths make data management and AI analytics more efficient.

What do you dislike about the product?

The setup can feel complex, and some of the more advanced features come with a steep learning curve. The interface and documentation could also be made more beginner-friendly, as they aren’t always easy to navigate when you’re just getting started.

What problems is the product solving and how is that benefiting you?

It helps break down data silos and makes it easier to access large datasets. As a result, I can analyze data more efficiently, with better performance and less time spent when working on AI and analytics projects.

Information Technology and Services

Robust Data Storage and Maintenance for Managing Complex Data Flows

Reviewed on Jul 24, 2026

Review provided by G2

What do you like best about the product?

IBM watsonx.data has robust data storage and maintenance capabilities. It’s a powerful tool that has helped me manage data flow for semantic platforms and for the tools built for business intelligence and reporting.

What do you dislike about the product?

The ecosystem and setup process feel somewhat complex. There’s a slow learning curve to get fully engaged, and the UI is less intuitive compared to other available tools that offer similar functionality.

What problems is the product solving and how is that benefiting you?

It helps organize and process large-scale TPA data by unifying it in a single platform, where later stages of ETL processes can run smoothly. It also serves as a single, governed data layer that is retrieved from many different sources.

Chirag S.

Flexible Open Lakehouse with Iceberg Support and Multi-Engine Choice

Reviewed on Jul 23, 2026

Review provided by G2

What do you like best about the product?

Its focus is on giving organizations flexibility without forcing them into a single storage format or query engine. A few aspects stand out as particularly compelling. The open data lakehouse architecture is designed to work with open table formats such as Apache Iceberg, which helps reduce vendor lock-in and makes data more portable across different tools and platforms. The separation of storage and compute also matters: you can scale compute resources independently of storage, which can improve cost efficiency for workloads that fluctuate over time. Finally, instead of relying on one query engine, it supports multiple engines optimized for different workloads, letting users choose the best fit for analytics, SQL, or AI use cases.

What do you dislike about the product?

IBM watsonx.data has several strengths, but it also comes with trade-offs that some users and organizations may find limiting. One is complexity: compared with fully managed cloud data warehouses, watsonx.data can require more upfront planning and ongoing operational expertise, particularly when you’re configuring multiple query engines, storage layers, and governance components. Another is the learning curve: teams that aren’t already familiar with lakehouse concepts, Apache Iceberg, or IBM’s data ecosystem may need additional time before they can become fully productive.

What problems is the product solving and how is that benefiting you?

IBM watsonx.data helps solve the problem of fragmented data and inefficient analytics by offering a unified, open lakehouse platform. For me, the main benefits are that it makes data easier to access, improves performance for AI and analytics workloads, helps lower infrastructure costs, and provides flexibility by supporting open data formats.

reviewer2715654

Collaborative analytics workspace has improved campaign insights and saves weekly manual effort

Reviewed on Jun 19, 2026

Review provided by PeerSpot

What is our primary use case?

IBM Watson Studio is our main platform for analytics workflows as a marketing agency. We use the platform's machine learning and data visualization capabilities, primarily for analytics and analyzing campaign performance.

A specific example of how I use IBM Watson Studio for campaign performance analytics is that because we use different channels and have different customers, we need one source where we can collect and view all the data. For this reason, we recently started using IBM Watson Studio.

I have nothing else to add about my main use case or how I integrate IBM Watson Studio with my other tools.

What is most valuable?

One of the best features IBM Watson Studio offers is the ability to collaborate across teams using a centralized workspace.

The centralized workspace helps my team collaborate because we did not need to spend excessive time on manual processes. This helped us collaborate across teams by selecting which data and which channels should be reflected in IBM Watson Studio. In this way, we saved time and could easily see campaign outcomes and make better data-driven marketing decisions.

IBM Watson Studio has positively impacted my organization by being time-efficient and enabling collaboration, as we can see everything in one screen. It helped improve our efficiency and provided deeper customer insights that enable better decision-making. It definitely helped our weekly time efficiency by saving manual workload because we have a lot of work going on. It really helped us in analyzing the data and analytics.

What needs improvement?

IBM Watson Studio can be improved because there is currently a learning curve. It would be better if it were not so difficult to learn for people without a data background or limited technical experience.

I do not have anything more to add about the needed improvements, including around documentation, support, or user interface.

For how long have I used the solution?

I have been using IBM Watson Studio for six months.

What do I think about the stability of the solution?

IBM Watson Studio is definitely stable.

What do I think about the scalability of the solution?

The scalability of IBM Watson Studio is good. We started using it during a period of fast growth and scaling, so it was the right time for a company in our position to implement it.

How are customer service and support?

The customer support was good in terms of helping answer any questions my team had.

Which solution did I use previously and why did I switch?

I did not previously use a different solution; IBM Watson Studio was our first solution in this area.

How was the initial setup?

My experience with pricing, setup cost, and licensing is that I think it is expensive.

Regarding pricing, because it is IBM, it is justified. However, it is an expensive cost. The goals and what we achieved through it justify the price.

What was our ROI?

I have seen a return on investment through time saved. With the time saved, my employees and I can put more time into other responsibilities.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing is that I think it is expensive.

Regarding pricing, because it is IBM, it is justified. However, it is an expensive cost. The goals and what we achieved through it justify the price.

Which other solutions did I evaluate?

We did not evaluate any other options before choosing IBM Watson Studio.

What other advice do I have?

I would rate IBM Watson Studio an eight out of ten.

I chose eight because I think it is great in terms of all the things I described, and the only two points I subtracted are due to the learning curve.

Regarding IBM Watson Studio's AI capabilities, IBM is a very trustworthy company. The AI capabilities were particularly valuable for our marketing analytics workflows. The platform's AutoAI features helped accelerate model development for my team by automating data preparation and model selection. This allowed my team to focus more on campaign strategy and insights, which was what we needed to do.

The accuracy and reliability of output for IBM Watson Studio is definitely reliable because from a governance perspective, IBM Watson Studio provides strong controls around model management and monitoring.

The advice I would give to others looking into using IBM Watson Studio is that they need to have a good team that can build the usage of this because it is not something you can start using immediately. You need to learn, as there is a learning curve.

Hennie Du Toit

AI-driven monitoring has reduced manual rule maintenance and now supports multi-tenant operations

Reviewed on Jun 18, 2026

Review provided by PeerSpot

What is our primary use case?

I have been in IT in this particular sphere for my whole career, basically spanning over 20 years. I remember approximately how much time deployment for IBM Watson Studio required, and it was a couple of days probably. The project span was complex given our environment, so we use it for multi-tenancy purposes. It was not that easy to do.

What is most valuable?

The best features in IBM Watson Studio for me personally are moving away from the alarm dictionary or moving away from the rule-based alarms to more the AI Ops portion where you have IBM Watson Studio with some of the machine learning to do the correlations and learning seasonality, et cetera. Having a smarter technology rather than strictly rule-based, fixed scenarios of reducing events has been beneficial.

The prebuilt model templates in IBM Watson Studio have helped us reduce time-to-value for our team by making it a lot easier for us to manage because previously, we had to build a lot of the rule-based correlations in a different tool, and now we have ported that into AI Ops IBM Watson Studio. It is looking a lot easier for us to manage that.

I do see some positive impact after implementing IBM Watson Studio. Otherwise, we would have moved on.

What needs improvement?

I face some difficulties and room for improvement in IBM Watson Studio. A lot of the functions they did bring in are what we asked for, and I think a lot of them are roadmap items, but perhaps tighter integrations to some of the products that they also own, such as Instana or Turbonomic , would be great. I think you still have to configure a lot of the webhooks, for example, where it would be nice if it was an out-of-the-box integration.

I assess the flexibility of IBM Watson Studio in integrating with open-source machine learning tools and frameworks, and I find that it is not always that easy, but with the PMRs, they normally help you quite quickly to solve it.

For how long have I used the solution?

I can tell you that I have been working with IBM Watson Studio for many years, and we still make use of the platform.

What do I think about the scalability of the solution?

I can confirm that IBM Watson Studio is a scalable product. We have a multi-tenanted environment operating across many markets, so for us, it is definitely one of the big benefits.

What about the implementation team?

For the deployment, a combined team with probably about five or six people was involved. This included engineers and administrators from our team and the IBM consulting team as well.

What was our ROI?

I do not track any return on investment or cost reductions after implementing the product because I am not personally involved in that. That is more on our finance side that they do that.

What's my experience with pricing, setup cost, and licensing?

My thoughts about licensing cost are that it is a bit of a tricky question to be honest, because it depends on what you compare it to. For the product suite, I think we have negotiated a good price. Obviously, all businesses want the price to go lower, so I think it is not that bad.

What other advice do I have?

Regarding deployment, we have not done a new deployment in the last couple of years since we upgraded to AI Ops, so I cannot really answer that for the recent. There were some challenges with the containerized solution, but we managed to sort it out.

My preference for the deployment model of IBM Watson Studio is that for IBM software or that portion, we still have on-prem, but obviously, if it makes sense, we deploy a SaaS service. For IBM, we still have on-prem at the moment.

I have not used the AutoAI feature in IBM Watson Studio closely at the moment with the tooling implementation, but I think it is something they were looking at. I am not sure if it was deployed.

We use some automated reports and things to evaluate the effectiveness of IBM Watson Studio's model development capabilities. We use BI reports to verify that it is effective, and we do some retrospective checks.

My understanding of integration with Instana, particularly, is that Instana and Turbonomics are part of their product suite because they also own them and bought them a couple of years ago.

I would rate this product an 8 out of 10.

View all reviews