Listing Thumbnail

    IBM watsonx.data as a Service - GenAI Ready Data Lakehouse for AWS

     Info
    Deployed on AWS
    IBM watsonx.data is an open, hybrid data lakehouse with built-in data fabric and multi-engine optimization to prepare structured and unstructured data for AI.
    4.4

    Overview

    IBM watsonx.data as a Service is an open, hybrid-cloud data lakehouse on AWS that combines lakehouse storage with integrated data fabric capabilities for governance, lineage, and data quality. Using open formats such as Apache Iceberg and Parquet, and engines including Presto SQL and Apache Spark, the platform provides governed access to structured, semi-structured, and unstructured data across hybrid, multi-cloud, and on-premises environments.

    watsonx.data is GenAI-ready, automating ingestion, preparation, and retrieval of unstructured data to fuel accurate generative AI. With vector search and multi-model capabilities through Cassandra (Astra DB) and Milvus, watsonx.data supports advanced RAG, similarity search, and real-time operational workloads. Internal testing shows improved accuracy over vector-only RAG by leveraging retrieval governance and integrated metadata.

    watsonx.data offers enterprise-grade deployment flexibility and security, including VPC-based deployments, AWS PrivateLink, and support for FedRAMP (Medium) and HIPPA for AWS GovCloud. Native AWS integrations, such as AWS Lake Formation and the Common Policy Gateway (CPG) for unified access control, enable real-time policy synchronization and full auditability. With multi-engine optimization across Presto and Spark, organizations can reduce data warehouse costs while scaling analytics and AI across their AWS footprint.

    Q: How does watsonx.data integrate with AWS-native services?

    The platform integrates with AWS Lake Formation for access management and metadata alignment, supports AWS PrivateLink for secure connectivity, and uses the Common Policy Gateway (CPG) for unified access control with real-time policy synchronization and full audit tracking.

    Q: What security and compliance capabilities are available?

    watsonx.data offers enterprise-grade deployment flexibility and security, including VPC-based deployments, AWS PrivateLink, and support for FedRAMP (Medium) and HIPPA for AWS GovCloud. to support regulated workloads.

    Q: What deployment options does watsonx.data support?

    IBM watsonx.data supports SaaS on AWS, in-customer VPC deployments on AWS and Azure, multi-cloud architectures, and on-premises deployments on Red Hat OpenShift. On-premises deployments can take advantage of existing IBM Power and IBM Fusion HCI environments to deliver optimized performance, while maintaining flexibility for data residency, security, and compliance requirements.

    Q: How does watsonx.data improve GenAI and RAG accuracy?

    watsonx.data enhances generative AI results by combining governed retrieval with integrated vector databases such as Milvus and Cassandra (Astra DB), enabling fusion of unstructured, structured, and metadata-rich context. Internal testing shows higher answer correctness compared to vector-only RAG by applying data fabric governance and optimized retrieval strategies.

    Highlights

    • Unify hybrid-cloud analytics through a single entry point: Access all enterprise data across AWS, on-premises, and multi-cloud environments through a shared metadata layer that supports open table formats such as Apache Iceberg and Parquet, enabling consistent analytics and governance without ETL.
    • Deploy and connect to AWS data sources in minutes: Begin querying data quickly by connecting AWS storage (e.g. Amazon S3) and analytics environments - including Db2 Warehouse on AWS and Netezza on AWS - within minutes, supported by built-in governance, security automation, and multi-engine execution through Presto and Spark.
    • Reduce the cost of your data warehouse by up to 50% through workload optimization: Lower analytics spend by offloading and optimizing workloads across fit-for-purpose engines (Presto, Spark) and storage tiers, enabling measurable cost reductions of up to 50% when augmenting traditional warehouse workloads.

    Details

    Delivery method

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Buyer guide

    Gain valuable insights from real users who purchased this product, powered by PeerSpot.
    Buyer guide

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    IBM watsonx.data as a Service - GenAI Ready Data Lakehouse for AWS

     Info
    Pricing is based on the duration and terms of your contract with the vendor, and additional usage. You pay upfront or in installments according to your contract terms with the vendor. This entitles you to a specified quantity of use for the contract duration. Usage-based pricing is in effect for overages or additional usage not covered in the contract. These charges are applied on top of the contract price. If you choose not to renew or replace your contract before the contract end date, access to your entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (4)

     Info
    Dimension
    Description
    Cost/12 months
    Extra-small Watsonx.data installation
    Watsonx.data Resource Units annual Contract "pack" of 2000 Resource Units
    $2,000.00
    Small Watsonx.data installation
    Watsonx.data Resource Units annual Contract "pack" of 20000 Resource Units
    $20,000.00
    Medium Watsonx.data installation
    Watsonx.data Resource Units annual Contract "pack" of 50000 Resource Units
    $50,000.00
    Large Watsonx.data installation
    Watsonx.data Resource Units annual Contract "pack" of 100000 Resource Units
    $100,000.00

    Additional usage costs (1)

     Info

    The following dimensions are not included in the contract terms, which will be charged based on your usage.

    Dimension
    Cost/unit
    Overage charge for overconsumption of contracted resource units
    $1.10

    Vendor refund policy

    All orders are non-cancellable and all fees and other amounts that you pay are non-refundable.

    Custom pricing options

    Request a private offer to receive a custom quote.

    How can we make this page better?

    Tell us how we can improve this page, or report an issue with this product.
    Tell us how we can improve this page, or report an issue with this product.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Support

    Vendor support

    This product includes enterprise-grade support designed for fast deployment and low operational risk. Customers have access to comprehensive public documentation, step-by-step integration guides, and architecture references aligned with AWS best practices. Technical support is available through defined support channels with documented SLAs, and our team actively assists with onboarding, configuration, and troubleshooting.

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    50
    In Data Warehouses
    Top
    10
    In Databases & Analytics Platforms, ML Solutions, Data Analytics
    Top
    10
    In Data Analysis

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Open Table Format Support
    Supports open table formats including Apache Iceberg and Parquet for consistent analytics and governance across hybrid-cloud environments without requiring ETL processes.
    Multi-Engine Query Optimization
    Provides multi-engine optimization across Presto SQL and Apache Spark to execute queries across structured, semi-structured, and unstructured data with workload-specific optimization.
    Vector Database Integration
    Integrates vector search and multi-model capabilities through Cassandra (Astra DB) and Milvus to support advanced retrieval-augmented generation (RAG), similarity search, and real-time operational workloads.
    Enterprise Security and Compliance
    Offers VPC-based deployments, AWS PrivateLink connectivity, and compliance support for FedRAMP (Medium) and HIPAA for AWS GovCloud environments.
    Unified Access Control and Governance
    Implements integrated data fabric with governance, lineage, and data quality capabilities, including AWS Lake Formation integration and Common Policy Gateway (CPG) for unified access control with real-time policy synchronization and audit tracking.
    Lakehouse Architecture
    Built on a lakehouse foundation providing unified data storage and governance across data engineering, analytics, BI, data science, and machine learning workloads
    Open Source Integration
    Constructed on open source data projects and open standards to maximize flexibility and interoperability across the data ecosystem
    Data Intelligence Engine
    Powered by a Data Intelligence Engine that enables organizational access to data and insights across diverse user roles and technical skill levels
    Unified Data Platform
    Consolidates data, analytics, and AI workloads on a single common platform running on Amazon S3, eliminating traditional data silos
    Collaborative Capabilities
    Provides native collaboration features enabling data teams to work together across the entire data and AI workflow
    Workload Auto-scaling
    Intelligently autoscales workloads up and down across hybrid and public cloud environments for optimized cloud infrastructure utilization.
    Multi-function Analytics Platform
    Provides integrated data warehouse, machine learning, and custom analytics capabilities with unified analytic functions to eliminate data silos.
    Shared Data Experience (SDX)
    Implements security and governance policies that are set once and applied consistently across all data and workloads, with portability across supported infrastructures.
    Data Lifecycle Management
    Manages complete data lifecycle functions including ingestion, transformation, querying, optimization, and predictive analytics across multiple cloud environments.
    Unified Security and Governance
    Ensures all workloads share common security, governance, and metadata with capabilities for data discovery, curation, and self-service access controls.

    Contract

     Info
    Standard contract
    No
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    4.4
    165 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    58%
    38%
    3%
    1%
    0%
    2 AWS reviews
    |
    163 external reviews
    External reviews are from G2  and PeerSpot .
    Sunandan G.

    Complex Setup and Rising Costs at Scale Despite a Strong Lakehouse Foundation

    Reviewed on Apr 26, 2026
    Review provided by G2
    What do you like best about the product?
    its open lakehouse architecture, which lets you query data across multiple sources without moving it.
    It also delivers strong performance with built-in query optimization and integrates easily with existing data tools, making analytics faster and simpler.
    What do you dislike about the product?
    setup and configuration can feel complex, especially for smaller teams without strong data engineering support.
    It can also become expensive at scale, particularly when handling large workloads or advanced features.
    What problems is the product solving and how is that benefiting you?
    solves the problem of scattered data by letting you access and query data across different storage systems without moving it into one place.
    This benefits you by reducing data duplication, lowering costs, and enabling faster, more efficient analytics and decision-making.
    Rahul S.

    Scalable Platform with Robust Analytics, Needs Setup Improvement

    Reviewed on Apr 23, 2026
    Review provided by G2
    What do you like best about the product?
    I use IBM watsonx.data to centralize and manage both structured and unstructured data in a unified lakehouse for analytics and AI workloads. I like its ability to combine the flexibility of a data lake with the performance of a data warehouse in a single platform. It helps me access, process, and analyze data across hybrid environments to generate faster insights and support data-driven decisions. It also offers strong query optimization and supports open data formats, making it easy to scale analytics across hybrid environments. Additionally, it integrates well with BI tools for visualization, helping turn processed data into actionable insights. Transitioning to IBM watsonx.data helped me gain more flexibility and scalability, handle growing data volumes more efficiently while reducing costs, and support modern analytics and AI workloads.
    What do you dislike about the product?
    The setup and initial configuration can be a bit complex, especially for teams new to lakehouse architectures. Additionally, improving documentation, UI intuitiveness, and integration with some third-party tools would make the overall experience smoother. The initial setup was moderately complex and required some familiarity with data architecture and cloud environments. While the documentation helps, the process can be time-consuming, especially when configuring integrations and optimizing performance for specific workloads.
    What problems is the product solving and how is that benefiting you?
    I use IBM watsonx.data to centralize data in a unified lakehouse for analytics, solving the challenge of managing large data volumes by unifying lakes and warehouses. It improves query performance and reduces costs with efficient data access and workload optimization.
    Atul K.

    Flexible Lakehouse Platform with Good Performance and Scalability

    Reviewed on Apr 23, 2026
    Review provided by G2
    What do you like best about the product?
    What I like most about IBM watsonx.data is how it brings together a lakehouse approach without making things overly complicated. It feels flexible enough to handle both structured and unstructured data, and the performance with query engines is quite solid, especially when working with large datasets.
    What do you dislike about the product?
    Initial setup can feel a bit complex, especially for new users. Also, performance tuning and cost optimization sometimes require extra effort compared to more mature, plug-and-play platforms.
    What problems is the product solving and how is that benefiting you?
    It helps consolidate data from multiple sources into one platform, reducing silos and improving data accessibility. This makes analysis faster and more reliable, which ultimately supports better decision-making and reduces overall data management costs.
    Bhavya S.

    Flexible Integration, Complex Learning Curve

    Reviewed on Apr 22, 2026
    Review provided by G2
    What do you like best about the product?
    I like that IBM watsonx.data allows us to access data from multiple sources and can run on cloud and hybrid environments. I also appreciate its open and flexible architecture. It helps me connect data across sources and manage it effectively.
    What do you dislike about the product?
    The initial learning can be complex for beginners, could be made simple with instruction steps. Fix AWS S3, need more stable and plug and play connectors. The setup was not instant, it was somewhat complex.
    What problems is the product solving and how is that benefiting you?
    I use IBM watsonx.data to search and organize data. It lets me connect data across sources and manage it effectively.
    Preeti Y.

    Scalable Data Management with IBM watsonx.data

    Reviewed on Apr 22, 2026
    Review provided by G2
    What do you like best about the product?
    I use IBM watsonx.data as a unified data platform to manage, access, and analyze large volumes of structured and unstructured data. I like its ability to unify data across multiple sources without requiring heavy data movement, which makes it easier to access and analyze data efficiently while maintaining performance. I also appreciate the scalability and flexibility it offers for handling large and diverse datasets. The platform supports both analytics and AI workloads in a structured way. Its data governance capabilities help ensure data reliability and security, enabling more efficient and data-driven decision-making. The initial setup was relatively smooth with proper planning and guidance, providing a structured setup process that made it easier to configure core components and connect data sources.
    What do you dislike about the product?
    IBM watsonx.data is a strong and scalable platform overall. Some advanced features may require initial familiarity to fully utilize, so a bit of onboarding or guidance can be helpful. Additionally, having more simplified out-of-the-box configurations for certain use cases could further enhance ease of use. Overall, these are minor areas, and the platform continues to evolve with improvements that enhance usability and performance.
    What problems is the product solving and how is that benefiting you?
    I use IBM watsonx.data to unify and manage large volumes of data across systems without needing to move it, reducing silos and improving efficiency. It supports data-driven decision-making and analytics, enabling AI applications with scalable, reliable data.
    View all reviews