
Overview
For more information or customized pricing, please email us: cpd_on_aws@wwpdl.vnet.ibm.com
IBM Cloud Pak for Data is a unified data and AI platform that connects the right data, at the right time, to the right people anywhere. Available on AWS and running on Red Hat OpenShift, the platform simplifies data access, automates data discovery and curation, and safeguards sensitive information by automating policy enforcement for all users in your organization. Make better data driven decisions and lay the foundation for AI with a data fabric that connects siloed data on premises or across multiple clouds without data movement. Discover actionable insights and apply trusted data to build, run, automate and manage AI models.
Outcomes:
- Data access and availability: Eliminate data silos and simplify your data landscape to enable faster, cost-effective extraction of value from your data.
- Data quality and governance: Apply governance solutions and methodologies to deliver trusted, business data.
- Data privacy and security: Fully understand and manage sensitive data with a pervasive privacy framework.
- Batch data integration: Design, develop and run jobs that move and transform data with powerful automated integration capabilities.
- 360 entity data: Enable agility and accelerated ROI for consolidated and governed views of critical enterprise data.
Product Version 4.7.x
Standard Min: 48 VPCs Enterprise Min: 72 VPCs
Already have a CP4D License? Deploy from the BYOL Listing today!
Highlights
- Deliver data responsibly with a data fabric. Unify and access disparate data with AutoSQL, a universal query engine. Discover and classify data in real time with Watson Knowledge Catalog. Protect sensitive data with automated policy enforcement.
- Scale trustworthy AI: Synchronize application and model pipelines while reducing drift, bias, and risk with ModelOps on Watson Studio. Monitor and govern AI models to meet regulations, manage risk and enhance transparency.
- Recognized by analysts as a Leader in core data and AI segments: The Forrester Wave™: Machine Learning Data Catalogs, Q4 2020; 2021 Gartner Magic Quadrant for Data Science and Machine Learning; The Forrester Wave™: Multi modal Predictive Analytics and Machine Learning, Q3 2020.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Buyer guide

Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/month |
|---|---|---|
Standard Option | Cloud Pak for Data Standard Option: 48 VPCs | $19,824.00 |
Enterprise Option | Cloud Pak for Data Enterprise Option: 72 VPCs | $59,400.00 |
Vendor refund policy
Please contact your rep for any questions.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Software as a Service (SaaS)
SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.
Resources
Support
Vendor support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.


Standard contract
Customer reviews
Longstanding reporting platform has supported reliable dashboards and regulatory compliance
What is our primary use case?
The main use case for IBM Cognos is for business intelligence and reporting.
What is most valuable?
IBM Cognos has been available for many years, and we use regular dashboarding and for producing scheduled reports and some mandatory regulatory reporting. All our departments use Cognos, and the actual Cognos reports are developed by those different teams.
IBM Cognos is very stable and has been around for many years, with many users familiar with it, making it a reliable solution for our institution. Because of our long association with Cognos, we have good pricing.
The benefits of choosing IBM Cognos, in addition to saving on cost, include having institutional knowledge about maintaining this infrastructure and enough people who have developed on Cognos in the past, which creates comfort in its use. Cognos is a reliable solution, and developer productivity is high because of the long history of development on it.
What needs improvement?
I do not know if Cognos has all the features that users are looking for since we provide it as our standard and do not maintain infrastructure for other tools.
For how long have I used the solution?
I am actually very new to the organization and have been here for less than a year.
How are customer service and support?
I would rate IBM's support at about a seven or eight out of ten because we have good support coverage owing to our long association with IBM. We are good on the support front. IBM support is very supportive, and I would rate them an eight out of ten based on our long relationship with them.
How would you rate customer service and support?
Positive
How was the initial setup?
DataStage is not difficult to set up, but we had a lot of challenges in setting up IBM Cloud Pak for Data cluster on-premises. Our infrastructure team faced many challenges when they were doing it because we had to first stand up an OpenShift cluster on-premises before deploying IBM Cloud Pak for Data solution.
The setup for IBM Cloud Pak for Data is very complex, and our teams responsible for standing up the environment struggled a lot. This might also be due to the learning curve since we had not used containerized solutions in the past.
What's my experience with pricing, setup cost, and licensing?
The pricing and setup cost are handled by a different procurement team. Our IT procurement team is centralized, so licensing and the actual cost of the software are taken care of by a different team altogether.
Which other solutions did I evaluate?
I am not sure about the main differences between IBM Cognos and some other business intelligence tools such as Tableau or Microsoft because many members of the user community have previously experienced those reporting tools before joining our college. However, due to the variety of cloud offerings, users are often able to subscribe directly without having to approach IT for reporting tools, given they have the budget.
What other advice do I have?
I do not utilize Dell PowerStore or Dremio because I work for a university setting with a very simple infrastructure, where we just use Cognos and IBM DataStage.
I do not know if my organization uses AWS as a main cloud provider. We are not on the cloud in a major way and are still on-premises for most of our solutions. In fact, even IBM DataStage, we are using IBM Cloud Pak for Data version, but it is installed on-premises, and we haven't progressed much on how to migrate to the cloud yet.
I am not sure if we use AWS as a cloud provider since we do have some SaaS applications that we subscribe to, but I do not know where they are hosted. I just know we have access to the application for the user interface, and the data is pulled out using an API, but we do not know where it is hosted.
I do not utilize Cognos ad hoc reporting because I do not develop reports. We only host the Cognos infrastructure for our different user groups, and the report development is completed by them. Our infrastructure team provides the hardware, and our system engineering team provides the installation and application maintenance for Cognos.
I think some users are using the interactive dashboards feature, and there are also other tools such as Power BI and Tableau that some users automatically use. However, our IT organization only provides Cognos as an enterprise business intelligence and reporting tool. Other tools are subscribed to separately by different people.
I am not the right person to speak on the machine learning capabilities, as my responsibility is to work with different IT teams who maintain systems across the university. I connect to them using IBM DataStage to fetch their data, perform ETL activities, and load the data into an Oracle database. My team maintains the infrastructure for DataStage and Cognos, but actual development is done by other people.
I use IBM DataStage, which we call IBM Cloud Pak for Data, as we migrated from InfoSphere DataStage to IBM Cloud Pak for Data, and it is installed on-premises in our data center. IBM Cloud Pak for Data version is more or less a modern OpenShift cluster-based platform.
The best features of IBM Cloud Pak for Data include a very modern approach to providing data capabilities under one umbrella, with various services such as artificial intelligence and machine learning capabilities, real-time integration, and data virtualization, though each has separate licenses associated with them. We are currently only using the DataStage license.
We have not evaluated data virtualization, but I recognize it as a good capability for exploring and experimenting with data, especially for those unfamiliar with data modeling. However, we are not using it due to cost considerations.
The developer productivity for DataStage on IBM Cloud Pak for Data is the same as on the old tool, InfoSphere. It does not change anything because the core capabilities remain consistent.
Overall, I would rate Cognos a nine out of ten from a pure infrastructure stability and support perspective because we are comfortable and know what to do, considering the long-term use of Cognos.
Overall, I would rate IBM Cloud Pak for Data a nine out of ten in terms of capabilities. It mirrors the traditional InfoSphere version of DataStage with a good ETL tool that covers all features expected from such tools.
We did not purchase through a marketplace such as AWS. This is all from a long association with IBM directly through negotiations with our procurement team, as we have been a large IBM customer for many years. I would rate this review a nine out of ten overall.
Starts strong with data management capabilities but needs a demo database
What is our primary use case?
My primary use case for Cloud Pak is that I am the reference Data steward for the Africa regions in the banks where I work. My main objective is to capture the reference data in Caltech or Data and ensure that people profile or QA their data.
This is due to the fact that a large percentage of data is actually reference data, not by volume, but by the number of tables. The group-approved reference data is used to assure quality and ensure people know what they have; that's my primary use case for Cloud Pak.
What is most valuable?
There's a whole bunch of stuff I really like. I love the way that I can start at a very basic level with my data management journey by capturing my policies, justifying my data, and putting them into different categories to say this is data relating to individuals, for example, or data relating to geography. Those base-level data management components, together with the reference data, can then be reused whether I want to figure out where the data is coming from—using Nantucket, for example—or checking the quality of my data.
Often, when I check the quality of my data, I might find an issue, but that data did not originate in the system where I found the issue. So, I need to use Nantucket to track back to where that data originally came from so I can fix it at the source. I love that component of Cloud Pak.
I do not do much with the machine learning or AI pieces. It is probably because I can start at a basic level with data management: policies, rules, categories, reference data, and business terms. From there, I can work my way into a more granular level, applying all of that information on top of my actual data to understand what my data looks like, where it came from, and where it went wrong, managing it throughout the cycle.
What needs improvement?
What I would love to see is an end-to-end, almost a training demo database of some sort, where one of the biggest problems with data management is demonstrated.
There are so many components to data management, and more often than not, people understand one thing really well. They may understand DataStage and how to move data around, but they do not see the impact of moving data incorrectly.
They also do not see the impact of everyone understanding a piece of data in the same way. I would love Cloud Pak to come with a demo database that illustrates the different components of data management in a logical way, so I can see the whole picture instead of just the area I'm specializing in.
It would be great if Cloud Pak, from a data modeling point of view, allowed us to import our PDMs, for example. It would be ideal to import and create business terms in Cloud Pak. The PEA would be great to create the technical data. The association between the business and the technical metadata could then be automated by pulling it through from your ACE models. The data modeling component is available in Cloud Pak.
Additionally, when it comes to Cloud Pak, even though it has the NextGen DataStage built into it, there is Cloud Pak for data integration as well. Currently, I do not think we have a full enough understanding of how CP4D and CP4I can enhance each other.
For how long have I used the solution?
I have used the solution since the end of 2021.
What do I think about the scalability of the solution?
Scalability is endless if I can pay for it. Obviously, it is just for containers, however, I have to pay more.
How are customer service and support?
The response time is quick, however, solving the problem is not always as fast. Cloud Pak is a complicated system, and it's often difficult to find the right resource in IBM to help with specific issues.
How would you rate customer service and support?
Neutral
How was the initial setup?
The setup was very complete and very complex.
What about the implementation team?
We did the implementation with IBM.
What's my experience with pricing, setup cost, and licensing?
The setup cost is very expensive. The cost depends on the pieces of the solution I'm using, how much data I have, and whether it's on the cloud or on-prem.
Which other solutions did I evaluate?
I've looked at Talend, Calibra, Denodo , Purview , and AWS Glue . It depends on the client's maturity in data management. If the client is only looking to do data quality as a small piece of data management, Denodo would be an excellent choice. If they are looking for end-to-end data management and have the technical resources to get Cloud Pak running and enabled with all functionalities, then definitely Cloud Pak. The choice depends on the maturity of the company.
What other advice do I have?
Cloud Pak is a very, very, very good system. I'm super impressed with it. The learning curve is high, but I gain so much when I finally figure it out.
Overall product rating: seven out of ten.
From Data Silos to Actionable Insights: IBM Cloud Pak for Data Delivers
Provides IBM Watson Catalog and data pipelines, but catalog searching needs to be improved
What is most valuable?
IBM Watson Catalog and data pipelines are the most valuable features of the solution.
What needs improvement?
Previously, we used to extract the information in the DSX and the XML formats. IBM Cloud Pak for Data exports information mostly on the ISX, which is an encrypted format. The only challenge with the tool is the metadata queries we try to understand.
We have to go with the lineage and other packages that come with IBM. Previously, we created our own reports depending on the existing command line export of the mappings. The solution's catalog searching or map search needs to be improved.
For how long have I used the solution?
I have been using IBM Cloud Pak for Data for two years.
What do I think about the scalability of the solution?
We usually recommend the solution for medium and large-scale organizations.
How are customer service and support?
My current organization is a Gold Partner with IBM. Whenever we reach out to the support team, the turnaround time is about 24 to 48 hours, which is pretty decent.
I rate the solution’s technical support an eight to nine out of ten.
How would you rate customer service and support?
Positive
How was the initial setup?
The solution’s initial setup is easy.
What's my experience with pricing, setup cost, and licensing?
The solution's pricing is competitive with that of other vendors. The pricing also depends on the number of users.
What other advice do I have?
If people are with the existing stuff, I would definitely suggest they go with IBM Cloud Pak for Data. I usually recommend the solution for the financial sector, where I worked for about ten years. I worked with IBM for almost eight years. Unless they want to migrate to a new product completely, I recommend IBM Cloud Pak for Data to explore current business. It is easy to integrate the tool with other solutions.
Except for metadata queries, metadata validations, and metadata integrations, I don't see any issues with the solution. I would recommend the solution to other users if it supports their existing infrastructure.
Some people don't want to put their data in the cloud because they are concerned about how the data is secured with encryption and decryption. For such cases, we have listed out all the pros and cons of the solution to suggest them to users.
Overall, I rate the solution a seven out of ten.