AWS Architecture Blog
Deploying IBM Cloud Pak for Data on Red Hat OpenShift Service on AWS
Editor’s note, October 2024: This post is now obsolete. For the latest post, refer to Deploying IBM Cloud Pak for Data on Red Hat OpenShift Service on AWS.
Amazon Web Services (AWS) customers who want to deploy and use IBM Cloud Pak for Data (CP4D) on the AWS Cloud, can use Red Hat OpenShift Service on AWS (ROSA).
ROSA is a fully managed service, jointly supported by AWS and Red Hat. It is managed by Red Hat Site Reliability Engineers and provides a pay-as-you-go pricing model, as well as a unified billing experience on AWS.
With this, customers do not manage the lifecycle of Red Hat OpenShift Container Platform clusters. Instead, they are free to focus on developing new solutions and innovating faster, using IBM’s integrated data and artificial intelligence platform on AWS, to differentiate their business and meet their ever-changing enterprise needs.
In this post, we explain how to create a ROSA classic cluster and install an instance of IBM Cloud Pak for Data
Cloud Pak for data architecture
Here, we are implementing a highly available ROSA classic cluster with three Availability Zones (AZs), three master nodes, three infrastructure nodes, and three worker nodes.
Review the AWS Regions and Availability Zones documentation and the regions where ROSA is available to choose the best region for your deployment.
Figure 1 demonstrates the solution’s architecture.
In our scenario, we are building a public ROSA classic cluster, with internet-facing Elastic Load Balancers providing access to our cluster. Consider using a ROSA private cluster when you are deploying CP4D in your AWS account.
We are using Amazon Elastic Block Store (Amazon EBS) and Amazon Elastic File System (Amazon EFS) for the cluster’s persistent storage. Review the IBM documentation for information about supported storage options.
Also, review the AWS prerequisites for ROSA and follow the Security best practices in IAM documentation, before deploying CP4D for production workloads, to protect your AWS account before deploying CP4D.
Cost
You are responsible for the cost of the AWS services used when deploying CP4D in your AWS account. For cost estimates, see the pricing pages for each AWS service you use.
Prerequisites
Before getting started, review the following prerequisites for this solution:
- This blog assumes familiarity with: CP4D, Terraform, Amazon Elastic Compute Cloud (Amazon EC2), Amazon EBS, Amazon EFS, Amazon Virtual Private Cloud, and AWS Identity and Access Management (IAM).
- Access to an AWS account, with permissions to create the resources described in the installation steps section.
- An AWS IAM user, with the permissions described in the AWS prerequisites for ROSA documentation.
- Verification of the required AWS service quotas to deploy ROSA. You can request service-quota increases from the AWS console.
- Access to an IBM entitlement API key: either a 60-day trial or an existing entitlement.
- Access to a Red Hat ROSA token; you can register on the Red Hat website to obtain one.
- A bastion host to run the CP4D installer; we have used and AWS Cloud9 workspace. You can use another device, provided it supports the required software packages:
Installation steps
Complete the following steps to deploy CP4D on ROSA:
- Navigate to the ROSA console to enable the ROSA service:
- Click Get started.
- On the Verify ROSA prerequisites page, select I agree to share my contact information with Red Hat.
- Choose Enable ROSA.
- Create an AWS Cloud9 environment to run your CP4D installation. We’ve used a t3.medium instance (Figure 2).
Figure 2. Create an AWS Cloud9 environment
- After your AWS Cloud9 environment is up, close the Welcome tab and open a new Terminal tab and install the required packages:
- Create an IAM policy named cp4d-installer-permissions with the following permissions:
- Create an IAM role:
1. Select an AWS service and Amazon EC2, then click Next: Permissions.
2. Select the cp4d-installer-permissions policy, and click Next.
3. Name it cp4d-installer, and click Create role. - From your AWS Cloud9 IDE, click the circle button on the top right, and select Manage EC2 Instance (Figure 3).
- On the Amazon EC2 console, select the AWS Cloud9 instance, then choose Actions / Security / Modify IAM Role.
- Choose cp4d-installer from the IAM Role drop down, and click Update IAM role (Figure 4).
Figure 4. Attach the IAM role to your workspace
- Update the IAM settings for your AWS Cloud9 workspace:
- Set up your AWS environment:
- Navigate to the Red Hat Hybrid Cloud Console, and copy your OpenShift Cluster Manager API Token.
- Use the token and log in to your Red Hat account:
- Verify that your AWS account satisfies the quotas to deploy your cluster:
- When deploying ROSA for the first time, create the account-wide roles:
- Create your ROSA cluster:
- Once your cluster is ready, create a cluster-admin user and take note of the cluster API URL, username, and password:
- Log in to your cluster using the login information from the previous step. For example:
- Create an inbound rule in your worker nodes security group, allowing NFS traffic from your cluster’s VPC CIDR:
- Create an Amazon EFS file system:
- Log in to Container software library on My IBM and copy your API key.
- In this blog, we are installing CP4D with IBM Watson Machine Learning and IBM Watson Studio.
- Review the IBM documentation to determine which CP4D components you need to install to support your requirements.
- Export environment variables for the CP4D installation. The COMPONENTS variable defines which services will be installed:
- Download and install the CP4D cli as per supported Cloud Pak for Data version:
- Log in to your ROSA cluster:
- Set up persistent storage for your cluster:
- Create projects to deploy the CP4D software:
- Modify load balancer timeout settings to prevent connections from being closed before processes complete:
- Configure the global image pull-secret to pull images from the IBM container repository:
- Install certificate manager and the license service:
- Apply the required permissions by running authorize-instance-topology:
- Install the CP4D foundational services:
- Create the operators and operator subscriptions for your CP4D installation:
- Install the CP4D platform and services:
- Get your CP4D URL and admin credentials:
- The command output will display the URL of your CP4D and the password for your Admin user (Figure 5):
Figure 5. CP4D URL and admin credentials
- Using the information from the previous steps (CP4D URL, User, Admin Password), access your CP4D console.
- From the CP4D home (welcome page), click on Discover Services to be directed to the Services catalog.
- From the Services catalog, you can see all CP4D available services.
- Use the search bar to filter for Watson, and find the IBM Watson Machine Learning and IBM Watson Studio services. Note how they are displayed as Enabled (Figure 6).
Figure 6. Services enabled in your CP4D catalog
Congratulations! You have successfully deployed IBM CP4D on Red Hat OpenShift on AWS.
Post-installation
Review the following topics, when you installing CP4D on production:
- Review the IBM system requirements documentation to calculate the size of your ROSA cluster.
- Review the administrative tasks to enable security, maintenance, monitoring, managing users, and backing up your environment.
- How to setup services after you have installed the platform.
- Configure identity providers on ROSA.
- Enable auto scaling for your ROSA cluster.
- Configure logging and enable monitoring for your ROSA cluster.
Cleanup
Connect to your AWS Cloud9 workspace, and run the following steps to delete the CP4D installation, including ROSA. This avoids incurring future charges on your AWS account:
To monitor your cluster uninstallation logs, run:
Once the cluster is uninstalled, remove the operator-roles
and oidc-provider
, as informed in the output of the rosa delete
command. For example:
Conclusion
In summary, we explored how customers can take advantage of a fully managed OpenShift service on AWS to run IBM CP4D. With this implementation, customers can focus on what is important to them, their workloads, and their customers, and less on the day-to-day operations of managing OpenShift to run CP4D.
If you are interested in learning more about CP4D on AWS, explore the IBM Cloud Pak for Data (CP4D) on AWS Modernization Workshop.
Visit the AWS Marketplace for IBM Cloud Pak for Data offers.
Further reading
- Building a healthcare data pipeline on AWS with IBM Cloud Pak for Data
- IBM Cloud Pak for Data Simplifies and Automates How You Turn Data into Insights
- Accelerate Data Modernization and AI with IBM Databases on AWS
- Build a Modern Data Architecture on AWS with your IBM Z Mainframe
- Making Data-Driven Decisions with IBM watsonx.data, an Open Data Lakehouse on AWS