IBM & Red Hat on AWS

Accelerate Data Modernization and AI with IBM Databases on AWS

Data quality is essential for successful artificial intelligence (AI) workloads. For this reason, customers want to move beyond basic database management tasks such as backups and upgrades, to building modern data architectures. They seek to implement efficient strategies to unify data, removing duplication and implement governance to ensure secure data access.

These challenges aren’t new and highlight the necessity for flexible, cloud-native solutions capable of addressing these challenges and scaling with the growing demands of AI workloads.

In this blog, we will look at how IBM’s portfolio of Software as a Service (SaaS) database solutions, available on Amazon Web Services (AWS), helps customers scale and accelerate their data modernization strategies, applications, analytics, and AI workloads on AWS.

Fit-for-purpose databases

IBM offers several fit-for-purpose database solutions on AWS, enabling customers to select the right tool for their workloads and optimize cost-effectiveness. With native integrations and support for open formats, solutions such as IBM watsonx.dataIBM Db2IBM Db2 Warehouse, and IBM Netezza Performance Server facilitate the unification and sharing of a single copy of data and metadata. This eliminates the need for data migration or re-cataloging.

Customers can create a trusted data foundation for AI workloads and a data fabric methodology by using AWS services such as Amazon Elastic Map Reduce (EMR) and Amazon Glue combined with IBM data solutions. This integration offers automated lineage, governance, and data reproducibility, helping accelerate and scale AI workloads.

Combining IBM’s database performance, scalability, security, and governance features with AWS’s flexibility, agility, and cost efficiency enables you to accelerate your data modernization strategy in the cloud. For customers with existing IBM database investments on-premises, like-for-like migrations to AWS are facilitated allowing for modernization at their preferred pace.

Let’s take a closer look at the different IBM database offerings available on AWS.

Amazon RDS for Db2

IBM Db2 runs the world’s mission critical database workloads, handling complex transactions and large data volumes supporting a variety of data and AI-driven applications.

Launched at AWS re:Invent in 2023, Amazon Relational Database Service (RDS) for Db2 is a fully managed relational database service, simplifying setup, operation, protection and scaling of Db2 deployments on AWS. This service combines the ease and availability of Amazon RDS with IBM Db2’s capabilities.

Amazon RDS for Db2 integrates with AWS services, including AWS Key Management Service (AWS KMS), AWS Identity and Access Management (IAM) and Amazon Simple Storage Service (Amazon S3), ensuring multiple layers of data protection. Encryption of databases and backups is managed through AWS KMS keys, while credential management is simplified with AWS Secrets Manager. Additionally, if provides support for many compliance programs including HIPAA and FedRAMP.

There are  a range of monitoring tools to track database instances, including Amazon RDS Enhanced Monitoring, Amazon CloudWatch, or IBM Data Management Console and Db2’s dsmtop.

Amazon RDS for Db2 supports deployment across multiple Availability Zones (AZ) for high availability (Multi-AZ deployment). This includes a synchronous standby replica for data redundancy, enabling automatic failover and minimizing latency during system backups.

For cross-regional disaster recovery, configure Amazon RDS to replicate backups and transaction logs to a chosen AWS Region. Amazon RDS handles the cross-Region copy of all snapshots and transaction logs.

Figure 1 below shows a setup for achieving high availability and cross-region backup for Amazon RDS using multiple availability zones.

This illustration shows a setup for achieving high availability and cross-region backup for Amazon RDS using multiple availability zones.

Figure 1. Amazon RDS for Db2 High Availability and Cross-Regional Backup.

On-premises Db2 customers can start using Amazon RDS for Db2 with minimal setup and pay-as-you-go pricing, without upfront fees. Migration to Amazon RDS is simplified through native Db2 tools or AWS Database Migration Service (AWS DMS), as illustrated in the figure 2 below.

This illustration shows the use of AWS Database Migration Service as part of a data flow from a corporate data center to Amazon RDS for Db2.

Figure 2. IBM Db2 database migration process flow using AWS DMS.

Profile Centevo successfully modernized with Amazon RDS for Db2 for its mission critical workloads

One example of a customer seeing advantage with IBM’s Db2 on AWS is Profile Centevo, who recently adopted Amazon RDS for Db2 to modernize and manage its business-critical Db2-based asset management applications.

This enabled Profile Centevo to scale their transaction handling capabilities, supporting thousands of users daily without compromising on performance, security or reliability.

With Amazon RDS for Db2, Profile Centevo was able to reduce their infrastructure costs by up to 4x, when compared to self-managing their Db2 database and applications on-premises.

Db2 Warehouse SaaS on AWS

IBM Db2 Warehouse is a cloud-native data warehouse designed for low-latency querying and real-time insights. It handles mission-critical workloads like operational analytics, business intelligence (BI), in-database machine learning (ML), and AI. Db2 Warehouse facilitates collaboration through governed data access.

Available as SaaS on AWS, Db2 Warehouse is fully managed, minimizing administrative tasks like indexing and maintenance. It supports different storage options, including Amazon S3 for efficient data storage. Additionally, it provides flexibility to store row-organized data in Amazon Elastic Block Store (EBS).

Db2 Warehouse optimizes performance by using in-memory processing, compressed data querying, and cached data for query execution. If offers independent scaling of storage and compute resources, enhancing analytics workloads efficiency while controlling storage costs.

It ensures resiliency through managed computation, highly available storage, and data replication. Db2 Warehouse detects unhealthy nodes and quickly replaces them from a standby pool. Additionally, it offers self-service snapshot backup and restore options, with disaster recovery backups stored and replicated via Amazon S3.

Full support for open data formats such as Apache Parquet, Apache Avro, ORC, and Apache Iceberg table format facilitates data sharing across different teams, without the need for data duplication or additional ETL. It also enables integration with IBM watsonx.data to provide a common metadata store, compatible with multiple query engines like Apache Spark and Presto. Additional integrations with AWS services like Amazon S3, Amazon EMR, or AWS Glue help you scale your analytics and AI workloads in the cloud.

Data scientists and engineers can use tools like Python, R, and Jupyter Notebooks to analyze and train ML models directly within the Db2 Warehouse engine, eliminating the need for data movement.

Existing Db2 Warehouse software or Integrated Analytics Appliance (IIAS) customers can migrate to AWS with full workload compatibility. This includes support for data definition language (DDL), data manipulation language (DML), extract, transform, and load (ETL), along with other on-premises tools.

Netezza Performance Server SaaS on AWS

IBM Netezza Performance Server is a cloud-native columnar data warehouse service, with massively parallel processing technology. It’s designed for high-performance petabyte-scale data warehousing, making it ideal for deep analytics, data mining and AI tasks in the cloud. Available as SaaS on AWS, Netezza is fully managed, highly available, and minimizes administration tasks with no indexing, tuning, and automated maintenance required.

Netezza uses AI-driven elastic scaling to predict and schedule scaling based on workload demands. You can specify performance requirements in micro-increments for efficient resource utilization. Elastic scaling manages both planned and unplanned increases in data warehouse requests or computing demand. Additionally, Netezza enables independent scaling of storage density and compute capacity.

Existing Netezza appliance customers can easily migrate to Netezza Performance Server on AWS with commands like nz_migrate or nz_backup and nz_restore (figure 3).

Diagram showing backup data routing from on-premises Netezza appliance to AWS based Netezza using nz_migrate and nz_backup.

Figure 3. Migrating to Netezza Performance Server SaaS on AWS.

Native support for open formats like Parquet and Apache Iceberg helps data engineers, scientists, and analysts share data and run complex workloads without the need for data duplication and requiring zero ETL. Teams can use governed data from Netezza to develop, train and run custom ML models directly inside the database for predictions and scoring without data movement.

Integration with IBM watsonx.data enables centralized analytics and AI workload execution on governed data, at scale from a single point of entry. Analytics tasks can be performed using engines like Presto and Apache Spark. Other integrations with AWS services such as Amazon S3, Amazon EMR, or AWS Glue help you scale your analytics and AI workloads in the cloud.

Conestoga Wood Specialties uses Netezza Performance Server on AWS to transform their data analytics

An example of a customer seeing improvement is Conestoga Wood Specialties, a leader in custom cabinet doors and wood cabinet components, who is using Netezza Performance Server SaaS on AWS to transform their data analytics and management systems.

Using  Netezza combined with IBM Cognos they are able to run over 5,200 reports per day, giving leadership across the business the decision support they need to continue growing and expanding the enterprise. In addition, Netezza supports 1.2 terabytes of data and 10,000 daily queries, with capabilities to scale up its business infrastructure as its needs change. This acceleration enabled faster decision-making and a more agile response to market demands and continue growing their businesses.

watsonx.data SaaS on AWS

IBM watsonx.data is an open data lakehouse solution for analytics and AI workloads, available as SaaS on AWS. Its open architecture includes built-in governance and supports different data sources and formats, ensuring secure data sharing across your organization and facilitating insights from your enterprise data through a unified platform.

This solution simplifies your data landscape, giving you access to all your data wherever it resides. Including transactional data from Amazon RDS for Db2, analytics data from Db2 Warehouse and Netezza Performance Server, and data lakes stored in Amazon S3, AWS Glue, Amazon EMR, and others. (figure 4).

Diagram showing an AWS reference architecture for watsonx.data combined with IBM Netezza Performance Server and IBM Db2 Warehouse.

Figure 4. IBM watsonx.data open lakehouse architecture and integrations on AWS.

Supporting open table formats like Apache Iceberg, watsonx.data optimizes analytics for large datasets. It handles various data formats such as Parquet, Avro, JSON, and CSV. Customers can select from different open query engines, such as Presto and Spark, for their workload requirements. Including ad hoc analytics, data transformation, data sharing, BI workloads, or machine learning (ML) and generative AI workloads.

Watsonx.data enables you to have a single copy of data useable across your organization, reducing the need for ETL processes, data movement, and duplication. Built-in data governance capabilities and compatibility with IBM Knowledge Catalog ensure secure data access and sharing across your organization. Making your data available to AI models or applications of your choice, with data governance, lineage, and reproducibility.

Data scientists and engineers will benefit from the platform’s support for open data formats, simplifying data management, discovery, transformation, and analysis. The platform accelerates insights through an AI-powered conversational interface, eliminating the need for SQL. For users preferring traditional methods, SQL is also available for exploring and transforming data.

You can now provision Milvus as a service in watsonx.data, an embedded vector DB to store and query vector data. This service is used for tasks like similarity search and recommendation systems in generative AI use-cases. It gives data scientists and engineers the ability to curate and prepare data for tasks such as retrieval-augmented generation (RAG) use-cases. With vectorized embedding capabilities, customers can refine and optimize their AI models to produce accurate responses and insights from trusted data.

Conclusion

Conclusion

IBM databases on AWS offer a transformative opportunity for customers looking to scale their AI and analytics capabilities. By leveraging the benefits of fully managed services, integrated solutions for zero-ETL data preparation, and workload compatibility, customers can streamline their data management and minimize the risks of migrating to AWS.

With flexible deployment models, including SaaS on AWS, customers gain the agility they need to adapt to changing business needs. IBM’s solutions also support innovations in open-source technologies, which optimize query performance, governance, and workload management, allowing customers to get more from their data.

The end-to-end analytics and AI solutions from IBM on AWS, helps customers reduce time-to-value and simplify the deployment and management of their production workloads. This approach enables customers to unlock the potential of their data and drive strategic growth.

Visit the AWS Marketplace for the IBM Data and AI SaaS solutions on AWS:

IBM Db2 Warehouse as a Service

IBM watsonx.data as a Service on AWS

IBM Netezza Performance Server as a Service

Further content

Making Data-Driven Decisions with IBM watsonx.data, an Open Data Lakehouse on AWS

Migrate from self-managed Db2 to Amazon RDS for Db2 using AWS DMS

IBM watsonx.data on AWS

Introducing Db2 Warehouse on AWS

Netezza Performance Server on AWS