Driving Hybrid Cloud Analytics with Amazon Redshift and Denodo Data Virtualization
By Mitesh Shah, Sr. Cloud Product Manager at Denodo Technologies
By Saunak Chandra, Sr. Solutions Architect at AWS
Organizations of all sizes are facing an increasingly complex data landscape. Data now resides in multiple on-premises systems and across cloud environments, and applications need to consume data regardless of its location.
The growing numbers and volumes of data sources are increasingly difficult to manage, and enterprises are grappling with emerging business requirements such as artificial intelligence (AI), machine learning (ML), cloud integration, and more.
The need for greater agility and faster time-to-market is clear, and delays can mean lost business or customers to competitors that are utilizing more efficient, responsive IT architectures.
A data integration architecture that can virtually connect multiple data platforms provides business users with immediate access to data, with far less IT friction than traditional methods, so you can make faster, more data-driven decisions.
The Denodo Platform for Amazon Web Services (AWS) can aid organizations in managing their data by providing an alternative data integration method that combines data from traditional enterprise data warehouses, cloud data sources, and modern big data stores. You get all of this without the costs and complexity of alternative methods.
In this post, we’ll discuss how a data virtualization platform provides wider access to data and simplifies security and governance through in-place integrations.
As organizations augment data warehouses with expansive data lakes that feed advanced analytics, there also practical reasons to consider data virtualization as an influential and key component of your modern data architecture.
Denodo Virtualization Platform
Customers who employ data virtualization gain the benefits of lower latency data access. They can also leverage newer architectural models, such as a logical data warehouse that provides a virtual data access layer onto their existing data warehouse to make it faster and easier to combine, connect, and consume new data sources.
Figure 1 – Denodo acts as a middle tier data access layer, leveraging connect, combine, and consume capabilities to simplify data integration via data virtualization.
Additionally, customers can virtually combine data from disparate sources into views that can be readily consumed by business intelligence (BI) and analytics tools. Without such architectures, it’s a cumbersome process to integrate data from multiple silos across the organization and prepare it for discovery.
Another area where virtualization plays a key role is cloud modernization initiatives. The Denodo Platform simplifies the process of transitioning data to cloud data warehouses, such as Amazon Redshift, by providing a data abstraction layer between the consuming applications and physical data sources to avoid disrupting business operations.
Benefits of the Denodo Platform with Amazon Redshift
Connecting multiple data sources via the Denodo Platform is streamlined using the native connector for Amazon Redshift. Businesses can quickly integrate petabytes of data in real-time using Denodo’s advanced query performance optimizer.
A recommended best practice is to always deploy a Denodo Platform instance close to the data sources. Smart, built-in query optimization determines the optimal approach for speed and costs, enabling processing to be pushed down to the database to minimize performance overhead or network latency.
The Denodo connector to Amazon Redshift offers additional benefits:
- Seamless Interoperability: Native support for Amazon Redshift connector provides for optimized query generation, and allows customers to take advantage of advanced features like windowing functions and statistical functions to guarantee optimal query push-down.
- Advanced Optimization: Support for techniques like data movement (data shipping) that enable multi-pass executions based on intermediate temporary tables.
- High Performance Caching: Use of Amazon Redshift as a high-performance caching database for the Denodo Platform in the cloud.
- Frictionless Data Movement: Ability to leverage Amazon Redshift APIs with Denodo to enable customers with frictionless movement of data to Amazon Simple Storage Service (Amazon S3).
- Metadata Integration: Integration with Amazon Redshift metadata catalog provides metrics to outline an optimal query plan for better query efficiency.
Customer Use Case Scenarios
Let’s take a look at some key use cases of the Denodo Platform for AWS, working in conjunction with Amazon Redshift.
Simplified and Frictionless Transition to Cloud
Data warehouse modernization has become a new normal by moving organizations’ existing workloads to the cloud. It’s a common goal for many companies to reduce the cost of managing the data warehouse while adding flexibility to connect to the data.
With Amazon Redshift, which powers mission-critical analytical workloads for Fortune 500 companies, startups, and everything in between, customers get a peace of mind when they see the large ecosystem around Amazon Redshift.
The Denodo Platform provides a data abstraction layer that decouples the underlying data sources from the consuming applications, enabling a frictionless transition of data to Amazon Redshift.
For example, users can migrate data from Teradata or Oracle Exadata to Amazon Redshift at their own pace by leveraging data virtualization as an abstraction layer to isolate the business from the effects of the change.
Hybrid Logical Data Warehousing
From a strategic standpoint, many organizations are not moving all of their data to the cloud. Rather, they are taking a hybrid approach to cloud data integration and keeping some data in on-premises data stores and some data in cloud-based options like Amazon Redshift.
The Denodo Platform offers a single virtual layer for accessing data across both types of sources simultaneously, facilitating access for reporting tools as well as providing the means for data scientists to quickly access the combined data for analysis.
Cloud-Based Analytics and Data Science
The Denodo Platform is platform-agnostic, so it can be deployed in a hybrid fashion to support both on-premises and cloud environments. This allows it to provide a single-entry point for data integration across multiple data sources.
Data virtualization provides an easier method to accelerate data acquisition, and enables business analysts and data scientists to more readily access, discover, and tag the data they are looking for.
Denodo’s data catalog, a core feature of the Denodo Platform, is a valuable tool for advanced analytics and data science projects. It provides seamless access to data via a searchable, contextualized interface, allowing business users to query, search, and browse information and metadata stored in the Denodo server.
Customer Success Story
Let’s look at a real-world scenario in which the Denodo Platform has made it easy to scale petabyte-volume cloud data integration in support of analytics.
Indiana University has a combined student body of more than 110,000 students. They had recently begun a Decision Support Initiative (DSI) to close the gaps between the abundance of available data and people whose decisions will help the university achieve its goals.
Historically, data and its corresponding business logic were stored across multiple, siloed systems, making it extremely time-consuming to gather and combine the relevant that information decision makers needed.
Indiana University chose the Denodo Platform to create a logical data warehouse in order to connect the university’s systems of record stored on AWS to data-consuming applications. This provided heterogeneous data connectivity, delivery, security, and governance services.
With Denodo and Amazon Redshift, Indiana University gained frictionless transition to cloud that significantly improved their information agility. Data can now be defined and accessed almost instantaneously, no matter where it resides and with minimal effort.
Core Business Intelligence logic is becoming centralized, thus reducing duplication of effort and enhancing development efficiency.
In many cloud initiatives, the center of gravity for the data is a moving target. It might start off being on-premises but will move to AWS as more data sets migrate to the cloud.
In this situation, the Denodo Platform may initially be deployed on-premises before moving to a multi-location deployment with Denodo deployed both on-premises and on AWS in response to the data becoming spread over the different locations.
One key objective of leveraging data virtualization as a data integration technique is to minimize moving large result sets from Amazon Redshift, to enable them to be processed locally.
If the bulk of data is stored on-premises, with just a few data sets being moved to AWS, it may be preferable to start with the Denodo Platform deployed on-premises. If, however, the bulk of the data has been, or is being, moved to Amazon Redshift already, then deploying the Denodo Platform on AWS makes more sense.
There are three options for where to deploy the Denodo Platform when you have a distributed data architecture:
- Deployed on-premises with AWS data sources.
- Deployed in a hybrid multi-location architecture.
- Deployed on AWS only.
The choice depends on your answers to the following questions:
- Where is the majority of the data, or your data’s “center of gravity”?
- Where are the users and consuming applications located (on-premises or AWS)?
- What type and volume of queries are going to be executed against the Denodo Platform? Are the queries going to return large result sets and will they be frequently executed?
In the architecture below, you see how a typical deployment architecture of Denodo can help kickoff a data integration project.
Figure 2 – Typical deployment architecture of Denodo.
The Denodo Platform can be deployed both in the cloud and on-premises. Its instance is deployed via Auto Scaling Groups to provide scalability and resilience.
Even if you only need, or are only licensed for, a single instance on AWS, you should still deploy the Denodo Platform instance in an Auto Scaling Group, for resilience
In this post, you have learned how companies are rapidly adopting cloud technologies to gain greater agility, flexibility, and scalability.
Migrating applications and data to the cloud, however, can be fraught with challenges related to downtime, security, compliance, and latency. Data virtualization removes these challenges by creating an abstraction layer and location transparency of myriad data sources.
Denodo’s support for Amazon Redshift provides a smooth pathway to integrate data in the cloud, as well as a cost-effective way to modernize your data warehouse using data virtualization techniques.
To experience the capabilities of Denodo with Amazon Redshift, we have created a number of test drives that showcase functionality and performance capabilities. To get started, you can sign up for free on the Denodo web site.
The test drive takes a maximum of two hours and allows users to experience the power of Denodo working with Amazon Redshift (and other native AWS data sources). You can choose between the business intelligence and analytics test drive, or the data science and data catalog test drive to try out a range of different use cases.
Denodo also offers a 14-day free trial of the Denodo Platform for AWS, which can be initiated under the your AWS account.
Denodo Platform offerings are also available on AWS Marketplace. You can choose between offerings that support two, five, or unlimited data sources, depending on how many sources need to be integrated in the cloud.
There are several valuable resources you can leverage to embark upon an exciting and fulfilling data virtualization journey. Denodo User Community provides a wealth of information along with some great case studies that can be found on Denodo website. Feel reach out to Denodo at firstname.lastname@example.org.
The content and opinions in this blog are those of the third party author and AWS is not responsible for the content or accuracy of this post.
Denodo – APN Partner Spotlight
Denodo is an APN Advanced Technology Partner. The Denodo Platform for data virtualization integrates and delivers your data in real-time without replicating it.
*Already worked with Denodo? Rate this Partner
*To review an APN Partner, you must be an AWS customer that has worked with them directly on a project.