AWS Database Blog

Amazon DynamoDB Accelerator (DAX): A Read-Through/Write-Through Cache for DynamoDB

Joseph Idziorek is a product manager at Amazon Web Services.

AWS recently launched Amazon DynamoDB Accelerator (DAX), a highly available, in-memory cache for Amazon DynamoDB. If you’re currently using DynamoDB or considering DynamoDB, DAX can offer you response times in microseconds and millions of requests per second.

When developers start using DAX, they tell us that the performance is great, and they love that DAX is API compatible with DynamoDB. This means they no longer have to set up and develop against a side cache. Instead of learning another database system with a new set of APIs and data types—and then rewriting applications to do the two-step dance needed for cache look-ups, population, and invalidation—you can simply point your existing DynamoDB application at the DAX endpoint. What used to take weeks and months now takes only moments with DAX.

How does DAX accomplish this? When you’re developing against DAX, instead of pointing your application at the DynamoDB endpoint, you point it at the DAX endpoint, and DAX handles the rest. As a read-through/write-through cache, DAX seamlessly intercepts the API calls that an application normally makes to DynamoDB so that both read and write activity are reflected in the DAX cache. For you, the API calls are the same, so there’s no need to rewrite the application.

In this post, we take a step back and describe what a read-through/write-through cache is, and how it’s different from other caching strategies. We also discuss the design considerations for these different caching strategies.

Side-cache
When you’re using a cache for a backend data store, a side-cache is perhaps the most commonly known approach. Canonical examples include both Redis and Memcached. These are general-purpose caches that are decoupled from the underlying data store and can help with both read and write throughput, depending on the workload and durability requirements.

For read-heavy workloads, side-caches are typically used as follows:

  1. For a given key-value pair, the application first tries to read the data from the cache. If the cache is populated with the data (cache hit), the value is returned. If not, on to step 2.
  2. Because the desired key-value pair was not found in the cache, the application then fetches the data from the underlying data store.
  3. To ensure that the data is present when the application needs to fetch the data again, the key-value pair from step 2 is then written to the cache.

Although the number of use cases for caching is growing, especially for Redis, the predominant workload is to support read-heavy workloads for repeat reads. Additionally, developers also use Redis to better absorb spikes in writes. One of the more popular patterns is to write directly to Redis and then asynchronously invoke a separate workflow to de-stage the data to a separate data store (for example, DynamoDB).

There are a few design points to note here. First, writes to cache are both eventually consistent and non-durable, so there is a possibility for data loss. Some applications in IoT, for example, can tolerate this trade-off. In addition, there are penalties in the form of multiple round-trips and additional connection handling.

(more…)

AWS Named as a Leader in The Forrester Wave: Database-as-a-Service Q2 2017

by Yoav Eilat and Larry Weber | on | Permalink | Comments |  Share

Larry Weber and Yoav Eilat are product marketing managers at Amazon Web Services.

On April 24, Forrester published The Forrester Wave: Database-as-a-Service Wave, Q2 2017, in which AWS is named as a Leader, receiving the highest score for Current Offerings, Strategy, and Market Presence. There were 13 vendors in the evaluation.

According to Forrester, “AWS not only has the largest adoption of DBaaS, it also offers the widest range of offerings to support analytical, operational, and transactional workloads.” AWS was the only vendor to receive a 5.0 of 5 on in the data security criterion. Forrester notes that “AWS’s key strengths lay in its dynamic scale, automated administration, flexibility of database offerings, strong security, and high-availability capabilities, which make it a preferred choice for customers.”

The Forrester report evaluated a number of AWS database services: Amazon Aurora, Amazon RDS, Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for MariaDB, Amazon RDS for Oracle, Amazon RDS for SQL Server, Amazon DynamoDB, Amazon Redshift and Amazon ElastiCache.

A few things especially struck us while reading the report, but one really stood out: the statement that database-as-a-service has become critical for all enterprises.

Just three to four years back, very few people were moving their databases to the public cloud. Larry remembers demoing a managed database at a conference. He asked the question, “How many of you are leveraging the cloud for your transactional data?” The four or so hands that did go up were web-based startups. He then asked the question, “For the rest of you, do you have any plans in the future to manage your production data in the cloud?” Outside of a few pondering looks, the room was silent.

Fast forward to today, and it feels like DBaaS has become the new normal. Organizations are moving workloads as fast as possible to the cloud to reap the benefits of managed services, and we feel the Forrester report validates these trends. As with other cloud services, DBaaS isn’t just about saving money or moving from CapEx to OpEx. Instead, it’s about enabling companies to become more agile and competitive, and to rethink the way they do business. Cloud database services are fueling the growth of new business services, especially when scalability and rapid deployment are critical.

In the report, Forrester also mentions the AWS Database Migration Service, which helps you migrate your existing databases to AWS easily and securely. The service supports homogeneous migrations such as Oracle to Oracle, and also heterogeneous migrations between different platforms such as Oracle or Microsoft SQL Server to Amazon Aurora. AWS DMS now supports over 10 source databases and over 10 target databases. We recently announced that customers have migrated over 28,000 databases using the service.

As you compare database services from different cloud providers, think about what it is you want to accomplish. Are you looking for a fully managed relational database? A fast and flexible NoSQL database service? A data warehouse? Here at AWS, the choice is always up to you. We encourage you to use the right tool for the job, as opposed to trying to solve all problems with a single database.

We’re very excited that customers such as Unilever, Intuit, Infor, Cornell University, and many others give high marks to our managed database services. We’re committed to developing services for you, our customers, and helping you experience database freedom. This is just the beginning.

Download a copy of Forrester’s Database-as-a-Service Wave.

How to Migrate Your Oracle Database to PostgreSQL

Knievel Co is a database engineer at Amazon Web Services.

This blog post provides an overview on how you can migrate your Oracle database to PostgreSQL. The two biggest parts of a database migration are the schema conversion and the data replication. We go over how to use the AWS Schema Conversion Tool (AWS SCT) and the AWS Database Migration Service (AWS DMS) to tackle these two parts.

Before we cover SCT and DMS, you should take some preliminary steps that have been proven to be helpful for every migration. One way to make the migration easier is to have what is usually called a modernization phase before the migration. This phase involves taking an inventory of objects in your Oracle database and then making a few decisions.

First, deprecate any objects that are no longer needed. Don’t waste time migrating objects that no one cares about. Also, purge any historical data that you no longer need. You don’t want to waste time replicating data you don’t need, for example temporary tables and backup copies of tables from past maintenance. Second, move flat files and long strings stored in LOBs, CLOBs, LONGs, and so on into Amazon S3 or Amazon Dynamo DB. This process requires client software changes, but it reduces the complexity and size of the database and makes the overall system more efficient. Last, try to move any PL/SQL packages and procedures, especially those containing business logic, back to the client software. These objects need to be manually changed anyway if SCT cannot translate them.

The following steps are suggested for migrating to a different database engine (Oracle to PostgreSQL in this case). If you aren’t moving to a different platform, you might find more appropriate native tools or other techniques to move your database.

  1. Create your schema in the target database.
  2. Drop foreign keys and secondary indexes on the target database, and disable triggers.
  3. Set up a DMS task to replicate your data – full load and change data capture (CDC).
  4. Stop the task when the full load phase is complete, and recreate foreign keys and secondary indexes.
  5. Enable the DMS task.
  6. Migrate tools and software, and enable triggers.

Create your schema in the target database

Begin your migration by taking a look at the schema that you want to migrate. In our case, we use the AWS Schema Conversion Tool (AWS SCT) to perform the analysis. When you start the application, you need to create a new project, with the source being Oracle and the target being PostgreSQL. When you’re connected, select the name of the schema that you want to migrate on the left side. Right-click the schema name and choose Convert Schema. Then choose View / Assessment Report View.

The AWS SCT assessment report is a high-level overview of the effort required to convert your Oracle database to PostgreSQL. The following is an example of what the assessment report looks like:

(more…)

Using the AWS Database Migration Service, Amazon S3, and AWS Lambda for Database Analytics

Jeff Levine is a solutions architect for Amazon Web Services.

The AWS Database Migration Service (AWS DMS) supports Amazon S3 as a migration target. The services enable you to extract information from any database supported by DMS and write it to Amazon S3 in a format that can be used by almost any application. You can extract the entire database and replicate ongoing changes including additions, deletions, and updates using change data capture (CDC) technology. You can even process the changes with AWS Lambda or Amazon Kinesis Firehose. By using Lambda or Firehose, you can extend the capabilities of AWS services to existing database environments, both those within AWS and in other locations.

Overview of the example

Let’s consider an example that shows how to bring these services together. We use the AWS Database Migration service to migrate Twitter statistics produced by the City of Seattle, Washington from Amazon RDS for PostgreSQL to Amazon S3 and use AWS Lambda for analysis.  We see how DMS handles both the migration of the initial database contents and ongoing change data capture. Here’s a diagram of what we plan to do.

We will do the following:

  1. Create two datasets containing Twitter account statistics. One initializes a database table. The other one performs subsequent updates to that table, processed by the change data capture feature.
  2. Create an Amazon RDS for PostgreSQL database instance, define a table, and load the first dataset.
  3. Create an Amazon S3 bucket to serve as the target.
  4. Create an AWS Lambda function to analyze S3 objects upon creation.
  5. Configure and invoke the AWS Database Migration Service to migrate the initial dataset that we just loaded into RDS to the S3 bucket. Doing this demonstrates the initial load phase of DMS and shows AWS Lambda processing the dataset.
  6. Copy the second dataset into the database. Doing this demonstrates the change data capture feature of DMS and shows AWS Lambda processing the dataset.
  7. Examine the contents of the Amazon S3 bucket to see the results of both the initial migration and the change data capture.
  8. View the results of the analysis in Amazon CloudWatch.

Prerequisites and assumptions

You will need the following:

  • An AWS account that provides access to the services shown in the diagram.
  • Working knowledge of these services.
  • A utility that connects to a PostgreSQL database, such as psql.

Additionally:

  • We will configure all services in the same VPC and region to simplify networking considerations.
  • The VPC must have an S3 endpoint.

(more…)

Integrate Amazon RDS for SQL Server DB Instances with an Existing Active Directory Domain

Vlad Vlasceanu is a solutions architect at Amazon Web Services.

Last year, we launched support for Windows Authentication in Amazon RDS for SQL Server. This feature has been quite popular with customers. It lets you deploy Microsoft SQL Server-based workloads with Amazon RDS using the same centralized user management patterns that you use in your own data centers, by leveraging Microsoft Active Directory.

In Amazon RDS for SQL Server, you can join both new and existing DB instances to a managed Active Directory domain that is provided by AWS Directory Service. Just as Amazon RDS gives you the ability to operate fully managed database instances (including SQL Server), AWS Directory Service provides fully managed directories that you can use in your workload deployment. In both cases, you get a turnkey solution. AWS handles the deployment, operation, high availability, reliability, and security of the service—allowing you to focus on what matters most: operating your own workload effectively.

This post describes how you can use forest trust relationships to extend your existing Windows Authentication capabilities to SQL Server instances that are deployed using Amazon RDS.

Using AWS Directory Service with Amazon RDS for SQL Server

AWS Directory Service provides several different directory service types, each with its own feature set, and each designed for specific use cases. To use the Windows Authentication feature in RDS for SQL Server, you must use the Microsoft Active Directory (AD) directory service type.

With a Microsoft AD directory, you can build out a fully managed Active Directory environment in the cloud. You can join your Amazon EC2 instances to the provided domain and join your RDS for SQL Server DB instances to that same domain. You can create and manage users, groups, and policies directly in the managed domain. The result is an environment that looks and works a lot like a traditional Active Directory environment that you’re used to. However, you don’t have to spend time managing the day-to-day operational aspects of the resources underpinning your domain controllers or SQL Server databases.

But what if you already have an existing corporate Active Directory domain—either running in AWS in a virtual private cloud (VPC) using EC2 instances, on-premises in your data center, or even extended between both? In these cases, you can still use RDS for SQL Server and Windows Authentication, even if your users and computers are joined to your existing domain.

Forest trusts

With a Microsoft AD directory, you can establish forest trusts. Your RDS SQL Server resources must be joined to an AWS managed Microsoft AD directory. But you can establish a trust relationship between that AWS managed forest and the forest of your corporate domain. For Windows Authentication in RDS for SQL Server to work, you only need a one-way directional forest trust, where the AWS managed forest trusts the corporate domain forest.

Your corporate domain takes the role of the trusted domain, and the AWS Directory Service managed domain takes the role of the trusting domain. Validated authentication requests travel between the domains in only one direction—allowing accounts in your corporate domain to authenticate against resources shared in the managed domain. In this case, RDS for SQL Server resources interact only with the managed domain, and the managed domain then passes on the authentication requests to your corporate domain.

Trust relationships and authentication flow

The following diagram illustrates the trust relationship flow, factoring in an RDS for SQL Server resource and a user from the corporate domain:

(more…)

Automatically Archive Items to S3 Using DynamoDB Time to Live (TTL) with AWS Lambda and Amazon Kinesis Firehose

Adam Wagner is a solutions architect at Amazon Web Services.

Earlier this year, Amazon DynamoDB released Time to Live (TTL) functionality, which automatically deletes expired items from your tables, at no additional cost. TTL eliminates the complexity and cost of scanning tables and deleting items that you don’t want to retain, saving you money on provisioned throughput and storage. One AWS customer, TUNE, purged 85 terabytes of stale data and reduced their costs by over $200K per year.

Today, DynamoDB made TTL better with the release of a new CloudWatch metric for tracking the number of items deleted by TTL, which is also viewable for no additional charge. This new metric helps you monitor the rate of TTL deletions to validate that TTL is working as expected. For example, you could set a CloudWatch alarm to fire if too many or too few automated deletes occur, which might indicate an issue in how you set expiration time stamps for your items.

In this post, I’ll walk through an example of a serverless application using TTL to  automate a common database management task: moving old data from your database into archival storage automatically. Archiving old data helps reduce costs and meet regulatory requirements governing data retention or deletion policies. I’ll show how TTL—combined with DynamoDB Streams, AWS Lambda, and Amazon Kinesis Firehose—facilitates archiving data to a low-cost storage service like Amazon S3, a data warehouse like Amazon Redshift, or to Amazon Elasticsearch Service.

TTL and archiving data
Customers often use DynamoDB to store time series data, such as webpage clickstream data or IoT data from sensors and connected devices. Rather than delete older, less frequently accessed items, many customers want to archive them instead. One way to accomplish this is via “rolling tables,” that is pre-creating tables to store data for particular months/weeks/days. However, this requires custom application logic to handle creating and deleting tables, and switching of reads and writes to new tables. Another approach is to run periodic jobs to delete old items, but this consumes write throughput and also requires custom application logic.

DynamoDB TTL simplifies archiving by automatically deleting items based on the time stamp attribute of your choice. Items deleted by TTL can be identified in DynamoDB Streams, which captures a time-ordered sequence of item-level modifications on a DynamoDB table and stores them in a log for up to 24 hours. You can then archive this stream data using applications like the one referenced later in this blog post. If you choose S3 as the archive, you can optimize your costs even further using S3 lifecycle configuration rules, which automatically transition older data to infrequent access storage class in S3 or to Amazon Glacier for long-term backup.

Sample application overview
This post shows how to build a solution to remove older items from a DynamoDB table and archive them to S3 without having to manage a fleet of servers (see the following simplified workflow diagram). You use TTL to automatically delete old items and DynamoDB Streams to capture the TTL-expired items. You then connect DynamoDB Streams to Lambda, which lets you run code without provisioning or managing any servers. When new items are added to the DynamoDB stream (that is, as TTL deletes older items), the Lambda function triggers, writing the data to an Amazon Kinesis Firehose delivery stream. Firehose provides a simple, fully managed solution to load the data into S3, which is the archive.

At a high level, this post takes you through the following steps:

  1. Activate TTL and DynamoDB Streams on your DynamoDB table.
  2. Create a Firehose delivery stream to load the data into S3.
  3. Create a Lambda function to poll the DynamoDB stream and deliver batch records from streams to Firehose.
  4. Validat that the application works.

Note: This post assumes that you already have a DynamoDB table created, and that you have an attribute on that table that you want to use as the time stamp for TTL. To get started with a simple DynamoDB table, see Getting Started with Amazon DynamoDB. For more details on setting up TTL, see Manage DynamoDB Items Using Time to Live on the AWS Blog.

Step 1: Enable DynamoDB TTL and DynamoDB streams
Start by signing in to the DynamoDB console and navigating to the table that contains the items that you want to archive. Go to the Overview tab for the table. Under Table details, choose Manage TTL.

Enter the table attribute containing the time stamp that will flag items for automated TTL deletion. Next, select the check box to enable DynamoDB streams with view type New and old images. This gives you both the new and old versions of items that are updated. Selecting this view type is required for your stream to contain items removed by TTL.

(more…)

Migrate Your Procedural SQL Code with the AWS Schema Conversion Tool

Wendy Neu is a big data architect at Amazon Web Services.

Database administrators and developers rely on relational databases to store data for applications. As Forbes noted in 2016, the development of open source offerings for database management systems like PostgreSQL is causing a growing number of enterprises to migrate to lower-cost solutions for their data storage. The move to the cloud often provides an excellent opportunity to make such a change.

If your system contains a lot of business logic in the database, it can be quite a large task to translate between SQL dialects line by line. Fortunately, Amazon Web Services offers the AWS Schema Conversion Tool (AWS SCT), a standalone application with a project-based user interface that does the heavy lifting of translating SQL dialects for you.

Using AWS SCT
This post demonstrates how easy it is to migrate procedural code from Oracle to PostgreSQL using AWS SCT. It includes how to spot the differences between the two dialects and how to modify the code when needed. You can use AWS SCT to generate a high-level overview of the effort required to convert an Oracle database to Amazon RDS for PostgreSQL. In the example, the tool is running on Windows, but it can run on a number of different operating systems. To download AWS SCT, see Installing and Updating the AWS Schema Conversion Tool. For general information about the tool, start with What Is the AWS Schema Conversion Tool?

The example uses an Oracle database on Amazon EC2 and the well-known SCOTT/TIGER database as a starting point. Three stored procedures are added to demonstrate how AWS SCT handles common dialect translations. For information about installing the base SCOTT schema objects, see the Oracle FAQ wiki. You can download additional Oracle procedures and the translated PostgreSQL examples from this repository.

Prerequisites
The example in this post requires the following:

  • A source Oracle database with SCOTT schema objects installed
  • Three additional stored procedures:
    Procedure Source file
    emp_by_job oracle-emp-by-job.sql
    get_emp_info oracle-get-emp-info.sql
    employee_report oracle-employee-report.sql
  • Target Amazon RDS for PostgreSQL database
  • AWS SCT installed

Generate an assessment report
Typically, the first step in a migration is to understand its feasibility and the effort it will require. You can use AWS SCT to generate a high-level overview of the work ahead. In this example, open the context (right-click) menu for the SCOTT schema, and choose Create Report. The assessment report provides a high-level summary of how much of the schema can be converted automatically from Oracle to PostgreSQL, and the work that will be left after conversion.

The following screenshot shows the results of the assessment.

(more…)

Serverless Scaling for Ingesting, Aggregating, and Visualizing Apache Logs with Amazon Kinesis Firehose, AWS Lambda, and Amazon Elasticsearch Service

Pubali Sen and Shankar Ramachandran are solutions architects at Amazon Web Services.

In 2016, AWS introduced the EKK stack (Amazon Elasticsearch Service, Amazon Kinesis, and Kibana, an open source plugin from Elastic) as an alternative to ELK (Amazon Elasticsearch Service, the open source tool Logstash, and Kibana) for ingesting and visualizing Apache logs. One of the main features of the EKK stack is that the data transformation is handled via the Amazon Kinesis Firehose agent. In this post, we describe how to optimize the EKK solution—by handling the data transformation in Amazon Kinesis Firehose through AWS Lambda.

In the ELK stack, the Logstash cluster handles the parsing of the Apache logs. However, the Logstash cluster must be designed and maintained for scale management. This type of server management requires a lot of heavy lifting on the user’s part. The EKK solution eliminates this work with Amazon Kinesis Firehose, AWS Lambda, and Amazon Elasticsearch Service (Amazon ES).

Solution overview
Let’s look at the components and architecture of the EKK optimized solution.

Amazon Kinesis Firehose
Amazon Kinesis Firehose provides the easiest way to load streaming data into AWS. In this solution, Firehose helps capture and automatically load the streaming log data to Amazon ES, and backs it up in Amazon S3. For more information about Firehose, see What is Amazon Kinesis Firehose?

AWS Lambda
AWS Lambda lets you run code without provisioning or managing servers. It automatically scales your application by running code in response to each trigger. Your code runs in parallel and processes each trigger individually, scaling precisely with the size of the workload. In the EKK solution, Amazon Kinesis Firehose invokes the Lambda function to transform incoming source data and deliver the transformed data to the managed Amazon ES cluster. For more information about AWS Lambda, see the AWS Lambda documentation.

Amazon Elasticsearch Service
Amazon ES is a popular search and analytics engine that provides real-time application monitoring and log and clickstream analytics. In this solution, the Apache logs are stored and indexed in Amazon ES. As a managed service, Amazon ES is easy to deploy, operate, and scale in the AWS Cloud. Using a managed service eliminates administrative overhead, including patch management, failure detection, node replacement, backups, and monitoring. Because Amazon ES includes built-in integration with Kibana, it eliminates having to install and configure that platform—simplifying your process even more. For information about Amazon ES, see What Is Amazon Elasticsearch Service?

Amazon Kinesis Data Generator
This solution uses the Amazon Kinesis Data Generator (KDG) to produce the Apache access logs. The KDG makes it easy to simulate Apache access logs and demonstrate the processing pipeline and scalability of the solution.

Architecture
The following diagram shows the architecture of the EKK optimized stack.

(more…)

How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS

Shree Kenghe is a solutions architect at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) to help you migrate your existing Oracle data warehouse to Amazon Redshift. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Migrating your Oracle data warehouse to Amazon Redshift can substantially improve query and data load performance, increase scalability, and save costs.

When migrating your database from one engine to another, you have two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. AWS has tools to help do both. You can convert schema and code with AWS SCT and migrate data with AWS DMS. AWS DMS helps you migrate your data easily and securely with minimal downtime.

Let’s take a look at the architecture if you are migrating your data warehouse from Amazon RDS for Oracle to Amazon Redshift.

AWS SCT is a client application that can be installed on a workstation, laptop, or Amazon EC2 instance. AWS DMS creates a replication instance that runs on an Amazon EC2 instance in a VPC. You use this replication instance to perform the database migration. The replication instance to be used by DMS, the RDS Oracle database, and Amazon Redshift should all be in the same region. Ideally, they should all be in the same account within the same Availability Zone.

Now that we know how the architecture looks, let’s take a look at the steps to take.

(more…)

Migrating to PostgreSQL Lifecycle

This is a guest post by David Rader, the vice president of product development at OpenSCG.

OpenSCG is an AWS Consulting Partner and a leading expert in PostgreSQL, helping customers migrate to, operate, and optimize PostgreSQL and other data platforms on-premises and in the cloud.

There is a lot of interest these days in migrating data from relational databases to the open source PostgreSQL. PostgreSQL is a great choice for migration, offering advanced features, high performance, rock-solid data integrity, and a flexible open source license. PostgreSQL is ANSI SQL compliant, supports drivers for nearly all development languages, and has a strong community of active committers and companies to provide support.

This post describes the overall approach for migrating an application from your current database to PostgreSQL.

Choosing a database or application

When you’re deciding which of your applications and databases to start with, you should consider several factors. If this is your company’s first PostgreSQL migration, you want to start with a less challenging, lower-risk migration to get your feet wet. Just as with any technology switch, don’t choose your mission-critical, 24/7, high-volume, main revenue generator as the first application you try to run in production on a technology you’re just learning. PostgreSQL is certainly capable of running that important application, but you should build up some experience with the technology before making a “bet-the-company” migration. If, for some reason, you do have to jump into the deep end right away, make sure that you hire experienced database administrators (DBAs) who can help you do it right the first time.

For database migrations, you need to think about the applications that use the database. In general, an in-house application or website that you control the source code for is easier to modify for a new database. A third-party commercial software product that doesn’t provide support for PostgreSQL is usually a non-starter.

Departmental applications or reporting data marts are lower-risk choices that give you an opportunity to introduce the new technology to your organization. Dashboards or extract, transfer, load (ETL) tools that work against multiple databases through ODBC or JDBC connections are great candidates because of the very high compatibility and low modification requirements.

Migration assessment

As a first step in planning your migration, analyze the application and estimate how easy or difficult it will be to migrate it from your existing database to PostgreSQL. This “Migration Assessment” stage should cover several areas, including technology, personnel, and processes.

Technology issues

The technical aspect is the easiest to evaluate. The level of effort needed to migrate the data and database code is determined by the size and complexity of the current database, the type of data access layer that is used, how many database objects there are (functions, triggers, and stored procedures), and whether any database-specific functionality is used.

You should analyze the application code in addition to the database to understand what changes might be required in the application. For example, an application that has a lot of dynamic SQL scattered throughout the code requires more effort to convert than an application with a clean data access layer.

(more…)