Microsoft Workloads on AWS

How Dow Jones migrated and modernized its business-critical databases in one motion to gain effectiveness and cost benefits

As one of the largest business and financial news companies in the world, millions of people around the world rely on Dow Jones properties for their daily market data needs. The Market Data platform, one of Dow Jones’ most critical business systems, serves the real-time markets data to all of its news sites, including WSJ.com, Factiva.com, Barrons.com, and MarketWatch.com.

In 2019, Dow Jones made a strategic business decision to change its provider for all market data. The transition, which created cost benefits and business opportunities, required an ambitious timeline. To meet the timeline, the team determined that a shift from on premises to the AWS Cloud would allow the increased flexibility and rapid scaling capabilities needed to develop and test in parallel during the migration period.

This article explores the tools and best practices Dow Jones used to successfully migrate and modernize its business-critical database workloads for the Market Data platform from on-premises MSSQL to Amazon Aurora.

Dow Jones migrated and modernized its business-critical databases

About Market Data’s Platform Architecture

At peak capacity, Dow Jones processes more than 3,000 requests per second through the Market Data platform. The data ranges from official end-of-day pricing from exchanges to real-time pricing data, company information, and mapping data. The Dow Jones Market Data platform then ingests, normalizes, and surfaces this market data across Dow Jones. Given its strategic value to the business, the platform must be highly reliable, scalable, and fast, delivering real-time market data to consumers within milliseconds — with protection from downtime.

Prior to modernizing the platform, the system consisted of more than 44 separate applications running across 200 on-premises servers split into four main groups.

The core data storage layer of the platform was an on-premises 2 TB relational database management system built on Microsoft SQL Server 2008 R2. This database served as a central repository for everything market data-related, storing all data for the application system. For example, it housed data about each company that mapped directly back to its listing on each stock exchange. The server database also stored Market Data’s operations data for processing transactions, including all historical pricing data going back to the 1970s.

Some applications that manage and serve market data to consumers were centered around the SQL Server database. The migration to AWS enabled the retirement of these systems, including:

  • DJ Symbology System, which provided the actual mapping of “company codes” (e.g. Dow Jones Ticker symbol) to real-time or delayed quotes for U.S., Canadian, and global companies.
  • The Real-time Feed Processor group of applications, which were responsible for maintaining up-to-the-second data when markets were open and stocks were trading. These applications were connected to multiple sources, ingesting data, and integrating it into the Market Data platform.
  • The Intraday Systems were tasked with maintaining the market state. And finally, the client-facing APIs were migrated. These are the reporting systems used to generate XML files for clients and the print product, as well as the charting system and main Market Data API.

Each of these applications were responsible for a different part of the Market Data architecture, such as managing alerts, calculating indexes, serving pricing, and updating its query engine.

From an architectural perspective, the on-premises Market Data SQL Server database consisted of as many as 15 individual database server instances, distributed across two disparate data centers in the United States. Dow Jones used SQL Server Database Mirroring for increased availability and data protection in case of failure. There were four primaries mirrored across the two disparate data centers, while the rest of the server instances were split into 2 Distributor instances and 9 Subscribers instances. The primary principal server instance served the database to all clients from a single data center. In the meantime, the primary mirror instance acted as a warm standby server in case the primary server failed. A Distributor instance in each data center handled synchronous replication to local Subscriber nodes. All writes were processed by the principal server in the primary data center, while reads were processed by the 9 Subscribers instances distributed across both data centers.

Dow Jones on premise SQL Server database

Overview of Dow Jones’ Migration and Modernization

To enable the fastest time to production, the Dow Jones team used a lift-and-shift cloud migration strategy to migrate the core components of its Market Data application to AWS. This strategy allowed Dow Jones to accelerate the migration with minimal code changes — without redesigning the entire application. This approach would be transitional: once Dow Jones lifted and shifted onto AWS, it planned to modernize, moving to a cloud-native approach. AWS’ cloud services and optimizations provided significant benefits that made it easier to upgrade and rearchitect the system.

In order to manage licensing costs during this process, the company upgraded its existing on-premises 2 TB MS SQL Server database to a cloud-native database, enabling it to take full advantage of the reliability, scalability, manageability, and cost optimization this type of AWS database provides. As such, Market Data was rearchitected using Amazon Aurora. Amazon Aurora is a MySQL-compatible, relational database built for the cloud that combines the performance and availability of high-end commercial databases with the simplicity and cost effectiveness of open source databases.

According to Luke Sawatsky, Software Engineering Manager at Dow Jones, “migrating from on-premises SQL Server database to Amazon Aurora MySQL was an easy process that entailed several key stages.”

Stage 1: Convert database schema

Automatic schema conversion using AWS Schema Conversion Tool

Before moving its 2 TB databases from MS SQL Server to an Amazon Aurora MySQL target database, Dow Jones needed to create the target schema. To help in this process, the company used the AWS Schema Conversion Tool (AWS SCT) to help convert its existing database schema from MS SQL Server to Amazon Aurora MySQL.

As part of this process, Dow Jones generated the AWS SCT database migration assessment report. This report evaluated how much of the project could be completed by using the AWS Schema Conversion Tool and what else needed to be done to complete the conversion. The report was an extremely valuable tool as it summarized all of the schema conversion tasks and detailed the action items for the schema that could not be converted to the Aurora MySQL target DB instance.

Dow Jones Database Migration Assessment Report

After completing this analysis, Dow Jones found that 99.8% of its database storage objects (ex: schemas, tables, indexes, types, table type, etc.) and 52% of database code objects (ex: triggers, views, procedures, functions) could be converted automatically or with minimal changes using Amazon Aurora MySQL Compatible as a migration target. In addition, 97% of Dow Jones’ entire database schema could be converted to Amazon Aurora (MySQL compatible) automatically.

Manual schema conversion

While most of the migration work could be automated, there were some aspects that required manual intervention. The report flagged one database storage object and 38 database code objects with “significant actions” that would require manual intervention.

DJ Database Migration Assessment Report

For example, one of the significant database code object issues was related to Dow Jones’ use of global cursors:

Manual conversion Issue

As it turns out, the Aurora MySQL Cursors framework was simpler than SQL Server and provides only the basic types of servers. If Dow Jones’ code had relied on advanced cursor features, it would have needed an entire rewrite. Instead, Dow Jones was able to resolve this issue by using temporary tables.

Dow Jones encountered a second issue related to a SQL Server function. This function used a nested SQL statement to get a tree of records and its parent records. Since this recursion could not be recreated in MySQL, Dow Jones had to rewrite the SQL statement in C#.

For each conversion issue, Dow Jones modified the objects on the source SQL Server database so AWS SCT could convert these objects to the target Aurora MySQL database successfully. Using SCT, Dow Jones was able to double-check the assessment report after each iteration.

During this stage, several small POCs were run in parallel to validate that code changes were performing as expected. Dow Jones continued this process until it found no further conversion issues. In total, Dow Jones estimates it took one engineer a month to rewrite code for its stored procedures. Dow Jones found that fixing one code issue often fixed other issues simultaneously, reducing the overall time spent. Once all conversions were complete, Dow Jones applied the schema changes to the Aurora MySQL database and was ready for the next stage – the data migration.

Stage 2: Migrating the data with AWS Database Migration Service

By using AWS Database Migration Service (AWS DMS), Dow Jones was able to quickly and securely migrate its data continuously from the on-premises SQL Server to Amazon Aurora MySQL. The source database remained fully operational during the migration, which minimized downtime to the Market Data applications that relied on its database. When configuring the database migration task, Dow Jones chose the “migrate existing data and replicate ongoing changes” option. This ensured that AWS DMS would capture and apply changes, even after the bulk data had been loaded. AWS DMS managed the complexities of the full data migration, successfully completing the process within 24 hours. As a final step, the AWS DMS replication task was updated to “CDC only” to ensure both databases remained in sync until the final switchover.

Stage 3: Internal Testing Phase and AWS Well Architected Review

In its next stage of migration, Dow Jones kicked off an extensive period of internal testing to ensure that the Market Data platform was ready for production. During this period, Dow Jones collaborated with AWS experts to perform an AWS Well-Architected Review (WAR) of the platform. This critical step ensured that Dow Jones was using best practices to ensure operational excellence, security, reliability, and performance efficiency. To ensure success, Dow Jones made a strategic business decision to overprovision capacity rather than optimize for cost. Cost optimization was then slated as part of a post-migration initiative.

The Market Data WAR was instrumental in uncovering several key technical issues with the current AWS architecture that required immediate attention and remedying. More importantly, the Market Data team was able to walk away with an actionable plan detailing the following best practice recommendations:

  • Provision identical database instance types/sizes for Read Replicas and Writer node in the Aurora Global Cluster to protect the database in the event of a failure.
  • Balance the number of instances between primary and backup regions.
  • Set an appropriate DNS TTL (1 sec) for the application.
  • Scale horizontally to meet demand.
  • Implement automation to ensure the cluster never fails below a defined threshold.
  • Over-provision for peak and downsize quantity and/or instance class off-peak using automation/auto-scaling.
  • Execute full “at scale” load and resiliency testing of the Aurora database prior to the go-live date.

The final outcome: Dow Jones was able to successfully modernize its on-premises legacy SQL Server database to a scalable and resilient cloud-native database architecture. This cloud-native database featured an Amazon Aurora Global Cluster with one Writer node and five Reader nodes in Virginia (us-east-1) and six Reader nodes in Ohio (us-east-2).

Amazon Aurora Global Cluster

Stage 4: Cutover Aurora MySQL database to production

At this stage in the migration process, Dow Jones had two parallel database environments up and running for Market Data:

  1. A cloud-native Aurora MySQL database operating in test mode on AWS; and
  2. An on-premises MS SQL Server serving production data to clients.

Next, Dow Jones worked to facilitate the cutover to the Aurora MySQL without requiring every client to make a change to its endpoints. In order to meet this challenge, the Market Data team used a proxy service called NGINX to redirect client data requests from on premises to AWS. This ensured uninterrupted service for Dow Jones’ clients. Within 8 hours, proxies were installed, DNS entries were updated, and Market Data clients were successfully connected to the Aurora MySQL database on AWS. Two weeks later, the on-premises MS SQL Server database was officially shut down and all production traffic was hitting the Aurora MySQL database on AWS. The cutover to Aurora MySQL was complete.

Stage 5: Optimizing databases post-migration

Dow Jones’ migration journey did not end once it had migrated to AWS. There was a post-migration stage. During this stage, Dow Jones addressed cost optimization and rightsizing opportunities across its entire AWS infrastructure. Using Amazon CloudWatch metrics and CloudHealth management software to analyze data, Dow Jones quickly determined, based on current performance and usage requirements for the Market Data workload, that the database instance types/sizes it was using for Aurora MySQL were significantly overprovisioned. As a result of this analysis, Dow Jones took steps to rightsize the Aurora MySQL database instance type/size to better match current capacity requirements and eliminate two Reader nodes per region for further cost savings. After redeploying, Dow Jones was able to reduce its spend significantly.

Conclusion

Dow Jones needed to ensure its mission-critical Market Data platform remained secure, high-performing, resilient, and efficient. By migrating and modernizing from on-premises Microsoft SQL Server to Amazon Aurora, Dow Jones not only manifested cost benefits, they also benefitted from the cloud-native tooling. With features like Amazon Aurora’s Auto Storage Provisioning and replicas, it’s easier for Dow Jones to maintain and replicate their data.


AWS can help you assess how your company can get the most out of the cloud. Join the millions of AWS customers that trust us to migrate and modernize their most important applications in the cloud. To learn more on modernizing Windows Server or SQL Server, visit Windows on AWSContact us to start your migration journey today.

Sheri Moran

Sheri Moran

Sheri Moran is a Principal Solutions Architect with Amazon Web Services based in New York. She works with major financial service institutions, architecting, and modernizing their large-scale applications using AWS cloud services. She brings over 20 years of technology experience across multiple domains including finance, media, publishing, and healthcare.

Luke Sawatsky

Luke Sawatsky

Luke has been with Dow Jones for 13 years. Over the years he has held a few different positions on a few different teams, but he is currently Director of Engineering on the Market Data team. His team is responsible for obtaining, normalizing, and distributing market data across all the Dow Jones properties. Outside of work he enjoys spending time with his three children, reading, and being disappointed by Minnesota sports teams.

Alex Zuo

Alex Zuo

Alex Zuo is a Product Manager in Amazon EC2 Commercial Software Service team. He focuses on making AWS the best place to run SQL Server workloads and enabling customers to migrate and modernize SQL Server workloads on AWS.