Nasdaq Uses AWS to Pioneer Stock Exchange Data Storage in the Cloud
2020
Nasdaq is a multinational financial services and technology corporation that owns and operates the Nasdaq Stock Exchange. Nasdaq operates a total of 27 markets, a central securities depository, and clearinghouse across a variety of asset classes in North America and Europe. It is home to nearly 4,000 listed companies globally across its markets and also provides its mission-critical technology to other market infrastructure operators located in 50 countries.
The Nasdaq Stock Exchange is the largest equities franchise globally by volume, and it manages the matching of buyers and sellers at high volume and velocity, while providing data feeding the price quote for stocks in electronically entered trades. Nasdaq relies on an internal application to capture and store all protected exchange data. “This data includes orders, quotes, trades, and cancellations,” says Robert Hunt, vice president of software engineering for Nasdaq. Every night, Nasdaq receives billions of records that need to be loaded for billing and reporting processes before the markets open the following morning.
As automated trading platforms have entered the market, the pace and volume of transactions has grown. In 2014, to increase scale and performance and lower operational costs, Nasdaq moved from a legacy on-premises data warehouse to an Amazon Web Services (AWS) data warehouse powered by an Amazon Redshift cluster. Between 2014 and 2018, this Amazon Redshift cluster grew to 70 nodes as the company expanded the solution to support all its North American markets. By 2018, the solution ingested financial market data from thousands of sources nightly, ranging from 30 billion to 55 billion records and surpassing 4 terabytes.
Over time, growth in data led to a change in approach for managing that data for analytics. The overnight batch processing that runs against the warehouse caused challenges in processing enormous volumes to meet stringent deadlines. Users rely on the data to complete billing, reporting, and surveillance. “When market volatility increased in early 2018, data volumes for the warehouse grew substantially, peaking at about 55 billion records per day in 2018,” says Hunt.
More sophisticated trading practices lead to a massive growth in data and it was critical that Nasdaq started planning to evolve a new architecture to continue to achieve the performance standards and operational excellence that the ecosystem expects. “We have to both load and consume the 30 billion records in a time period between market close and the following morning. Data loading delayed the delivery of our reports,” says Hunt. “We needed to be able to write or load data into our data storage solution very quickly without interfering with the reading and querying of the data at the same time.”
We were able to easily support the jump from 30 billion records to 70 billion records a day because of the flexibility and scalability of Amazon S3 and Amazon Redshift.”
Robert Hunt
Vice President of Software Engineering, Nasdaq
Using AWS Services for Flexibility, Scalability, and Performance
In 2018, Nasdaq chose to build the foundation of a new data lake on Amazon Simple Storage Service (Amazon S3), which enables the company to separate compute and storage and to scale each function independently. In traditional data warehouse deployments, scaling storage capacity often requires companies to scale compute capacity at the same time because the application and storage are tightly linked, with onsite hardware modifications needed for any change to the ratio of the two. “In addition to the flexibility that comes with separation of compute and storage, Amazon S3 has better scaling properties in terms of writing and reading large datasets at the same time,” Hunt says. “Amazon S3 gave us a solution that enables zero contention between data loading and querying processes.”
What began as a performance-focused solution has become a multi-use data lake shared between teams, creating additional benefit for the business.
Scaling to Support 70 Billion Records a Day
Loading Market Data for Reporting 5 Hours Faster
About Nasdaq
Benefits of AWS
- Ingests 70 billion records per day
- Loads financial market data 5 hours faster
- Runs Amazon Redshift queries 32 percent faster
- Enables business transformation with shared data
- Spurs innovation with additional use cases
AWS Services Used
Amazon Simple Storage Service
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.
Amazon Redshift
Amazon Redshift gives you the best of high performance data warehouses with the unlimited flexibility and scalability of data lake storage.
AWS Identity and Access Management
AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely.
Amazon S3 Glacier
Amazon S3 Glacier and S3 Glacier Deep Archive are a secure, durable, and extremely low-cost Amazon S3 cloud storage classes for data archiving and long-term backup.