AWS Marketplace

Analyzing impact of regulatory reform on the stock market using AWS and Refinitiv data

Every change in regulation has intended and unintended consequences on the stock market. By analyzing historical data, you can better estimate future impacts. In this blog post, Boris, Pramod, and Alex will use Amazon Elastic MapReduce (Amazon EMR)  to analyze historical market data provided by Refinitiv, part of London Stock Exchange Group (LSEG), on AWS Data Exchange.

Experiment

Back in December 2020, the US Securities and Exchange Commission (SEC) approved changes. These changes require the highest bid and lowest ask prices (top-of-book) feeds to include security order amounts less than the normal unit of trading for that particular asset (odd-lot quotes). They also redefine the round-lot size to better reflect trading in smaller increments. More recently in December 2022, the SEC proposed to accelerate the implementation of these changes and introduce additional odd-lot information to the top-of-book feeds.

The key questions we attempt to answer are:

  • What is the odd-lot volume as a share of US equity trading (share count and dollar value)?
  • How does odd-lot volume change as share price increases? Do we see an increase in odd lot trading when a stock splits?
  • What’s the impact of the reform on transaction costs (measured by quoted spread)?
  • What impact will the approved odd-lot lot reform have on market data volumes?

Answering these questions requires high-fidelity market data with nanosecond precision and an infrastructure capable of processing billions of data points cost-effectively.

Terminology

  • Odd-lot: An odd lot is an order amount for a security that is less than the normal unit of trading for that particular asset. Odd lots are considered to be anything less than the standard 100 shares for stocks. Trading commissions for odd lots are generally higher on a percentage basis than those for standard lots since most brokerage firms have a fixed minimum commission level for undertaking such transactions.
  • Order book: An order book is an electronic list of buy and sell orders for a specific security or financial instrument organized by price level.
  • Bid price (bid): Orders to buy
  • Ask price (ask): Orders to sell
  • Mid-price (mid): Arithmetic average of bid and ask
  • Top of book: the highest bid and lowest ask prices
  • Market micro-structure: Market microstructure is a branch of finance concerned with the details of how exchange occurs in markets.
  • Quoted spread: the difference between the lowest ask and highest bid, expressed as percent of mid
  • PCAP: Network packet capture

Experimentation methodology

  • We used Refinitiv’s Tick History – PCAP dataset, found in AWS Data Exchange. This dataset contains raw PCAP data conveniently reformatted into parquet files and stored in Amazon S3 (S3).
  • To accelerate time-to-market and minimize undifferentiated heavy lifting, we licensed this third-party data on AWS Data Exchange. We accessed data directly in Refinitiv’s S3 buckets using the recently announced AWS Data Exchange for Amazon S3 (Preview). This eliminates the need to move hundreds of terabytes of data back and forth.
  • Since we have to analyze hundreds of terabytes of data, we chose to use Amazon EMR on Amazon Elastic Kubernetes Service (EKS) to enable analysis at scale while keeping the cost down by utilizing Spot Instances.

Architecture diagram

The following diagram shows our resulting architecture:

  1. Data providers like Refinitiv use AWS Data Exchange for Amazon S3 (Preview) to license direct access to data sets stored in their S3 buckets without needing to create and manage copies.
  2. Provisioned Amazon EMR on EKS cluster is configured to access the shared dataset.
  3. Managed notebook environment provided by Amazon EMR Studio authors Apache Spark jobs and executes them on Amazon EMR on EKS cluster.
  4. Job results are durably stored on Amazon S3 bucket in Apache Parquet format optimized for data processing engines like Amazon EMR, Amazon Redshift, or Amazon Athena.

Amazon EMR on EKS access data directly in data provider's S3 bucket ADX for S3 functionality launched at reinvent 2022

For this blog post, we analyzed top-of-book, trade, and full-depth data from 16 US equity exchanges for the July 13, 2022 – July 20, 2022 timeframe.

We selected Amazon EMR on EKS deployment for Amazon EMR with AWS Data Exchange for Amazon S3 (Preview) and enabled no-copy data access for the following reasons:

  1. Out-of-the-box ability to analyze at scale. We processed about 60 billion data points.
  2. Ease of package management via Docker.
  3. Fast scale-up and -down proportional to the workload to pay for what you use.
  4. Amazon EMR Spark is interrupt-tolerant, enabling spot deployment and about 80 percent cost savings over on-demand fleet.
  5. Collaborative notebook environment using Amazon EMR Studio.
  6. No need to for “undifferentiated heavy lifting” related to data movement, optimization of storage, and egress costs.

Experimentation findings

1.    Odd-lot trading in 2022 continues to be a major component of US trading

In the three years since the SEC’s analysis, odd-lot trading continues to be a major component of US equity trading. The analysis yielded the following data for the period analyzed:

  • 55 percent of trades were odd-lots across all US equity securities.
  • From a share volume perspective, 7.6 percent of the traded share volume across all U.S equity securities were in odd-lots. Odd-lot trading volume was split with on-exchange trading, accounting for 80.46 percent and off-exchange representing 19.53 percent of share volume.
  • When examined from a notional value perspective, odd-lot trades represent 16 percent of notional value across all US equity securities.

There is a question as to whether odd-lot trading is impactful, given the lower notional value of low-quantity trades. Our analysis shows that even though odd-lot trades from a notional volume perspective are lower than trade volume percentages, they are still significant. Sixteen percent notional value represents $75 billon on an average daily basis.

2.    Odd-lot vs. Round-lot quoted spread

We looked at quoted spreads to see how narrow spreads would be if odd-lots were used to calculate the National Best Bid and Offer (NBBO). To do this, we defined the Quoted Spread – Round-lot as the quoted spread calculated using the round lot NBBO as published on the top-of-book feeds. The Quoted Spread Odd-lot is the quoted spread calculated by creating a synthetic NBBO calculated from the direct exchange feeds, which have odd-lots at the top of the book.

We calculated the quoted spread as the difference between the lowest bid ask price and the highest bid price, as a proportion of the midpoint price: quoted spread = (lowest ask – highest bid) / midpoint price. By defining the quoted spread this way, we can normalize the spread across a wide range of share prices. The results confirm that for stocks priced $250 – $10,000, spreads would be narrower if odd-lots were included in the NBBO. It is worth noting that the quantity available at these narrower prices will, by definition, be less than 100 shares. The following table summarizes these results visually.

Share Price Bucket  Odd-lot Spread
(basis points)
BBO Spread
(basis points)
Spread improvement
(%)
$0 – $250 13.8 14.6 5.6%
$250.01 – $1,000 5.3 6.7 20.6%
$1,000.01 -$10,000 6.2 8.0 22.1%
$10,000+ 11.3 11.3 0.0%

3.    Higher share prices increase odd-lot trading

We expected odd-lot trading to be higher in high-priced stocks as a percentage of all trading; our analysis confirmed that, as the following tables shows.

Share Price Bucket  Trade Count
(odd-lot only)
Share Volume
(odd-lot only)
Notional Value
(odd-lot only)
$0 – $250 54.38% 7.31% 13.39%
$250.01 – $1,000 76.23% 20.02% 21.57%
$1,000.01 -$10,000 92.93% 46.97% 47.46 %

We have noted the impact stock split has on odd-lot trading of individual stock. Take, for example, the AMZN stock split on June 6, 2022. As the following graph shows, prior to the stock split odd-lot trade counts were 95 percent and went down to 70 percent after the split. The first three compound columns show odd-lot trades from June 1 to June 3, 2022, at around 95% and from June 6 to 8, 2022 at around 70 percent.

graph showing that prior to the stock split odd-lot trade counts were 95 percent and went down to 70 percent after the split

4.    Impact of odd lots on the market data volumes

The SEC’s approved odd-lot reform will add odd-lot quotes to top-of-book data feeds and change the definition of the round lot as follows:

Share price Round lot size
$250 and less 100 shares
$250.01 to $1,000 40 shares
$1,000.01 to $10,000 10 shares
$10000.01 and more 1 share

The impact of the SEC’s odd-lot reform will not just impact trading; it will also impact market data volumes. For this analysis, we measure the market data volume increase by the number of messages in the current direct exchange feeds where the top of book is an odd lot based on SEC’s new round lot definition.

The following table shows the number of average daily number of round-lot quotes published on the top-of-book market data feeds and the number of top-of-the book odd-lot quotes across each of the direct exchange feeds for the period we analyzed. The number of odd-lot quotes at the top-of-book provides a good estimate of the number of additional market data messages that we can expect to see.

Based on our analysis, if the odd-lot reform had been in effect during the period we analyzed, we would have seen an average of 23.1 percent increase in the number of market data messages. Refer to the following table.

Market Participant Odd-lot message count Current message count Expected increase (%)
CBOE BZX 1,268,341,161 3,841,608,149 33.0%
CBOE BYX 138,960,542 1,434,075,392 9.7%
NYSE Chicago 12,654,732 598,255,696 2.1%
CBOE EDGA 222,622,808 2,166,583,402 10.3%
CBOE EDGX 855,038,497 3,059,904,090 27.9%
EPRL 299,384,118 931,800,249 32.1%
INVESTORS EXCHANGE 592,643,942 1,914,748,061 31.0%
MEMX 435,908,353 4,053,960,717 10.8%
NASDAQOMX 3,049,132,913 4,844,106,572 62.9%
NASDAQOMX BX 103,108,040 1,446,105,352 7.1%
NASDAQOMX PSX 283,261,332 1,177,685,027 24.1%
NYSE National 68,042,412 414,884,658 16.4%
NYSE 1,235,950,826 3,337,435,792 37.0%
NYSE Arca 1,372,067,301 4,377,824,695 31.3%
NYSE American 103,842,566 1,017,413,012 10.2%
Average 23.1%

Conclusion

In this blog post, Boris, Pramod, and Alex used high-fidelity market data from Refinitiv to determine the following:

  1. investors are expected to see reduction in top of book bid-ask spread, ranging from 5 percent to 20 percent for stock names priced up to $1,000 per share. Note that the benefit will be realized in full only for odd-lot trade sizes.
  2. market data volumes are expected to increase on all analyzed exchanges, anywhere from 10 percent and up to 63 percent.

We further showed how AWS Data Exchange for Amazon S3 (Preview) and Amazon EMR enabled us to eliminate undifferentiated heavy lifting and analyze terabytes of data cost-effectively by deploying Amazon EMR on Spot Instances.

About the authors

Pramod Nayak is the Director of Product Management in the Low Latency Group of LSEG. He focusses on the software, data and platform products for the low-latency market data industry. Pramod is a former software engineer & passionate about market data & quantitative trading.
Alex Tarasov is Senior Solutions Architect working with Fintech Startup customers helping them to design and run their data workloads on AWS. He is a former data engineer and is passionate about all things data.
Boris Litvin is Principal Solution Architect, responsible for Financial Services industry innovation. He is a former Quant and FinTech founder, passionate about quantitative trading and data science.