AWS for Industries

Healthcare Transparency in Coverage Rule (TCR) – Cost-effectively Hosting Machine-Readable Files On AWS

by Umair Khalid, Dana Bullukian, Gokhul Srinivasan, Keith Wood, and Suresh Patnam | on | in Healthcare | Permalink |  Share

The Transparency in Coverage Rule (TCR) in the United States requires health insurers (payors) disclose information on their pricing agreements with healthcare providers. There are two phases to this regulation:

  • The implementation of Phase 1, which must be complete by July 1, 2022, requires that payors publish machine-readable files (JSON, XML, or Parquet) publicly for each plan that they offer.
  • Phase 2 requires that payors create a user-friendly tool that their subscribers can use to shop for procedures across providers, and must be complete by January 1, 2023.

The tool must initially expose the pricing for a list of 500 procedure types that the regulatory departments have specified.  The requirement expands later to provide consumers access to all procedure types. The initial implementation guidance released by the Centers for Medicare and Medicaid (CMS) for the structure of the machine-readable files would have meant that payors would have had to publish petabytes of data files each month, largely due to data duplication.

In addition, the regulation specifically calls out that there can’t be any barriers erected to prevent anyone from accessing the files, and that payors cannot capture personal information as a part of the process.  These issues caused significant concern amongst insurers that the hosting, storage, and data egress cost estimates provided by the government may have been inaccurate.

Since the initial release of the implementation guidance, CMS has been making improvements and published v1.0 of the schema for the machine-readable files.  The most consequential change has been the addition of Table of Contents files, which eliminated most of the duplication and lowered the size of the dataset to the multiple gigabytes range instead of multiple petabytes.

Throughout this time, AWS has been working with health insurance companies across the United States to create standard approaches to fulfill these regulatory requirements in a secure manner while providing ways to keep data egress costs low. This blog provides implementation guidance for insurers to meet the Phase 1 machine-readable files requirement of TCR.  We will publish Phase 2 guidance soon.

What do payors need to do to achieve CMS compliance?

There are two distinct bodies of work required for payors to meet the TCR requirements: data generation and data delivery.

Data generation requires that each payor aggregate and curate the data to produce machine-readable files.  The process will be unique to each payor and will depend upon internal systems, processes, integration, and systems of record.

For data delivery, payors are required to build a mechanism to deliver these machine-readable files to any parties interested in pricing data.  These files must be accessible to everyone, but the likely consumers of this data are data analytics firms, government entities, and other payors.

Solution Overview

AWS provides a well-architected way for payors to meet the TCR requirements without the need to create and manage complex on-premises infrastructures.

The creation of the machine-readable files during the data generation process will be different for each payor.  Some payors who have their data in AWS data lakes or Amazon Redshift will be able to create the machine-readable files using Extract Transform and Load (ETL) processes with AWS Glue.  Others that manage data in on-premises datacenters using mainframes or other technology may decide to use existing on-premises ETL processes to create the files, and then send the files to Amazon S3 for public availability.

Our solution focuses on data delivery.  It incorporates Amazon S3 with Lifecycle Management and Amazon CloudFront CDN with a AWS WAF Web Application Firewall configured across multiple regions.  Using these components allows payors to keep storage costs low while maintaining high availability and providing a way to prevent downloads from bad actors.

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance.  Customers of all sizes and industries can use Amazon S3 to store and protect any amount of data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics.

Amazon CloudFront is a Content Distribution Network (CDN) that accelerates distribution of your static and dynamic web content.  For the case of TCR, payors will be publishing compressed JSON files or Parquet files.  CloudFront delivers your content through a worldwide network of data centers called edge locations.  When a user requests content served with CloudFront, the request is routed to the edge location that provides the lowest latency (time delay), so that content is delivered with the best possible performance.

Architecture Diagram:

With Amazon S3, payors have unlimited capacity to store structured or unstructured data in files of any type, and many AWS customers use it to create a data lake.  This diagram depicts data copied from sources such as an on-premises datacenter or another AWS account, and placed into an Amazon S3 bucket.  If this data needs additional processing or validation, payors can utilize AWS Lambda or AWS Glue to create the final version of the files and place them into another Amazon S3 bucket.  From there, an Amazon CloudFront Distribution can make the files publicly available in a secure manner with an attached AWS WAF to prevent unwanted requests.  Payors can put links to the files on their web site to meet the TCR requirements.

The public-facing files should use compression, which can reduce file size by up to 90% and can help to keep egress costs under control. If the file format is JSON, use GZIP.  Also, data analysts tend to prefer Apache Parquet as the file format in this situation, which has built-in compression.

The TCR regulation specifically calls out that the files must always be available.  Any downtime could put payors at risk of fines from the government.  Therefore, we encourage payors to leverage multiple regions for storing files that are ready for public delivery.

AWS publishes CloudFormation template examples that you can use to get started with creating your solution.  Check out the S3 Website with CloudFront Distribution template in the AWS GitHub template library.

AWS provides a comprehensive list of services that enables organizations to closely manage costs.  By using AWS Budgets along with AWS Cost Explorer and Amazon CloudWatch, payors can create alerts to track when data egress is larger than anticipated, giving them an opportunity to set new WAF rules to prevent abuse.

In addition to Cost Explorer and other management related features to help control costs, you can take advantage of the following capabilities and resources.

  • Amazon CloudFront Class: In most cases the audiences of these downloads will be in the United States.  Selecting class 100 for Amazon CloudFront, will limit edge locations to only North America and Europe.  This will provide a lower price point for data served through Amazon CloudFront than other classes which are more global in nature.
  • Lifecycle Management: S3 allows you to set up lifecycle policies to archive or delete files after the appropriate amount of time to ensure you are only paying for files being used.  Depending on your retention policy, you may be able to utilize other storage classes that will further reduce costs.  You can read more about storage classes and lifecycle policies here:
  • Private Pricing: If you anticipate a large amount of egress traffic from Amazon CloudFront, contact your AWS account team and ask if private pricing is available.  Depending on the amount of traffic you expect you may be able to reduce egress costs by as much as 80%.

Amazon S3 provides integration with many preferred analytics platform partners. You can leverage your existing partner tools to accelerate the CMS compliance.


There may be additional components available that can simplify your TCR implementation.  Our account teams are happy to dive into this regulation and recommend architecture approaches with any health insurer.  We can help you figure out the best approach to creating data files, and will help you estimate associated costs.

Reach out to your AWS account team for further architecture guidance and discussion around private pricing.  If you do not have a contact with AWS yet, please use this form to request your sales representative contact you.

The healthcare team at AWS will continue innovating and will provide guidance for the 2023 and 2024 TCR requirements soon.

Umair Khalid

Umair Khalid

Umair is a Partner Solutions Architect supporting global AWS Partners who are building healthcare and life sciences solutions on the cloud. Umair is passionate about building frictionless experiences which simplify healthcare delivery.

Dana Bullukian

Dana Bullukian

Dana is an Enterprise Solutions Architect based in Chicago. Dana has over 12 years of experience building automations and designing secure, resilient, and cost-effective systems. In his spare time, Dana enjoys writing music and making silly puppet videos with wife and daughter.

Gokhul Srinivasan

Gokhul Srinivasan

Gokhul is a Senior Partner Solutions Architect supporting AWS ISV Startup Partners across healthcare and life sciences industry. Gokhul has over 17 years of healthcare IT experience helping organizations modernize their digital platforms, and deliver business outcomes.

Keith Wood

Keith Wood

Keith is a Principal Solutions Architect at AWS based in Raleigh, NC. He builds relationships with new customers throughout the southeastern United States and helps them to design solutions on AWS.

Suresh Patnam

Suresh Patnam

Principal Solutions Architect at AWS; He works with customers to build IT strategy, making digital transformation through the cloud more accessible, focusing on big data, data lakes, and AI/ML. In his spare time, Suresh enjoys playing tennis and spending time with his family.