AWS Public Sector Blog

Modern data strategy for government tax and labor systems

AWS branded background design with text overlay that says "Modern data strategy for government tax and labor systems"


Government authorities such as tax, unemployment insurance, and other finance agencies across the US and globally are seeking ways to innovate. They are trying to unlock insights from their data, deliver better customer experiences, and improve operations using cutting-edge technologies such as generative artificial intelligence (AI), machine learning (ML), and other data analytics tools.

Amazon Web Service (AWS) public sector customers say that realizing value from their data is challenging, in part, because the data is siloed, and difficult to access, share, and use. Customers also say that these challenges are often compounded by inflexible legacy systems, internal skill gaps, and anxiety with ensuring the security of sensitive data.

As the volume, velocity, and variety of data grows, government authorities are now investing in modern data strategies that treat data as an asset and allow it to be used effectively. In this blog post, we will cover why a modern data strategy is important, the key components of a modern data strategy, and practical next steps.

Why a modern data strategy is important

Historically, finance and administrative systems such as integrated tax systems have used a “one size fits all” approach to storing and processing data in relational databases such as a SQL Server. This database stores structured data, such as unemployment insurance (UI) claimant and taxpayer records, and semi-structured and unstructured data such as PDF documents, all in one place.

Due to data retention requirements, these databases often grow over time to contain terabytes of data. While storage has become less costly, performing database operations such as indexing and data conversion requires high compute power and memory, leaving customers with slower system performance. Large databases also demand increased oversight of operations and maintenance activities.

This approach also led to the creation of data silos. For instance, a production database might be storing both regulated data such as Federal Tax Information (FTI) and non-FTI data in a single relational database. Comingling such data subjects all of the data to IRS Publication 1075 compliance, making it harder to retrieve non-FTI data for analytics.

Data is the centerpiece for faster decision making, gaining business insights, optimizing your resources, and building modern AI-powered applications. These modern applications deliver capabilities such as a 360-degree customer view, advanced analytics, and generative AI features such as customer agent and employee training assistant.

For example, using a modern data strategy, a UI agency can use claims data to build AI models for fraud risk scoring. This allows the agency to quickly identify non-compliant UI claims and anomalies, prioritize limited agency resources for investigations, and respond to new patterns as they emerge.

Key components of a modern data strategy

An effective data strategy fits hand in glove with organizational goals. For example, let’s say your goal is to improve taxpayer compliance. The data available includes administrative tax data and federal tax data. You also have data from other state agencies, employers, and financial institutions. The desired outcome is a well-defined universe of taxpayers who may have under-reported income so you can focus your limited audit resources.

The next step in achieving that outcome is making effective use of available data, which requires secure, scalable, and performant cloud-based services that provide answers from data in time to make effective decisions without a massive capital investment. With the AWS Cloud, you can securely move and store any amount of data at scale and low cost, and access that data seamlessly.

Underpinning your data strategy are several key components including:

  • Data governance: Establish clear, transparent data governance policies that define roles, responsibilities, and access controls to ensure proper handling of taxpayer and UI data.
  • Data collection and integration: Collect tax-related data sources such as historical taxpayer information, financial records, and transaction data into a centralized repository where data can be put to work effectively across databases, data lakes, analytics, and ML.
  • Data security and disaster recovery: Implement security tools to protect sensitive taxpayer data and ensure regulatory compliance with data protection standards including IRS 1075, FedRAMP, FIPS 140-2, and NIST 800-171. Set up backup and disaster recovery solutions to protect against data loss and ensure business continuity.
  • Data quality: Commit to cleansing your data to reduce errors and inconsistencies in data analyses. Continuously monitor the effectiveness of data collected, data sources and your data strategy.
  • Data scalability: Use cost-effective and scalable storage for data and optimize processing costs by using AWS cost management tools and employing data archiving strategies for older or less frequently accessed data.
  • Data proficiency: Invest in training and building skills and allowing time to earn certifications. Data is a core skill to cultivate in government to drive better business outcomes.

With these components in place, you gain the ability to move and store any amount of data at scale, access that data seamlessly, and manage who has access to the data with proper security and data governance controls.

Putting a modern data strategy into practice

A modern data strategy moves from using a single data store to smaller purpose built data stores that leverage tools that match the type of data and access patterns. This allows you to build scalable data lakes or a lake house architecture and gives you the ability to opportunistically combine relevant datasets, including third party datasets, to build advanced analytics and AI/ML use-cases.

Moving to purpose-built data stores also facilitates seamless data movement and simplifies your security and compliance posture for mission-critical systems storing sensitive data. For example, you can separate regulated data from non-regulated data and apply appropriate security controls to reduce compliance risks. Additionally, you can move your archive data into a cold cloud storage tier such as Amazon S3 Glacier for long-term retention and free up storage in your relational database.

Figure 1 shows the AWS components of an example modern data strategy for UI/tax application. The image depicts an integrated data platform architecture leveraging AWS services. Data sources like UI/tax applications, files & documents, and third-party data ingested through AWS Database Migration Service, AWS DataSync and AWS Glue into database and storage services such as Amazon Aurora, Amazon Relational Database Service, Amazon Redshift and Amazon Simple Storage Service (S3). These services support data storage and querying through Amazon Athena. The analyzed data powers business intelligence with Amazon QuickSight, supports machine learning and generative AI through Amazon SageMaker, Amazon Bedrock and Amazon Q. Data governance and cataloging are handled by AWS Lake Formation and Amazon DataZone, delivering insights to people, applications, and business processes.

A flow diagram illustrating amazon web services (aws) cloud components for data integration, storage, analysis, and action, leading to end-user applications, processed and devices.

Figure 1. AWS data services for building modern data architectures on cloud.

Lastly, modern data architectures use purpose-built data services and tools that allow you to build composable modern applications. For example, you can break down your large monolithic UI or integrated tax system into microservices and API-based architectures. In this model, each API can be completely decoupled with its own purpose built data store. This type of approach allows you to extract data from relevant system modules and combine them as needed for analytics and AI applications.

How AWS can help

The potential to drive better business outcomes by using purpose built databases, analytics, and AI/ML with AWS is within reach. AWS has a number of trainings and acceleration programs, like the Data Driven Everything (D2E) program, to help you get started on your modern data strategy. AWS also offers in-person trainings, free online training, and certification programs. Finally, AWS has a number of partners and a Professional Services team who can help you develop an end-to-end data strategy.

To learn more about how you can use AWS to support your agency’s unique use case, contact the AWS Public Sector team.

Read more about AWS for tax and labor agencies:

Sohaib Tahir

Sohaib Tahir

Sohaib is a principal solutions architect and a technical leader at Amazon Web Services (AWS) for the US state and local government finance and administration team. He has more than 12 years of experience in the technology and engineering space and has helped customers deliver AWS powered solutions since 2015. Sohaib specializes in designing mission critical systems in the cloud such as tax, unemployment insurance, enterprise resource planning (ERP), department of motor vehicles (DMV), and others. He works with tax agencies globally to modernize tax systems to deliver mission outcomes using cloud technologies.

Danny Lee

Danny Lee

Danny is the labor and workforce leader at Amazon Web Services (AWS), supporting US state and local government customers. Before joining AWS, he led digital transformation initiatives at the New York State Department of Labor, leveraging cloud technology to drive improved customer experience and better business outcomes. Danny also served as the state-appointed ombudsman for unemployment insurance in New York. He has more than 15 years of experience advocating for unemployed workers.

Tami Fillyaw

Tami Fillyaw

Tami is the finance and administration (F&A) leader at Amazon Web Services (AWS). She helps state and local government agencies leverage the cloud to improve business operations, financial performance, and customer interactions. Prior to AWS, Tami served 22 years in Florida government leading statewide policy initiatives and enterprise operations as the state’s deputy chief financial officer, deputy secretary of workforce operations, and chief of staff. During her time in public service, she founded Florida’s transparency portal for state expenditures and contracts, modernized the state’s $2.7 billion group health plan, and transformed human resources, retirement, and procurement functions using cloud-based tools.