Listing Thumbnail

    Estimating Confidence Intervals for 2020 Census Statistics Using Approximate Monte Carlo Simulation (2010 Census Proof of Concept)

     Info
    Open data
    |
    Deployed on AWS
    The 2010 Census Production Settings Demographic and Housing Characteristics (DHC) Approximate Monte Carlo (AMC) method seed Privacy Protected Microdata File (PPMF0) and PPMF replicates (PPMF1, PPMF2, ..., PPMF25) are a set of microdata files intended for use in estimating the magnitude of error(s) introduced by the 2020 Decennial Census Disclosure Avoidance System (DAS) into the Redistricting and DHC products. The PPMF0 was created by executing the 2020 DAS TopDown Algorithm (TDA) using the confidential 2010 Census Edited File (CEF) as the initial input; the replicates were then created by executing the 2020 DAS TDA repeatedly with the PPMF0 as its initial input. Inspired by analogy to the use of bootstrap methods in non-private contexts, U.S. Census Bureau (USCB) researchers explored whether simple calculations based on comparing each PPMFi to the PPMF0 could be used to reliably estimate the scale of errors introduced by the 2020 DAS, and generally found this approach worked[...]

    Overview

    The 2010 Census Production Settings Demographic and Housing Characteristics (DHC) Approximate Monte Carlo (AMC) method seed Privacy Protected Microdata File (PPMF0) and PPMF replicates (PPMF1, PPMF2, ..., PPMF25) are a set of microdata files intended for use in estimating the magnitude of error(s) introduced by the 2020 Decennial Census Disclosure Avoidance System (DAS) into the Redistricting and DHC products. The PPMF0 was created by executing the 2020 DAS TopDown Algorithm (TDA) using the confidential 2010 Census Edited File (CEF) as the initial input; the replicates were then created by executing the 2020 DAS TDA repeatedly with the PPMF0 as its initial input. Inspired by analogy to the use of bootstrap methods in non-private contexts, U.S. Census Bureau (USCB) researchers explored whether simple calculations based on comparing each PPMFi to the PPMF0 could be used to reliably estimate the scale of errors introduced by the 2020 DAS, and generally found this approach worked well.

    The PPMF0 and PPMFi files contained here are provided so that external researchers can estimate properties of DAS-introduced error without privileged access to internal USCB-curated data sets; further information on the estimation methodology can be found in Ashmead et. al 2024 .

    The 2010 DHC AMC seed PPMF0 and PPMF replicates have been cleared for public dissemination by the USCB Disclosure Review Board (CBDRB-FY24-DSEP-0002). The 2010 PPMF0 included in these files was produced using the same parameters and settings as were used to produce the 2010 Demonstration Data Product Suite (2023-04-03) PPMF, but represents an independent execution of the TopDown Algorithm. The PPMF0 and PPMF replicates contain all Person and Units attributes necessary to produce the Redistricting and DHC publications for both the United States and Puerto Rico, and include geographic detail down to the Census Block level. They do not include attributes specific to either the Detailed DHC-A or Detailed DHC-B products; in particular, data on Major Race (e.g., White Alone) is included, but data on Detailed Race (e.g., Cambodian) is not included in the PPMF0 and replicates.

    The 2020 AMC replicate files  for estimating confidence intervals for the official 2020 Census statistics are available.

    Features and programs

    Open Data Sponsorship Program

    This dataset is part of the Open Data Sponsorship Program, an AWS program that covers the cost of storage for publicly available high-value cloud-optimized datasets.

    Pricing

    This is a publicly available data set. No subscription is required.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    AWS Data Exchange (ADX)

    AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.

    Open data resources

    Available with or without an AWS account.

    How to use
    To access these resources, reference the Amazon Resource Name (ARN) using the AWS Command Line Interface (CLI). Learn more 
    Description
    2010 Census Production Settings Demographic and Housing Characteristics Approximate Monte Carlo method seed Privacy Protected Microdata File and PPMF replicates
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::uscb-2020-product-releases/decennial/amc/2010/mdf/2010-dhc-mdf-replicates
    AWS region
    us-west-2
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://uscb-2020-product-releases/decennial/amc/2010/mdf/2010-dhc-mdf-replicates/
    Description
    Census Open Data S3 Inventory
    Resource type
    S3 bucket
    Amazon Resource Name (ARN)
    arn:aws:s3:::uscb-opendata-inventory
    AWS region
    us-west-2
    AWS CLI access (No AWS account required)
    aws s3 ls --no-sign-request s3://uscb-opendata-inventory/

    Resources

    Support

    How to cite

    Estimating Confidence Intervals for 2020 Census Statistics Using Approximate Monte Carlo Simulation (2010 Census Proof of Concept) was accessed on DATE from https://registry.opendata.aws/census-2010-amc-mdf-replicates .

    License

    CC0 1.0 Universal

    Similar products