Listing Thumbnail

    Emissions Estimation Data Module

     Info
    Sold by: ESG Book 
    Deployed on AWS
    The ESG Book Estimated Emissions Data Module provides investors with estimated emissions for ~45,000 public corporate entities that do not disclose their emissions. The dataset includes estimations for Scope 1, Scope 2, Scope 3 (total) emissions, and 15 Scope 3 Categories. A confidence rating is also provided alongside each estimated emissions figure, indicating the degree of accuracy of the estimation based on the amount of available data used in the estimation process.

    Overview


    Overview

    The ESG Book Estimated Emissions Data Module provides investors with estimated emissions for ~45,000 public corporate entities that do not disclose their emissions. The dataset includes estimations for Scope 1, Scope 2, and Scope 3 (total) emissions, as well as the 15 Scope 3 Categories in tonnes of CO2 equivalents. A confidence rating is also provided alongside each estimated emissions figure, indicating the degree of accuracy of the estimation based on the amount of available data used in the estimation process. Importantly, PCAF data quality indicators are included.

    The dataset additionally includes the actual reported emissions data of public companies. We currently cover 4000 public companies, where approximately half of them disclose their emissions data. Our Estimated Emissions Data Module thus significantly expanding the coverage of emissions data for use in portfolio analysis and index creation, for instance.


    Methodology Overview

    Emissions are estimated using the Extreme Gradient Boosting (XGBoost) Model. The model is an unsupervised machine learning model which identifies and analyses complex relationships between large numbers of predictor variables to generate estimations for unknown data. In this case, the model identifies the relationship between 15 financial and non-financial predictor variables and emissions for each region, country, sector and industry to estimate the emissions of companies which are not disclosing emissions data.

    We have chosen to use a machine learning estimation model rather than a traditional statistical regression model for several key reasons. Firstly, the XGBoost model (machine learning model) is able to handle non-linear relationships. As the predictor variables might be non-linearly correlated with emissions (for instance, a company with 500 employees might not generate 5 times the emissions of a company with only 100 employees due to economies of scale), the ability of the XGBoost model to handle non-linear relationships provide an extra layer of robustness to accurately capture the relationships between the predictor variables and emissions.

    Secondly, the XGBoost model is able to handle missing data unlike conventional regression models or other machine learning models such as Adaptive Boosting. Though 15 predictor variables are used in the model, all 15 datapoints might not be available for all companies. As such, a threshold of datapoints is set such that the model will estimate emissions for companies which meet this minimum data threshold. Conventional regression models are unable to account for this missing data, where this missing data has to be interpolated, or simply replaced with zeros. This introduces higher order errors into the model, reducing the accuracy of the emissions estimations due to the ambiguity of input data. This issue does not affect the XGBoost model due to its ability to handle missing data.

    Lastly, the XGBoost model uses a decision-tree algorithm to identify and analyse the complex relationships between the predictor variables and emissions, which is subsequently used in the estimation process. This allows for greater accuracy as the decision tree process corrects the mistakes of the previous trees. The parameters of the model are fine-tuned to increase the precision of estimations. This is done using the Optuna4 , an open source hyperparameter optimization framework, that tests different configurations of hyperparameters on a holdout test set to determine the optimal values for a given regression.

    Overall, due to the reasons explained above, the XGBoost model shows better accuracy when compared to traditional statistical models such as the Ridge Regression model or other machine learning models such as the Adaptive Boost model.


    Use Cases

    The Emissions data can be instrumental for Asset Managers and Corporates:

    Portfolio Management

    Emissions data can be used by Portfolio Managers during portfolio construction for:

    • Exclusion - The screening out of companies that are not aligned with the Paris Agreement temperature goals.
    • Carbon Intensity - The scaling of emissions data by financial metrics to compute carbon intensities, monitor the portfolio and benchmark against other portfolios
    • TCFD & SFDR Reporting - The reporting of climate-related financial metrics to understand the climate-related risks and opportunities of the companies within a portfolio
    • Portfolio Alignment to Climate Goals - Identify to what extent a portfolio is aligned with the Paris Agreement to minimise exposure to carbon-intensive companies
    • Regulation Compliance - Generate voluntary TCFD disclosures on how climate-related risks and opportunities are factored into relevant investment strategies
    • Alignment to investor demand- Increasing number of investors require asset managers to integrate climate risks and opportunities into their investing strategy

    Corporates

    Emissions data includes climate metrics related to emissions, reporting, policy and frameworks and enables:

    • Tailored benchmarking - The quality and granularity of the data allows corporations to analyse their climate performance against direct peers, industry, sector and region
    • Climate Reporting - The identification of climate-related topics that need to be reported on for a company to stay ahead of its peers
    • Tailor-made Comparison Metrics - Combining emissions data with financial metrics such as revenue or EBITDA or non-financial metrics such as production quantity enables the creation of innovative carbon intensity metrics relevant to each company
    • Market Positioning & Differentiation - Understand which climate-related topics corporations need to report on to be a leader among their peers

    Metadata

    Meta DataInformation
    Update FrequencyWeekly
    Data Source(s)Estimations produced using ESG Book raw emissions data and ESG raw data. Financial data from third-party provider
    Geographic coverageGlobal
    Time period coveragePresent
    Is historical data “point-in-time”YES
    Raw or scraped dataAll input data is collected from public sources such as Annual Reports, CSR Reports, Investor Relation Presentations and Reports, ESG reports, Company Websites
    Number of companies covered~37,000
    Standard entity identifiersTicker (please contact for information on other identifiers)

    Pricing Information

    Pricing is determined on a use-case basis, thus please contact for more information.

    When requesting please include the following information:

    • Organization Name
    • Position (non-mandatory)
    • Business Email Address or Telephone Number
    • Country
    • Use-case

    Regulatory and Compliance Information

    This product is allowed for internal use only, users are not allowed to distribute the data externally.

    If you're interested in a re-distribution of data use case, please contact us.


    Need Help?


    About Your Company

    Details

    Sold by

    Delivery method

    Deployed on AWS
    New

    Introducing multi-product solutions

    You can now purchase comprehensive solutions tailored to use cases and industries.

    Multi-product solutions

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Emissions Estimation Data Module

     Info
    Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (1)

     Info
    Dimension
    Description
    Cost/12 months
    Product Access
    Dimension that grants access to the product for subscribers.
    $15,000.00

    Vendor refund policy

    No refunds.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    AWS Data Exchange (ADX)

    AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.

    Additional details

    Data sets (2)

     Info

    You will receive access to the following data sets.

    Data set name
    Type
    Historical revisions
    Future revisions
    Sensitive information
    Data dictionaries
    Data samples
    sco-eem-100
    All historical revisions
    All future revisions
    sco-eem
    All historical revisions
    All future revisions

    Resources

    Vendor resources

    Similar products