
Overview
AI Ready Data packages for US Upstream and International Upstream are now available!
** The offer comprises a sample data set containing approximately 100 dated records **
The AI Ready Data dataset encompasses a comprehensive array of textual content across Energy publications produced by in-house editorial and research teams, including market reports, news articles, rationales, commentaries, fundamentals analyses, outlooks, and more - all in an LLM-friendly format prepared for seamless integration with AI systems.
Customers can effortlessly leverage AI Ready Data for their Retrieval-Augmented Generation (RAG) solutions, enhancing their analytical capabilities and driving informed decision-making. This dataset removes restrictions as you integrate your choice of large language models (LLMs), to uncover patterns, correlations, and insights across commodities. Our flexibility aids processing and understanding data to suit your organizations, and you can utilize the provided data embeddings or set your own as per your preference. Additionally, you can integrate with your own vector database and leverage various internal and external data sources to enrich the dataset.
This dataset includes:
- Unstructured data in an AI-ready format broken down into documents and segments with LLM-friendly metadata
- Flexible data delivery
- Easy customization of your own search and relevancy-boosting algorithms
- Ease of discovery of relevant content for your end users
Sample Fields:
| DOCUMENT_METADATA |
|---|
| PUBLISHED |
| UPDATED |
| FILETYPE |
| FILESIZE |
| SOURCEURl |
| REPORTINGFREQUENCY |
| PRIMARYENTITYTYPE |
| PRIMARYENTITYNAME |
| DOCUMENT_PRIMARY_ENTITY_IDF |
| OTHERDOCUMENTMETADATA |
| SEGMENT_METADATA |
|---|
| DOCUMENTID |
| SEGMENTATIONSTRATEGY |
| SEGMENTID |
| SEGMENTTYPE |
| SEGMENTLOCATION |
| RAWSEGMENTCONTENT |
| PROCESSEDSEGMENTCONTENT |
| LANGUAGE |
| SEGMENTOVERLAP |
| OTHERSEGMENTMETADATA |
| SEGMENTEMBEDDINGS |
| SEGMENTORDER |
Tables :
| TABLE TITLE | TABLE DESCRIPTION |
|---|---|
| DOCUMENT_METADATA | Contains metadata about various documents such as id, name, file type, size, sourceURL, and reportingFrequency. Additionally, it includes related tags like primary entity, commodity, geography, and any additional metadata that helps in identifying the document. |
| SEGMENT_METADATA | Contains chunked segments from documents along with metadata such as related document id, segment id, type, location, along with the processed and raw content of the segment. Additionally, it contains information on the segmentation strategy used to chunk the data and the embedding ids for each segment. |
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Vendor refund policy
Refunds are not offered for this product.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
AWS Data Exchange (ADX)
AWS Data Exchange is a service that helps AWS easily share and manage data entitlements from other organizations at scale.
Additional details
You will receive access to the following data sets.
Data set name | Type | Historical revisions | Future revisions | Sensitive information | Data dictionaries | Data samples |
|---|---|---|---|---|---|---|
ai-ready-data-cet | All historical revisions | All future revisions | Not included | Not included |