Intellog proposes to co-ordinate the creation and maintenance of an AWS Public Data Set for the oil & gas industry. This industry is a very data intensive, with much of the data in the public domain. Yet the data is locked up within proprietary systems, and/or available only from for-profit vendors. There is a vested interest in maintaining the status quo, as it benefits the data vendors' bottom line. However, this current paradigm reflects an entirely obsolete notion of what it costs of obtain and store the information. While the cost of provisioning the service has dropped dramatically -- to near zero with the advent of cloud-based services like AWS -- obtaining data from these vendors has remained expensive and arcane.
This public domain data is scattered around the globe in a variety of formats from a variety of different agencies. This problem is particularly acute in the United States and Canada, where natural resources are a state/provincial responsibility. Data is fragmented across a variety of state/provincial agencies, with no one group responsible for its integration and dissemination of the data. Tracking all of the data down at source is tedious and time-consuming, which makes the aggregation and analysis of the data difficult. However, it's a job which only needs to be done once, to everybody's mutual benefit. The advent of services like AWS create the opportunity to put much of this industry's data in one spot, with consistent formatting and organization. This would free up industry stakeholders to focus on adding value to the data and/or building applications on it, as opposed to the tedious mechanics of organizing and storing the data in the first place.
There is room for an 'open' alternative where industry-relevant data which is public domain or non-proprietary can be easily accessed through an AWS Public Data Set. A preliminary list of data sources has been compiled and is available upon request. Intellog would take responsibility for some sort of high level taxonomy, rationalization and formatting, and getting it ready to put it up on AWS as a Public Data Set. This idea is in the very early stages, and there is no pre-conceived notion as to what data should be included or excluded. Intellog seeks the opportunity for further discussion with AWS as to how this opportunity can be pursued.