Global Surface Summary of Day (GSOD) is a collection of daily weather measurements (temperature, wind speed, humidity, pressure, and more) from 9000+ weather stations around the world. Data was originally collected by the National Climactic Data Center.

Global summary of day data for 18 surface meteorological elements are derived from the synoptic/hourly observations contained in USAF DATSAV3 Surface data and Federal Climate Complex Integrated Surface Data (ISD). Historical data are generally available for 1901 to the present, with data from 1973 to the present being the most complete. For some periods, one or more countries' data may not be available due to data restrictions or communications problems. In deriving the summary of day data, a minimum of 4 observations for the day must be present (allows for stations which report 4 synoptic observations/day). Since the data are converted to constant units (e.g, knots), slight rounding error from the originally reported values may occur (e.g, 9.9 instead of 10.0).

The mean daily values described below are based on the hours of operation for the station. For some stations/countries, the visibility will sometimes 'cluster' around a value (such as 10 miles) due to the practice of not reporting visibilities greater than certain distances. The daily extremes and totals—maximum wind gust, precipitation amount, and snow depth—will only appear if the station reports the data sufficiently to provide a valid value. Therefore, these three elements will appear less frequently than other values. Also, these elements are derived from the stations' reports during the day, and may comprise a 24-hour period which includes a portion of the previous day. The data are reported and summarized based on Greenwich Mean Time (GMT, 0000Z — 2359Z) since the original synoptic/hourly data are reported and based on GMT.

Each station report is available as a unique CSV file in the "aws-gsod" S3 bucket in the US East (N. Virginia) region. Each report is named based on its USAF and WBAN identifier as well as the year it is binned to. For example, we can tell that the report named "007026-99999-2016.csv" took place in 2016, had a USAF ID of 007026 and unknown WBAN ID (because it is listed as 99999).

In addition to binned yearly reports for each station, there are two top-level metadata files in the root of the bucket. s3://aws-gsod/isd-inventory.csv contains metadata about how many daily measurements there are per station and per year. s3://aws-gsod/isd-history.csv contains detailed information (country, state, lattitude, longitude, elevation and more) about each station available in the dataset.

All of the data is publicly accessible via the S3 bucket's HTTPS endpoint at https://s3.amazonaws.com/aws-gsod. No authentication is required to download data over HTTPS. For example, the inventory file can be accessed at https://s3.amazonaws.com/aws-gsod/isd-inventory.csv and the example report mentioned above can be accessed at https://s3.amazonaws.com/aws-gsod/2016/007026-99999-2016.csv.

If you use the AWS Command Line Interface, you can list the bucket contents, see how many reports are available, and calculate the total size of the reports with the "ls" command:

aws s3 ls s3://aws-gsod --human-readable --summarize

Source
National Climate Data Center (NCDC)
Category Regulatory
Format csv
License The data is intended for free and unrestricted use in research, education, and other non-commercial activities. Per World Meteorological Organization (WMO) Resolution 40, redistribution of these data by others must provide this same notification, and non-U.S. data cannot be redistributed for commercial purposes.
Storage Service Amazon S3
Location s3://aws-gsod in US East Region
Update Frequency Currently updated infrequently. Last updated on September 13, 2016.

Special thanks to data scientist Sohier Dane for helping optimize this data for analysis from Amazon S3.

Educators, researchers and students can apply for free promotional credits to take advantage of Public Datasets on AWS. If you have a research project that could take advantage of GSOD on AWS, you can apply for AWS Cloud Credits for Research.