AWS Government, Education, & Nonprofits Blog

Smart City Data Analysis: Pushing Data into the Cloud

Cloud computing can help cities use big data and analytics to analyze and gain intelligence from data. In the previous smart cities posts, we mentioned how cities can acquire data from different sources, such as IoT devices, sensors, mobile applications, and interactions with citizens (read the past blogs here and here). This data, after it has fulfilled its primary purpose, might be given a new life as a source for analytics to influence further projects.

Think about the potential years (if not centuries) of data stored in city archives. From demographics and schools to transportation and GIS data, this data can be combined with real-time data and machine learning to ascertain information about the cities in order to make a meaningful impact.

How can cities easily push data into the cloud to be analyzed? While analyzing massive amounts of quickly changing data is not always an easy job, a lot of the complexity can be resolved by using AWS managed services:

  • Amazon Kinesis makes it easy to load and analyze near real-time streaming data, such as fraud detection, inventory alerts in a critical care unit or a blood bank. This opens up the possibility of detecting device failure in a traffic system or a medical unit to provide a corrective response action.
  • With Amazon Simple Storage Service (Amazon S3), you can host massive data sets in a cost-effective way. S3 can be the target storage for a process that involves scanning incoming documents that can be processed to feed the analytic engine or be stored in a common relational database.
  • S3 can work beyond a simple storage environment (with careful prefixes which are self-documenting) to become an integration platform where you can look at the data and be able to apply polices and workflows. Additional AWS services that makes this happen include Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), and Amazon Simple Workflow Service (SWF).
  • Another option is to take advantage of Amazon Elastic Block Store (EBS), which provides persistent block-level storage volumes with different possibilities in terms of IOPS, or Elastic File System, which is useful as a NFS-v4 compatible shared storage that grows and shrinks automatically as you add and remove files.
  • With Amazon Elastic MapReduce (Amazon EMR), you can quickly launch clusters of Hadoop in minutes, resize them, and terminate them when their analysis is completed. Or, you can decide to keep the cluster available to continuously process data when it arrives.
  • Additionally, for more secure workloads in smart cities, such as census data, polling data, and medical sciences, the EMR Filesystem (EMRFS) can read objects from and write objects to Amazon S3 using S3 server-side encryption with AWS Key Management Service keys (SSE-KMS). You can launch a 10-node Hadoop cluster for as little as $0.11 per hour.  Because EMR has native support for EC2 Spot Instances, you can also save 50-80% on the cost of the underlying EC2 instances. This reduces the cost barrier of big data analysis for cities of all sizes.

As an example scenario to analyze massive amount of documents, we can imagine a document data extraction flow as depicted below:

The flexibility of tools and services is important for data analytics in order to gain insights about information, hidden patterns, and eventually refine your algorithms until you find the answers you were looking for. This gives cities of all sizes the ability to perform their big data analysis for real-time streaming data, historical analysis, or a mixed approach without the burden of an expensive investment.

This technology can be widely used in many smart city applications, such as:

  • Early alert systems (bad weather, tsunami, earthquakes, infections, and pollen count).
  • Adaptive traffic light systems that consider pedestrian counts, road work, school holidays, and nearby key locations, such as tourist hot spots.
  • Monitoring systems for government agencies to complement their counter terrorism efforts.
  • Internet of Things (IoT) driven sensors with adaptive waste management for a more eco-friendly smart city.
  • Intelligent simulation taking advantage of analytics to model smarter cars and housing.
  • Smart health care.

We hope the availability of this technology will trigger new innovative solutions in city management that will lead to improved citizen services.


Post authored by Giulio Soro, Senior Solutions Architect, AWS, Steven Bryen, Manager, Solutions Architect, AWS, and Pratim Das, Specialist SA – Analytics, EME, AWS