AWS Public Sector Blog

How transit agencies can use AWS to improve safety and passenger experience

AWS branded background with text overlay that says "How transit agencies can use AWS to improve safety and passenger experience"

Transit authorities across the globe want to monitor the driving habits of their fleet drivers to curtail unsafe driving, reduce safety-related incidents, and improve passenger experience. Transit agencies also want to provide meaningful feedback to their operators by providing them with relevant context for their unsafe driving. Fleet managers can use Amazon Web Services (AWS) to ingest and analyze fleet driver data. In this post, we will share how a large public transit agency in the United States (referred to as “Agency”) worked with AWS to create a proof-of-concept (POC) to analyze operator behavior and improve its visibility of sudden acceleration-based events. We will also share a few architectural patterns and a partner solution.

The three components of improving fleet operations

Historically, improving fleet operations is challenging because of the requirements to capture large disparate sets of data to analyze and determine operator behavior. The Agency responsible for public transportation in one of the largest cities in the US worked with AWS to track bus operators’ driving trends. Their goal is to curtail unsafe behavior and instill better driving habits. The solution has three main components: data acquisition, data analysis, and data insights and visualization of the data received from on-board bus sensors. Figure 1 is an architectural diagram of the Agency’s solution for collecting and analyzing bus operator behavior.

architectural diagram of the solution the Agency described in this blog post used to collect and analyze bus operator behavior

Figure 1. Architectural diagram of the Agency’s operator behavior analysis. The architecture uses AWS services such as Amazon Simple Storage Service, AWS Lambda, AWS Glue, Amazon Athena, and Amazon QuickSight.

  1. Data acquisition – In this POC, the raw log files from the buses are ingested to Amazon Simple Storage Service (Amazon S3) in zipped format on a nightly basis.
  2. Data analysis – The architecture follows an event-driven mechanism to trigger different AWS Lambda functions for unzipping, and parsing/curating the raw log files. At this stage, AWS Glue crawlers scan the curated datasets and populate the AWS Glue Data Catalog. The Data Catalog serves as a central repository to store the metadata of the datasets. Amazon Athena accesses the data for interactive queries, using the Data Catalog.
  3. Data insights and visualization – Finally, Amazon QuickSight is used to securely visualize the insights derived from the data. Refer to the “Securely analyze your data with AWS Lake Formation and Amazon QuickSight” blog post to learn more about this architecture pattern.

The Agency is seeing success with this solution to identify sudden acceleration and braking events, and views these events on interactive dashboards. The next phase of their project will involve rolling out this POC to their entire bus fleet in a production environment.

Next, let’s dive deeper into the architectural patterns, which can be leveraged by other transit agencies to build similar solutions.

Data acquisition

The high volume of data coming from a bus fleet makes it challenging to ingest and store it. Further, the data generated by various sensors is usually in different formats (such as proprietary electronic control unit (ECU) data, syslog data, or blob data). AWS provides services and capabilities to ingest, harmonize, and prepare data for different types of analysis. These are a few architecture patterns to leverage for data ingestion:

  1. AWS IoT FleetWise and Amazon Timestream AWS IoT FleetWise can be used to collect vehicle data in different formats, transform, and transfer data to the cloud in near real time. AWS provides a number of ways to store the ingested data. One option is Timestream which is a fast, scalable, fully managed, and purpose-built time-series database. This architectural pattern provides you the capability of near real-time analysis of the ingested data. Refer to the documentation to implement this architectural pattern. Here is an example implementation of this architectural pattern.
  2. AWS IoT FleetWise and Amazon S3 – This architectural pattern of using AWS IoT FleetWise to send data to Amazon S3 is suitable for batch processing. AWS provides you a robust toolkit to clean, standardize and extract actionable insights from the ingested data. Learn more about this architectural pattern in this AWS IoT FleetWise blog post.
  3. Offline Transfer – For an offline data transfer use case, you can use Amazon S3 console, AWS SDK, AWS Command Line Interface (AWS CLI), REST API, and AWS DataSync to initiate a data transfer from your on-premises storage to Amazon S3.

Data analysis

Data can come from different sources and in different formats, so it needs to be cleaned, standardized, and normalized before it’s analyzed. The first architectural pattern discussed (AWS IoT FleetWise and Timestream) offers capabilities of data standardization as part of the ingestion process itself. Once the data is in the Timestream database, you can use SQL to query data in Timestream to retrieve time series data from one or more tables. You can also integrate Timestream with other analytics and machine learning (ML) services for further analysis. Browse available integrations.

For the other two architectural patterns, the vehicle telemetry data stored in Amazon S3 needs to be cataloged to make it available for search and query. AWS Glue, a serverless data integration service, can help with identifying the data formats and suggesting the schemas of the stored data without moving it anywhere. You can do data discovery and classification using the crawler functionality, and store the associated metadata in AWS Glue Data Catalog in a query optimized canonical format.

You can use Athena, an interactive query service, to analyze data directly in Amazon S3 using standard SQL. Athena integrates with the AWS Glue Data Catalog, as well as offers Athena Federated Query for data in sources other than Amazon S3.

In addition to the vehicle telemetry data, many transit agencies are installing multiple cameras on buses to capture contextual information of the driving environment. The suite of artificial intelligence (AI) and ML services on AWS can help you analyze your camera feeds to recognize attributes for dangerous and distracted behavior. AWS Partner solutions such as Driver•i—a driver safety solution that uses vision-based technology to provide real-time feedback and generates thousands of data points—can capture driver behavior and vehicle performance on the road. Driver•i leverages a host of AWS services like Amazon S3 for highly durable, scalable, and accessible storage, AWS IoT for device management, Amazon Kinesis for streaming data processing, Lambda for event driven and serverless computing, and Amazon Simple Queue Service (Amazon SQS), Amazon Simple Notification Service (Amazon SNS), and Amazon Kinesis Data Firehose for providing efficient and scalable solutions to thousands of customers worldwide. Using AWS helps Driver•i process hundreds of millions of API calls daily and provides near real-time insights into safety events across the fleet.

Data insights and visualization

Operators, depot managers, and executives at transit agencies need access to actionable insights about bus operators’ driving behavior to prevent unsafe driving practices and provide context-relevant feedback. With Amazon QuickSight, you can meet the aforementioned analytics needs from the same source of truth through interactive dashboards, paginated reports, embedded analytics, and natural language queries. This serverless and fully managed service allows you to directly connect to and import data from a wide variety of cloud, third-party and on-premises data sources, supporting various file formats.


In conclusion, driver operator behavior is a critical aspect of transportation management that directly impacts the safety, efficiency, and overall experience of passengers. In this post, we shared common architectural patterns on AWS to collect, store, and analyze vast volumes of data generated by transportations systems and generate actionable insights.

To learn more about how you can partner with AWS to build transportation solutions, explore these resources:

A special thanks to senior data scientist Tayo Ogunmakin, solutions architects Tariq Habib and Ola Ola, and solutions architect manager Mamta Vaidya for supporting the POC and architectural aspects of this solution.

Mehar Swarup

Mehar Swarup

Mehar is a senior solutions architect at Amazon Web Services (AWS) with extensive experience in cloud computing, security, and networking. Mehar has helped a number of public sector customers establish their technical cloud strategy and migration efforts. He is passionate about leveraging the power of technology to better serve citizens. Outside of work, he enjoys spending time with his family, traveling, and music.

Ravi Tallury

Ravi Tallury

Ravi is a principal solution architect at Amazon Web Services (AWS) and has more than 28 years of experience in architecting and delivering IT solutions. Prior to joining AWS, he led solution and enterprise architecture for automotive and life science verticals.