Building an Industrial Data Platform on AWS with Emerson’s Plantweb Optics Data Lake
Industrial companies often struggle to extract value from their Operations Technology (OT) data. One of the primary reasons is that industrial data is often locked away in siloed legacy systems and even when this data is made available for analysis, the lack of context around the data makes it challenging to gain insights needed to deliver business outcomes. With the drive towards leveraging analytics to unleash the business value, Industrial companies are increasingly trying to realize the value of this OT data. They are embarking on a digital transformation journey to leverage the increasing amounts of data to produce transformational insights that will enable better, faster decisions to reinvent industrial operations.
Unlocking the data from siloed legacy systems and providing capability to holistically analyze data across the enterprise can greatly improve the value of industrial companies can extract from their OT data. Establishing an enterprise-wide repository of real-time and historical data is an important step to enable innovation and automated workflows through industrial solutions for advanced analytics, machine learning and artificial intelligence. The Industrial Data Platform on AWS provides a data strategy blueprint to securely store, collect, contextualize, and analyze OT data centrally. Cloud-based analytics services available on AWS democratize the access to advanced technologies for all users to extract insights from data. A hybrid model enables organizations to bridge the cloud and on-premises solutions and helps them better utilize data to improve business outcomes.
Challenges of OT data ingestion
Collecting data from a wide variety of on-premises industrial systems at an enterprise scale can be challenging. Manufacturing plants can have numerous disparate systems installed at distinct network levels, and with different user access and security requirements. These systems may also use different connectivity protocols, and contain data in a variety of formats and at different time resolution. Traditionally, historian software is used to collect and store time series data from various industrial systems. However, for holistic analysis, it is important to combine the time series data with data from other sources such as process alarm and events, maintenance/asset data and system log files. A data platform must be able to ingest these different types of data at the OT layer.
Emerson’s Plantweb Optics Data Lake Solution
Emerson’s Plantweb Optics Data Lake provides a unified management interface with a modern architecture to ingest data from OT systems. It provides out of the box connectors to interface with a wide variety of industrial systems. These connectors can be managed from a central administration console, providing the ability to manage configuration, security, and updates from one application. The key to generating value with these data sets is organizing the data in context to the asset and its place in the overall enterprise. Plantweb Optics provides ability to centrally organize different data types (such as time series, alarms, events, files, pictures) in context to the asset hierarchy. Its built-in object data models ensure proper context is brought to the data prior to cloud ingestion. This lowers total cost of ownership by leveraging on-premises computing for lower value tasks.
As a centralized data management solution, Plantweb Optics addresses the challenges associated with OT data ingestion and combining data from multiple sources into a single stream. With Emerson’s Plantweb Optics, the communication interface to each OT system is carefully monitored and managed so that available compute and network resources are never overwhelmed by requests for data. It strictly adheres to a secure-by-design approach, allowing data to flow through multiple encrypted network layers. Typical enterprises today can have millions of data streams that need to be curated, stored, contextualized and egressed into AWS. Plantweb Optics enables enterprises to move this OT data at scale to AWS with low data latency.
Emerson’s Plantweb Optics Integration with AWS Services
As demonstrated in the reference architecture above, Plantweb Optics system provides ability to connect to AWS through various connectivity options. MQTT connectivity can be used to connect to AWS IoT Core, opening up the possibility of real-time analytics with AWS IoT Services. Customers can also choose to use Apache Kafka or Amazon Kinesis for streaming analytics. Plantweb Optics also provides a web API access, which allows customers to write their own custom code for data access and to securely write data back to Plantweb Optics. While the choice of the connectivity option will depend on the use case, each option is capable of sending data to an Amazon Simple Storage Service (S3) data lake. In this scenario, the data lake serves as the single repository of the OT data in the cloud and Plantweb optics serves as a single interface on premises to collect and organize data from multiple sources and send it to the data lake.
Industrial Data Platform on AWS
The Amazon S3 data lake is central to the Industrial Data Platform strategy. It provides secure, cost-effective and scalable storage for industrial data. AWS Lake Formation is a service that simplifies the task of creating and managing data lake access and security policies from a central place. The raw data in the data lake also needs to be transformed to formats that are easier to query and allow self-discovery of data context. Some of these transformations can be done on premises with Plantweb Optics to homogenize data from various systems. At the cloud layer, AWS Glue provides the capability to automatically crawl data and create data catalogs. It also enables ETL (extract, transform and load) operations on incoming data. The end result is a second “processed data” location in the data lake which contains transformed, homogenized and cataloged data ready to be consumed by analytics applications.
Amazon Athena provides the capability to write SQL like queries to analyze this data directly from Amazon S3. Customers can also utilize Amazon QuickSight to create business intelligence dashboards of the OT data from multiple sources in the plant. The data in S3 can also be leveraged to solve for manufacturing challenges such as predictive maintenance using Amazon SageMaker to build and deploy machine learning models. Specialized Industrial artificial intelligence (AI) services such as Lookout for Equipment can also be used to detect anomalies in machine operations, all without any ML expertise. Amazon Redshift can also be used to create an OT data warehouse for business analytics. Establishing a data platform on AWS allows industrial companies to break down data silos and combine different types of analytics to gain insights and guide better business decisions.
Establishing an Industrial Data Platform on AWS enables manufacturers to extract insights from their operations data. It powers multiple analytics applications, which in turn help deliver business outcomes. Emerson’s Plantweb Optics Data Lake is enabling industrial companies to seamlessly leverage this cloud platform in their operations so that they can lower costs, become more agile and drive performance improvement at scale. By enabling manufacturers to securely aggregate their data from a variety of disparate OT systems and protocols without disruption to their operations, the Plantweb Optics Data Lake facilitates the movement of OT data to Industrial Data Platform on AWS, which can then be utilized to unlock new insights to drive enterprise-wide gains.