Integrate Operational Technology data from OSIsoft PI with AWS services
Large industrial enterprises in manufacturing, oil and gas, and electric transmission and distribution collect operational technology (OT) data from millions of sensors. Much of this valuable data remains unused sitting in historians with 80% of the world’s largest oil and gas companies and 65% of the industrial companies in the Fortune 500 relying on the PI System in their operations. OSIsoft is a leader in industrial digital transformation with a comprehensive data infrastructure platform for data ingestion from various industrial sensors using a variety of industrial protocols. OSIsoft systems are deployed across the globe collecting billions of data streams.
Data scientists and process engineering leaders recognize the need to leverage this operational data to gain insights, optimize processes, and enhance business value. Data Scientists and Process Engineers are looking to adopt advanced analytics technologies such as multi-dimensional dashboarding and machine learning to glean new insights from this data. Organizations that successfully generate business value from their data will outperform their peers. An Aberdeen survey saw organizations who implemented a data lake outperforming similar companies by 9% in organic revenue growth. These leaders were able to do new types of analytics like machine learning over new OT data sources stored in the data lake. This helped them to identify, and act upon opportunities for business growth faster by attracting and retaining customers, boosting productivity, proactively maintaining equipment, and making better informed decisions. A secure cloud data lake is a centralized, curated, and secured repository that stores all your data, both in its original form and prepared for analysis. OT data in an AWS data lake simplifies using advanced analytics using cloud tools like Amazon Athena to which is a serverless, interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (S3) using standard SQL, Amazon SageMaker to build, train, and deploy machine learning models quickly, and Amazon QuickSight, which is a fast, cloud-powered business intelligence service that lets you easily create and publish interactive dashboards that include ML Insights.
However, engineers typically need to spend an immense amount of time and effort to clean and transform the OT data locked in these systems before it can be analyzed. OT data is not uniform and can have outliers such as varying identifiers, uneven spacing, spike/out of range values, bad sensor readings, or even communication failures on some measurement points. This makes it challenging for process engineers to analyze the data quickly and may require months of cleaning and data preparation in order to even begin analysis. OSIsoft estimates that between 50-80% of a data scientists time is spent on data preparation. OSIsoft has a solution, ‘PI Integrator for Business Analytics’ that helps solve this problem. With PI Integrator for Business Analytics, you can cleanse your data, apply transformations and then store the data in an Amazon S3 data lake, ready for advanced analysis with uses cases including:
- Revealing insights from data to reduce operations costs, improve resource utilization, and lower the risks associated with business operations.
- Supporting business decisions with data from globally distributed operations and systems.
- Implementing predictive analysis to deliver actionable insights across your organization.
- Performing fleet-wide analysis to compare asset performance.
The PI Integrator for Business Analytics joins your advanced analytics infrastructure with OSIsoft’s PI System, allowing you to combine high-value OT data from the PI System with information technology (IT) data for reporting, analytics, and applications integration. Check out this OSIsoft video for more info. By bringing in data from automation and control systems to the AWS Cloud and then enriching it with data from IT systems such as Manufacturing Execution Systems, Enterprise Resource Planning, Warehouse Management, and Supply Chain Systems enables you to add business context to OT data. This allows for a level of transparency into industrial operations and business processes, giving you the ability to anticipate problems and identify opportunities for process improvements. For example, TransCanada is using the PI System together with AWS to realize new business value. View a video covering their use case, <here>.
The architecture below shows a typical deployment scenario where there is a central roll-up or aggregation PI System on AWS.
Figure 1: A recommended deployment of the PI System and PI Integrator for Business Analytics in AWS
The following steps explain the flow of OT data using PI Integrator for Business Analytics to a PI System in AWS Cloud:
- PI System Connector connects OT data from on premises PI System to the roll up PI System on AWS via AWS Direct Connect or Amazon VPN.
- PI System data from on-premises is sent to the PI Connector Relay which persists the data into PI System on AWS Cloud.
- PI Integrator for Business Analytics exports asset and event view data from PI System into AWS Managed Services, namely, Amazon S3, Amazon Redshift, Amazon Kinesis, and Amazon Managed Streaming for Kafka.
AWS customers benefit from security designed for the most sensitive industries, like healthcare, government, and financial services. Below are some of the security requirements to setup this type of application, however more details can be found in the OSIsoft PI Integrator for Business Analytics Deployment Guide for AWS .
Amazon S3 requirements include:
- An Identity and Access Management (IAM) user with an AWS Access Key ID and AWS Secret Access Key are required to configure the Amazon S3 target.
- The IAM needs the following permissions on the Amazon S3 target: List Objects, Write Objects, Read Bucket Permissions, and Write Bucket Permissions.
- With the Amazon Athena database, ensure that the IAM user has the following permissions on the database: SELECT, CREATE, UPDATE, SHOW DATABASE, SHOW TABLES, and CREATE EXTERNAL TABLE.
Amazon Redshift requirements include:
- Cluster database username and password are required to configure the Amazon Redshift target.
- Ensure that the Redshift user has the following minimum permissions on the Amazon Redshift target: SELECT, CREATE, DROP, UPDATE, and INSERT.
Amazon Kinesis requirements include:
- An IAM user with an AWS Access Key ID and AWS Secret Access Key to configure the Amazon Kinesis target.
- The Amazon Kinesis Data Streams producer requires the following permissions on the Amazon Kinesis target: DescribeStream, PutRecord/PutRecords actions.
Manufacturing, Oil & Gas, and utilities companies collect large real-time data streams from sensors into a data historian to make sense of their operation. Often, the historian of choice is OSIsoft’s PI System that has installations around the world in small and large enterprises. With AWS and PI Integrator for Business Analytics, this OT data can be aggregated into a central PI System in the cloud and subsequently sent to Amazon S3, Amazon Redshift, or Amazon Kinesis. The simplifies the process for analysis by data scientists and process engineers and speeds time to results to detect the root cause of issues, and optimize operations, to ultimately get real value out of their PI OT data.