Pearson Boosts Security and Productivity Using Amazon OpenSearch Service

2020

Global educational media company Pearson needed a more efficient way to analyze and gain insights from its log data. With a number of teams in various locations using Elasticsearch—the popular open-source tool for search and log analytics—Pearson found that keeping track of log data and managing updates led to high operating costs. Faced with this, as well as increasingly complex security log management and analysis, the company found a solution on Amazon Web Services (AWS). Pearson quickly saw improvements by migrating from its self-managed open-source Elasticsearch architecture to Amazon OpenSearch Service, a managed service that makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. Rather than spending considerable time and resources on managing the Elasticsearch clusters on its own, Pearson used the managed Amazon OpenSearch Service as part of its initiative to modernize its products. 

Shot of two young women using a laptop together in a college library
kr_quotemark

As we migrate to [Amazon OpenSearch Service], we can start to focus on what’s necessary from a security perspective. We can bring in different skill sets and focus on what’s more important to the company rather than just maintaining standard hardware or infrastructure.” 

Muthu Meyyappan
VP of Security Engineering and Product Security Officer, Pearson

Meeting the Needs of the Modern World

As one of the largest and oldest educational companies in the world, Pearson operates in 70 different countries. The company provides a wide variety of educational content and assessments and other services, which are often specialized for different target audiences. As the company moved toward digitization in the cloud, it began to use AWS services. At first, Pearson used Amazon Elastic Compute Cloud (Amazon EC2)—a web service that provides secure, resizable compute capacity in the cloud—to power its self-managed open-source Elasticsearch. But the company found that a self-managed approach posed several challenges. “One of the major issues we had was with the security portion of the fine-grained access control—we weren’t able to work through that,” says Muthu Meyyappan, vice president of security engineering and product security officer for Pearson. “Another challenge was maintaining the upgrades and the usual service management below the line, which consumed the effort of a full-time engineer to maintain the platform.”

In order to iron out its access control to log data and reduce the amount of time spent on updates, Pearson turned to Amazon OpenSearch Service. “When we were updating the open-source Elasticsearch, it took time to make sure we didn’t miss any data,” says Meyyappan. “We were looking to have someone else take that accountability. If there is data lost, for example, [Amazon OpenSearch Service] enables us to go back 14 days and get the index back. Features like that encouraged us to migrate to the managed service.”

Improving Analytics and Security

Pearson found the Amazon OpenSearch Service migration process to be straightforward. “The main task was migrating the users and making sure that their indexes were there,” says Meyyappan. “But it wasn’t a huge task for us, because we hadn’t been using a lot of data when we were using open source. So the migration path was fairly easy.” Once Pearson migrated to Amazon OpenSearch Service, the company could take advantage of fine-grained access control, which offers authorization management for particular indexes. “We didn’t have the security of fine-grained access control before,” says Meyyappan. “Once we migrated to [Amazon OpenSearch Service], we didn’t have to worry about authorization.”

Pearson now uses Amazon OpenSearch Service to retrieve application logs and infrastructure logs through Amazon Simple Storage Service (Amazon S3) buckets. The storage service offers industry-leading scalability, data availability, security, and performance. “Once we get the logs there,” says Meyyappan, “they get transferred into [Amazon OpenSearch Service] through Amazon Kinesis Data Streams.” This service offers massively scalable and durable real-time data streaming. “We’re using the analysis for multiple use cases, including for identifying security events and for general application support,” says Meyyappan. Another AWS service that Pearson integrated is AWS Lambda, which makes it possible to run code without provisioning or managing servers. Through its use, Pearson team members can more easily see the costs associated with ingesting data on a dashboard.

One of Pearson’s goals for data analytics is to establish sophisticated methods for identifying security events or patterns of behavior in order to protect everything from users’ personal information to the content of the company’s digital courses. When it had used an open-source Elasticsearch architecture, Pearson spent so much time building the foundation that it didn’t have much time left over for defining security patterns and implementing anomaly detection. “As we migrate to [Amazon OpenSearch Service], we can start to focus on what’s necessary from a security perspective,” says Meyyappan. “We can bring in different skill sets and focus on what’s more important to the company rather than just maintaining standard hardware or infrastructure.” Pearson now intends to use anomaly detection in the future. Using Amazon OpenSearch Service, Pearson can also now easily scale its data ingestion to six billion documents per day, maintain up to 40 TB in total storage, and enable multiple teams to make use of the solution.

Always Learning

By migrating from Elasticsearch to Amazon OpenSearch Service, Pearson was able to redirect a lot of time spent on self-management toward more mission-critical projects while also taking advantage of the security features in Amazon OpenSearch Service for threat detection and security event analysis. In the future, Pearson aspires to use the data it ingests to do more complex analytics involving machine learning. This is especially pertinent to things like anomaly detection and security management, and it’s all part of Pearson’s drive to become a more modernized, tech-savvy educational leader. In this regard, Pearson truly lives up to its motto: “always learning.” 


About Pearson

Founded in 1844 in the United Kingdom, Pearson is one of the largest education companies in the world. Pearson provides educational content and assessments to schools, corporations, nonprofits, and individual students in more than 70 countries. 

Benefits of AWS

  • Enabled fine-grained access control
  • Reduced time spent on updates
  • Scaled data ingestion to six billion documents per day 

AWS Services Used

Amazon OpenSearch Service

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. OpenSearch is an open source, distributed search and analytics suite derived from Elasticsearch. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), and visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions).

Learn more »

Amazon Simple Storage Service

Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Amazon S3 is designed for 99.999999999% (11 9's) of durability, and stores data for millions of applications for companies all around the world.


Learn more »

Amazon Kinesis Data Streams

Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources. The data collected is available in milliseconds to enable real-time analytics use cases such as real-time dashboards, real-time anomaly detection, dynamic pricing, and more.

Learn more »

AWS Lambda

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. With Lambda, you can run code for virtually any type of application or backend service—all with zero administration. 

Learn more »


Get Started

Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.