Pearson Boosts Security and Productivity Using Amazon Elasticsearch Service
Global educational media company Pearson needed a more efficient way to analyze and gain insights from its log data. With a number of teams in various locations using Elasticsearch—the popular open-source tool for search and log analytics—Pearson found that keeping track of log data and managing updates led to high operating costs. Faced with this, as well as increasingly complex security log management and analysis, the company found a solution on Amazon Web Services (AWS). Pearson quickly saw improvements by migrating from its self-managed open-source Elasticsearch architecture to Amazon Elasticsearch Service, a fully managed service that makes it easy to deploy, secure, and run Elasticsearch cost effectively at scale. Rather than spending considerable time and resources on managing the Elasticsearch clusters on its own, Pearson used the managed Amazon Elasticsearch Service as part of its initiative to modernize its products.
As we migrate to Amazon Elasticsearch Service, we can start to focus on what’s necessary from a security perspective. We can bring in different skill sets and focus on what’s more important to the company rather than just maintaining standard hardware or infrastructure.”
VP of Security Engineering and Product Security Officer, Pearson
Meeting the Needs of the Modern World
As one of the largest and oldest educational companies in the world, Pearson operates in 70 different countries. The company provides a wide variety of educational content and assessments and other services, which are often specialized for different target audiences. As the company moved toward digitization in the cloud, it began to use AWS services. At first, Pearson used Amazon Elastic Compute Cloud (Amazon EC2)—a web service that provides secure, resizable compute capacity in the cloud—to power its self-managed open-source Elasticsearch. But the company found that a self-managed approach posed several challenges. “One of the major issues we had was with the security portion of the fine-grained access control—we weren’t able to work through that,” says Muthu Meyyappan, vice president of security engineering and product security officer for Pearson. “Another challenge was maintaining the upgrades and the usual service management below the line, which consumed the effort of a full-time engineer to maintain the platform.”
In order to iron out its access control to log data and reduce the amount of time spent on updates, Pearson turned to Amazon Elasticsearch Service. “When we were updating the open-source Elasticsearch, it took time to make sure we didn’t miss any data,” says Meyyappan. “We were looking to have someone else take that accountability. If there is data lost, for example, Amazon Elasticsearch Service enables us to go back 14 days and get the index back. Features like that encouraged us to migrate to the managed service.”
Improving Analytics and Security
Pearson found the Amazon Elasticsearch Service migration process to be straightforward. “The main task was migrating the users and making sure that their indexes were there,” says Meyyappan. “But it wasn’t a huge task for us, because we hadn’t been using a lot of data when we were using open source. So the migration path was fairly easy.” Once Pearson migrated to Amazon Elasticsearch Service, the company could take advantage of fine-grained access control, which offers authorization management for particular indexes. “We didn’t have the security of fine-grained access control before,” says Meyyappan. “Once we migrated to Amazon Elasticsearch Service, we didn’t have to worry about authorization.”
Pearson now uses Amazon Elasticsearch Service to retrieve application logs and infrastructure logs through Amazon Simple Storage Service (Amazon S3) buckets. The storage service offers industry-leading scalability, data availability, security, and performance. “Once we get the logs there,” says Meyyappan, “they get transferred into Amazon Elasticsearch Service through Amazon Kinesis Data Streams.” This service offers massively scalable and durable real-time data streaming. “We’re using the analysis for multiple use cases, including for identifying security events and for general application support,” says Meyyappan. Another AWS service that Pearson integrated is AWS Lambda, which makes it possible to run code without provisioning or managing servers. Through its use, Pearson team members can more easily see the costs associated with ingesting data on a dashboard.
One of Pearson’s goals for data analytics is to establish sophisticated methods for identifying security events or patterns of behavior in order to protect everything from users’ personal information to the content of the company’s digital courses. When it had used an open-source Elasticsearch architecture, Pearson spent so much time building the foundation that it didn’t have much time left over for defining security patterns and implementing anomaly detection. “As we migrate to Amazon Elasticsearch Service, we can start to focus on what’s necessary from a security perspective,” says Meyyappan. “We can bring in different skill sets and focus on what’s more important to the company rather than just maintaining standard hardware or infrastructure.” Pearson now intends to use anomaly detection in the future. Using Amazon Elasticsearch Service, Pearson can also now easily scale its data ingestion to six billion documents per day, maintain up to 40 TB in total storage, and enable multiple teams to make use of the solution.
By migrating from Elasticsearch to Amazon Elasticsearch Service, Pearson was able to redirect a lot of time spent on self-management toward more mission-critical projects while also taking advantage of the security features in Amazon Elasticsearch Service for threat detection and security event analysis. In the future, Pearson aspires to use the data it ingests to do more complex analytics involving machine learning. This is especially pertinent to things like anomaly detection and security management, and it’s all part of Pearson’s drive to become a more modernized, tech-savvy educational leader. In this regard, Pearson truly lives up to its motto: “always learning.”
Founded in 1844 in the United Kingdom, Pearson is one of the largest education companies in the world. Pearson provides educational content and assessments to schools, corporations, nonprofits, and individual students in more than 70 countries.
Benefits of AWS
- Enabled fine-grained access control
- Reduced time spent on updates
- Scaled data ingestion to six billion documents per day
AWS Services Used
Amazon Elasticsearch Service
Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost effectively at scale. You can build, monitor, and troubleshoot your applications using the tools you love, at the scale you need.
Amazon Simple Storage Service
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Amazon S3 is designed for 99.999999999% (11 9's) of durability, and stores data for millions of applications for companies all around the world.
Amazon Kinesis Data Streams
Amazon Kinesis Data Streams (KDS) is a massively scalable and durable real-time data streaming service. KDS can continuously capture gigabytes of data per second from hundreds of thousands of sources. The data collected is available in milliseconds to enable real-time analytics use cases such as real-time dashboards, real-time anomaly detection, dynamic pricing, and more.
AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. With Lambda, you can run code for virtually any type of application or backend service—all with zero administration.
Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.