Using Seagate Lyve Mobile Cloud Import to accelerate mass geophysical data transfer into Amazon S3
Energy transition and digitalization sparked a dramatic growth in the volume and complexity of data in the energy industry. The importance of data has increased dramatically over the last decade as companies rethink their strategies and embrace new ways of storing and processing data to drive actionable business insights. As enterprises continue to generate massive volumes of data, strain on data storage infrastructure will only compound. To meet these challenges head-on, companies whose day-to-day operations rely on seamless, data-intensive workflows have looked to both Amazon Web Services (AWS) and Seagate Technology to unlock efficient and scalable data storage at the edge and frictionless mass data transfers to the cloud.
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. Seagate Lyve Mobile—a high-capacity edge storage solution from Seagate Technology that helps businesses to aggregate, store, and move their data—is available on AWS Marketplace, a place to find, test, buy, and deploy software that runs on AWS. Scalable and modular, this integrated solution eliminates network dependencies so users can transfer mass datasets in a fast, secure, and efficient manner. Seagate also provides import services from the Seagate Lyve Mobile solution directly to Amazon S3.
In this blog, we will explore how PXGEO, an innovative marine geophysical service provider, used Seagate Lyve Mobile to import data into Amazon S3, illustrating how anyone—from a midsize company to an enterprise organization—can move large amounts of data from the edge to the cloud.
A need for updated storage infrastructure
In the energy industry, exploration and production companies continuously strive to find and develop recoverable hydrocarbon reserves. For every segment of the energy industry—upstream, midstream, and downstream—activating critical data faster can give companies a competitive advantage.
However, data-intensive processing workflows performed at the field level have always presented significant roadblocks, including the limited edge-infrastructure storage, the barriers to scale, the limited transmission bandwidth and network capacity, and the decreased time from data acquisition to engineering insight. With these limitations, the difficulty of aggregating, transporting, and uploading large datasets presents a clear challenge for turnaround time. It can take weeks for a data acquisition project to be fully completed in remote field locations. The inability to physically transfer those mass datasets back to a central processing center prevents customers from reaping actionable insights from that data. In effect, as data ages, it loses value.
Types of geophysical datasets
Dubai-based company PXGEO, one of the most innovative marine geophysical service providers in the world, wanted to accelerate and digitize its data capture, transfer, and delivery to its clients. Because its subsurface imaging solutions combine the strengths of ocean bottom node (OBN) and marine towed streamer (MTS) seismic data acquisition techniques, PXGEO needed an open, easy, and secure way to directly copy terabytes to petabytes of data from the equipment on their ships and quickly off-load the already digitized data to processing centers.
PXGEO uses OBN and MTS acquisitions for different purposes, according to project-specific imaging objectives. OBN acquisition is used extensively for reservoir monitoring and production optimization surveys over hydrocarbon-producing basins in water depths up to 3,000 meters. MTS acquisition is primarily suited to 3D exploration and appraisal seismic surveys as well as time-lapse (4D) applications for monitoring reservoirs and delivering narrow-azimuth, wide-azimuth, and multi-azimuth data sampling. With its in-house OBN and MTS capabilities, the company can offer hybrid seismic data acquisition solutions that combine both methodologies, providing their clients with fit-for-purpose data that optimizes subsurface imaging quality and seismic acquisition economics. However, because these techniques produce anywhere from terabytes to petabytes of data, PXGEO needed to reimagine the way its months-long field projects were aggregated and transported to lead to actionable data insights.
Use of Seagate Lyve Mobile to address edge storage needs
To address these data management challenges, PXGEO deployed x20 96TB Lyve Mobile Array hard disk drives (HDDs) to collect more than one petabyte of data. By doing so, they were able to bypass the limitations of traditional edge infrastructure, such as bandwidth issues, and accelerate field data processing as well as significantly improve data access times. Figure 1 illustrates the newly modernized workflow.
Figure 1. Lyve Mobile solution for subsurface imaging data acquisition and processing
Using Lyve Mobile’s Rackmount Receivers—which allow users to install up to two Lyve Mobile Arrays into a standard 19-inch data center rack, complete with redundant power and high-speed interfaces—PXGEO’s Seagate Lyve Mobile deployment was able to provide digitalized copy of all sensor data prior to secure delivery to the processing center with up to 1.3 GB/s throughput. As a result, Seagate Lyve Mobile helped PXGEO to aggregate, store, process, and mobilize more data, improving field processing time, significantly reducing its customers’ time to data, and delivering valuable insights into ongoing operations.
Data transfer challenges are not just limited to on-vessel seismic data acquisition. The challenges of accelerating field processing to client delivery were also apparent. The off-loaded data needed to be replicated, transferred to specialized processing partners, and delivered in the specified manner to the client. Challenges were twofold: (1) acquired data quadrupled in size during processing and interpretation, and (2) the time to data for the number of partners involved continued to increase as the datasets grew.
PXGEO embarked on a massive workflow migration away from tape, instead immediately digitizing the data off-loaded from the vessels. With the data on Lyve Mobile Arrays, the company was able to quickly pass data to its specialized partners, accelerating their time to data by 3–4 times. Seagate also was able to provide cloud import services to Amazon S3 for client delivery. The Lyve Mobile Arrays were sent to Seagate with the raw, preprocessed data and the fully processed, interpreted data—which was 4 times the acquired data amount—and was uploaded directly to the client’s designated Amazon S3 bucket.
With Amazon S3, it is possible to upload large objects using Amazon S3 Transfer Acceleration and multipart upload to significantly speed up content transfers. Multipart upload helps you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If you’re uploading large objects over a stable, high-bandwidth network, using multipart upload can maximize the use of your available bandwidth by uploading object parts in parallel for multithreaded performance. Amazon S3 Transfer Acceleration can speed up content transfers to and from Amazon S3 by as much as 50–500 percent for long-distance transfer of larger objects. With this service, data is encrypted both in transit and at rest. In-transit data is secured with HTTPS and TLS 1.2 or higher protocols, while at-rest data can be protected with server-side or client-side encryption. AWS Direct Connect, a dedicated network connection to AWS, can further accelerate data transfer rates and facilitate smooth and reliable data transfers at a massive scale.
Due to Seagate’s throughput at their service centers, PXGEO was able to receive the final project data 2 weeks before deadline. By deploying digital-first workflows, PXGEO was able to eliminate unnecessary copies, remove tape, and decrease the time to deliver the project data into AWS—all while modernizing their workflows for more efficiency for future projects. The summary of the seismic data off-load is shown in figure 2.
Figure 2. Seismic data off-load summary
Custom edge-to-cloud workflow with Lyve Mobile and Amazon S3
To deploy a comparable solution, you will first need to log on to your existing AWS account, or create one, and access the AWS Management Console. In order to move the data successfully to a designated Amazon S3 bucket, you will need to create a new bucket and allow Seagate Lyve Mobile access to do the import. It is recommended that users create a unique S3 bucket specific to this import project and that users follow Amazon S3 best practices as it relates to creating buckets, security, and user permissions. When completing this process, Seagate recommends implementing the following list of best practices:
- Create an Amazon S3 bucket dedicated to your Seagate Lyve Import project.
- Block all public access to your Amazon S3 bucket.
- Verify that Amazon S3 bucket versioning is deactivated.
- Verify that server-side encryption is activated.
- Create a permission policy with AWS Identity and Access Management (AWS IAM), a service that helps to securely manage identities and access to AWS services and resources.
- Create an AWS IAM role trusting Lyve Import Service, attaching the AWS IAM policy you created.
- Deactivate or delete the role after the cloud import project has ended.
- Deactivate or delete the policy after the cloud import project has ended.
Figure 3 outlines a step-by-step guide on how to move large datasets into Amazon S3 with Seagate Lyve Mobile. A brief description of the steps is provided below, and for more information, see the Seagate reference guide.
Step 1: Opening an account in the Lyve Management portal
Open an account in the Lyve Management Portal and create projects to move data. This account allows you to order, provision, and run your projects while controlling who can access and use the devices. After the account setup is completed, you can order and deploy the Lyve Mobile solutions by following the sub-steps below.
Sub-step 1.1: Device selection
- Filter devices based on HDD, solid-state drive (SSD), or both.
- Enter the device quantities for the Seagate Lyve Mobile device(s), including needed accessories for additional connection options or for mounting devices inside of rugged environments.
- Configure the redundant array of independent disks (RAID) level for your Lyve Mobile Array(s).
- Provide the project details and shipping information for the Seagate Lyve Mobile device(s).
Figure 3. Data ingest workflow from Seagate Lyve Mobile into Amazon S3
Sub-step 1.2 Security aspect and encryption
Seagate Lyve Mobile Device Security offers industry-standard AES 256-bit hardware encryption at rest and in motion. Seagate Lyve Mobile assures data integrity with TCG industry-standard verifications such as authenticated firmware and data encryption at rest and in flight. The device includes tamper-evidence labels along with a lockable, military-grade shipping case. To facilitate proper data destruction, the device can be securely crypto-erased, and a certified data destruction certificate is provided within the Lyve Management Portal.
Sub-step 1.3: User management
Identity and Access Management in the Lyve Management Portal provides an easy-to-use management system that helps you to select roles to determine user access to your projects and devices. You are in full control of your data and who can access the account or the device.
Step 2: Configuring AWS Cloud import
Create your upload destination by selecting Amazon S3 from the Cloud Destination drop-down menu and the Region your Amazon S3 bucket resides in. You will then complete your upload plan details by inputting your credentials and adding the bucket you would like to have your data uploaded to. Use the Lyve Management Portal to view all projects and their associated statuses in a centralized place.
In this blog post, we have explored a high-capacity physical data transfer solution that is designed to accommodate mass physical data transfers from the edge to the cloud and to scale alongside growing data needs. The energy industry is expected to see exponentially increasing data volumes from both digital transformations and new operations. In the seismic acquisition space alone, there is a growing demand for higher resolution and higher density 3D and 4D seismic surveys. Carbon Capture Utilization and Storage (CCUS) could drive an additional demand for more frequent seismic surveys that would range from 10 to 100 TB or more per survey. As data becomes a strategic resource across enterprises, customers will require a rugged, scalable, and cost-effective data storage solution built both for mass-capacity storage at the edge and for frictionless physical transfer to the cloud.