Data lake storage
Amazon S3 is the most secure, durable, and scalable storage to build your data lake. S3 hosts tens of thousands of data lakes for customers such as Sysco, Bristol Myers Squibb, GE, and Siemens, who are using them to securely scale with their needs and to discover business insights every minute.
Georgia-Pacific built a central data lake based on Amazon S3, allowing it to efficiently ingest and analyze structured and unstructured data at scale.
"AWS enables us to source, store, enrich, and deliver data in a centralized way, which we couldn’t do previously. Using this new model, we believe we can run more production lines in a more predictable manner. Using AWS, we can ensure the highest quality product running at the fastest possible rate, so we can best serve our customers.”"
Steve Bakalar, Vice President of IT/Digital Transformation, Georgia-Pacific
Sysco consolidated its data into a single data lake built on Amazon S3 and Amazon S3 Glacier to run analytics on its data and gain business insights.
"We're using S3 as our main data lake repository and S3 Glacier for archival. Our data lake in S3 allows us to transform, data load, extract, and query the data directly on S3. With S3 Glacier, we were able to reduce storage costs by over 40%."
Wesley Story, VP Business Technology Americas - Sysco
With a data lake based on Amazon S3 capable of collecting 6 TB of log data per day, the security staff at Siemens can perform forensic analysis on years' worth of data without compromising the performance or availability of the Siemens security incident and event management (SIEM) solution.
"Our goal was to use cloud-based artificial intelligence to process these huge amounts of data and make immediate decisions about how best to counter any detected threats," says Jan Pospisil, a senior data scientist at CDC. "Given the objective of an AI-enabled, high-speed, fully automated, and highly scalable platform, the decision to use AWS was easy."
Jan Pospisil, Senior Data Scientist - Siemens Cyber Defense Center
Bristol Meyers Squibb
Bristol Myers Squibb collects a variety of clinical data from external sources, such as academic medical centers, healthcare providers, and other collaborations. The wide assortment of sources for data results in broad variations in data formats. Amazon S3 & AWS Storage Gateway play central roles at Bristol Myers Squibb by moving scientific data into clinical data lakes.
“Bristol Myers Squibb has been using Amazon S3 and Storage Gateway for years, moving hundreds of terabytes of scientific data from our local premises to the AWS Cloud, daily. We have found AWS services to be efficient, reliable, and cost-effective, often bringing in more flexibility and scalability while reducing our dependency on hardware infrastructure.”
Oleg Moiseyenko, Senior Cloud Architect - Bristol Myers Squibb
GE Healthcare is known for its medical imaging equipment and diagnostic imaging agents, but has—over the last several years—continued in its digital transformation. The company launched the GE Health Cloud to provide radiologists and other healthcare professionals with a single portal to access enterprise imaging applications to view, process, and easily share images and patient cases.
“Every day, healthcare data flows through millions of medical devices, including more than 500,000 GE Healthcare medical imaging devices globally. Amazon S3 is the cornerstone of our solution, and it gives us the durability and reliability we need for storing critical data.”
Mitch Jackson, VP of cloud strategy and technology - GE Healthcare Digital
INVISTA is one of the world’s largest integrated producers of chemical intermediates, polymers, and fibers. INVISTA data is no longer siloed at sites around the world because of an ambitious initiative to transform its operations by moving from business intelligence (BI) to artificial intelligence (AI). The data now resides in an Amazon Web Services (AWS) data lake.
“With our old solution, it took us two months the first time we tried to get just one plant site's historical data into a data scientist's hands for analysis. Through our optimization and right-sizing efforts, migrating our data centers to AWS is saving us more than $2 million a year,"”
Tanner Gonzalez, Analytics Leader - INVISTA
Backup and Archive
AWS offers a complete set of cloud storage services for backup and archiving. Amazon S3 Glacier and S3 Glacier Deep Archive are secure, durable, and extremely low-cost Amazon S3 cloud storage classes for data archiving and long-term backup.
Celgene is a global biopharmaceutical company that develops drug therapies for cancer and inflammatory disorders. Celgene runs many HPC workloads on hundreds of Amazon EC2 instances and uses Amazon S3 and Amazon S3 Glacier for durable long-term storage of petabytes of genomic data.
"Some of our genomic files are very large in size, even after compression, so we need the robust storage capabilities of Amazon S3 and Amazon Glacier.”
Lance Smith, Associate Director of IT - Celgene
Ryanair switched tape backups to the cloud using AWS Storage Gateway’s Tape Gateway and stored them in Amazon S3 Glacier and Amazon S3 Glacier Deep Archive for long-term storage. Ryanair eliminated the need for resources for ongoing support and management of physical tapes, and realized 65% savings in backup costs. Ryanair is Europe's largest airline group, flying more than 150 million passengers per year to more than 200 destinations on 2,400 daily flights.
Autodesk is a leader in 3D design, engineering, and entertainment software. Autodesk makes software for people who make things. Autodesk needed to backup 2.4 petabytes of data to Amazon S3 Glacier to reduce on-premises storage costs.
Autodesk decided to use Amazon S3 because of the low cost, pay-as-you-go model, high durability, and availability. It also has lifecycle management capabilities for long-term archival storage to Amazon S3 Glacier or Amazon S3 Glacier Deep Archive. The goal was to move this dataset to S3 as soon as possible and eventually lifecycle it to Amazon S3 Glacier for long-term retention. This petabyte scale data migration from on-premises to S3 was accomplished swiftly with minimal effort and was completely self-managed with AWS DataSync.
Nasdaq is home to over 4000 company listings and the market technology provider to over 100 marketplaces around the globe in 50 countries.
Nasdaq has some of its most critical data on Amazon S3 and S3 Glacier and AWS has been a trusted partner for many years. Watch this video to learn how AWS enables Nasdaq to meet their long-term data retention polices, lifecycle management needs for all data types, compliance and security requirements, and scaling demands in a highly-regulated industry.
Growth is Ambra Health's hallark. Since its founding, the medical data and image-management software-as-a-service (SaaS) provider has grown to manage more than five billion medical images.
“Using AWS, we can easily scale our medical image management platform to meet the needs of healthcare customers worldwide. It was very easy to deploy and be operational globally. We didn’t have to put a lot of resources into building new data centers and training people to manage them.”
Andrew Duckworth, Vice President of Business Development - Ambra Health
Many customers use Amazon S3 to store enterprise application data, as well as to store cloud native application production data. With Amazon S3, you can upload any amount of data and access it anywhere in order to deploy applications faster and reach more end users.
Nielsen is a global measurement and data analytics company, measuring what consumers watch and the advertising they’re exposed to. In 2019, Nielsen migrated its National Television Audience Measurement platform to AWS, and built a cloud-native local television rating application. To do so, the company built a data lake capable of storing 30 petabytes of data in Amazon S3, increasing their scale from measuring 40,000 households to more than 30 million households each day.
"We drastically increased the amount of data ingested, processed, and reported to our clients each day. Working with AWS and the services they provide allows us to do all of that at a much faster pace, with much greater velocity than we could have ever achieved before."
Scott Brown, general manager of TV & Audio - Nielsen
Fileforc provides more than 300 domestic and global corporate customers with cloud file storage and document management services. Customers use the Fileforce cloud-based application to securely store and manage their business content in the same folder structure as their on-premises file storage solutions. Fileforce began running its application on Amazon EC2 instances and using Amazon S3 for data storage.
3M Health Information Systems needed the agility to develop and deploy new applications faster, and determined that moving to the cloud was the best way to address its challenges of scalability, speed, and security.
3M HIS's applications run on hundreds of Amazon EC2 instance and use Amazon S3 for application data storage.
Myriota was founded to revolutionize the Internet of Things (IoT) by offering disruptively low-cost and long-battery-life global connectivity. Based in Adelaide, Australia, a focal point of the Australian space industry and home of the Australian Space Agency, Myriota has a growing portfolio of more than 20 patents, and support from major Australian and international investors. With deep heritage in telecommunications research, world-first transmission of IoT data direct to nanosatellite was achieved in 2013. Myriota has made this ground-breaking technology commercially available for partners worldwide.
“We depend on Amazon S3 as a critical staging area where we hold data for processing the core of our data platform. AWS Transfer for SFTP has helped us simplify the security of sensor data coming over from our customers' widespread sites over the Myriota Network of satellites to Amazon S3.”
Andrew Beck, Director of Service Delivery – Myriota
Thousands of hiring managers worldwide rely on Lever software to find, nurture, and manage their job candidates in a central location. Since its founding, Lever has run its application environment on the AWS Cloud, taking advantage of services including Amazon EC2 for on-demand compute capacity, and Amazon S3 for storing customer data.
Monzo has grown from an idea to a fully regulated bank on the AWS Cloud. A bank that “lives on your smartphone,” Monzo has already handled £1 billion worth of transactions for half a million customers in the UK. Monzo runs more than 1600 core-banking microservices on AWS, using services including Amazon EC2, and Amazon S3.
"By using AWS, we can run a bank with more than 4 million customers with just eight people on our infrastructure and reliability team."
Matt Heath, Distributed Systems Engineer - Monzo Bank
Customers use Amazon S3 to save on storage costs, and new storage classes and features continually help optimize storage costs even further. With S3 Intelligent-Tiering and S3 Glacier Deep Archive, customers can automate storage cost savings, or use the lowest-cost cloud storage across three availability zones.
Founded in 2008, Zalando is Europe’s leading online platform for fashion and lifestyle with over 32 million active customers. Amazon S3 is the cornerstone of the data infrastructure of Zalando, and they have utilized S3 Storage Classes to optimize storage costs.
"We are saving 37% annually in storage costs by using Amazon S3 Intelligent-Tiering to automatically move objects that have not been touched within 30 days to the infrequent-access tier."
Max Schultze, Lead Data Engineer - Zalando
Teespring, an online platform that lets creators turn unique ideas into custom merchandise, experienced rapid business growth, and the company’s data also grew exponentially—to a petabyte—and continued to increase. Like many cloud native companies, Teespring addressed the problem by using AWS, specifically storing data on Amazon S3.
By using Amazon S3 Glacier and S3 Intelligent-Tiering, Teespring now saves more than 30 percent on its monthly storage costs.
"Just as Teespring simplified the process for bringing physical products to market, AWS simplified how businesses approach cloud and infrastructure. Because of the services AWS provides and the ease with which we can implement them."
James Brady, Vice President of Engineering - Teespring
AppsFlyer, a marketing analytics and attribution platform, built its data lake on Amazon S3 to collect terabytes of data each day, enabling the company to improve their analytics products and increase customer satisfaction. AppsFlyer further optimizes its storage costs using Amazon S3 Intelligent-Tiering.
Using AWS, SimilarWeb manages large volumes of data, with which its data scientists build algorithms to improve its market-intelligence platform. By using Amazon S3 Intelligent-Tiering, SimilarWeb is able to democratize that data for its employees and save 20 percent on storage costs.
Photobox wanted to get out of the business of owning and maintaining its own IT infrastructure so it could redeploy resources toward innovation in artificial intelligence and other areas to create a better customer experience. Photobox is an online, personalized photo-products company that serves millions of customers each year in over ten markets.
By migrating from its EMC Isilon and IBM Cleversafe on-premises storage arrays to Amazon S3 using AWS Snowball Edge, Photobox saved a significant amount on costs on storage for its 10 PB of photo storage.
Union Bank of the Philippines (UnionBank) aims to improve what it calls “prosperity inclusion,” by attracting a total of 50 million customers by the year 2020. Key to this objective is its digital transformation on AWS.
Since moving to Amazon S3 and S3 Glacier, the bank is saving 20 million pesos (US$380,500) annually, a figure that would double when it completely migrates its Tier 1 workloads. This excludes the savings in electricity cooling the backup tape or the reduction in staff hours required to monitor, change, and store the tapes.