AWS Storage Gateway in 2019
Tens of thousands of customers around the world are using AWS Storage Gateway to connect their on-premises applications with AWS. Over the last 12 months, Storage Gateway launched over 20 new features. These features range from 4x performance improvements to new deployment options, including both hardware appliances and VMware high availability configurations, as well as several features across all three types of gateways (file gateway, tape gateway, and volume gateway). While backup continues to remain a popular use case for Storage Gateway, increasingly customers are using Storage Gateway for additional hybrid use cases. Examples include replacing on-premises storage with cloud-backed storage, and providing distributed access to data in AWS.
In this post, I want to share with you the new features we’ve launched over the last year and show you how customers are using Storage Gateway to access cloud storage from their on-premises apps.
What is AWS Storage Gateway?
AWS Storage Gateway provides on-premises access to virtually unlimited cloud storage. It supports standard storage protocols such as NFS, SMB, iSCSI, and iSCSI-VTL, so existing applications can use AWS storage without making any changes. It provides a cache that delivers low latencies for frequently accessed data and optimizes data transfers to AWS. In addition, gateways are managed through AWS, the same way you manage your other AWS services. This means using the AWS Management Console, CLI, SDK, and services such as Amazon CloudWatch and AWS CloudTrail.
AWS Storage Gateway enhancements over the last 12 months
- Increased write performance of File Gateway to up to 4 Gbps and Tape Gateway to up to 2.7 Gbps, expanding the use of Storage Gateway to applications and workloads that need even higher performance.
- Increased performance for reading files from File Gateway’s local cache to speeds up to 4.8 Gbps and for reading files from AWS up to 0.8 Gbps. This provides faster access to data stored in the cache or in AWS.
- Increased file gateway directory list performance 4x, providing NFS or SMB users and applications quicker access to file metadata.
- Increased read performance for virtual tape libraries managed by Tape Gateway 3x, up to 2 Gbps, helping further reduce time to recover tapes from AWS.
- File Gateway added support for Access Control Lists (ACLs) to SMB shares, enabling fine-grained access controls on individual files and folders in the gateway’s file share.
- File Gateway added options to enforce encryption and signing for SMB file shares, helping meet organization requirements for security and client compatibility.
- File Gateway enhancements to manage selective cache refreshes, further simplifying collaboration and content distribution workflows in geographically separated hybrid cloud environments.
- Tape Gateway support for S3 Glacier Deep Archive makes it possible to store virtual tapes in the lowest-cost Amazon S3 storage class, at $1 per TB per month.
- Tape Gateway support for moving tapes from S3 Glacier to S3 Glacier Deep Archive, further reducing the monthly cost to store long-term data in the cloud by up to 75%.
- Tape Gateway support for IBM Spectrum Protect. Tape Gateway now supports backup apps from leading ISVs including Veritas, Veeam, Commvault, IBM Spectrum Protect, Dell EMC, and Microsoft. We also increased tape size support from 2.5 TB to 5 TB, making it easier to manage tapes by storing more data on each tape.
- Volume Gateway integrated with AWS Backup, enabling use of AWS Backup to manage retention and scheduling of EBS Snapshots generated through the Volume Gateway.
- Volume Gateway detach and attach capability to make it easy to move volumes between different host platforms, which is useful during server migrations.
Deployment configurations, management, and monitoring
- Introduced a High Availability configuration for VMware, which allows Storage Gateway to recover from most service interruptions in under 60 seconds, enabling business critical applications to run uninterrupted. This protects storage workloads against hardware, hypervisor, or network failures. It also protects against software errors such as connection timeouts and file share or volume unavailability. With this feature, it reduces the need to write custom scripts for health checks, auto-restart, and alerting.
- Expanded local storage managed through the hardware appliance from 5 TB to 12 TB providing a larger local cache and introduced 10 Gbps fiber optic network card support to provide additional datacenter deployment options. The appliance can now be purchased in Europe as well, from Amazon UK or Amazon Germany.
- Added support for VPC endpoints with AWS PrivateLink so the network connection between an on-premises Storage Gateway and AWS can be restricted to private network routes. This further secures storage workloads and administration activities.
- Introduced support for adding tags and setting fine-grained access controls on Storage Gateway resources, making it possible to easily organize, search, and identify resources, create cost allocation reports, and control access to resources.
- New embedded CloudWatch graphs for continuously monitoring throughput, cache utilization, and access patterns, which makes it easy to adapt storage, compute, or network resources assigned to the gateway to optimize the gateway’s performance as workloads change.
- New CloudWatch Logging to log configuration errors such as insufficient bucket access privileges, or when applications use the gateway to access files that have transitioned to long-term Amazon S3 storage classes, providing proactive notification to correct errors.
- Added granular controls to manage application of software updates to align Storage Gateway software updates with enterprise IT policies and maintenance windows.
- Launched Storage Gateway in Sweden, Bahrain, and Hong Kong, and Tape Gateway in South America (Sao Paulo). Storage Gateway is now available in 20 AWS Regions, including AWS GovCloud (US-West) and China (Beijing).
How are customers using AWS Storage Gateway?
Customers in all stages of their AWS journey are using AWS Storage Gateway – whether they are just starting to use the cloud, are in the process of migrating to the cloud with applications and data both on-premises and in the cloud, or have moved to AWS but need on-premises access to data in the cloud.
Storage Gateway is an easy way for customers to start using AWS, since they can use it to move their on-premises backups to AWS. This allows customers to free up on-premises storage, while durably storing data in AWS.
Then there are customers that are moving to the cloud and want to minimize their on-premises storage footprint, but often need on-premises access to storage for their existing apps. These customers use Storage Gateway as a way to replace on-premises storage with cloud-backed storage, which allows their existing applications to operate without changes, while still getting the benefits of storing and processing this data in AWS.
The third category includes customers that run their apps in AWS and want to make the results available from multiple on-premises locations such as data centers or branch and remote offices. Also, customers that have moved their on-prem archives to AWS often want to make this data available for access from existing on-premises applications. These customers use Storage Gateway as a way to distribute and share data in AWS.
Let me tell you about these use cases in more detail, along with some of the features of Storage Gateway that help meet these use cases.
Use case 1: Move backups to the cloud
You can use cloud storage for on-premises data backups to reduce infrastructure and administration costs. You can use all three types of gateways for backup. This enables you to backup files, applications, databases, and volumes, either directly to Amazon S3, to EBS snapshots, or to virtual tape libraries. (Analog Devices using Tape Gateway, Kellogg’s using File Gateway for database backups, and StemCell using Volume Gateway).
File Gateway for on-premises backups
Databases and applications are often backed up directly to a file server on-premises. You can now simply point these backups to the File Gateway, which copies the data to Amazon S3. You can configure your bucket policy to move this data to any storage class in Amazon S3, depending on your retention needs. The data is available as objects in S3 and you can process it in AWS or take advantage of Amazon S3 features such as S3 Object Lock for compliance. In addition, recently used data is maintained in the cache, so you have immediate access to recent backups. You can restore the data on-premises through the gateway or through in-cloud databases or Amazon RDS services. Learn more by checking out this blog ‘Store SQL Server backups in Amazon S3 using AWS Storage Gateway.’
Volume Gateway for on-premises backups
The Volume Gateway provides either a local cache or full volumes on premises while also storing full copies of your volumes in the AWS Cloud. Volume Gateway also provides Amazon EBS snapshots of your data for backup or disaster recovery. Now you can also use AWS Backup to control scheduling and managing retention of these snapshots.
Tape Gateway for on-premises backups
You can use Tape Gateway to replace physical tapes with virtual tapes in AWS. Tape Gateway acts as a drop-in replacement for tape libraries, tape media, and archiving services, without requiring changes to existing software or archiving workflows (see our blog for Tape Gateway blog for more information). Through Tape Gateway’s integration with Amazon S3 storage classes, including S3 Glacier Deep Archive, customers store virtual tapes with higher durability and lower cost than actual physical tape. This can be as little as about $1 per TB per month – which is the lowest cost storage available in the cloud. AWS is the only cloud provider to provide a service such as Tape Gateway for virtual tape backups. Learn more about how you can start using Tape Gateway today for your backup and archival needs by watching this video.
Use Case 2: Shift on-premises storage to cloud-backed file shares
A number of our customers have told us they have on-premises applications that need easy-to-use, cost-effective, scalable file storage. However, they often run out of capacity on their on-premises storage arrays, and face expensive hardware replacement cycles every three-to-five years. Many on-premises file workloads (for example, web servers, logging, and database backups) do not need expensive storage arrays. Instead, you can use File Gateway, which gives on-premises access to virtually unlimited cloud storage for files stored as Amazon S3 objects. It provides on-premises Windows and Linux applications easy integration to durable storage in Amazon S3 using SMB or NFS interfaces. You simply point applications to write to the gateway and the data seamlessly arrives in AWS.
Using the File Gateway in such a way allows you to keep using your existing applications, but reduce the amount of storage you must provision on-premises. The cache provides low-latency access to the working dataset and you get the elasticity of cloud storage. The data is stored as S3 objects and you can access it using the S3 API. You can use the ‘Notify-When-Uploaded’ event to know when your data has arrived in AWS, and automatically trigger an AWS Lambda function for further processing data in AWS. This enables you to easily manage distributed data pipelines. File Gateway thus allows you to replace your entry-level and midrange on-premises NAS with cloud backed storage, with the additional benefit of getting access to data in the cloud for further processing. You can now access your data through NFS or SMB on-premises and through the S3 API in AWS.
Use Case 3: Low latency access to data in AWS for on-premises applications
With the first two use cases, I described how Storage Gateway allows you to store data to AWS, either for backups or as a replacement to your on-premises file servers. There is a third use case where customers have data in the cloud that they want to share and distribute with remote locations. For example, data may be generated through a genomics application in AWS, and also must be accessed from on-premises by researchers. In other cases, a media archive or geospatial dataset may have been moved from on-premises for archival in AWS using services such as AWS Snowball or AWS DataSync, but customers want to have “on-demand” access to this data from existing on-premises applications.
With File Gateway, customers have on-demand, low-latency access to data stored in AWS for application workflows that can span the globe. You can use features such as cache refresh to refresh data in geographically distributed locations so content can be easily shared across your offices. One of our customers recently told me, “File Gateway is magical, with a TB of cache on-premises, I have access to 10s of TBs in AWS.”
If you’re coming to re:Invent, you can learn more at
- The breakout session STG305-R – [REPEAT] Build hybrid storage architectures with AWS Storage Gateway, R1, featuring speakers from FINRA and Bristol Myers Squibb.
- The breakout session STG217 – Shift your tape backups to AWS to save time and money, featuring speakers from Ryanair on how they used Tape Gateway to move their physical tapes to virtual tapes in AWS.
- The interactive Chalk Talk STG354 – Large-scale file migrations with AWS DataSync, featuring best practices from Cox Automotive’s migration of 700 million files to AWS – and then their process accessing this data via File Gateway.
- The interactive chalk talk STG220 – How to move 700 TB over the wire to AWS with AWS DataSync where we show how Autodesk successfully migrated over 700 terabytes (TB) of data from their Dell EMC Data Domain storage systems to Amazon S3 with minimal setup and administration, within two months.
- The breakout session STG211 – How to use AWS storage for on-premises file-based applications, where you’ll learn how you can use AWS storage for on-premises use cases, including user home directories, cloud-backed file shares for applications, content repositories, analytics workloads, and enterprise business applications.
- The breakout session STG204-R – [REPEAT] Get your data to AWS: How to choose and use data migration services, R1, where you’ll learn how to choose and combine services—including AWS DataSync, the AWS Snow family, CloudEndure, and AWS Transfer for SFTP—for your different use cases.
You can also try out Storage Gateway yourself in a few other sessions:
- Workshop STG313 – Hybrid architectures for database backups & file migrations
- Workshop STG316 – Get hands-on & learn best practices for AWS data migrations
- Builders sessions STG225-R – [REPEAT] Getting started with hybrid file storage using File Gateway, R1, R2
- Builders sessions STG226-R – [REPEAT] Hands-on with hybrid block storage using a Volume Gateway, R1