Amazon S3 has various features you can use to organize and manage your data in ways that support specific use cases, enable cost efficiencies, enforce security, and meet compliance requirements. Data is stored as objects within resources called “buckets”, and a single object can be up to 5 terabytes in size. S3 features include capabilities to append metadata tags to objects, move and store data across the S3 Storage Classes, configure and enforce data access controls, secure data against unauthorized users, run big data analytics, and monitor data at the object and bucket levels.
Storage management and monitoring
Amazon S3’s flat, non-hierarchical structure and various management features are helping customers of all sizes and industries organize their data in ways that are valuable to their businesses and teams. All objects are stored in S3 buckets and can be organized with shared names called prefixes. You can also append up to 10 key-value pairs called S3 object tags to each object, which can be created, updated, and deleted throughout an object’s lifecycle. To keep track of objects and their respective tags, buckets, and prefixes, you can use an S3 Inventory report that lists your stored objects within an S3 bucket or with a specific prefix, and their respective metadata and encryption status. S3 Inventory can be configured to generate reports on a daily or a weekly basis.
With S3 bucket names, prefixes, object tags, and S3 Inventory, you have a range of ways to categorize and report on your data, and subsequently can configure other S3 features to take action. S3 Batch Operations makes it simple, whether you store thousands of objects or a billion, to manage your data in Amazon S3 at any scale. With S3 Batch Operations, you can copy objects between buckets, replace object tag sets, modify access controls, and restore archived objects from Amazon S3 Glacier, with a single S3 API request or a few clicks in the Amazon S3 Management Console. You can also use S3 Batch Operations to run AWS Lambda functions across your objects to execute custom business logic, such as processing data or transcoding image files. To get started, specify a list of target objects by using an S3 Inventory report or by providing a custom list, and then select the desired operation from a pre-populated menu. When an S3 Batch Operation request is done, you will receive a notification and a completion report of all changes made. (Sign up for the preview.)
Amazon S3 also supports features that help maintain data version control, prevent accidental deletions, and replicate data in other AWS Regions. With S3 Versioning, you can easily preserve, retrieve, and restore every version of an object stored in Amazon S3, which allows you to recover from unintended user actions and application failures. To prevent accidental deletions, enable Multi-Factor Authentication (MFA) Delete on an S3 bucket. If you try to delete an object stored in an MFA Delete-enabled bucket, it will require two forms of authentication: your AWS account credentials and the concatenation of a valid serial number, a space, and the six-digit code displayed on an approved authentication device, like a hardware key fob or a Universal 2nd Factor (U2F) security key. With S3 Cross-Region Replication (CRR), you can replicate objects (and their respective metadata and object tags) into other AWS Regions for reduced latency, compliance, security, disaster recovery, and other use cases. S3 CRR is configured to a source S3 bucket and replicates objects into a destination bucket in another AWS Region.
You can also enforce write-once-read-many (WORM) policies with S3 Object Lock. This S3 management feature blocks object version deletion during a customer-defined retention period so that you can enforce retention policies as an added layer of data protection or to meet compliance obligations. You can migrate workloads from existing WORM systems into Amazon S3, and configure S3 Object Lock at the object- and bucket-levels to prevent object version deletions prior to a pre-defined Retain Until Date or Legal Hold Date. Objects with S3 Object Lock retain WORM protection, even if they are moved to different storage classes with an S3 Lifecycle policy. To track what objects have S3 Object Lock, you can refer to an S3 Inventory report that includes the WORM status of objects. S3 Object Lock can be configured in one of two modes. When deployed in Governance mode, AWS accounts with specific IAM permissions are able to remove S3 Object Lock from objects. If you require stronger immutability in order to comply with regulations, you can use Compliance Mode. In Compliance Mode, the protection cannot be removed by any user, including the root account.
In addition to these management capabilities, you can use S3 features and other AWS services to monitor and control how your S3 resources are being used. You can apply tags to S3 buckets in order to allocate costs across multiple business dimensions (such as cost centers, application names, or owners), and then use AWS Cost Allocation Reports to view usage and costs aggregated by the bucket tags. You can also use Amazon CloudWatch to track the operational health of your AWS resources and configure billing alerts that are sent to you when estimated charges reach a user-defined threshold. Another AWS monitoring service is AWS CloudTrail, which tracks and reports on bucket-level and object-level activities. You can configure S3 Event Notifications to trigger workflows, alerts, and invoke AWS Lambda when a specific change is made to your S3 resources. S3 Event Notifications can be used to automatically transcode media files as they are uploaded to Amazon S3, process data files as they become available, or synchronize objects with other data stores.
With Amazon S3, you can store data across a range of different S3 Storage Classes: S3 Standard, S3 Intelligent-Tiering, S3 Standard-Infrequent Access (S3 Standard-IA), S3 One Zone-Infrequent Access (S3 One Zone-IA), Amazon S3 Glacier (S3 Glacier), and Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive) (coming soon).
Every S3 Storage Class supports a specific data access level at corresponding costs. This means you can store mission-critical production data in S3 Standard for frequent access, save costs by storing infrequently accessed data in S3 Standard-IA or S3 One Zone-IA, and archive data at the lowest costs in the archival storage classes — S3 Glacier and S3 Glacier Deep Archive. You can use S3 Storage Class Analysis to monitor access patterns across objects to discover data that should be moved to lower-cost storage classes. Then you can use this information to configure an S3 Lifecycle policy that makes the data transfer. S3 Lifecycle policies can also be used to expire objects at the end of their lifecycles. You can store data with changing or unknown access patterns in S3 Intelligent-Tiering, which automatically moves your data based on changing access patterns between a frequent access tier and a lower-cost infrequent access tier for cost savings.
Access management and security
To protect your data in Amazon S3, by default, users only have access to the S3 resources they create. You can grant access to other users by using one or a combination of the following access management features: AWS Identity and Access Management (IAM) to create users and manage their respective access; Access Control Lists (ACLs) to make individual objects accessible to authorized users; bucket policies to configure permissions for all objects within a single S3 bucket; and Query String Authentication to grant time-limited access to others with temporary URLs. Amazon S3 also supports Audit Logs that list the requests made against your S3 resources for complete visibility into who is accessing what data.
Amazon S3 offers flexible security features to block unauthorized users from accessing your data. Use VPC endpoints to connect to S3 resources from your Amazon Virtual Private Cloud (Amazon VPC). Amazon S3 supports both server-side encryption (with three key management options) and client-side encryption for data uploads. Use S3 Inventory to check the encryption status of your S3 objects (see storage management for more information on S3 Inventory).
S3 Block Public Access is a set of security controls that ensures S3 buckets and objects do not have public access. With a few clicks in the Amazon S3 Management Console, you can apply the S3 Block Public Access settings to all buckets within your AWS account or to specific S3 buckets. Once the settings are applied to an AWS account, any existing or new buckets and objects associated with that account inherit the settings that prevent public access. S3 Block Public Access settings override other S3 access permissions, making it easy for the account administrator to enforce a “no public access” policy regardless of how an object is added, how a bucket is created, or if there are existing access permissions. S3 Block Public Access controls are auditable, provide a further layer of control, and use AWS Trusted Advisor bucket permission checks, AWS CloudTrail logs, and Amazon CloudWatch alarms.
Customers can also use Amazon Macie to discover, classify, and protect sensitive data stored in Amazon S3. It uses machine learning capabilities to recognize sensitive data, such as personally identifiable information (PII) or intellectual property, and provides dashboards and alerts for visibility into how this data is being accessed or moved. Amazon Macie also monitors data access patterns for anomalies, and generates alerts when it detects risk of unauthorized access or inadvertent data leaks.
Query in place
Amazon S3 has a built-in feature and complimentary services that query data without needing to copy and load it into a separate analytics platform or data warehouse. This means you can run big data analytics directly on your data stored in Amazon S3. S3 Select is an S3 feature designed to increase query performance by up to 400%, and reduce querying costs as much as 80%. It works by retrieving a subset of an object’s metadata (using simple SQL expressions) instead of the entire object, which can be up to 5 terabytes in size.
Amazon S3 is also compatible with AWS analytics services Amazon Athena and Amazon Redshift Spectrum. Amazon Athena queries your data in Amazon S3 without needing to extract and load it into a separate service or platform. It uses standard SQL expressions to analyze your data, delivers results within seconds, and is commonly used for ad hoc data discovery. Amazon Redshift Spectrum also runs SQL queries directly against data at rest in Amazon S3, and is more appropriate for complex queries and large data sets (up to exabytes). Because Amazon Athena and Amazon Redshift share a common data catalog and data formats, you can use them both against the same data sets in Amazon S3.
Transferring large amounts of data
AWS has a suite of data migration services that make transferring data into the AWS Cloud simple, fast, and secure. S3 Transfer Acceleration is designed to maximize transfer speeds to S3 buckets over long distances. For very large data transfers, consider using AWS Snowball, AWS Snowball Edge, and AWS Snowmobile to move petabytes to exabytes of data to the AWS Cloud for as little as one-fifth the cost of high-speed Internet. These AWS Snow services work by using secure physical devices to transport data via roads, and solve for migration problems such as high network costs, long transfer times, and security.
Customers who want to keep their on-premises applications and enable a cloud storage architecture can use AWS Storage Gateway (a hybrid cloud storage service) to seamlessly connect on-premises environments to Amazon S3. You can automate transferring data between on-premises storage and AWS (including Amazon S3) by using AWS DataSync, which can transfer data at speeds up to 10 times faster than open-source tools. If you want to transfer files directly into and out of Amazon S3 using the Secure File Transfer Protocol (SFTP), use AWS Transfer for SFTP — a fully managed service that enables secure file exchanges with third parties.
Customers can also work with third-party providers from the AWS Partner Network (APN) to deploy hybrid storage architectures, integrate Amazon S3 into existing applications and workflows, and transfer data to and from the AWS Cloud.
Intended usage and restrictions
Your use of this service is subject to the Amazon Web Services Customer Agreement »
Ready to get started?
Pay only for what you use. There is no minimum fee.
Instantly get access to the AWS Free Tier and start experimenting with Amazon S3.
Get started building with Amazon S3 in the AWS Console.