AWS Storage Blog

Getting started with Amazon S3 compatible storage on AWS Snowball Edge devices

Customers across numerous industries need to extend their applications from the cloud to rugged, mobile edge, and disconnected environments to capture and process data closer to where it is created. Today, many of these customers use third-party storage solutions at the edge. This approach requires significant effort to rearchitect applications for both cloud and edge environments, and it often lacks Amazon Simple Storage Service (Amazon S3) features available in the cloud. Deploying and managing separate storage infrastructure requires custom development and maintenance, adding to the operational overhead and overall cost and complexity. This also includes the financial investment in hardware without the option of on-demand pricing. Customers deploying and managing compute and storage workloads at the edge require a scalable, repeatable experience that aligns across AWS services.

Amazon S3 compatible storage on Snow brings automated storage management, robust Amazon S3 API functionality, and high availability storage clusters to on-premises and disconnected environments. Amazon S3 compatible storage on Snow is designed specifically for AWS Snowball Edge Compute Optimized devices that allows customers to build in AWS Regions, and then deploy on premises, thereby leveraging the same Amazon S3 API for storage and reducing the need to re-architect software for on-premises deployments. Amazon S3 compatible storage on Snow empowers customers to focus their time and money on core services instead of the underlying hardware and disparate product versions.

In this post, we explain the advantages of Amazon S3 compatible storage on AWS Snowball Edge devices, show you how to get started and order a device, and walk through other features.

Picture of two small dogs, Chewie and Mr. Waggles standing on 2 Snowball Edge Devices.

Chewie and Mr. Waggles love when it Snows!

Amazon S3 compatible storage on Snow

AWS customers can use Amazon S3 APIs to store and retrieve data in rugged, mobile edge, and disconnected environments. This means that many tools, apps, scripts, or utilities that already use Amazon S3 APIs, either directly or through SDKs, can now be configured to store and interact with that data locally on your Snowball Edge. AWS Snow Family devices are compatible with native AWS Services, such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Storage (Amazon EBS), Amazon S3, Amazon EKS Anywhere, AWS Identity and Access Management, and AWS IoT Greengrass. The AWS Snow Family does not require internet connectivity, making it ideal for running workloads in denied, disrupted, intermittent, or limited (DDIL) environments.

Snowball Edge Compute Optimized devices pack up to 104 vCPUs, 416 GB of memory, and 42 TB of high performance storage into a ruggedized enclosure the size of a suitcase that has even withstood military high-impact shock testing. Although these devices previously had local data storage in the form of the Amazon S3 Adapter, Amazon S3 compatible storage brings new features to the platform. It lets you use Amazon S3 APIs to manage buckets, configure notifications, enforce lifecycle rules, store data, and even create resilient storage clusters. You can use Amazon S3 compatible storage to meet local storage needs in DDIL environments, data-residency requirements, or to satisfy demanding performance needs by keeping data close to on-premises applications. It also lets you reduce data transfers to AWS Regions, since you can perform filtering, compression, or other pre-processing on your data locally without having to send all of it to an AWS Region, particularly in DDIL environments.

Amazon S3 compatible storage on Snow is secure, scalable, and optimized. Encryption is enabled by default, and the tamper resistant hardware and auto-lockout protections mean that data is safeguarded at every level. Amazon S3 compatible storage on Snow can scale to meet even the most demanding storage needs. With up to 500 TB per cluster, Amazon S3 compatible storage on Snow provides real-time data redundancy delivering protection from accidents and hardware failures. Customers can even automate data retention enforcement, simplify regulatory compliance, and reduce operational overhead with the lifecycle management and data tagging features.

Getting started

There are many features, so let’s check them out together! I’ll show you how to order, setup, and access a Snowball Edge, and then we can activate Amazon S3 compatible storage and try it out.

Now, you can do all of this by clicking through AWS OpsHub desktop software, but because the new API is so powerful for automation, we can use the AWS Snowball Edge Client and AWS Command Line Interface (AWS CLI) today. Imagine ordering a Snowball Edge pre-loaded with your software, and when it arrives at its location, all the recipient has to do is plug it in and run a simple script for full setup! That’s the power of the AWS CLI and APIs.

Note that if you prefer to configure your Snowball Edge differently than this method, refer to the Snowball Edge Developer Guide.

Ordering a Snowball Edge with Amazon S3 compatible storage

Upgrading to Amazon S3 compatible storage feature requires re-provisioning of the device. If you have existing Snowball Edge devices, then you want to order replacements with the Amazon S3 compatible storage on Snow enabled, and then return the existing devices.

To order Amazon S3 enabled Snow devices, create a job in the AWS Snow console. Remember to choose Local compute and storage only, a Snowball Edge Compute Optimized device, and to select Amazon S3 compatible storage. Storage is provisioned prior to shipment, so select the full amount of Amazon S3 compatible storage that you require. For more details, refer to the AWS Snowball Edge Developer Guide.

Unpacking your Snowball Edge

To get started, notice the panels on the top, front, and back of the Snowball Edge device. You want to make sure that the front and back panels remain open and unblocked at all times, as this is where airflow enters and exits the device. On these panels, press the metal circle inward, and then slide the lock up. This lets you fold up and stow the front and back panel doors. The top panel contains the power cord.

Picture of the front of a Snowball Edge with the protective lid open and slid into the case.

The top panel contains the power cord.

Picture of the top of a Snowball Edge with the protective lid open revealing a power cord.

Now just plug in the device and press the power button on the front. The fans spin all the way up as it goes through POST checks. But don’t worry – it quiets down within a few minutes. Use an Ethernet cable to plug the device into your network.

Picture of the back of a Snowball Edge with Ethernet and power cables plugged in.

Once the device is done powering up, use the touchscreen to setup either a static or DHCP connection.

Picture of the front screen of a Snowball Edge with IP information displayed.

Configuring the Snowball Edge Client

The Snowball Edge Client is used to interact with the device itself, and the AWS CLI is used to interact with services once they are running. Alternatively, OpsHub software wraps many functions of both the Snowball Edge Client and AWS CLI into a graphical user interface. Today we use the Snowball Edge Client to unlock and perform some initial setup of the device.

To protect customer data, all Snow devices are shipped in a locked state, and returned to that state anytime they are powered off. To access the device, you must unlock it using the manifest file and unlock the code associated with your order.

Visit the AWS Snow console, pull-up the job for the order, and save them to the computer you’re using to access the device. The unlock code and manifest file cannot be modified while on-premises. These are sensitive credentials, so make sure to keep cyber security best practices in mind when transferring, storing, or sharing them. Note the file path of the manifest file.

Picture of the AWS Console "Credentials" section with unlock code and manifest file available. For educational purposes only, sensitive information.

Download and install the latest version of the Snowball Edge Client. Open your preferred console to begin running commands using the Snowball Edge Client. Run the configure command to create a snowballEdge profile. When prompted, enter the file path of your manifest file, the unlock code, and the IP address of the Snowball Edge. This lets you run the SnowballEdge Client commands without manually adding these parameters each time in the future.

snowballEdge configure

snowballEdge Client Command: "Configure"

Unlocking your Snowball Edge

To unlock the Snowball Edge, now all you have to do is type the following command:

snowballEdge unlock-device

snowballEdge Client Command: "unlock-device"

You can also notice that the status on the Snowball Edge screen has changed, and it no longer says “locked.”

Picture of the front of a Snowball Edge with screen showing "Unlocking" status.

After the device is done unlocking, you’ll see it says “Ready.”

Picture of the front of a Snowball Edge with screen showing "Ready" status.

Starting the Amazon S3 compatible storage service

Amazon S3 compatible storage service uses two virtual network interfaces or VNIs: s3control for administrative management and s3api for interacting with data stored within buckets. We want to create these before we start the service. Since this is done with the Snowball Edge client, let’s do this now. Run the following command to pull the Physical Network Interface IDs. Today we are using the physical interface we’ve already connected, so I can select the Physical Network Interface ID that shows an IP address already assigned.

snowballEdge describe-device

Now, we create the two VNIs which are used. Run the following create-virtual-network-interface command twice to create the VNIs. Copy the newly created VirtualNetworkInterfaceArn for each, and we can assign them to the Amazon S3 Compatible Service when we activate it.

snowballEdge create-virtual-network-interface --ip-address-assignment dhcp --physical-network-interface-id "PhysicalNetworkInterfaceId"

snowballEdge Client Command: "create-virtual-network-interface"

To start the service, run the following command, entering the ARNs we just copied down (separated by a space) after – virtual-network-interface-arns.

snowballEdge start-service --service-id s3-snow --virtual-network-interface-arns vni-arn-1 vni-arn-2

snowballEdge Client Command: "start-service"

As seen above, when I checked the service status, it shows “ACTIVATING”. You want to give it about five minutes, and then check the service status. When the output has changed to “ACTIVE”, we’re good to go!

snowballEdge describe-service --service-id s3-snow

snowballEdge Client Command: "describe-service", status now active.

Pulling access keys and CA Bundle

To interact with services, we must retrieve the access keys and CA Certificate Bundle from the device. First, run the following commands to pull the access key, and then use the access key to pull the secret access key. Note both the IAM access key id, which is similar to a username, and the secret access key, which is used as the password. These are sensitive credentials, so make sure to keep cyber security best practices in mind when transferring, storing, or sharing them.

snowballEdge list-access-keys
snowballEdge get-secret-access-key --access-key-id "access-key-we-just-pulled"

snowballEdge Client Command: "list-access-keys" and "get-secret-access-key"

Next, we pull the CA Certificate Bundle, so that we can interact via https. Look up the certificate ARN using list-certificates. If there are multiple certificates, then identify the correct one using the describe-service command. Run the following command and copy the ARN, then run the next command using the output ARN from the first.

snowballEdge list-certificates
snowballEdge get-certificate --certificate-arn "arn-we-just-pulled-from-list-certificates-command"

snowballEdge Client Command: "list-certificates"

We must take that certificate and save it to a file. Copy “Begin Certificate”, “End Certificate”, and everything in-between to a text file on your computer, and save it as .pem. I have named mine “SnowBall-CA.pem.

Picture of certificate being created and saved in a text editor.

Configuring AWS CLI

From now on, we interact with the service itself. To do this, we use the AWS CLI. Download and install the latest version of the AWS CLI. To use the AWS CLI, open your preferred console and enter AWS following by the AWS CLI command and parameters.

Instead of manually entering our authentication every time we send a command, let’s create a profile to reference. Using the AWS CLI, enter the following command with your preferred name after —profile. It prompts you to enter the access key id and secret access key we pulled earlier, as well as the default Region name, where you should enter “snow“. For the default output format, we choose JSON. Note the ProfileName you just created, as you use it to authenticate most AWS CLI commands to your Snow device by adding —profile MyProfileName.

aws configure --profile ProfileName
AWS Access Key ID [None]: access-key-id-we-pulled-earlier
AWS Secret Access Key [None]: secret-access-key-we-pulled-earlier
Default region name [None]: snow
Default output format [None]: json

Picture of AWS CLI command: AWS configure

Next, let’s edit that profile, and add the CA Bundle we created. The profile you just created is in ~/.aws/config. In my case, this is “C:\Users\myusername\.aws\config“. Edit the file, and add a new line pointing to the filepath of the CA Bundle.

Snowball profile shown in text editor

Creating a bucket

Now that everything is setup, let’s start using Amazon S3 compatible storage on Snow! Like I mentioned earlier, creating buckets, uploading data, and downloading data can all be done manually with just a few clicks in Opshub. However, because of its automation and app integration capabilities, we want to learn how to work with API commands. Let’s pull the VNI IP’s so we know which to use for s3control commands, and which is used for s3api commands.

Run the following command, copying and labeling the IP addresses for each VNI. In my case, 192.168.1.9 is the bucket (s3control) API address, and 192.168.10 is the object (s3api) address.

snowballEdge describe-service --service-id s3-snow

Picture of snowball Edge "describe-service" command

To create a new bucket, I specify the profile name with —profile [my-profile], the bucket name with —bucket [my-new-bucket], and the management virtual network interface with —endpoint-url [https://my.interface.ip]. A BucketArn appears, so we know it completed successfully.

aws s3control --profile your-profile create-bucket --bucket my-new-bucket --endpoint-url https://my-s3control-endpoint-ip

Uploading a file

Let’s upload a file to the bucket. Run the following command, adding in your bucket name, object path, profile name, and endpoint IP address. Remember that since this is an s3api command, and it is interacting with objects inside a bucket rather than managing the bucket itself, we are using the s3api endpoint address.

aws s3api put-object --bucket my-new-bucket --key sample-object.xml --body sample-object.xml --profile your-profile --endpoint-url https://my-s3api-endpoint-ip

Picture of AWS CLI "s3api put-object" command

Let’s look at the uploaded file. Run the following command, entering your own bucket-name, profile, and endpoint URL to list the objects within the S3 bucket.

aws s3api list-objects-v2 --bucket my-new-bucket --profile your-profile --endpoint-url https://my-s3api-endpoint-ip

Downloading a file

Now let’s download the object. Run the following command, also adding in your bucket name, object path, profile name, and endpoint IP address.

To view the download the object, run the following command, entering your bucket name, the object name, the path you’re downloading to with the object name included, profile name, and endpoint URL. After running, verify the file was downloaded.

aws s3api get-object --bucket my-new-bucket --key sample-object.xml path/to/download/to/sample-object.xml --profile your-profile --endpoint-url https://my-s3api-endpoint-ip

Applying object tags

Object tagging in Amazon S3 is powerful, as it allows us to group files, and applies changes and policies to those groups. Let’s upload another file, and then apply a tag. First, I upload another file, following the same syntax we used earlier. In this case, we chose to mark the “designation“ as ”confidential“.

aws s3api put-object --bucket my-new-bucket --key sample-object-with-tag.xml --body sample-object-with-tag.xml --profile your-profile --endpoint-url https://s3api-endpoint-ip

aws s3api put-object-tagging --bucket my-new-bucket --endpoint-url https://my-s3control-endpoint-ip --key sample-object-with-tag.xml --tagging "{\"TagSet\": [{ "\Key\": \"designation\", \"Value\": \"confidential\" }]}"

Picture of AWS CLI "s3api put-object-tagging" command

Applying a bucket lifecycle rule

In Amazon S3 compatible storage on Snow, bucket lifecycle rules let us automatically delete files after a certain amount of time. This is especially useful for compliance information. If a company is required to keep security logs for one year, they can easily tag security logs with [Key: “file”, Value: “logs” and Key: log-type: Value: “security”] then apply a bucket lifecycle rule for those specific tags. This will ensure they are staying compliant without running out of space, and limit the need for time-intensive manual storage management.

In other scenarios, you may want to expire files quickly. Let’s say you run a website that handles social security numbers, and have committed to the customer that the confidential form they upload is only kept temporarily on your servers. A lifecycle policy can automatically expire files tagged with a key of “designation” and value of “confidential”, further protecting the data and limiting your liability.

Let’s apply that policy to our tagged file by creating a bucket lifecycle rule, and applying it.

First, we create the policy by saving the following locally as a JSON file.

{
    "Rules": [{
        "ID": "id-1",
        "Filter": {
            "And": {
                "Prefix": "myprefix",
                "Tags": [{
                        "Value": "confidential",
                        "Key": "designation"
                    },
                ],
            }
        },
        "Status": "Enabled",
        "Expiration": {
            "Days": 1
        }
    }]
}

Now, to apply the policy, we run the following command:

aws s3control put-bucket-lifecycle-configuration --bucket my-new-bucket --profile your-profile --account-id 12-digit-account-id-your-ordered-from --endpoint-url https://my-s3control-endpoint-ip --lifecycle-configuration file://path/to/lifecycle-example.json

Picture of AWS CLI "s3control put-bucket-lifecycle-configuration" command

To verify, check back a few hours after the expiration set and execute a get-bucket command to list files in the bucket such as the one below. If properly configured, the files will be gone.

aws s3api list-objects-v2 --bucket my-new-bucket --profile your-profile --endpoint-url https://my-s3api-endpoint-ip

Transferring data using Amazon S3 compatible storage on Snow

Now that you can store and retrieve data using Amazon S3 compatible storage on your Snowball Edge, you might want to transfer that data to Amazon S3 in an AWS Region. Or you might want to transfer data from AWS Regions to your Snowball Edge for frequent local access, processing, and storage. This is especially important for workloads in DDIL environments, which may need to backup data and download updates each time connectivity is restored.

You can use AWS DataSync to do this in Commercial and GovCloud regions with the newly launched support for Amazon S3 compatible storage. With DataSync, you can choose which objects to transfer, when to transfer them, and how much network bandwidth to use. DataSync also encrypts your data in-transit, verifies data integrity in-transit and at-rest, and provides granular visibility into the transfer process through Amazon CloudWatch metrics, logs, and events. See the AWS DataSync User Guide to get started.

More features

I’ve walked through the basics, but really only skimmed the surface of what can be done with Amazon S3 compatible storage on Snow. You can create clusters, integrate the SDK into your applications, and so much more. To explore further, refer to the Snowball Edge Developer Guide, or reach out to our sales or your AWS account team for architecture guidance or Professional Services.

Conclusion

Amazon S3 compatible storage on Snow helps customers perform local data processing and meet data residency requirements in DDIL environments. With this feature, customers have a consistent Amazon S3 experience using Amazon S3 APIs, AWS CLI/SDK across Snow, and AWS Regions.

To get started, visit the AWS Snow console. For information about Snow and Amazon S3 compatible storage pricing, check out AWS Snowball pricing and the AWS Pricing Calculator. If you’d like to discuss your Snow purchase in more detail, then reach out to your AWS account team or to our sales team.

Happy Storing!

Joe DeSantis

Joe DeSantis

Joe is a Sr. Solutions Architect and U.S. Air Force Combat Communications veteran with extensive cybersecurity experience in commercial and federal sectors. He works primarily with AWS Federal Partners on migration, modernization, and compliance efforts, but also specializes in cyber architecture and risk management. When he's not working with customers, he enjoys ethical hacking, traveling in his RV, and playing with his dogs Chewie and Mr. Waggles.

Jared Novotny

Jared Novotny

Jared is a Specialist Solutions Architect for AWS World Wide Public Sector (WWPS) Hybrid Edge and Government Regions for the Americas. Jared enables customers to consume core AWS services that can’t be moved to AWS regions due to data residency, latency, local data processing, or distributed edge compute requirements. Jared has a background in a diverse array of on-premise and cloud solutions, supporting Oil and Gas and the industrial space for 10 years before moving to Public Sector.