How does my SageMaker container access Amazon S3 data and push logs to CloudWatch when I turn on network isolation?

2 minute read
0

I turned on network isolation for my Amazon SageMaker container. I want to know how the container accesses the Amazon Simple Storage Service (Amazon S3) data and pushes the logs to Amazon CloudWatch.

Resolution

When you turn on network isolation, SageMaker accesses Amazon S3 and CloudWatch in isolation to the training or inference containers. If you don't specify an Amazon Virtual Private Cloud (Amazon VPC), then data access and logging happens on the server side. In this case, the container can't access the network.

When you specify VpcConfig in your training, processing, or model configuration, SageMaker creates two elastic network interfaces in the specified Amazon VPC. All the traffic is routed through these two elastic network interfaces in the specified VPC. One elastic network interface is for the algorithm container, and the other is for Amazon S3 and logging access. If you specify an Amazon VPC in VpcConfig, then the network isolation setting doesn't create the elastic network interface for the algorithm container. This blocks the container from all outbound networking calls. The other elastic network interface remains, and you can use it for Amazon S3 and logging access.

In both cases, the internal SageMaker processes that are running on the nodes download the data. Then, the input channels that are specified in the job definition make this data available to the algorithm container. Also, the internal processes that are running on the node make the logs and metrics available to CloudWatch. These nodes connect to CloudWatch through the VPC that's in VpcConfig.

Whether or not you specify VpcConfig, the container doesn't have access to any AWS credentials to make API calls to AWS services.


Related information

Network isolation

AWS OFFICIAL
AWS OFFICIALUpdated a year ago