How can I be sure that manually installed libraries persist in Amazon SageMaker if my lifecycle configuration times out when I try to install the libraries?

Last updated: 2019-05-09

When I try to install additional libraries, my lifecycle configuration scripts run for more than five minutes, which causes the Amazon SageMaker notebook instance to time out. How can I resolve this and be sure that my manually installed libraries persist between Amazon SageMaker notebook instance sessions?

Short Description

If a lifecycle configuration script runs for longer than five minutes, it fails, and the notebook instance is not created or started. There are two ways to resolve this issue:

  • nohup: The nohup command forces the lifecycle configuration script to continue running in the background until the packages are installed. This method is recommended for less technical users, and is more appropriate as a short-term solution.
  • Move the Python packages: Copy the /home/ec2-user/anaconda3 path to the Amazon Sagemaker persistent path (/home/ec2-user/SageMaker/anaconda3/). Then, use a bash script to create a soft link to the original path. This method is recommended for more technical users and is a better long-term solution.

Resolution

Use one of the following methods to resolve lifecycle configuration timeouts.

nohup

Use the nohup command to force the lifecycle configuration script to continue running in the background even after the five-minute timeout period expires. Example:

#!/bin/bash
set -e
nohup pip install xgboost &

The script stops running after the libraries are installed. You aren't notified when this happens, but you can use the ps command to find out if the script is still running.

Note: You can also use the nohup command if your lifecycle configuration script times out in other scenarios, such when you download large Amazon Simple Storage Service (Amazon S3) objects.

Move the Python packages

1.    Copy the Anaconda3 files to the Amazon SageMaker persistent path. You need to perform this step only one time. Example:

$ cp -ruf /home/ec2-user/anaconda3 /home/ec2-user/SageMaker/anaconda3/

Note: These are large files, and copying them can take hours. This is because you are copying from the Docker container that is running on the notebook instance to the persistent Amazon Elastic Block Store (Amazon EBS) volume. To speed up this operation, use a larger notebook instance type. Then, when all files are copied, resize your notebook instance to its original size.

2.    Create a bash script similar to the following:

========Sample script======
#!/bin/bash
 
# Rename the path. This is faster.
sudo mv /home/ec2-user/anaconda3 /home/ec2-user/anaconda33$

# Be sure that the custom copy is at /home/ec2-user/SageMaker/anaconda3/.
# Then, create a soft link.

ln -s /home/ec2-user/SageMaker/anaconda3/ /home/ec2-user/anaconda3

echo "Listing Anacondas"
cd
pwd
ls -l | grep anaconda

echo "Environment: ec2-user :"
env |grep ec2-user

# Wait about one minute.
echo "Note: The terminal might freeze for a few seconds."
echo "This is expected behavior."

3.    Navigate to the Amazon SageMaker directory. Add the script that you created in the previous step, and then make it executable. In the following example, the script is named persist.sh.

$ cd /home/ec2-user/SageMaker/
$ chmod u+x persist.sh

4.    Restart the Amazon SageMaker notebook instance.

5.    Navigate to the Amazon SageMaker directory where the bash script is located and then run the script.

$ cd /home/ec2-user/SageMaker/
$ ./persist.sh

6.    Install your custom packages. Example:

$ source activate "conda kernel"
$ pip install or conda install

If you stop and then start your notebook instance, your custom packages will still be available. You don't need to install them again.