AWS News Blog
Software Package Management with AWS CodeArtifact
|
Software artifact repositories and their associated package managers are an essential component of development. Downloading and referencing pre-built libraries of software with a package manager, at the point in time the libraries are needed, simplifies both development and build processes. A variety of package repositories can be used, for example Maven Central, npm public registry, and PyPi (Python Package Index), among others. Working with a multitude of artifact repositories can present some challenges to organizations that want to carefully control both versions of, and access to, the software dependencies of their applications. Any changes to dependencies need to be controlled, to try and prevent undetected and exploitable vulnerabilities creeping into the organization’s applications. By using a centralized repository, it becomes easier for organizations to manage access control and version changes, and gives teams confidence that when updating package versions, the new versions have been approved for use by their IT leaders. Larger organizations may turn to traditional artifact repository software to solve these challenges, but these products can introduce additional challenges around installation, configuration, maintenance, and scaling. For smaller organizations, the price and maintenance effort of traditional artifact repository software may be prohibitive.
Generally available today, AWS CodeArtifact is a fully managed artifact repository service for developers and organizations to help securely store and share the software packages used in their development, build, and deployment processes. Today, CodeArtifact can be used with popular build tools and package managers such as Maven and Gradle (for Java), npm and yarn (for Javascript), and pip and twine (for Python), with more to come. As new packages are ingested, or published to your repositories, CodeArtifact automatically scales, and as a fully managed service, CodeArtifact requires no infrastructure installation or maintenance on your part. Additionally, CodeArtifact is a polyglot artifact repository, meaning it can store artifact packages of any supported type. For example, a single CodeArtifact repository could be configured to store packages from Maven, npm and Python repositories side by side in one location.
CodeArtifact repositories are organized into a domain. We recommend that you use a single domain for your organization, and then add repositories to it. For example you might choose to use different repositories for different teams. To publish packages into your repositories, or ingest packages from external repositories, you simply use the package manager tools your developers are used to. Let’s take a look at the process of getting started.
Getting started with CodeArtifact
To get started with CodeArtifact, I first need to create a domain for my organization, which will aggregate my repositories. Domains are used to perform the actual storage of packages and metadata, even though I consume them from a repository. This has the advantage that a single package asset, for example a given npm package, would be stored only once per domain no matter how many repositories it may appear to be in. From the CodeArtifact console, I can select Domains from the left-hand navigation panel, or instead create a domain as part of creating my first repository, which I’ll do here by clicking Create repository.
First, I give my repository a name and optional description, and I then have the option to connect my repository to several upstream repositories. When requests are made for packages not present in my repository, CodeArtifact will pull the respective packages from these upstream repositories for me, and cache them into my CodeArtifact repository. Note that a CodeArtifact repository can also act as an upstream for other CodeArtifact repositories. For the example here, I’m going to pull packages from the npm public registry and PyPi. CodeArtifact will refer to the repositories it creates on my behalf to manage these external connections as npm-store and pypi-store.
Clicking Next, I then select, or create, a domain which I do by choosing the account that will own the domain and then giving the domain a name. Note that CodeArtifact encrypts all assets and metadata in a domain using a single AWS Key Management Service (AWS KMS) key. Here, I’m going to use a key that will be created for me by the service, but I can elect to use my own.
Clicking Next takes me to the final step to review my settings, and I can confirm the package flow from my selected upstream repositories is as I expect. Clicking Create repository completes the process, and in this case creates the domain, my repository, and two additional repositories representing the upstreams.
After using this simple setup process, my domain and its initial repository, configured to pull upstream from npm and PyPi, are now ready to hold software artifact packages, and I could also add additional repositories if needed. However my next step for this example is to configure the package managers for my upstream repositories, npm and pip, with access to the CodeArtifact repository, as follows.
Configuring package managers
The steps to configure various package managers can be found in the documentation, but conveniently the console also gives me the instructions I need when I select my repository. I’m going to start with npm, and I can access the instructions by first selecting my npm-pypi-example-repository and clicking View connection instructions.
In the resulting dialog I select the package manager I want to configure and I am shown the relevant instructions. I have the choice of using the AWS Command Line Interface (AWS CLI) to manage the whole process (for npm, pip, and twine), or I can use a CLI command to get the token and then run npm commands to attach the token to the repository reference.
Regardless of the package manager, or the set of instructions I follow, the commands simply attach an authorization token, which is valid for 12 hours, to the package manager configuration for the repository. So that I don’t forget to refresh the token, I have taken the approach of adding the relevant command to my startup profile so that my token is automatically refreshed at the start of each day.
Following the same guidance, I similarly configure pip, again using the AWS CLI approach:
C:\> aws codeartifact login --tool pip --repository npm-pypi-example-repository --domain my-example-domain --domain-owner ACCOUNT_ID
Writing to C:\Users\steve\AppData\Roaming\pip\pip.ini
Successfully logged in to codeartifact for pypi
That’s it! I’m now ready to start using the single repository for dependencies in my Node.js and Python applications. Any dependency I add which is not already in the repository will be fetched from the designated upstream repositories and added to my CodeArtifact repository.
Let’s try some simple tests to close out the post. First, after changing to an empty directory, I execute a simple npm install
command, in this case to install the AWS Cloud Development Kit (AWS CDK).
npm install -g aws-cdk
Selecting the repository in the CodeArtifact console, I can see that the packages for the AWS Cloud Development Kit (AWS CDK), and its dependencies, have now been downloaded from the upstream npm public registry repository, and added to my repository.
I mentioned earlier that CodeArtifact repositories are polyglot, and able to store packages of any supported type. Let’s now add a Python package, in this case Pillow, a popular image manipulation library.
> pip3 install Pillow
Looking in indexes: https://aws:****@my-example-domain-123456789012.d.codeartifact.us-west-2.amazonaws.com/pypi/npm-pypi-example-repository/simple/
Collecting Pillow
Downloading https://my-example-domain-123456789012.d.codeartifact.us-west-2.amazonaws.com/pypi/npm-pypi-example-repository/simple/pillow/7.1.2/Pillow-7.1.2-cp38-cp38-win_amd64.whl (2.0 MB)
|████████████████████████████████| 2.0 MB 819 kB/s
Installing collected packages: Pillow
Successfully installed Pillow-7.1.2
In the console, I can see the Python package sitting alongside the npm packages I added earlier.
Although I’ve used the console to verify my actions, I could equally well use CLI commands. For example, to list the repository packages I could have run the following command:
aws codeartifact list-packages --domain my-example-domain --repository npm-pypi-example-repository
As you might expect, additional commands are available to help with work with domains, repositories, and the packages they contain.
Availability
AWS CodeArtifact is now generally available in the Frankfurt, Ireland, Mumbai, N.Virginia, Ohio, Oregon, Singapore, Sweden, Sydney, and Tokyo regions. AWS CloudFormation support for CodeArtifact is coming soon.
For additional best practice considerations on using CodeArtifact, see this blog post, and tune in on June 12th at noon (PST) to Twitch.tv/aws or LinkedIn Live, where we will be showing how you can get started with CodeArtifact.
— Steve