AWS DevOps & Developer Productivity Blog
AWS CodeArtifact and your package management flow – Best Practices for Integration
You often use artifact repositories to store and share software or deployment packages. Centralized artifacts enable teams to operate independently and share versioned software artifacts across your organization. Sharing versioned artifacts across organizations increases code reuse and reduces delivery time. Having a central artifact store enables tighter artifact governance and improves security visibility. This post uses some of these patterns to show you how to integrate AWS CodeArtifact in an effective, cost-controlled, and efficient manner.
AWS CodeArtifact concepts
AWS CodeArtifact uses the following elements:
- Asset – An individual file stored in AWS CodeArtifact that is associated with a package version, such as an npm .tgz file or Maven POM and JAR files
- Package – A package is a bundle of software and the metadata that is required to resolve dependencies and install the software. AWS CodeArtifact supports npm, PyPI, and Maven package formats.
- Repository – A CodeArtifact repository contains a set of package versions, each of which maps to a set of assets. Repositories are polyglot—a single repository can contain packages of any supported type. Each repository exposes endpoints for fetching and publishing packages using tools like the
npm
CLI, the Maven CLI (mvn
), andpip
. - Domain – Repositories are aggregated into a higher-level entity known as a domain. The domain allows organizational policy to be applied across multiple repositories. A domain deduplicates storage of the repositories packages.
Creating a domain based on organizational ownership
When you create a domain in CodeArtifact, it’s important to organize the domain by ownership within the organization. An example would be a a company being a domain, and the products being repositories. Domains allow you to apply organizational policies across multiple repositories. Generally we recommend creating one domain per company. In some cases it may also be beneficial to have a sandbox domain where prototype repositories reside. In a sandbox domain teams are at liberty to create their own repositories and experiment as needed, without affecting product deliverable assets. Using a sandbox domain will duplicate packages, isolate repositories since you can not copy packages between domains, and increase costs since package deduplication is handle at the domain level. Organizing packages by domain ownership increases the cache hits on a package within the domain and reduces cost for each subsequent package fetch request.
Whenever a package is fetched from a repository, the asset is cached in your CodeArtifact domain to minimize the cost of subsequent downstream requests. A given asset only needs to be stored once in a domain, even if it’s available in two—or two thousand—repositories. That means you only pay for storage once. Copying a package version with the CopyPackageVersions
API is only possible between repositories within the same CodeArtifact domain.
You can create a domain for your organization by calling create-domain
in the AWS Command Line Interface (AWS CLI), AWS SDK, or on the CodeArtifact console. See the following code:
aws codeartifact create-domain --domain "my-org"
After creating the domain you will see the domains listed in the Domains section on the CodeArtifact console.
Using a shared repository
A shared repository is applicable when a team feels that a component is useful to the rest of the organization and isn’t in an experimental state, personal project, and not meant for wide distribution within the organization. Examples of shared components are open source public repositories (npm, PyPI, and Maven), authentication, logging, or helper libraries. Shared libraries aren’t related to product libraries; for instance, a service contract library shouldn’t live in a shared repository. The shared repository should be marked read-only to all users except for the publishing IAM role. At Amazon, we have found that many teams want to consume common packages as part of their application build, and don’t need to publish any package themselves. Those teams don’t need their own repository and pull packages from shared. Overall, approximately 80% of packages are downloaded from the shared repository, and 20% from team or project specific repositories.
You can create a shared repository by calling the create-repository
command and setting a resource policy that makes the repository read-only.
Here is how you create a repository with the AWS CLI using the create-repository command. See the following code:
aws codeartifact create-repository --domain "my-org" \
--domain-owner "account-id" --repository "my-shared-repo-name" \
--description "My new repository"
Next you make the repository read-only by setting a resource policy. See the following code:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"codeartifact:DescribePackageVersion",
"codeartifact:DescribeRepository",
"codeartifact:GetPackageVersionReadme",
"codeartifact:GetRepositoryEndpoint",
"codeartifact:ListPackages",
"codeartifact:ListPackageVersions",
"codeartifact:ListPackageVersionAssets",
"codeartifact:ListPackageVersionDependencies",
"codeartifact:ReadFromRepository"
],
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::444455556666:root"
},
"Resource": "*"
}
]
}
To attach a resource policy to a repository by calling the put-repository-permissions command. See the following code:
aws codeartifact put-repository-permissions-policy --domain "my-org" \
--domain-owner "account-id" --repository "my-shared-repo-name" \
--policy-document file:///PATH/TO/policy.json
When you have created the repository, you will see it listed in the Repositories section on the CodeArtifact console.
External repository connections
CodeArtifact enables you to set external repository connections and replicate them within CodeArtifact. An external connection reduces the downstream dependency on the remote external repository. When you request a package from the CodeArtifact repository that’s not already present in the repository, the package can be fetched from the external connection. This makes it possible to consume open-source dependencies used by your application. Using an external connection reduces interruption in your development process for package external dependencies, an example is if a package is removed from a public repository, you will still have a copy of the package stored in CodeArtifact. You should have a one-to-one mapping with external repositories, and rather than have multiple CodeArtifact repositories pointing to the same public repository. Each asset that CodeArtifact imports into your repository from a public repository is billed as a single request, and each connection must reconcile and fetch the package before the response is returned. By having a one-to-one mapping, you can increase cache hits, reducing time to download an application dependency from CodeArtifact, and reduce the number of external package resolution requests. Associating an external repository connection with your repository is done using the associate-external-connection
command. See the following code:
aws codeartifact associate-external-connection \
--domain "my-org" --domain-owner "account-id" \
--repository "my-external-repo" --external-connection public:npmjs
Once you have associated an external connection with your repository, you’ll see the external connection visible in the Repositories section detail. In this example we’ve connected the repository to the external npmjs repository.
Team and product repositories
When working in distributed teams, you often align repositories to the product or service ownership. Teams working on their own repository can update as needed. An example would be creating a private package that your team only uses internally.
See the following code:
aws codeartifact create-repository --domain my-org \
--domain-owner account-id --repository my-team-repo \
--description "My new team repository"
As team’s develop against the package they will need to publish their changes to the repository. As part of your development pipeline you would publish the package to the repository. See the following code for an example:
# Log in to CodeArtifact
aws codeartifact login --tool npm \
--domain "my-org" --domain-owner "account-id" \
--repository "my-team-repo"
# Run build commands here
...
# Set $VERSION from your build system
npm version $VERSION
# Publish to CodeArtifact
npm publish
After testing the feature and you find that it will be usable across your organization, you can copy the package into your shared repository. See the following code:
# Promoting to a shared repo
aws codeartifact copy-package-versions --domain "my-org" \
--domain-owner "account-id" --source-repository "my-team-repo" \
--destination-repository "my-shared-repo" \
--package my-package --format npm \
--versions '["6.0.2"]'
Once you’ve created your shared repository you will see the repositories updated as shown here.
Sharing repositories across accounts
Often teams or workloads have separate accounts within an organization. This is a recommended practice because it clearly defines operational boundaries and domain of ownership and establishes security boundaries. If your organization uses a multi-account strategy, you can share repositories across accounts using CodeArtifact resource policy. Teams can develop in their own account and publish to a CodeArtifact repository controlled in a shared account.
Here you see a list of repositories, which includes both a shared and team repository.
Using Amazon CloudWatch Events when a package is pushed
When a package is pushed into a repository, its change can affect software dependencies, teams, or process dependencies. When an artifact is pushed to CodeArtifact, an Amazon CloudWatch Events event is triggered, which you can use to trigger additional functionality. You can react to these events by subscribing to a CodeArtifact event in Amazon EventBridge. Some examples of reactions to a change you could take are: checking dependencies, deploying dependent services, notifying teams or services of a change, or building the dependencies.
You can also use EventBridge to start a pipeline in AWS CodePipeline, notify an Amazon Simple Notification Service (Amazon SNS) topic, and have that call AWS Chatbot. For more information see, CodeArtifact event format and example. If you are looking to integrate AWS Chatbot into your delivery flow, see Receive AWS Developer Tools Notifications over Slack using AWS Chatbot.
Deploying code in a hybrid environment
You can enable seamless software deployment into AWS and on-premises environments by integrating CodeArtifact with software build and deployment services. You can use CodeArtifact with your existing development pipeline tooling such as NPM, Python, and Maven. With native support for these package managers, you can access CodeArtifact wherever you operate today.
First, log in to CodeArtifact, build your code, and finally publish using npm publish
with the following code:
# Log in to CodeArtifact
aws codeartifact login --tool npm \
--domain "my-org" --domain-owner "account-id" \
--repository "my-team-repo"
# Run build commands here
...
# Set $VERSION from your build system
npm version $VERSION
# Publish to CodeArtifact
npm publish
Cleaning Up
When you’re ready to clean up the repositories and domains you’ve created, you’ll need to remove them in a specific order. Please be aware that deleting a repository is a destructive action which will remove any stored packages. To delete a domain and delete a repository created from the previous sections in this blog, you will be using the delete-domain
and delete-repository
commands.
You will need to remove the domain and repository in the following order:
- Remove any repositories in a domain
- Remove the domain
To delete the repository and domain, see the following code:
# Delete the repository
aws codeartifact delete-repository --domain "my-org" --domain-owner "account-id" --repository "my-team-repo"
# Delete the domain
aws codeartifact delete-domain --domain "my-org" --domain-owner "account-id"
Conclusion
This post covered how to integrate CodeArtifact into your delivery flow and use CodeArtifact effectively. A shared repository approach aides in creating reusable components across your organization. Using team repositories and promoting to a consumable repository allows your teams to iterate independently. For more information, see Getting started with CodeArtifact.