Migration to AWS CodeCommit, AWS CodePipeline and AWS CodeBuild From GitLab
This walkthrough shows you how to migrate multiple repositories to AWS CodeCommit from GitLab and set up a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild. Event notifications and pull requests are sent to Amazon Chime for project team member communication.
AWS CodeCommit supports all Git commands and works with existing Git tools. I can keep using my preferred development environment plugins, continuous integration/continuous delivery (CI/CD) systems, and graphical clients with AWS CodeCommit.
Over the years the number of repositories hosted in my GitLab environment grew beyond 100 and maintaining it with patches, updates, and backups was time consuming and risky. Migrating over to AWS CodeCommit project by project manually would have been a tedious process and error pone. I wanted to run a script to handle the AWS setup and migration of code for me.
The documentation for AWS CodeCommit has an example how to migrate a single repository, I wanted to migrate many though.
As part of the migration, I had a requirement to set up a CI/CD pipeline using AWS CodePipeline and send notifications on activity in the repository to Amazon Chime, which I use for communication between project members.
The migration script calls the GitLab API to get a list of git repositories and subsequently runs
git clone --mirror <ssh-repository-url> <project-name>
commands against the SSH endpoint of the repositories.
For every GitLab repository, a CloudFormation template creates a AWS CodeCommit repository and the AWS CodePipeline, AWS CodeBuild resources. If an Amazon Chime webhook is configured, also the Lambda function to post to Amazon Chime is created.
One S3 bucket for artifacts is also setup with the first AWS CodeCommit repository and shared across all other AWS CodeCommit and AWS CodePipeline resources.
The migration script can be executed on any system able to communicate with the existing GitLab environment through SSH and the GitLab API and with AWS endpoints and has permissions to create AWS CloudFormation stacks, AWS IAM roles and policies, AWS Lambda, AWS CodeCommit, AWS CodePipeline, .
To pull all the projects from GitLab without needing to define them previously, a GitLab personal access token is used.
You can configure to migrate user specific GitLab project, repositories for specific groups or individual projects or do a full migration of all projects.
For the AWS CodeCommit, CodePipeline, and CodeBuild – following best practices – I use CloudFormation templates that allow me to automate the creation of resources.
The Amazon Chime Notifications are setup using a serverless Lambda function triggered by CloudWatch Event Rules and are optional.
I wrote and tested the solution in Python 3.6 and assume pip and git are installed. Python 2 is not supported.
The GitLab version that we migrated off of and tested against was 10.5. I expect the script to work fine against other versions that support REST calls as well, but didn’t test it against those.
For this walkthrough, you should have the following prerequisites:
- An AWS account
- An EC2 instance running Linux with access to your GitLab environment or a Laptop or Desktop running MacOS or Linux. The solution has not been tested on Windows/Cygwin
- Git installed
- AWS CLI installed.
- Run a pip install on a command line:
pip install gitlab-to-codecommit-migration
- Create a personal access token in GitLab (instructions)
- Configure ssh-key based access for your user in GitLab (Create and add your SSH public key in GitLab Docs)
- Setup your AWS account for CodeCommit following (Setup Steps for SSH Connections to AWS CodeCommit Repositories on Linux, macOS, or Unix). You can use the same SSH key for both, GitLab and AWS.
- Setup your
~/.ssh/configto have one entry for the GitLab server and one for the CodeCommit environment. Example:
Host my-gitlab-server-example.com IdentityFile ~/.ssh/<your-private-key-name> Host git-codecommit.*.amazonaws.com User APKEXAMPLEEXAMPLE-replace-with-your-user IdentityFile ~/.ssh/<your-private-key-name>
This way the git client uses the key for both domains and the correct user. Make sure to use the SSH key ID and not the AWS Access key ID.
- “Configure your AWS Command Line Interface (AWS CLI) environment. This environment helps execute the CloudFormation template creation part of the script. For setup instructions, see (Configuring the AWS CLI
- When executing the script on a remote server on AWS or in your data center, use a terminal multiplexer like tmux
- If you migrate more than 33 repositories, you should check the CloudWatch Events limit, which has a default of 100 https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/cloudwatch_limits_cwe.html. The link to increase the limits is on the same page. The setup uses CloudWatch Events Rules to trigger the pipeline (one rule) and notifications (two rules) to Amazon Chime for a total of three CloudWatch Events Rule per pipeline.
- For even larger migrations of more than 200 repos you should check CloudFormation limits, which default to max 200 (
aws cloudformation describe-account-limits), CodePipeline has a limit of 300 and CodeCommit has a default limit of 1000, same as the CodeBuild limit of 1000. All the limits can be increased through a support ticket and the link to create it is on the limits page in the documentation.
After you have set up the environment, I recommend to test the migration with one sample project. On a command line, type
gitlab-to-codecommit --gitlab-access-token youraccesstokenhere --gitlab-url https://yourgitlab.yourdomain.com --repository-names namespace/sample-project
It will take around 30 seconds for the CloudFormation template to create the AWS CodeCommit repository and the AWS CodePipeline and deploy the Lambda function. While deploying or when you are interested in the setup you can check the state in the AWS Management Console in the CloudFormation service section and look at the template.
Time it takes to push the code depends on the size of your repository. Once you see this running successful you can continue to push all or a subset of projects.
gitlab-to-codecommit --gitlab-access-token youraccesstokenhere --gitlab-url https://gitlab.yourdomain.com --all
I also included a script to set repositories to read-only in GitLab, because once you migrated to CodeCommit it is a good way to avoid users still pushing to the old remote in GitLab.
gitlab-set-read-only --gitlab-access-token youraccesstokenhere --gitlab-url https://gitlab.yourdomain.com --all
To avoid incurring future charges for test environments, delete the resources by deleting the CloudFormation templates account-setup and the stack for the repository you created.
The CloudFormation template has a
DeletionPolicy: Retain for the CodeCommit Repository to avoid accidentally deleting the code when deleting the CloudFormation template. If you want to remove the CodeCommit repository as well at one point, you can change the default behavior or delete the repository through API, CLI, or Console. During testing I would sometimes fail the deployment of a template because I didn’t delete the CodeCommit repository after deleting the CloudFormation template. For migration purposes you will not run into any issues and not delete a CodeCommit repository by mistake when deleting a CloudFormation template.
In order to delete the repository use the AWS Management Console and select the AWS CodeCommit service. Then select the repository and click the delete button.
The blog post did show how to migrate repositories to AWS CodeCommit from GitLab and set up a CI/CD pipeline using AWS CodePipeline and AWS CodeBuild.
The source code is available at https://github.com/aws-samples/gitlab-to-codecommit-migration
Please create issues or pull requests on the GitHub repository when you have additional requirements or use cases.