Building Our Game on AWS – Lessons Learned by Leaftail Labs
Guest post authored by Eli Tayrien, CTO and Co-founder, Leaftail Labs.
Leaftail Labs was formed in 2017 by a pair of game industry veterans who saw exciting possibilities in mobile AR gaming. Our first game, Nibblity, recently launched worldwide, and we are excited to continue to deliver more excited content for you and your Nibblins to enjoy.
When we started Leaftail Labs, we knew social and connected components would be an important part of our game’s success, meaning the game’s online services would be critical. We had experience maintaining on-premises servers and were not excited to go down that road again. As a small team, it was important for us to spend as much time as possible iterating on game features and to minimize the time spent on devops and server infrastructure. With those goals in mind, we decided to adopt AWS as our services provider, despite having relatively little experience with it.
AWS is now an integral part of our game’s technology ecosystem. In this blog, we will share what has worked well for us, what we have learned, and what we recommend for teams that are in a similar situation.
Nibblity is a mobile AR pet simulation game. It is fun and cheerful, and incorporates influences from a variety of different genres including collector and idle games. As players explore the world around them, they discover new Nibblins to make friends with, as well as new treats to scan and feed to their new friends. Working together with others is a core piece of our game, and players can share items to help one another’s Nibblins become even happier.
Our first iteration of games services was a standalone application running on Amazon Elastic Compute Cloud (Amazon EC2). This was enough to get us off the ground but was not a sustainable long-term solution: We found ourselves spending too much time maintaining the environment the code runs in and not enough time working on the code itself. Manual tasks, such as installing updates on machine images and writing complicated custom CI/CD scripts to upload new versions of the services code, were a burden for our small team, and we found ourselves searching for a more managed solution. After some early experimentation, we made the move to AWS Fargate and AWS CloudFormation.
Fargate and CloudFormation
The core of our services are now Docker containers running on a Fargate cluster. The cluster and all its associated resources, including our code’s external dependencies, are created by a set of CloudFormation stacks. This has been a very effective combination for us. Deploying new updates is extremely simple, and our CI/CD scripts are just a few steps. The creation of new test environments is also straightforward. We can have a new service up and running in a matter of minutes. This has also been a nice cost-saving measure for us: because it is so easy to set up test environments, we are free to delete them when they are not in use, minimizing the number of resources that are sitting unused.
External dependencies and networking infrastructure, alongside other components of our service workflow, are all defined in a set of CloudFormation documents. CloudFormation can have a steep learning curve, but we recommended spending the time needed to become familiar with it. The syntax can be difficult and support for some AWS features are missing, but the ability to declare all of our code’s external dependencies in one place has totally changed how we work with our cloud services.
Moving to this model has made our infrastructure more resilient and repeatable. When we want to deploy a new test environment, we can do so quickly with no manual configuration steps required. When we want to roll back to an earlier version, we can do so with confidence, knowing that all of that version’s dependencies are explicitly contained within the CloudFormation documents.
Our team finds Fargate to be a very straightforward, reliable, secure, and easy to use service. It allows us to focus on the task of writing code to power great gameplay features while Fargate manages the code’s runtime environment. That means no fiddling with patches and updates and no OS settings to configure. We just upload our docker images to ECR and let Fargate do the rest.
Moving forward, we are interested in using AWS CodePipeline to improve our CI/CD pipeline and fully automate CloudFormation stack updates and creation whenever a change occurs. We are also interested in integrating Fargate’s autoscaling functionality to keep costs down while still maintaining responsive game sessions at peak play times.
Other AWS Services
Our workflow uses several other AWS services including Amazon S3 and Amazon DynamoDB for persistent storage. DynamoDB is easy to get started with, but we do recommend spending the time at the beginning to ensure you understand how hash and range keys work, and to think through how you will access your data. Mistakes made early on can be time consuming to correct. For example, secondary indices are a very powerful feature that we wish we had known about earlier.
All user accounts are managed by Amazon Cognito, which has been another success for us. Using Cognito allows us to support users signing in with a variety of different social providers or creating their own Leaftail-specific accounts with minimal code written. One obstacle for us is that customization of the Cognito Hosted UI is rather limited, but overall we find the benefits outweigh the drawbacks in this case. When creating a new user pool, we would also suggest not marking any standard attributes as required: These settings cannot be changed after the user pool has been created, and if your application requirements change at a later date, migrating data to a new user pool can be time consuming.
Although we have moved away from using Amazon EC2 for core services, it does still have a place in our tech stack hosting our CI/CD server.
Finally, Amazon Cloudwatch is used to monitor the health of our services. The same CloudFormation documents that create our services also create a set of CloudWatch alarms based on various integrated and custom health metrics. Whenever a threshold is breached, Cloudwatch automatically sends us a warning message. Integrating the alarms into CloudFormation in this way is highly useful to us; whenever we deploy a new environment, we are confident that the accompanying alarms are configured correctly as well.
We hope this post is useful to you and other game developers who find themselves in circumstances similar to our own. May the lessons we’ve learned can help you simplify your service infrastructure so you can focus on making great games for us all to enjoy.