AWS for Games Blog
Optimize game servers hosting with containers
Authored by: Serge Poueme, Senior Solutions Architect and Yahav Biran, Principal Solutions Architect
Game developers look for ways to improve online multiplayer experience while balancing availability and cost effectiveness. Containers and Kubernetes help modernize game server hosting running on virtual or bare metal machines to improve compute allocation density, optimize data-transfer cost and maintain high performance. Our containerization approach includes the game-server lifecycle, such as packaging game binaries and assets, as well as running and disposing of the game server. We apply the method with SuperTuxKart, an open-source, C++ based kart racing game and provide a configuration example.
Build containerized game servers with AWS Developer tools
Game server images contain executable binaries, graphics, sound, network, and physics assets media files that can take up gigabytes of storage. Therefore, the speed of loading the artifacts from the game assets registry on the compute nodes can slow the game operations even when just tens of bytes of changes are rolled out. You might want to separate the code from the media files so that changes by artists do not require the developer to compile and integrate the code unnecessarily.
In addition to optimizing the packaging process, Docker images introduce layers that allows pushing and deploying game assets and code individually and avoiding unnecessary data transfer costs when distributing the image to many nodes continuously.
The Docker multi-stage build approach is applied to package SuperTuxKart binaries and assets. The first stage denoted by debian_base in the Dockerfile includes the required packages for compiling the code. In the second stage, build_arts, the base image is reused and the arts objects are downloaded. The last stage, build_code, compiles the code using the build_art stage. The three stages allow the developer to recompile the image without re-installing the build packages, reducing build time and compute costs.
Now that the packaging is defined, we create a Continuous Integration Pipeline using CDK v2 that will continuously build the SuperTuxKart binaries when the code or assets are changed.
The pipeline sources the Dockerfile and configuration for the game server from an AWS CodeCommit repository. The solution uses AWS CodeBuild to create an updated Docker image. CodeBuild gives flexibility to assemble the files required for the game servers using multiple artifacts stores. In the current example, the game engine code is pulled from GitHub and the game assets are imported from a custom SVN repository.
CodeBuild can be used to create multi-architecture docker images to support workloads running on Graviton hosts. Graviton processors are custom designed by AWS with 64-bit Arm Neoverse N1 cores. Customers can use Graviton to take advantage of performance optimizations while saving money. Graviton power the T4g*, M6g*, R6g*, and C6g* Amazon Elastic Compute Cloud (Amazon EC2) instance types and offer up to 40% better price performance over the current generation of x86-based instances for gaming workloads.
When a change is made to the game engine code, the pipeline executes automatically and creates a new version of the game server container. The container has a unique tag defined from the date and time when the image was built.
Deploy containerized game servers to EKS
The next step is to deploy the game servers to make the game accessible to players.
In our example we create a Continuous Delivery pipeline with CodePipeline and CodeBuild to deploy the newly created image to an EKS cluster.
The Continuous Delivery pipeline is created with a CDK stack that generates the required resources in AWS. The pipeline requires the name of the EKS cluster and the ECR repository as parameters to get deployed. The deployment manifests for the game servers are stored as assets in an S3 bucket. A deployment role is created and added to the pipeline to grant permissions to execute commands on the EKS cluster. We also grant permissions to that role to access S3 to retrieve the deployment manifests and Systems Manager Parameter Store to access the latest docker image tag.
CDK returns a command to update the EKS authentication configuration map and add a mapping for the CodeBuild role that will be used during the deployment.
The pipeline uses ECR as a source action to trigger deployments. When a newly created game server container is pushed to ECR, the pipeline starts and execute the deployment flow in CodeBuild. CodeBuild will first assign environment variables such as the AWS account ID where the deployment is performed, the AWS region where the code is running. It will populate the IMAGE_TAG variable from the Systems Manager Parameter store. The latest eksctl command line is downloaded and used to retrieve the Kubernetes configuration of the cluster. CodeBuild retrieves the latest deployment manifests from S3 and replaces placeholder values in the deployment manifests. At this point kubectl is used to deploy the game servers to EKS. The manifest deployed includes the desired number of replicas for the game server and the specific memory or CPU constraints (requests, limits) for the pod.
CDK provides a flexible way to define pipelines for various components of the game. It is possible to reuse the same approach to generate a build and deploy pipeline for game clients.
Once the game server is deployed, players can connect using an Elastic Public IP and start playing.
Performance Efficiency & Cost Optimization
Game developers gain immediate benefits from the lightweight size of containers which can address issues related to the density of game servers. Game server density refers to the number of server binaries that can run on a single host (bare metal or virtual machine). Density has a direct impact on the number of game sessions a single instance can host. Containerized game servers use less memory, which allow developers to scale the number of game servers running on a single host. This increases the number of sessions hosted on a single instance.
Game studios who are looking at reducing their hosting spend can use a combination of observability and node provisioning mechanisms to maintain the right amount of compute resources at any time on Kubernetes. When the number of game sessions is low, a nodes provisioning solution can reduce the number of nodes running on the cluster. However, when the number of sessions increase, more nodes will be added to meet the capacity demand. With this approach, game developers can achieve economies of scale for large game servers.
One way to scale game servers is to leverage CloudWatch to observe a Kubernetes cluster running game servers and generate events when there is a peak of game sessions. In this case, EventBridge can initiate a scaling up of the cluster based on the current number of sessions. The same process can be used to scale down the cluster. It is important to benchmark the game and establish a baseline for minimum and maximum capacity. That baseline can change based on the results of the game and can be updated in CloudWatch.
Another way to scale game servers is to rely on a node provisioning mechanism based on Karpenter , to maintain the right amount of compute resources at any time on Kubernetes
Reliability & Operational Excellence
Game development teams looking at adopting cloud native operating models can use containers to modernize development and change management processes. Containers are portable, which means developers can use local workstations to code, build and test the latest versions of game servers on minified Kubernetes runtimes or with docker compose. That practice improves code quality since game developers can push code that has well defined unit tests and is assessed before being merged into the main development branch. Local development reduces the amount of compute resources needed to maintain expensive test environments.
Containers are well suited to modern deployment strategies such as rolling updates and canary deployments. It is also easier for game development teams to adopt chaos engineering to assess the quality of their games under unexpected scenarios. This will enable them to prevent outages when game servers are in production.
Games sessions hosting can be improved with the ability of Kubernetes to add custom labels to nodes or deployments. Game development teams can use game characteristics as node or deployment labels. Players can be assigned to specific game servers by matching the labels during game session requests. Game development teams can use node selectors to run specific game servers on preferred nodes to meet CPU and memory requirements for matches.
As your game development teams are scaling servers in the cloud using containers, they need to put in place the right Observability stack. This will enable them to have visibility into the player’s experience. On AWS, game development teams can use managed observability services such as Amazon Managed Grafana, Amazon Managed Prometheus or Cloudwatch Container Insights to query game server metrics, build dashboards and generate alerts that useful to games Live Operations.
Kubernetes can ensure that games deployed on a cluster are scheduled and in the desired readiness state. Any accidental termination of a game server on Kubernetes, will trigger its replacement by a fresh instance to maintain the desired capacity. To meet the SLA of games, game developers can tweak the liveness and readiness probes of the containerized game servers to minimize downtime and automatic restarts when pods fail.
Security
Game studios need to move fast and stay secure when releasing games or patches to players. Containerized game servers can be scanned for vulnerabilities in CI/CD pipelines or offline when stored in a container registry. Adding security checks in game servers’ pipelines improves the quality of game releases and mitigates security risks that can make games vulnerable to players data exfiltration or DDoS.
Dedicated game servers are standalone TCP or UDP endpoints that expose to the internet and are subject to DDoS attacks. The longer the server runs, the greater the risk of the endpoint being scanned and attacked. Therefore, it is recommended to keep the game server running for the duration of the game session. Kubernetes allows easy game server scheduling upon player demand and also offers hooks that allow you to dispose of the game server when not needed.
Finally, Kubernetes supports Role Based Access Control (RBAC) Authorization that allows game developers to set granular access to the game artifacts like game server deployment and configuration. Game operators can define roles for their teams like developers, sysops, and devops that authorize them to do specific actions such as setting application secrets. Learn more about EKS security best practices here.
Conclusion
In this blog post, we presented the opportunity containers and Kubernetes bring to game servers hosting in the cloud. We looked at strategies to containerize game servers and the benefits containers bring to resources consumption. When deployed on Kubernetes, they provide agility, reliability, security, cost and sustainability benefits to modern game servers hosting in the cloud. In addition, we described the elasticity and resilience improvement game development teams gain from containerized game servers. We explained how game development teams can use containers to accelerate the adoption of DevOps practices and improve the security and quality of games. Try our code sample to get started!