在Unreal Engine 像素流送在g4dn上实现容器化部署实践(一)中我们讲解了在g4dn上如何编译Unreal Engine 4项目, 构建docker 镜像,还讲解了如何通过docker-compose部署了UE4像素流送演示项目,接下来我们将会在Amazon Elastic Kubernetes Service(EKS)上尝试部署一套可以弹性伸缩的像素流送平台。
本文分为如下几个部分:
- 架构说明
- 创建Amazon EKS集群,配置工作节点, 配置AWS Load Balancer Controller
- 编写需要的yaml配置文件,包括TURN/STUN, Matchmaker, Streamer , Enovy 等多个组件
- 测试弹性伸缩功能
1. 架构说明
Epic Games官方提供了一个多用户/多游戏像素流送的参考架构(见下图), 通过一个Matchmaker匹配服务,将用户请求连接到不同的Signaling Server(Web Server), 不过官方提供的Matchmaker是直接返回Singaling Server的IP和端口,并没有实现统一接入,需要自行修改Matchmaker实现统一端口接入。
Epic Games 官方多用户/多游戏像素流送参考架构图
根据Amazon EKS/Kubernetes的特性进行了重构
- 分别使用CPU和GPU, 2种类型的工作节点, 通过节点亲和/污点容忍,CPU类型节点运行Matchmaker, STUN/TRUN, Envoy路由服务, GPU类型节点(g4dn)运行UE4项目/Streamer。
- 平台通过统一Ingress 入口提供对外服务, 使用了Enovy 作为中间路由。
- Matchmaker进行了改造,不用进行端口跳转。
Amazon EKS参考架构图如下:
2. 创建Amazon EKS集群
2.1 创建Amazon EKS集群
我们将同时会创建2个工作节点组,1节点组个使用m5.large (CPU负载),主要用来部署STUN/TURN, Envoy, Matchmaker, player 等服务器, 另外一个节点组使用g4dn.xlarge,主要是用来部署UE4 像素流送项目,实现WebRTC流推送。
集群配置文件(cluster.yaml),当前默认kubernetes集群版本为1.21
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: ue4-pixelsteraming-eks
region: us-east-1
nodeGroups:
- name: ng-cpu-group01
instanceType: m5.large
desiredCapacity: 1
minSize: 1
maxSize: 4
labels:
app.pixel/turn: true
app.pixel/envoy: true
ssh:
allow: true
publicKeyName: <请更换成自己的私钥对名字>
- name: ng-gpu-group01
instanceType: g4dn.xlarge
desiredCapacity: 1
minSize: 1
maxSize: 4
labels:
app.pixel/streamer: true
taints:
app.pixel/streamer: "true:NoSchedule"
ssh:
allow: true
publicKeyName: <请更换成自己的私钥对名字>
创建集群
eksctl create cluster -c cluster.yaml
2.1 部署AWS Load Balancer Controller (ALB ingress需要)
#创建ingress使用的角色,策略, service account
eksctl utils associate-iam-oidc-provider --region=us-east-1 --cluster=ue4-pixelsteraming-eks --approve
eksctl create iamserviceaccount \
--cluster=ue4-pixelsteraming-eks \
--namespace=kube-system \
--name=aws-load-balancer-controller \
--attach-policy-arn=arn:aws:iam::<12位账号IP>:policy/AWSLoadBalancerControllerIAMPolicy \
--override-existing-serviceaccounts \
--approve
#安装cert-manager
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.yaml
#下载ALB 2.2.1 安装配置文件
curl -OL https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.1/docs/install/v2_2_1_full.yaml
编辑v2_2_1_full.yaml,将–cluster-name=your-cluster-name 修改为–cluster-name=eksworshop
修改v2_2_1_full.yaml后安装aws-load-balancer-controller
kubectl apply -f v2_2_1_full.yaml
系统会自动在kube-system下部署aws-load-balancer-controller。
eks:~/environment/ue4-on-eks/deploy $ kc get deploy -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
aws-load-balancer-controller 1/1 1 1 5d19h
3. 编写需要的yaml文件
根据步骤1我们设计的架构,准备所需要的yaml文件, 笔者已经将所有文件放在UE4-PixelStreaming-AWS-EKS仓库中的deploy目录中。
git clone https://github.com/stevensu1977/UE4-PixelStreaming-AWS-EKS
3.1 创建namespace,配置权限
我们将使用部署一个带有kubectl命令行工具的pod来为了获得TURN外网地址,所以我们需要kubernetes API Server 访问权限
创建Namespace
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/component: unrealengine
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
name: ue4
创建了2个service account 并且通过ClusterRole, ClusterRoleBinding进行正确授权
#service account 摘要
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: unrealengine
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
name: stream-svc-account
namespace: ue4
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: turn
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
name: turn-svc-account
namespace: ue4
3.2 STUN/TURN server
我们使用了coturn 作为STUN/TURN server ,通过它来解决内网Streamer(游戏实例)数据传输的问题 , 通过标签”app.pixel/turn=true” 将cotton以Dameonset的方式部署到指定的EC2节点。
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app.kubernetes.io/component: turn
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
app: turn
name: turn
namespace: ue4
spec:
selector:
matchLabels:
app.kubernetes.io/name: turn
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
app: turn
template:
metadata:
annotations:
sidecar.istio.io/inject: "false"
labels:
app: turn
app.kubernetes.io/name: turn
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
version: 0.0.1
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app.pixel/turn
operator: In
values:
- "true"
containers:
- env:
- name: INTERNAL_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: TURN_PORT
value: "3478"
- name: TURN_MIN_PORT
value: "49152"
- name: TURN_MAX_PORT
value: "65535"
- name: TURN_REALM
value: app.pixel
- name: TURN_USER
valueFrom:
secretKeyRef:
key: username
name: turn-secret
- name: TURN_PASS
valueFrom:
secretKeyRef:
key: password
name: turn-secret
image: ghcr.io/stevensu1977/ue4-pixelstreaming/turnserver
imagePullPolicy: Always
name: turn
ports:
- containerPort: 3478
hostPort: 3478
name: turn-udp
protocol: UDP
- containerPort: 3478
hostPort: 3478
name: turn-tcp
protocol: TCP
hostNetwork: true
serviceAccountName: turn-svc-account
terminationGracePeriodSeconds: 10
3.3 matchmaker, player, streamer
matchmaker负载将空闲的streamer发送给客户,player提供的是静态网页/javascript , streamer 使用了Epic Games 提供的一个ARPG演示项目。
3.4 ingress , envoy-router
我们会创建一个ALB Ingress, 它由Amazon load balancer controller 完成ALB资源的创建
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
labels:
app.kubernetes.io/component: unrealengine
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
namespace: ue4
name: pixelstreaming-ingress
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/target-type: ip
spec:
ingressClassName: alb
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: envoy-router
port:
number: 80
另外我们会部署一个envoy 网关,通过它将不同路由分别映射到player, matchmaker服务上去。
#envoy yaml文件摘要
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: routing
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
app: envoy-router
name: envoy-router
namespace: ue4
spec:
selector:
matchLabels:
app: envoy-router
app.kubernetes.io/component: routing
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
template:
metadata:
labels:
app: envoy-router
app.kubernetes.io/component: routing
app.kubernetes.io/part-of: ue4-on-eks
app.kubernetes.io/version: 0.0.1
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app.pixel/envoy
operator: In
values:
- "true"
containers:
- image: envoyproxy/envoy:v1.21.1
imagePullPolicy: IfNotPresent
name: envoy-router
ports:
- containerPort: 11000
name: http
- containerPort: 12000
name: api
resources:
limits:
cpu: 200m
memory: 128Mi
requests:
cpu: 100m
memory: 64Mi
securityContext:
capabilities:
add:
- NET_BIND_SERVICE
- CHOWN
- SETGID
- SETUID
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 10001
volumeMounts:
- mountPath: /etc/envoy
name: config
volumes:
- configMap:
name: envoy-routing-config
name: config
#envoy 路由规则
static_resources:
listeners:
- name: listener_0
address:
socket_address:
protocol: TCP
address: 0.0.0.0
port_value: 11000
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
upgrade_configs:
- upgrade_type: websocket
access_log:
- name: envoy.access_loggers.stdout
typed_config:
"@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/matchmaker" }
route:
cluster: service_matchmaker
- match: { prefix: "/ws" }
route:
cluster: service_matchmaker
- match: { prefix: "/" }
route:
cluster: service_player
http_filters:
- name: envoy.filters.http.router
clusters:
- name: service_matchmaker
connect_timeout: 1s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_envoyproxy_io
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: matchmaker
port_value: 3000
- name: service_player
connect_timeout: 1s
type: LOGICAL_DNS
dns_lookup_family: V4_ONLY
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: service_envoyproxy_io
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: player
port_value: 80
4. 部署测试
所有的yaml内容都在deploy目录下面 ,我们直接进行部署
kubectl apply -f ./deploy
4.1 验证部署是否成功
检查Ingress , Service
kubectl get ingress
kubectl get svc
检查应用
4.2 测试演示项目
获取访问地址
eks:~/environment/ue4-on-eks $ kubectl get ingress -o json | jq .items[].status.loadBalancer.ingress[].hostname
#参考输出
"k8s-ue4-pixelstr-1111111111111-1111111111111.us-east-1.elb.amazonaws.com"
4.3 弹性伸缩测试
我们部署的应用stream 默认只有1个副本,当我们已经有1个浏览器连接ingress, matchmaker会把当前空闲的游戏实例发送给客户建立WebRTC连接,我们开启第二个浏览器(Firefox)访问该地址的时候它会显示”Waiting for available streamer”
这个时候我们只需要增加stream的副本,matchmaker就会通过websocket 发送新的游戏连接信息,firefox浏览器也会从”Waiting for available streamer” 状态变为游戏连接状态。
kubectl get deploy stream
kubectl scale deploy stream --replicas=2
这个时候我们可以看到firefox , chrome 浏览器使用同一个ALB地址但是分别连接到2个独立的UE4游戏, 进行各自的游戏,说明通过副本数扩展可以实现演示游戏的弹性伸缩,我们也可以继续增加应用stream的副本数量来支持更多的客户访问。
有兴趣的读者还可以自行研究与Kubernetes HPA组件结合来实现UE4像素流送的自动弹性伸缩。
5.总结
本文详细地介绍了如何将Unreal Engine 4 游戏打包部署在Amazon Elastic Kubernetes Service(EKS)服务上,通过Kubernetes内置的功能和Amazon EC2 g4dn实例打造一个支持多用户可以弹性伸缩的UE4游戏像素流送平台。
参考文档、资源
本篇作者