下一代的 k8s L7 流量入口 – 初探 K8s API 网关 Envoy Gateway

什么是 Envoy Gateway

Envoy Gateway 是一个用于管理 Envoy Proxy 的开源项目，可单独使用或作为 Kubernetes 中应用的网关。它通过了 Gateway API 核心一致性测试，使用 Gateway API 作为其唯一的配置语言来管理 Envoy 代理，支持 GatewayClass、Gateway、HTTPRoute 和 gRPCRoute 等资源。

社区发展方向

随着社区的不断发展，人们对如何暴露服务有很多不同的选择，虽然 Nginx 仍然是主流，但是其功能匮乏也逐渐不能适应云原生的要求，人们需要更强大的网关来治理微服务的“南北流向”。

方式	控制器	功能
Node Port/LB	Kubernetes	负载均衡
Ingress	Ingress Controller	负载均衡、TLS、虚拟主机、流量路由
Istio Gateway	Istio	负载均衡、TLS、虚拟主机、高级流量路由、其他 Istio 的高级功能
API Gateway	API Gateway	负载均衡、TLS、虚拟主机、流量路由、API 生命周期管理、权限认证、数据聚合、账单和速率限制

Envoy Gateway 和 EKS 集成

Envoy Gateway 作为服务的 7 层代理，以 service 的形态运行在 EKS 中。Envoy Gateway 本身通过 AWS NLB 对外暴露。本篇文章，我们将介绍如何安装 Envoy Gateway 并和 EKS 集成，以及验证其规则转发，限流，灰度，鉴权，安全和可观测性等功能。

安装手册

1. EKS 部署

eksctl create cluster \
--name eks-workshop \
--version 1.27 \
--region ap-southeast-1 \
--nodegroup-name workernode \
--node-type t3.medium \
--nodes 2 \
--nodes-min 2 \
--nodes-max 4 \
--ssh-access \
--ssh-public-key Envoygateway \
--managed

2. 安装最新版 Helm

wget https://get.helm.sh/helm-v3.13.1-linux-amd64.tar.gz
tar -zxvf helm-v3.13.1-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm

3. 用 helm 安装 EG

helm install eg oci://docker.io/envoyproxy/gateway-helm --version v0.5.0 -n envoy-gateway-system --create-namespace
 
Pulled: docker.io/envoyproxy/gateway-helm:v0.5.0
Digest: sha256:e7900970b1cb20f57d896ac4d72dfadb50eade130cec7c0440f6ddbfcf94e66e
NAME: eg
LAST DEPLOYED: Sat Oct 14 15:46:05 2023
NAMESPACE: envoy-gateway-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
**************************************************************************
*** PLEASE BE PATIENT: Envoy Gateway may take a few minutes to install ***
**************************************************************************
 
Envoy Gateway is an open source project for managing Envoy Proxy as a standalone or Kubernetes-based application gateway.
 
Thank you for installing Envoy Gateway! �
 
 Your release is named: eg. 
 
  Your release is in namespace: envoy-gateway-system.
 
 To learn more about the release, try:
 
	$ helm status eg -n envoy-gateway-sys
  $ helm get all eg -n envoy-gateway-sys
  To have a quickstart of Envoy Gateway, please refer to https://gateway.envoyproxy.io/latest/user/quickstart.html
 
To get more details, please visit https://gateway.envoyproxy.io and https://github.com/envoyproxy/gateway.

4. 安装其他必备的 CRD

kubectl apply -f https://github.com/envoyproxy/gateway/releases/download/v0.5.0/install.yaml

5. 使用 NLB 替换 CLB

我们要通过 NLB 来暴露 envoy gateway，要暴露 Loadbalancer 类型的服务，需要先安装 aws-loadbalancer-controller 服务。首先创建 aws-loadbalancer-controller 需要的 IAM 策略：

aws iam create-policy --policy-name AWSLoadBalancerControllerIAMPolicy-Envoy \
  --policy-document file://<(curl -s https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.6.0/docs/install/iam_policy.json)

使用 eksctl 为 aws-loadbalancer-controller 创建 IRSA 映射，确保 serviceaccount 有足够的权限创建 NLB 资源：

export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)

eksctl create iamserviceaccount \
  --cluster=envoy-gateway-demo \
  --region=ap-southeast-1 \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --role-name AmazonEKSLoadBalancerControllerRole-Envoy \
  --attach-policy-arn=arn:aws:iam::${AWS_ACCOUNT_ID}:policy/AWSLoadBalancerControllerIAMPolicy-Envoy \
  --approve

安装 aws-loadbalancer-controller Helm chart：

helm repo add eks https://aws.github.io/eks-charts
helm repo update eks

安装 aws-loadbalancer-controller 服务，绑定到 aws-load-balancer-controller serviceaccount：

helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  -n kube-system \
  --set clusterName=envoy-gateway-demo \
  --set serviceAccount.create=false \
  --set serviceAccount.name=aws-load-balancer-controller

检查 aws-loadbalancer-controller 安装状态：

kubectl get deployments.apps -n kube-system aws-load-balancer-controller 
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
aws-load-balancer-controller   2/2     2            2           47m

功能验证

Kubernetes Gateway API 包含 GatewayClass，Gateway，HTTPRoute，TCPRoute，Service 等资源，在下面的场景验证中，我们需要从上往下一步步把相关的资源创建：

1. 规则转发

EnvoyGateway 支持根据 host，header 和 path 转发到 EKS 后端的 service。在以下的步骤中，我们将创建 GatewayClass（example-gateway-class），Gateway（example-gateway），以及 3 个 HTTPRoute（example-route，foo-route，bar-route）分别对应按 host，路径和 header 进行转发。

1.1 安装环境

kubectl apply -f https://raw.githubusercontent.com/envoyproxy/gateway/v0.5.0/examples/kubernetes/http-routing.yaml

gatewayclass.gateway.networking.k8s.io/example-gateway-class created
gateway.gateway.networking.k8s.io/example-gateway created
service/example-svc created
deployment.apps/example-backend created
httproute.gateway.networking.k8s.io/example-route created
service/foo-svc created
deployment.apps/foo-backend created
httproute.gateway.networking.k8s.io/foo-route created
service/bar-svc created
deployment.apps/bar-backend created
service/bar-canary-svc created
deployment.apps/bar-canary-backend created
httproute.gateway.networking.k8s.io/bar-route created

检查 gateway 和 class 的状态并获取 NLB 地址

kubectl get gc --selector=example=http-routing
 
NAME                    CONTROLLER                                      ACCEPTED   AGE
example-gateway-class   gateway.envoyproxy.io/gatewayclass-controller   True       3m3s

kubectl get gateways --selector=example=http-routing
NAME              CLASS                   ADDRESS                                                                       PROGRAMMED   AGE
example-gateway   example-gateway-class   a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com   True         4m8s

检查 HTTPRouters 的状态

kubectl get httproutes --selector=example=http-routing -o yaml
 
apiVersion: v1
items:
- apiVersion: gateway.networking.k8s.io/v1beta1
  kind: HTTPRoute
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"gateway.networking.k8s.io/v1beta1","kind":"HTTPRoute","metadata":{"annotations":{},"labels":{"example":"http-routing"},"name":"bar-route","namespace":"default"},"spec":{"hostnames":["bar.example.com"],"parentRefs":[{"name":"example-gateway"}],"rules":[{"backendRefs":[{"name":"bar-canary-svc","port":8080}],"matches":[{"headers":[{"name":"env","type":"Exact","value":"canary"}]}]},{"backendRefs":[{"name":"bar-svc","port":8080}]}]}}
    creationTimestamp: "2023-10-15T07:42:05Z"
    generation: 1
    labels:
      example: http-routing
    name: bar-route
    namespace: default
    resourceVersion: "218855"
    uid: 942917bb-defd-4099-936f-6a14c5bd2a19
  spec:
    hostnames:
    - bar.example.com
    parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: example-gateway
    rules:
    - backendRefs:
      - group: ""
        kind: Service
        name: bar-canary-svc
        port: 8080
        weight: 1
      matches:
      - headers:
        - name: env
          type: Exact
          value: canary
        path:
          type: PathPrefix
          value: /
    - backendRefs:
      - group: ""
        kind: Service
        name: bar-svc
        port: 8080
        weight: 1
      matches:
      - path:
          type: PathPrefix
          value: /
  status:
    parents:
    - conditions:
      - lastTransitionTime: "2023-10-15T07:42:05Z"
        message: Route is accepted
        observedGeneration: 1
        reason: Accepted
        status: "True"
        type: Accepted
      - lastTransitionTime: "2023-10-15T07:42:05Z"
        message: Resolved all the Object references for the Route
        observedGeneration: 1
        reason: ResolvedRefs
        status: "True"
        type: ResolvedRefs
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: example-gateway
- apiVersion: gateway.networking.k8s.io/v1beta1
  kind: HTTPRoute
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"gateway.networking.k8s.io/v1beta1","kind":"HTTPRoute","metadata":{"annotations":{},"labels":{"example":"http-routing"},"name":"example-route","namespace":"default"},"spec":{"hostnames":["example.com"],"parentRefs":[{"name":"example-gateway"}],"rules":[{"backendRefs":[{"name":"example-svc","port":8080}]}]}}
    creationTimestamp: "2023-10-15T07:42:01Z"
    generation: 1
    labels:
      example: http-routing
    name: example-route
    namespace: default
    resourceVersion: "218764"
    uid: b4239e9d-45d3-44d2-b261-0e21f57c78e8
  spec:
    hostnames:
    - example.com
    parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: example-gateway
    rules:
    - backendRefs:
      - group: ""
        kind: Service
        name: example-svc
        port: 8080
        weight: 1
      matches:
      - path:
          type: PathPrefix
          value: /
  status:
    parents:
    - conditions:
      - lastTransitionTime: "2023-10-15T07:42:01Z"
        message: Route is accepted
        observedGeneration: 1
        reason: Accepted
        status: "True"
        type: Accepted
      - lastTransitionTime: "2023-10-15T07:42:01Z"
        message: Resolved all the Object references for the Route
        observedGeneration: 1
        reason: ResolvedRefs
        status: "True"
        type: ResolvedRefs
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: example-gateway
- apiVersion: gateway.networking.k8s.io/v1beta1
  kind: HTTPRoute
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"gateway.networking.k8s.io/v1beta1","kind":"HTTPRoute","metadata":{"annotations":{},"labels":{"example":"http-routing"},"name":"foo-route","namespace":"default"},"spec":{"hostnames":["foo.example.com"],"parentRefs":[{"name":"example-gateway"}],"rules":[{"backendRefs":[{"name":"foo-svc","port":8080}],"matches":[{"path":{"type":"PathPrefix","value":"/login"}}]}]}}
    creationTimestamp: "2023-10-15T07:42:02Z"
    generation: 1
    labels:
      example: http-routing
    name: foo-route
    namespace: default
    resourceVersion: "218796"
    uid: f61fa7a8-a18e-4045-a22b-5a94a963a1ac
  spec:
    hostnames:
    - foo.example.com
    parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: example-gateway
    rules:
    - backendRefs:
      - group: ""
        kind: Service
        name: foo-svc
        port: 8080
        weight: 1
      matches:
      - path:
          type: PathPrefix
          value: /login
  status:
    parents:
    - conditions:
      - lastTransitionTime: "2023-10-15T07:42:02Z"
        message: Route is accepted
        observedGeneration: 1
        reason: Accepted
        status: "True"
        type: Accepted
      - lastTransitionTime: "2023-10-15T07:42:02Z"
        message: Resolved all the Object references for the Route
        observedGeneration: 1
        reason: ResolvedRefs
        status: "True"
        type: ResolvedRefs
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: example-gateway
kind: List
metadata:
  resourceVersion: ""

验证测试配置

1）测试根据 host 对应”example.com”的路由，应返回状态代码 200，正文应包含"pod": "example-backend-*"指示流量已路由到示例后端服务的信息。

curl -vvv --header "Host: example.com" "http://a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com/"
*   Trying 18.140.155.10:80...
* Connected to a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com (18.140.155.10) port 80 (#0)
> GET / HTTP/1.1
> Host: example.com
> User-Agent: curl/7.85.0
> Accept: */*
< HTTP/1.1 200 OK
... 中间省略

 "pod": "example-backend-69b78f679c-vh4nj"
* Connection #0 to host a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com left intact
}%

成功输出响应的 pod 为：example-backend-69b78f679c-vh4nj

2）接下来测试根据路径匹配 foo.example.com/login/*转发流量。

curl -vvv --header "Host: foo.example.com" "http://a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com/login"
*   Trying 18.140.155.10:80...
* Connected to a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com (18.140.155.10) port 80 (#0)
> GET /login HTTP/1.1
> Host: foo.example.com
> User-Agent: curl/7.85.0
> Accept: */*
< HTTP/1.1 200 OK
... 中间省略

 "pod": "foo-backend-5cdf894969-8l472"
* Connection #0 to host a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com left intact
}%

如上 curl 输出 http 请求流量已经转发到 foo 后端的 pod（foo-backend-5cdf894969-8l472。

3）测试访问 Host bar.example.com 而且 header = canary 的服务转发。

curl -vvv --header "Host: bar.example.com" --header "env: canary" "http://a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com/"
*   Trying 18.140.155.10:80...
* Connected to a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com (18.140.155.10) port 80 (#0)
> GET / HTTP/1.1
> Host: bar.example.com
> User-Agent: curl/7.85.0
> Accept: */*
< HTTP/1.1 200 OK
... 中间省略

 "pod": "bar-canary-backend-5488694869-68bxw"
* Connection #0 to host a87b9dc7b4a5b4266b5675af3b1788c7-411544734.ap-southeast-1.elb.amazonaws.com left intact
}%

如上 curl 输出返回状态代码 200，流量发送到”pod”: “bar-canary-backend-5488694869-68bxw”。

2. 限流

以下这些场景你将考虑使用限流功能：

防止 DDos 等攻击
防止应用及数据库过载
根据用户授权保护 API

Envoy Gateway 支持使用 Global rate limiting 来对所有 Envoy proxies 进行限流，比如你有两个 EnvoyPod 且配置 10 requests/second，那每个 Envoy 将接受 5 requests/s。如果要对限流做精细化控制，可以使用 RateLimitFilter 这个 CRD 实现，例如基于 header 区分不同用户做限流，基于 IP 做限流和基于 JWT Claims 限流。

2.1 安装和部署

Global rate limiting 需要使用 ConfigMap 来配置，其中需要依赖 Redis 实例作为其缓存层用于访问计数，以下我们先安装 Redis，然后启用 ConfigMap。

1）安装 Redis 实例

cat <<EOF | kubectl apply -f -
kind: Namespace
apiVersion: v1
metadata:
  name: redis-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: redis-system
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - image: redis:6.0.6
        imagePullPolicy: IfNotPresent
        name: redis
        resources:
          limits:
            cpu: 1500m
            memory: 512Mi
          requests:
            cpu: 200m
            memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: redis-system
  labels:
    app: redis
  annotations:
spec:
  ports:
  - name: redis
    port: 6379
    protocol: TCP
    targetPort: 6379
  selector:
    app: redis
---
 
EOF

2）配置 ConfigMap 启用限流功能

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: envoy-gateway-config
  namespace: envoy-gateway-system
data:
  envoy-gateway.yaml: |
    apiVersion: gateway.envoyproxy.io/v1alpha1
    kind: EnvoyGateway
    provider:
      type: Kubernetes
    gateway:
      controllerName: gateway.envoyproxy.io/gatewayclass-controller
    rateLimit:
      backend:
        type: Redis
        redis:
          url: redis.redis-system.svc.cluster.local:6379
EOF

由于更新后 ConfigMap，需要重新启动 envoy-gateway 部署，使用一下命令重启 EG 的 deployment 以便配置生效。

kubectl rollout restart deployment envoy-gateway -n envoy-gateway-system
 
deployment.apps/envoy-gateway restarted

2.2 场景测试

测试场景一：速率限制特定用户

以下我们通过自定义 header 名字为 x-user-id，值为 one 的特定用户访问速率，如下部署 http 的 ratelimit 和 http 的路由。

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: policy-httproute
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: http-ratelimit
    namespace: default
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - name: x-user-id
            value: one
        limit:
          requests: 3
          unit: Hour
EOF

创建 HTTPRoute 路由

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: http-ratelimit
spec:
  parentRefs:
  - name: eg
  hostnames:
  - ratelimit.example
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
EOF

验证测试效果

首先获取网关 Envoygateway 的地址，如下命令：

export GATEWAY_HOST=$(kubectl get gateway/eg -o jsonpath='{.status.addresses[0].value}')

测试说明：我们通过查询 ratelimit.example/get 请求 4 次。应该从示例网关接收前 3 个请求的响应代码返回 200，因为包含 header 和 value 的参数所有第四个请求返回响应代码 429，我们应用请求限制设置为每小时 3 个请求。header 的 x-user-id 为 one。如下输出结果：

for i in {1..4}; do curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://${GATEWAY_HOST}/get ; sleep 1; done
 
HTTP/1.1 200 OK
content-type: application/json
x-content-type-options: nosniff
date: Tue, 07 Nov 2023 08:53:04 GMT
content-length: 547
x-envoy-upstream-service-time: 0
x-ratelimit-limit: 3, 3;w=3600
x-ratelimit-remaining: 2
x-ratelimit-reset: 416
server: envoy
 
HTTP/1.1 200 OK
... 省略
 
HTTP/1.1 200 OK
... 省略
 
HTTP/1.1 429 Too Many Requests
... 省略

我们这次把 x-user-id 标头的参数设置为其他值（two）并发送请求，可以从服务器接收成功的响应 200 代码。我们设置特定的标头（two）其不受限制约束请求。

for i in {1..4}; do curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://${GATEWAY_HOST}/get ; sleep 1; done
 
HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 200 OK

测试场景二：速率限制不同用户

我们对刚刚的配置进行稍许更改，指定 header 的 type 为 Distinct，提供 header name 为 x-user-id 但是不提供 value，这时候无论访问的流量中 value 是 one 还是 two，都将受到速率限制。

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy 
metadata:
  name: policy-httproute
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: http-ratelimit
    namespace: default
  rateLimit:
    type: Global
    global:
      rules:
      - clientSelectors:
        - headers:
          - type: Distinct
            name: x-user-id
        limit:
          requests: 3
          unit: Hour
EOF

创建 HTTPRoute 路由

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: http-ratelimit
spec:
  parentRefs:
  - name: eg
  hostnames:
  - ratelimit.example 
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
EOF

我们发起请求，header 的 name 为 x-user-id，无论 value 是 one 还是 two 都将受限。

for i in {1..4}; do curl -I --header "Host: ratelimit.example" --header "x-user-id: one" http://${GATEWAY_HOST}/get ; sleep 1; done

或者

for i in {1..4}; do curl -I --header "Host: ratelimit.example" --header "x-user-id: two" http://${GATEWAY_HOST}/get ; sleep 1; done

# 返回

HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 429 Too Many Requests

测试场景三：限制所有请求的速率

以下测试展示如何通过不设置该字段来将与 HTTPRoute 规则匹配的所有请求速率限制为 3 个请求/小时 clientSelectors。

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: policy-httproute
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: http-ratelimit
    namespace: default
  rateLimit:
    type: Global
    global:
      rules:
      - limit:
          requests: 3
          unit: Hour
EOF

http 路由

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: http-ratelimit
spec:
  parentRefs:
  - name: eg
  hostnames:
  - ratelimit.example
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
EOF

我们不设置参数对示例应用进行访问，前 3 个请求是 ok 的响应 200，第 4 个请求是响应代码 429。

for i in {1..4}; do curl -I --header "Host: ratelimit.example" http://${GATEWAY_HOST}/get ; sleep 1; done
 
HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 200 OK
HTTP/1.1 429 Too Many Requests

3. 灰度

我们可以通过把多个 backendRefs 绑定给一个 HTTPRoute 资源的方式实现灰度，另外不同 backendRefs 之间可以配置不同的权重实现加权。

1）部署一个新的 backend-2 服务

cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: backend-2
---
apiVersion: v1
kind: Service
metadata:
  name: backend-2
  labels:
    app: backend-2
    service: backend-2
spec:
  ports:
    - name: http
      port: 3000
      targetPort: 3000
  selector:
    app: backend-2
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend-2
spec:
  replicas: 1
  selector:
    matchLabels:
      app: backend-2
      version: v1
  template:
    metadata:
      labels:
        app: backend-2
        version: v1
    spec:
      serviceAccountName: backend-2
      containers:
        - image: gcr.io/k8s-staging-ingressconformance/echoserver:v20221109-7ee2f3e
          imagePullPolicy: IfNotPresent
          name: backend-2
          ports:
            - containerPort: 3000
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
EOF

2）检查 backend 和 backend-2 两个服务

[ec2-user@ip-172-31-4-195 ~]$ kubectl get svc
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
backend      ClusterIP   10.100.240.255   <none>        3000/TCP   9d
backend-2    ClusterIP   10.100.117.200   <none>        3000/TCP   4m39s

3）创建具有 multiple backendRefs 的 HTTPRoute

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: http-headers
spec:
  parentRefs:
  - name: eg
  hostnames:
  - backends.example
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
    - group: ""
      kind: Service
      name: backend-2
      port: 3000
EOF

4）检查 HTTPRoute 的状态

[ec2-user@ip-172-31-4-195 ~]$ kubectl get httproute
NAME           HOSTNAMES              AGE
backend        ["www.example.com"]    3d
http-headers   ["backends.example"]   7s

5）测试

默认情况下，weight 权重都是 1，所以流量是均衡的。

[ec2-user@ip-172-31-4-195 ~]$ curl  --header "Host: backends.example" -s "http://${GATEWAY_HOST}/bar" | grep 'pod'
 "pod": "backend-2-677948d499-xhz4g"
[ec2-user@ip-172-31-4-195 ~]$ curl  --header "Host: backends.example" -s "http://${GATEWAY_HOST}/bar" | grep 'pod'
 "pod": "backend-5cbb9cd947-wjrkz"

6）灰度

如果需要做灰度，可以编辑 yaml 文件调整 weight 的比值。

kubectl edit httproute http-headers

比如 backend 服务的 weight 值是 1， backend-2 服务的 weight 值是 2，那么两者的流量比就是 1:2。

4. 鉴权

Envoy 支持对 HTTP 请求做鉴权，gateway 收到请求的时候先检查是否携带合法的 JWT token，目前仅支持 HTTP header 校验 JWT。我们继续沿用 quickstart.yaml 里面的 backend 微服务进行验证。

1）创建 AuthenticationFilter 并和 HTTPRoute 关联

kubectl apply -f https://raw.githubusercontent.com/envoyproxy/gateway/v0.5.0/examples/kubernetes/authn/jwt.yaml
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: AuthenticationFilter
metadata:
  name: jwt-example
spec:
  type: JWT
  jwtProviders:
  - name: example
    remoteJWKS:
      uri: https://raw.githubusercontent.com/envoyproxy/gateway/main/examples/kubernetes/authn/jwks.json
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: backend
spec:
  parentRefs:
  - name: eg
  hostnames:
  - "www.example.com"
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
      weight: 1
    filters:
    - extensionRef:
        group: gateway.envoyproxy.io
        kind: AuthenticationFilter
        name: jwt-example
      type: ExtensionRef
    matches:
    - path:
        type: PathPrefix
        value: /foo
  - backendRefs:
    - group: ""
      kind: Service
      name: backend
      port: 3000
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /bar

执行完之后，/bar 可以无需认证就能访问，/foo 需要需要 JWT 认证才能继续访问。

2）准备验证

先找到对应 Envoy Gateway 的 service

export ENVOY_SERVICE=$(kubectl get svc -n envoy-gateway-system --selector=gateway.envoyproxy.io/owning-gateway-namespace=default,gateway.envoyproxy.io/owning-gateway-name=eg -o jsonpath='{.items[0].metadata.name}')

然后找到对应的 NLB 地址

export GATEWAY_HOST=$(kubectl get svc/${ENVOY_SERVICE} -n envoy-gateway-system -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')

[ec2-user@ip-172-31-4-195 ~]$ echo $GATEWAY_HOST
k8s-envoygat-envoydef-0d595b4a4b-b9e0b7e9965ca737.elb.ap-southeast-1.amazonaws.com

3）验证不带 jwt 访问/foo 失败

[ec2-user@ip-172-31-4-195 ~]$ curl http://$GATEWAY_HOST/foo -sS -o /dev/null -H "Host: www.example.com" -w "%{http_code}\n"
401

4）验证不带 jwt 访问/bar 成功

[ec2-user@ip-172-31-4-195 ~]$ curl http://$GATEWAY_HOST/bar -sS -o /dev/null -H "Host: www.example.com" -w "%{http_code}\n"
200

5. 安全

NLB TLS Termination

在实际生产中，对外的流量往往需要使用 TLS 证书加密，使用 HTTPS 对外暴露服务。本章节演示如何配置 NLB 卸载 TLS 证书，将 Envoy Gateway Pod 实例作为负载均衡的 target 绑定到 NLB。

一般情况下 kubernetes service 可以通过 NLB TLS Termination 的方法来设置，但当前 EnvoyGateway 的 CRD 还不支持 service 资源复杂描述，需要手动建 NLB 和 TargetGroup，再通过 TargetGroupBinding 将 kubernetes service 和 ELB 的 TargetGroup 关联。

创建 NLB 负载均衡器

对外提供服务的负载均衡器要创建在 Public subnet，建议开启跨 Cross-zone load balancing 和 Deletion protection。NLB 的证书可以直接使用 AWS 提供的 ACM 证书。

注意 NLB 现在支持 Securitygroup，需要进行一些配置。这个关联了两个安全组，EnvoyGateway 控制互联网到 NLB 的流量，这里设置只需允许 https 访问。另外一个安全组 k8s-traffic-poc 默认配置放行所有出网流量，主要用于设置 EKS 集群的安全组 inbound 规则时引用。

找到 EKS 集群的安全组，增加一条 inbound 规则，TCP 端口为 10080（EnvoyGateway Pod 的 http 监听端口），源设置为 k8s-traffic-poc，这样允许 NLB 直接访问 EnvoyGateway 的 Pod 监听端口。

创建 Targetgroup

EnvoyGateway Pod 默认在 10080 端口监听 http 服务。由于 NLB 对互联网请求不做任何 L7 的处理，因此这里监听端口设置为 TCP:10080，后面的健康检查也使用这个端口。

健康检查设置直接设置为 Traffic port：

建议开启 Stickness 和 Preserve client ip addresses：

TargetGroupBinding

官方文档说明：TargetGroupBinding – AWS Load Balancer Controller

SERVICE_NAME=$(kubectl get svc -n envoy-gateway-system \
  --selector gateway.envoyproxy.io/owning-gateway-name=apps,gateway.envoyproxy.io/owning-gateway-namespace=default \\
  -o jsonpath='{.items[0].metadata.name}')

cat > targetgroupbinding.yaml << EOF
apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
  name: envoyproxy
  namespace: envoy-gateway-system
spec:
  serviceRef:
    name: $SERVICE_NAME # route traffic to the service
    port: 80
  targetGroupARN: "arn:aws:elasticloadbalancing:ap-northeast-1:865557051408:targetgroup/EnvoyTargetgroup/118b1ae65b97955e"
EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: httpbin
spec:
  parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: apps
  hostnames:
    - "<替换成自己的域名>"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /httpbin/
      filters:
        - type: URLRewrite
          urlRewrite:
            path:
              type: ReplacePrefixMatch
              replacePrefixMatch: /
      backendRefs:
        - group: ""
          kind: Service
          name: httpbin
          port: 8000
          weight: 1

注意将以上配置中的”hostnames”替换为自己的域名。最后在 Route53 或三方 DNS 解析服务将 CNAME 指向 NLB 的域名。至此，您的站点将可以对外提供 https 服务，且在 NLB 上完成证书卸载后进入 Envoy Gateway，在安全和性能间取的一个较好的平衡。

6. 观测性

Envoy Gateway 支持 Opentelemetry 规范的 metric、log、trace，实现 Envoy Gateway 自身的可观测性。可以将这些可观测信号通过 Opentelemetry 的 collector 抓取或主动推送到 collector，由 opentelemetry 的 collector 对这些观测信号做处理增强后进行存储展示。且 log 支持丰富的自定义字段。配置参考”telemetry”配置段：

apiVersion: config.gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: custom-proxy-config
  namespace: envoy-gateway-system
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyService:
        type: ClusterIP
      envoyDeployment:
        replicas: 2
        container:
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
            limits:
              cpu: 1000m
              memory: 1Gi
  telemetry:
    metrics:
      prometheus: {}
    accessLog:
      settings:
        - format:
            type: JSON
            json:
              time: "%START_TIME%"
              trace_id: "%REQ(TRACEPARENT)%"
              request_id: "%REQ(X-REQUEST-ID)%"
              response_code: "%RESPONSE_CODE%"
              response_flag: "%RESPONSE_FLAGS%"
              method: "%REQ(:METHOD)%"
              request: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
              protocol: "%PROTOCOL%"
              bytes_received: "%BYTES_RECEIVED%"
              bytes_send: "%BYTES_SENT%"
              duration: "%DURATION%"
              client_ip: "%REQ(X-FORWARDED-FOR)%"
              user_agent: "%REQ(USER-AGENT)%"
              authority: "%REQ(:AUTHORITY)%"
              upstream_host: "%UPSTREAM_HOST%"
          sinks:
            - type: File
              file:
                path: /dev/stdout
    tracing:
      # sample 100% of requests
      samplingRate: 100
      provider:
        host: otel-collector.monitoring.svc.cluster.local
        port: 4317

总结

本文抛砖引玉介绍了 Envoy Gateway 的特性以及社区最新的发展方向，演示了如何安装并和 EKS 集成，最后用几个事例来验证规则转发，限流，灰度，鉴权，安全设置和观察性等特性。目前 Envoy Gateway 社区在快速发展当中，截止发文已经迭代到 v0.6 版本，笔者相信不久的未来 Envoy Gateway 将成为南北流量治理的标准。

参考材料

https://gateway.envoyproxy.io/v0.6.0/user/

https://tetrate.io/blog/how-to-use-envoy-gateway-with-aws-nlb/

https://gateway-api.sigs.k8s.io/

亚马逊AWS官方博客