Exposing Kubernetes Applications, Part 1: Service and Ingress Resources
The Exposing Kubernetes Applications series focuses on ways to expose applications running in a Kubernetes cluster for external access.
In this Part 1 of the series, we explore Service and Ingress resource types that define two ways to control the inbound traffic in a Kubernetes cluster. We discuss the handling of these resource types via Service and Ingress controllers, followed by an overview of advantages and drawbacks of some of the controllers’ implementation variants.
In Part 2, we provide an overview of the AWS open-source implementation of both Service and Ingress controllers, AWS Load Balancer Controller. We demonstrate the controller’s setup, configuration, possible use cases and limitations.
Kubernetes is a container orchestration engine that allows you to deploy, scale, and manage containerized applications.
Cluster administrators and application developers define logical resources that describe the desired state of the cluster and applications in it. Kubernetes various mechanisms then work to achieve and maintain that state.
For some applications, usually those that handle batch or asynchronous operations, enabling network access may not be a requirement. For others, like RESTful backend services or Web applications, it is mandatory.
Understanding the different use cases and appropriate ways to expose these applications is key to setting up scalable, operationally sound cluster infrastructure.
Kubernetes applications consist of one or more Pods running one or more containers. Every Pod gets its own IP address. Whether IP addresses of the Pods are accessible outside of the cluster depends on the implementation of the cluster’s Container Network Interface (CNI) plugin. Some CNI plugin implementations create an overlay network across Kubernetes nodes, keeping Pods’ IPs internal to the cluster. The Amazon Virtual Private Cloud (VPC) CNI plugin uses VPC IP addressing, so every Pod gets a valid IP address from the VPC CIDR range, which makes the same IP accessible both internally and externally to the cluster.
While possible, accessing applications by their Pods’ IPs is usually an anti-pattern. Pods are non-permanent objects that may be created and destroyed, “moving” between the cluster’s nodes, because of a scaling event, a node replacement, or a configuration change. Additionally, the direct access method doesn’t address applications’ load balancing or routing requirements.
To address these concerns and reliably expose an application, the Service resource type has been introduced. A Service is an abstraction over a dynamically constructed set of Pods defined by a set of label selectors.
During this series of articles, we will follow the Kubernetes terminology, where a resource type (e.g., a Service) is the logical definition that, when created via a Kubernetes Application Programming Interface (API) call, becomes a resource (e.g., a Service), sometimes also referred to as an object.
Here is an example of a Service definition that matches, using a label selector, a set of Pods in a Deployment:
apiVersion: v1 kind: Service metadata: name: some-service namespace: some-namespace spec: type: ClusterIP selector: app.kubernetes.io/name: some-app ports: - name: svc-port port: 80 targetPort: app-port protocol: TCP --- apiVersion: apps/v1 kind: Deployment metadata: name: some-deployment namespace: some-namespace spec: replicas: 1 selector: matchLabels: app.kubernetes.io/name: some-app template: metadata: labels: app.kubernetes.io/name: some-app spec: containers: - name: nginx image: public.ecr.aws/nginx/nginx ports: - name: app-port containerPort: 80 ...
Exposing a Service
The way in which a Service resource is exposed is controlled via its spec.type setting, with the following being relevant to our discussion:
- ClusterIP (as in the example above), which assigns it a cluster-private virtual IP and is the default
- NodePort, which exposes the above ClusterIP via a static port on each cluster node
- LoadBalancer, which automatically creates a ClusterIP, sets the NodePort, and indicates that the cluster’s infrastructure environment (e.g., cloud provider) is expected to create a load balancing component to expose the Pods behind the Service
Once a Pod targeted by the Service is created and ready, its IP is mapped to the ClusterIP to provide the load balancing between the Pods. A kube-proxy daemon, on each cluster node, defines that mapping in iptables rules (by default) for the Linux kernel to use when routing network packets, but itself is not actually on the data path.
Kubernetes also provides a built-in internal service discovery and communication mechanism. Each Service’s ClusterIP is provided with a cluster-private DNS name of
<service-name>.<namespace-name>.svc.cluster.local form, accessible from Pods in the cluster.
To allow external access, LoadBalancer type is usually the preferred solution, as it combines the other options with load balancing capabilities and, possibly, additional features. In AWS these features, depending on the load balancer type, include Distributed Denial of Service (DDoS) protection with the AWS WAF service, certificate management with AWS Certificate Manager and many more.
In Kubernetes, controller is an implementation of a control loop pattern, and its responsibility is to reconcile the desired state, defined by various Kubernetes resources, with the actual state of the system. A service controller watches for new Service resources to be created and, for those with the spec.type of LoadBalancer, the controller provisions a load balancer using the cloud provider’s APIs. It then configures the load balancers’ listeners and target groups and registers the Pods behind the Service as targets.
The way a provisioned load balancer routes to the Pods is specific to the load balancer type and the Service controller. For example, with AWS Network Load Balancer and Application Load Balancers you can configure target groups that use instance target type and route to NodeIP:NodePort on relevant nodes in the cluster or ip target type and route directly to Pods’ IPs.
Schematically, assuming the ip target type, it can be shown in the following diagram:
In-tree Service Controller
Today, alongside each Kubernetes version, there is a release of the matching version of a cloud provider-specific code that is responsible for the integration between the cluster and the cloud provider API. This code, until recently, resided within the Kubernetes repository and thus referred to as in-tree cloud provider, is installed by default on the corresponding cloud provider Kubernetes clusters. Amazon EKS clusters, via the AWS cloud provider, come preinstalled with AWS Cloud Controller Manager, which contains the Service controller, referred here as the in-tree Service controller.
On an empty (controller-wise) Amazon EKS cluster it’s the in-tree controller that handles the Service objects, so the following Service definition creates a Classic Load Balancer and wires it to the nodes (because it only supports the equivalent of the instance target type):
apiVersion: v1 kind: Service metadata: name: some-service namespace: apps labels: app.kubernetes.io/name: some-service spec: type: LoadBalancer selector: app.kubernetes.io/name: some-service ports: - name: svc-port port: 80 targetPort: app-port protocol: TCP
You can control the configuration of the in-tree controller via annotations. For example, the following would provide an AWS Network Load Balancer instead of the Classic one (again, only the instance target type is supported):
apiVersion: v1 kind: Service metadata: name: some-service namespace: apps labels: app.kubernetes.io/name: some-service annotations: service.beta.kubernetes.io/aws-load-balancer-type: nlb spec: ...
The in-tree controller is relatively limited in functionality, as is evident by its lack of support for ip target type, which is more suitable for containerized applications. We will discuss a much more feature-rich alternative, the AWS Load Balancer Controller, in Part 2 of the series.
Handling Many Services
A nontrivial application can contain dozens of API endpoints that need to be exposed externally. Such an application would require dozens of load balancers, with one for each of the Services exposing these API endpoints. This introduces additional operational complexity and infrastructure cost.
A possible solution may be to create a single load balancer and then connect Services and their Pods to the load balancer’s target groups. This is a relatively complex implementation, especially considering that different Services may belong to different teams with independent development and deployment cycles. It requires establishing prioritization and merging processes and providing these teams some form of access to the load balancer configuration, which is often undesirable.
Fortunately, Kubernetes provides an abstraction over that process, which is the Ingress resource type.
What is Ingress?
Ingress is a built-in Kubernetes resource type that works in combination with Services to provide access to the Pods behind these Services. It defines a set of rules to route incoming HTTP/HTTPS traffic (it doesn’t support TCP/UDP) to backend Services and a default backend if none of the rules match. Each rule can define the desired host, path, and backend to receive the traffic if there is a match.
An example Ingress definition may look something like the following:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: some-ingress annotations: alb.ingress.kubernetes.io/load-balancer-name: ingress alb.ingress.kubernetes.io/target-type: ip alb.ingress.kubernetes.io/scheme: internet-facing spec: ingressClassName: alb rules: - host: my.example.com http: paths: - path: /some-path pathType: Prefix backend: service: name: service-a port: number: 80 - path: /some-other-path pathType: Exact backend: service: name: service-b port: number: 81 - host: '*.example.com' http: paths: - path: /some-path pathType: Prefix backend: service: name: service-c port: number: 82 - http: paths: - path: / pathType: Prefix backend: service: name: service-d port: number: 80
The code above defines that any traffic:
- for example.com with a path that:
- starts with some-path should be routed to service-a on port 80
- equals to some-other-path should be routed to service-b on port 81
- otherwise, for any immediate subdomain of com with a path that
- starts with some-path should be routed to service-c on port 82
- that doesn’t have a hostname defined in request should be routed to service-d on port 80
If none of these scenarios match, then the request is routed to a predefined default backend.
Additionally, there are several annotations: the first one, kubernetes.io/ingress.class, defines the connection between the Ingress resource and its implementation. The other defines the target type, ip, of the load balancer to be provisioned. Other annotations may be used to provide additional, implementation-specific configuration parameters.
If you were to create the above Ingress resource in a Kubernetes cluster, it would not do much, aside from creating its object representation in Kubernetes key-value data store: etcd.
For Ingress objects, it’s the responsibility of an Ingress controller to create the necessary wiring, provision the load balancing component, and reconcile the state of the cluster. There is no default Ingress controller included in the controller manager distributed with Kubernetes, so it must be installed separately.
In this post, we will discuss two possible approaches to an Ingress controller implementation: External Load Balancer and Internal Reverse Proxy, the differences between them, and the pros and cons of each implementation.
External Load Balancer
The approach is similar to the Service-based one we’ve described previously:
The AWS implementation for Ingress controller, AWS Load Balancer Controller, translates the Ingress rules, parameters, and annotations into the Application Load Balancer configuration, creating listeners and target groups and connecting their targets to the backend Services.
This setup offloads the complexity of monitoring, managing, and scaling a routing mechanism to the cloud provider by using a managed, highly available, and scalable load balancing service. Additional features like Distributed Denial of Service attack (DDOS attack) protection or authentication can also be handled by the load balancer service.
Internal Reverse Proxy
In this case, the implementation is delegated to an internal, in-cluster Layer 7 reverse proxy (e.g., NGINX), that receives the traffic from outside the cluster and routes it to the backend Services based on Ingress configuration.
The installation, configuration, and maintenance of the reverse proxy is handled by the cluster operator, in contrast to the usage of the fully managed AWS Elastic Load Balancing service of the previous approach. As a result, while it may provide a higher degree of customization to fit the needs of the applications running in the cluster, this flexibility comes at a price.
The reverse proxy implementation places an additional element on the data path, which impacts the latency, but more importantly, it significantly increases the operational burden. Unlike the fully managed AWS Elastic Load Balancing service used by the previous implementation, it is the responsibility of the cluster’s operator to monitor, maintain, scale, and patch the proxy software and instances it runs on.
It is worth mentioning that in some cases, the two controller implementations can be used in parallel, handling separate segments of the cluster or in combination to form a complete solution. We’ll see an example of this in Part 3 of the series.
Kubernetes Gateway API
While we won’t go into its implementation details, Kubernetes Gateway API (currently in beta) is another specification that provides a way to expose applications in Kubernetes clusters, in addition to Service and Ingress resource types.
Gateway API is not currently supported by Amazon EKS, having only recently graduated to beta.
The Gateway API deconstructs the abstraction further into:
- GatewayClass that, similar to IngressClass, denotes the controller to handle the API objects
- Gateway that, similarly to Ingress, defines the entry point and triggers the creation of load balancing components, based on the handling controller and the gateway’s definition
- HTTPRoute (with TLSRoute, TCPRoute, and UDPRoute to come), that defines routing rules that connect the gateway to the Services behind it, allow matching the traffic based on the host, headers and paths and splitting it by weight
- ReferencePolicy, that allows to control which routes can be exposed via which gateways and which Services that can expose (including cross-namespace)
This is merely a cursory overview, but it should look very similar to what we’ve seen during the post:
The actual provisioning of the load balancing components by the controller, missing from the diagram above, can take either of the routes (external load balancer or in-cluster reverse proxy) or hook into a different paradigm, like service mesh.
In Part 1 of the series, we discussed several ways to expose applications running in a Kubernetes cluster: Service-based with an external load balancer and Ingress-based with an external load balancer or an in-cluster Layer 7 reverse proxy. We briefly touched on the up-and-coming Kubernetes Gateway API that aims to provide further control over how the applications are exposed.
During the rest of the series we focus on the Ingress-based implementations with AWS Load Balancer Controller and NGINX Ingress Controller, discuss their setup and configuration, and walk through a set of examples.