AWS Open Source Blog

kube-aws: Highly Available, Scalable, and Secure Kubernetes on AWS

Kube-AWS 更好管理 AWS 上的 Kubernetes 集群

Kubernetes graphic

There are many ways to manage a Kubernetes cluster on AWS. Kube-AWS is a Kubernetes incubator project that allows you to create, update, and destroy a highly scalable and available Kubernetes cluster on AWS. It provides seamless integration with several AWS features such as KMS, Auto Scaling groups, Spot fleet, node pools, and others. Complete project details are at github.com/kubernetes-incubator/kube-aws, but this post from project creator Yusuke Kuoka provides a good rundown on how to get started with kube-aws, and how to engage with the project for further questions.

Arun


Kubernetes is well known as a container orchestration system that helps manage many containers in a fairly automated manner. I often call it a “Rails for microservices,” as it provides a set of primitives to build one’s own framework for managing apps. (Primitives are the container, pod, deployment, service, and all the rich set of object types, and the API to deal with them.) Kubernetes also has various extension points, and awesome client libraries for major programming languages.

Developing and operating many microservices is challenging, especially when you have limited automation and tooling. Starting with Kubernetes as a framework, you can add complementary tools such as Fluentd, Zipkin, and Prometheus, to create a modern platform for managing microservices. I believe this is why Kubernetes is getting more and more attention these days.

Have you ever suffered from any of the following problems?

  • Fast development environment for multiple microservices: You need to work on a microservice which depends on several microservices, and each of them takes minutes to be provisioned for the development.
  • Distributed tracing: You need to locate the root cause of occasionally-seen slow transaction spans across several microservices.
  • Distributed logging: You need to track a user’s activity through access logs from multiple nginx reverse proxies, for debugging purposes.
  • Resource monitoring: You need to figure out why your system is not working well.

Two years ago, I tried hard to find a silver bullet for all these. I went looking for a PaaS that fit my use case perfectly, only to find that no such thing was available at the time.

So, I started from the foundation of Kubernetes on AWS, envisioning that I’d gradually build the PaaS on top of it. Even though Kubernetes itself isn’t a PaaS, it does help me in building one, as its ecosystem and general applicability increase with each release. Projects like Helm, Helmfile, and Brigade complement my PaaS needs, all developed by contributors who use different cloud providers. Without Kubernetes’ general versatility, this wouldn’t have happened.

The result is an open source tool called kube-aws, which I’ve been maintaining for almost two years now. I have used it extensively for provisioning my own production Kubernetes clusters on AWS at several different companies, and it is now used widely around the world, by companies such as Hotels.com, Netquest, Checkr, ChatWork, freee, and others, to serve business-critical applications on top of Kubernetes on AWS.

You might be thinking: “There are many Kubernetes provisioning tools available today. What is the point of using, let alone maintaining, yet another tool, especially when there’s already an awesome tool available, kops?” Read on to understand how kube-aws is different.

So What is kube-aws?

A Kubernetes Incubator project, kube-aws is a tool used for provisioning production-ready Kubernetes clusters on AWS. It heavily relies on and is specialized for Container Linux from CoreOS, along with well-known AWS-managed services like EC2, KMS, S3, ASG, ELB, and CloudFormation, and allows you to provision highly available, scalable, and secure Kubernetes clusters on AWS in a highly customizable manner.

Basic features of kube-aws include: Multi-AZ etcd clusters, Kubernetes control-plane, and worker node pools for availability; support for public and private subnets for all types of nodes, internet-facing and internal load balancers for Kubernetes API endpoints for security. You can have separate API endpoint load balancers: one for access from the Internet and one from VPC. Cluster credentials are encrypted with KMS. Resource signals are used for rolling updates of nodes without downtime.
A key feature of kube-aws is flexibility: you can customize many aspects of your cluster from within a single file called “cluster.yaml”. You can also customize all aspects of your node and stack, as long as they can be expressed by cloud-config and CloudFormation stack templates. This means that:

  • It is IAM friendly: you can reuse existing, pre-configured AWS resources with kube-aws. Let’s say you are in an enterprise situation where you have no permission to create IAM roles, so kube-aws cannot create roles on your behalf. In this case, you can just instruct kube-aws to use IAM roles created by an administrator beforehand. The same applies to VPCs, subnets, security groups, ELBs, and so on. kube-aws also supports CloudFormation service roles, so you aren’t forced to use an IAM admin user role to run kube-aws.
  • When you need to add many organization-specific settings and files to your worker/controller/etcd nodes, just add snippets to cloud-config. Previously, this would complicate your node provisioning script, as the 16KB size limit on EC2 user data would force you to separate out your node provisioning script/configuration sources from instance user data. kube-aws solves this issue for you by putting userdata into an S3 bucket automatically.
  • Similarly, when you need to add specific customizations and additional AWS resources relevant to the Kubernetes cluster, you might trip on the 51,200-byte limit on stack template size in CloudFormation. Again, kube-aws solves this for you by automatically putting stack templates into an S3 bucket.

kube-aws isn’t necessarily the answer to everything; as always, you should ensure that it is the appropriate tool for your needs. But it removes the need for a lot of yak shaving when provisioning highly available, scalable, and secure Kubernetes clusters. It is especially helpful when your primary need goes beyond just operating Kubernetes clusters.

My own primary job is Site Reliability Engineering and Developer Productivity. Kube-aws helps me focus on more complex parts of my job by allowing me to:

  • Provision Kubernetes clusters.
  • Update Kubernetes clusters, including adding/removing/updating nodes and load balancers, security groups, iam policies, etc.
  • Share cluster configuration across teams.

Getting started with kube-aws

There’s a detailed Getting Started guide on the kube-aws documentation site. It boils down to running:

console
 kube-aws init \

--cluster-name=quick-start-k8 \

--region=us-west-1 \

--availability-zone=us-west-1a \

--hosted-zone-id=ZBN159WIK8JJD \

--external-dns-name=quick-start-k8s.mycompany.com \

--key-name=ec2-key-pair-name \

--kms-key-arn="arn:aws:kms:us-west-1:123456789012:key/c4f79cb0-f9fb-434a-ac3c-47c5697d51e6"

--s3-uri=s3://kube-aws-assets/
 kube-aws render credentials --generate-ca

kube-aws render stack
 kube-aws validate
 kube-aws up

Here’s what each command does:

  • kube-aws init generates a `cluster.yaml`. It defines everything your Kubernetes cluster might contain. What you can define includes: whether or not to use existing VPC/Subnet/Route Table/Internet gateway/NAT gateway, number of etcd and controller nodes, number and size of node pools, enable GPU or not, apiserver and kubelet flags, sysetmd configs, and so on.
  • kube-aws render credentials generates various TLS assets required for running K8S system components.
  • kube-aws render stack generates various cloud-configs and stack-templates.
  • kube-aws validate run lint checks on your cluster configuration.
  • kube-aws up brings up the whole cluster by calling out to CloudFormation.

After running kube-aws up, depending on your cluster’s size defined in the cluster.yaml, you can even Ctrl-C and go have some coffee for 10 minutes or so. CloudFormation creates several stacks, each containing a set of AWS resources for etcd, Kubernetes control-plane, and node pools.

You can view a sample cluster.yaml template used by `kube-aws init`.

A Note About kube-aws’s Flexibility

Users with advanced use cases and requirements may want to modify the generated stack templates and cloud-configs. You don’t need to deal with golang and rebuild kube-aws just to make a small tweak to what kube-aws provides out of the box. To customize the cluster configuration, you can make any modification allowed by CoreOS-flavorted cloud-config and CloudFormation stack templates.

Just be aware that this would make it harder to upgrade your clusters to future kube-aws releases. If you need customization, I’d encourage you to open feature requests in GitHub issues, and ask questions in the #kube-aws channel in the Kubernetes Slack, so that together we can shape what we can do to solve your problem.

Recommendations Before Going to Production

Production clusters vary across use cases. I suggest checking the dedicated GitHub issue for recommendations on production quality deployment for reference on DOs and DON’Ts for your own production cluster.

My personal choices include:

If you’re going to made huge changes to stack-templates and/or cloud-configs, version-control your cluster assets with Git

I’d appreciate it if you could share your choices and experiences, and any questions and feature requests that you come up with!

Future Work

kube-aws is specialized for AWS users. Naturally, our response to the recent introduction of AWS EKS AWS’s long-awaited managed Kubernetes service is that we’re planning to add first-class EKS support to kube-aws.

The EKS integration would look like this: EKS manages the Kubernetes control plane, which consists of etcd and controller nodes. All the etcd members and the Kubernetes system components like apiserver and controller-manager run on EKS managed nodes. kube-aws, however, manages only worker node pools. Compared to etcd and controller nodes, worker nodes tend to have more varying requirements such as auditing, logging, network security, and IAM permissions, because they may run user-facing containers. The integration would keep your Kubernetes operational cost at minimum thanks to EKS, while keeping maximum flexibility thanks to kube-aws managed worker nodes.

There are many Kubernetes provisioning tools, and researching which is best for a particular use case is time-consuming for the user. I am considering whether to consolidate kube-aws with one or more of the other provisioning tools, to create a better user experience. Although I don’t have a concrete plan to do so yet, I’m looking forward to the future. Kubernetes Cluster API would be a good starting point.

Stay tuned, or – even better! – collaborate with us by letting us know what you want and expect from this integration!


Yusuke Kuoka is a Software Development Engineer at freee K.K., leading the design and development of a highly available, scalable, and secure developer platform and infrastructure for microservices. He maintains several OSS projects including kube-AWS, Brigade, Helmfile, and Habitus, which he finds important for resolving real-world problems encountered while running production workloads on Kubernetes on AWS clusters.

Arun Gupta

Arun Gupta

Arun Gupta is a Principal Open Source Technologist at Amazon Web Services. He focuses on everything containers and open source at AWS. He is responsible for CNCF strategy within AWS, and participates at CNCF Board and technical meetings actively. He has built and led developer communities for 12+ years at Sun, Oracle, Red Hat and Couchbase. He has extensive speaking experience in more than 40 countries on myriad topics and is a JavaOne Rock Star for four years in a row. Gupta also founded the Devoxx4Kids chapter in the US and continues to promote technology education among children. A prolific blogger, author of several books, an avid runner, a globe trotter, a Docker Captain, a Java Champion, a JUG leader, NetBeans Dream Team member, he is easily accessible at @arungupta.