Has enabled seamless infrastructure configuration while improving identity integration and monitoring capabilities
What is our primary use case?
In my current project, I am handling a US customer who is completely focused on working with LLM and security. Their clients are expected to give us the data sources, and we provision from Greengrass, which works as an agent running on the end client to fetch the customer false alarms. It's primarily focused on the SOC 2 side, where data goes to the SOC dashboard or SOC data source, leveraging LLM to filter how many alarms are false positives versus true positives. In doing this, the NOC engineers observing the SOC 2 dashboard do not have to worry about how many are true positives; our LLM API or tool filters it out and indicates how many true positives and false positives are displayed on the dashboard. This significantly eases the burden on the NOC engineers.
We are utilizing Amazon EKS for our application because most components are LLM and require certain GPUs, so we have specific managed node groups and we also use Carpenter along with HPA in place. Whenever the need arises, process queues are listed in Redis for elastic cache, and all the lists are processed and read from Redis and handed over to the nodes dynamically using HPA. We leverage HPA and Carpenter within Amazon EKS for scaling or scaling out.
How has it helped my organization?
Amazon EKS is simple due to its support for diverse AWS tools and various integrations, significantly influencing my application development and management processes. There is the aspect of trust relationships and permissions. Every service we create involves setting a role, and all authentication processes link to a policy, deciding access. This seamless process extends across accounts thanks to ARN (Amazon Resource Names). With the security guardrails in place, services are accessed effortlessly while maintaining high security, giving me confidence in its management capabilities.
What is most valuable?
The HPA feature within Kubernetes is one aspect I appreciate about Amazon EKS; it's beneficial for scalability. The managed node groups or dedicated node groups with GPU capacity allow us to scale in or out depending on our capacity planning, and Carpenter, which we have provisioned for scaling, contributes to that. These features are not available by default in EC2s or ECS, where you lack control over the nodes. Many people attempt to adopt Lambda functions or serverless solutions, but that approach does not suffice for GDPR and HIPAA compliance as it demands a solid identity of the node and clarity on where it is being provisioned. Hence, we cannot depend on any of the Lambda or serverless components which may be unstable, and this often leads to increased billing due to the dynamic nature of scheduled procedures that require a standalone node rather than a transient one.
When it comes to the integration with IAM, I have two thoughts on the authentication process of Amazon EKS. Previously, when I began using AWS, there was something called the AWS auth config map within Amazon EKS. Initially, only the person who creates the cluster has access to it. Now, if you engage in enterprise roles—as I did while working within the UK for Santander—it presents challenges. Essentially, earlier only the creator had the master role for accessing the node and the onboarding process was rather manual, with companies relying on ServiceNow and other tools for onboarding new joiners or team members. Now, EKS API or EKS CTL has many default settings that are not enabled, you need to enable these. Most clusters are created using Terraform, and you need to create a role that can manage cross-account access, as many customers don't operate with a single account.
My previous employers, such as Fidelity Investment, Nokia, and Ford, have worked across multiple accounts, necessitating single sign-on. This setup allows for cross-account access to the cluster by employing EKS CTL APIs, which leverage single sign-on to onboard team members. As such, once the role has access to the cluster, it can onboard users as a dev user, admin, or tester, simplifying the onboarding process. This way, previously manual tasks can be automated, which is a significant improvement. Earlier, we had to make changes to the config map to onboard users, but with EKS CTL API, this integration between EKS, Kubernetes service, and the cloud side is improved tremendously, alleviating many worries.
Self-healing nodes assist in minimizing administrative burdens in my projects. Coming from a telecom background where I've worked over seven years, I'm familiar with a service called SON—self-organizing and self-healing functionality. At a logical level, these are the layers we interact with, but AWS handles the physical layer through their software components. For instance, if one node is not ready and you enable the auto mode feature, AWS manages that for you—IAM upgrades or any nodes malfunctioning. I've seen these features in the UI; I've enabled them, and every 10 or 15 days, patches roll out. I can check them via AWS Inspector to see if there are any node-level patches or AMI level patches necessary. AWS takes care of these issues automatically. I appreciate that I don't need to manually check the dashboard and apply upgrades one by one, which is a significant improvement.
I measure the impact of Amazon EKS on the organization's management of complex workloads in terms of effectiveness and efficiency through my background in development and systems. Initially, I spent five years as a Java developer before transitioning to DevOps. With my understanding of end-to-end application architecture, I assess workloads based on system and application planning. For example, when I worked on a data lake product in Fidelity Investment, I observed that the cloud onboarding process, including Amazon EKS, had roadmaps extending over five years—from 2019 to 2024. I understand the nuances of enterprise or legacy applications and any system-related complexities. It all boils down to two components: system planning and application planning. Initially, we identify the type of application—whether it is database-related or has high GPU demands. Most applications today involve GPUs, which tend to incur high costs, and often customers are unaware of how to handle dynamic workloads effectively. It's crucial to assess not just one part (system), but various elements CPU, memory, and IOPS since the underlying hardware interacts with those components regardless of domain. First, we need to evaluate the application's requirements—such as its dependency on node storage. With EKS, Kubernetes provides solutions CSI, CNI, and CRI. By understanding the application's demands, I can apply the right Kubernetes configurations for performance optimization, such as taking advantage of Amazon EKS's ability to adjust the container network interface settings to suit the client's workload requirements. This loose coupling allows us to optimize our resources irrespective of whether we're using on-prem or cloud environments.
What needs improvement?
It has been since 2019 that I started using Amazon EKS. At that time, it was completely new, and many people were not using it just yet; it started from version 1.21, and right now we are on 1.33. Recently, 1.34 has been launched, but it's not yet available in the service catalog; we can see only 1.33. A lot of improvements have been made.
We had numerous add-ons to install manually because Kubernetes is a completely different service than AWS cloud provider, and everyone has opted to use it. After opting, there is an identity that you have to maintain—one at Kubernetes level and one at the AWS provider level. You have to maintain one identity at IAM level and one within the cluster, Amazon EKS. A few things do not make sense within the add-ons, many of the secret providers that read the secret from Secrets Manager and then mount it as a volume. We use a service called EBS CSI driver, which reads the secrets or sensitive data from Secrets Manager and then mounts it as a volume to the pod at runtime. However, that doesn't have a dynamic feature where, if any changes happen in the secrets, it can read and populate in the environment.
Sometimes consider your RDS password or OpenSearch password rotates. Amazon EKS doesn't have that feature to read the dynamic one and consider that the password has changed overnight; there is no functionality from the provider to see the changes and then restart the pod or fetch the new value. This often leads to downtime of 12 or even 6 hours, depending on when you realize it, so that needs improvement.
Nonetheless, mostly on the add-on side, they have developed a lot; earlier we were installing them manually, but now with EKS auto mode, many things VPC CLI and pod identity service—around four plugins—are installed by default, which is a good thing. However, I believe there should be some solution that is self-contained, covering generic use cases.
With the 1.33 release, they have addressed most of my earlier concerns, but I am still looking for some improvements, particularly in CloudWatch monitoring. In IT, we manage two aspects: either the system or the application. Currently, the application logs and monitoring are not very robust in CloudWatch; you can only find things if you are familiar with them. Fortunately, we are familiar, as most of the monitoring involves two types of databases: one is a time series for monitoring data, and the other is an indexing solution for a streaming service. This means we need to get the logs from each node, index them, and populate them on a screen. That part remains a separate service, but if they managed it within Amazon EKS service, where the monitoring is consolidated in one place, you wouldn't need to rely on Prometheus, Grafana, or different services. It would be advantageous to have a consolidated platform for EKS, as Kubernetes is leveraged; monitoring and logging should also be integrated simply by enabling parameters or tags. This would create a self-contained platform where people can onboard and start using it. Currently, I still need to enable logging and monitoring among other things myself; that shouldn't be the case after six or seven years in the market.
On a scale from 1 to 10, I would rate Amazon EKS tech support an eight. Some individuals have a deep understanding of the services and can identify potential bottlenecks, especially with load balancer endpoints and certificate management. The shift from NGINX to AWS load balancers has diminished many previous issues. However, not every support engineer meets the same level of expertise, hence why I rate it a solid eight, which I consider decent.
For how long have I used the solution?
I have been using Amazon EKS for seven years.
How are customer service and support?
Amazon's customer support has its merits; it is good overall. However, when it comes to enterprise licenses, the quality declines significantly. Startups may not recognize this at first, engaging with support during their initial phase, but they soon discover the lack of expert guidance and the costs associated with it—it's quite expensive.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have experience with other Kubernetes engines, such as those from Oracle, Azure, and Google. For instance, I find Red Hat OpenShift to be an excellent competitor. Many have transitioned to Azure due to the hosting incentives it offers for running open AI models. While people explore Amazon EKS or other Kubernetes solutions, the simplicity in moving between platforms remains an advantage since your manifests mostly stay the same, needing only minor adaptations to services. My experience with Red Hat OpenShift tells me it's a robust solution with competitive attributes against AWS.
What was our ROI?
In terms of ROI, the return on investment, I have clear examples. When we began, we utilized managed nodes for applications needing dynamic processes. Some applications simply required pod creation once data was queued for processing, allowing the node to remain free afterward. For a customer who was not cost-effective—we had provisioned a node that saw little CPU usage but substantial memory consumption—we implemented effective resource quotas and HPA. This setup enables the system to utilize resources dynamically based on the actual demand observed by reading metrics from the Prometheus adapter. Notably, the adapter isn't an out-of-the-box feature provided by Prometheus; you need to create your own adapter for it. Using tools HPA and Carpenter allowed us to scale resources based on requirements. Initially, not having them resulted in an unoptimized solution. However, with these tools in place, we witnessed a reduction of costs by approximately a third—if it was $100 beforehand, we brought costs down to $25.
What's my experience with pricing, setup cost, and licensing?
Regarding the pricing aspect and the licensing cost of Amazon EKS, sometimes it is not clear. Most discussions revolve around the data transfer costs from one region to another, and there are certain concerns regarding GPU nodes. However, if you optimize your node usage, with tools such as Kubecost, you can analyze how effectively you utilize your nodes. If you manage to optimize usage, you won't face steep costs. Otherwise, the cloud provider will certainly benefit from inefficient usage. Ultimately, it's not out of the box—if you want to monitor costs effectively, applying separate tools and acting accordingly in advance is essential.
Which other solutions did I evaluate?
I notice key differences between Amazon EKS and its competitors, analyzing both pros and cons. The seamless integration is sometimes lacking in other offerings. When managing software in platforms Kubernetes—including EKS, AKS, GKE, Rancher Kubernetes, and Oracle's Kubernetes engine—I've faced specific challenges, particularly with user management in Oracle's solution, which isn't as seamless as it is in Amazon EKS. Comparatively, OpenShift from Red Hat has notable strengths. Oracle is making improvements, especially with its longstanding database solutions. For cloud providers, though, OS from Red Hat is a formidable competitor, offering robust out-of-the-box solutions around resource limits and dashboard configurations that do not require command-line interventions.
What other advice do I have?
The review suggests that people considering Amazon EKS should heed some recommendations. They often attempt to enforce infrastructure as code with tools Terraform or HashiCorp to maintain workspace and all. I advise using services within a single environment, especially for LLM applications. It's prudent to have multiple LLM sources across various cloud providers while utilizing the same keys within your AWS environment.
My second piece of advice is to establish a separate CI/CD platform independent of AWS. This keeps things loosely coupled; with minimal tweaks in CI/CD pipelines, you can seamlessly migrate from one platform to another—say from EKS to AKS to GKE or OpenShift—thus keeping the focus on feature development rather than migration headaches. This leads to a modular approach in your code and infrastructure, ensuring that only the cloud provider specifics require adjustments.
Overall rating: 8 out of 10
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has supported production workloads consistently but requires simpler configuration and clearer troubleshooting tools
What is our primary use case?
We use Amazon EKS mostly for deploying production workloads, and there are multiple AI models that we run on Amazon EKS.
What is most valuable?
My favorite feature of Amazon EKS is the ecosystem that it provides, including the integration with S3, along with EBS, and the networking that is smooth to run Kubernetes.
What needs improvement?
I have experience with Azure, and in comparison to Azure, a downside of Amazon EKS is that even if you want to deploy a dev workload or do some experimentation, we have to pay the charges for the control panel with no free option.
Additionally, I have faced many issues while configuring the node groups and the whole configurations; bringing up the nodes was a bit hectic, and I was not able to determine which node was failing and for what reason.
Specifically, the pricing for the control panel of Amazon EKS is hefty, and there is no cost-cutting that can be done on that side.
For how long have I used the solution?
I have been using Amazon EKS in my career for four years overall.
What do I think about the stability of the solution?
Amazon EKS is pretty stable, and I have not seen any lagging, crashing, or downtime.
What do I think about the scalability of the solution?
Amazon EKS is good in terms of scaling.
How are customer service and support?
I have contacted technical support and customer support for Amazon EKS.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have used Azure EKS, Civo, and also experimented with GCP as alternatives to Amazon EKS. If I have to rank between all three of them, GCP comes at number one, Amazon EKS comes at number two, and Azure EKS service comes at number three.
On the costing part of Kubernetes, Azure is beating Amazon EKS since I can do some experimentation without paying for the control panel; I can just pay for the node groups, which is an area where Amazon can improve.
How was the initial setup?
From my point of view, the initial deployment of Amazon EKS is difficult.
I had to configure many components, such as IAM policies and other things; it was not a simple click-through process. The major issue is that there is no single point where I can see all the logs; while there is CloudWatch, it is not easily accessible, and you have to go through a hectic process to search and find information. There is nothing where you can go and click to get all the logs in a single place.
What other advice do I have?
Amazon EKS requires maintenance on my end to continue functioning; we have to do upgradation from time to time since every Kubernetes version is only valid for one year.
On a scale from one to ten, I rate Amazon EKS a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Has supported seamless migration of numerous legacy applications to a reliable platform
What is our primary use case?
Amazon EKS is used for running any type of application, and it is one of the more stable platforms. We use everything in AWS, CloudFront, and we probably have a hundred applications in AWS.
Several of our platforms have transitioned to Kubernetes because it is more economical than some other methods, and we have lots of apps that use Kubernetes.
The deployment of Amazon EKS took time mainly because of the amount of stuff we had to move, such as old legacy applications and systems, which need changes every few years, and AWS is just better.
We have several applications that run in Amazon EKS.
What is most valuable?
The best features of Amazon EKS include Kubernetes, as several of our platforms have transitioned to Kubernetes because it is more economical than some other methods, and we have lots of apps that use Kubernetes.
The support for AWS tools integration with Amazon EKS has influenced our management process significantly. The cloud affects everything, with probably 80% of our stuff in the cloud, mainly in AWS.
The self-healing nodes with Amazon EKS help minimize administrative burdens, and while I'm not a system admin myself, I do disaster recovery testing and resiliency assessments for cloud applications.
What needs improvement?
Improvements could include better support and pricing, which is always important.
There are definitely areas for improvement with Amazon EKS.
For how long have I used the solution?
My experience with AWS products spans the last six, seven years.
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
Amazon EKS scales effectively.
How are customer service and support?
I would rank their support close to a 10; they are very responsive.
How would you rate customer service and support?
How was the initial setup?
The initial setup with Amazon EKS is straightforward, and all of our teams are involved with AWS. Some use EC2 containers, and I've seen everything there is to see in AWS.
What other advice do I have?
Amazon EKS is easy to use, and you don't experience the problems that you encounter with some solutions like EC2 containers or S3 buckets.
The support for AWS tools integration with Amazon EKS has influenced our management process significantly. The cloud affects everything, with probably 80% of our stuff in the cloud, mainly in AWS, though we have some in Google Cloud and Azure.
In my role, I don't set anything up; I set up test disaster scenarios and have teams practice against those scenarios. We have encountered challenges during that migration process, and we utilize our AWS dedicated person whenever something arises that we can't handle or don't understand, so we can seek their help.
Amazon EKS helps us manage complex workflows effectively.
On a scale of 1-10, I rate Amazon EKS a 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has improved deployment efficiency and eliminated manual infrastructure management
What is our primary use case?
I'm actually working for a company that uses AWS as a cloud platform, and for our clients, we use Amazon EKS. We utilize multiple clusters and other requirements, making Amazon EKS our choice for deployment service or orchestration service.
The usual use case for Amazon EKS is to deploy an application intended for heavy user load and traffic. In technical terms, there are multiple services to choose from, but we choose Amazon EKS for its orchestration, load balancing, and auto-scaling capabilities. With this service, you don't have to worry about manual auto-scaling or manual load balancing. Before Kubernetes, manual intervention was needed for scaling applications, leading to potential crashes if capacity was exceeded. Amazon EKS alleviates those concerns with its auto-scaling feature, where predefined thresholds automatically trigger the launching of additional resources to handle increased traffic. Also, Amazon EKS allows configurations such as minimum and maximum server requirements, ensuring scalability while minimizing costs.
What is most valuable?
The features of Amazon EKS that I find most valuable include load balancing, auto-scaling, networking, security, and scalability.
Scalability in Amazon EKS refers to the ability to automatically scale up or down your application based on traffic needs. For instance, if you initially expect 10 users but suddenly have 20, Amazon EKS automatically handles the scaling, thereby preventing application crashes and maintaining service availability.
Reliability is crucial when running an application on Amazon EKS, as it ensures your application never crashes. With Amazon EKS, you don't manage the infrastructure yourself; Amazon takes care of it all. You simply need to deploy your container, select the required configurations, and Amazon EKS handles the rest without requiring you to manage the underlying resources.
I have utilized Amazon EKS's integration with IAM, which stands for identity and access management. IAM restricts access to services, ensuring only authorized personnel can access certain capabilities. This prevents mistakes or unauthorized actions, maintaining security throughout the platform.
The support for AWS tools integration in Amazon EKS influences our application development and management significantly. With integrated features related to security, scalability, and billing, we ensure the efficiency of our processes. At my company, we manage around 600 clusters on Kubernetes and emphasize reliability by integrating Amazon EKS with various third-party applications. This integration aids in deployment, security, and ultimately, efficiency, as it ensures that applications remain available and perform efficiently.
What needs improvement?
One area of Amazon EKS that could be improved is the manual process for adjusting the number of nodes. When I've already defined configurations in Docker or YAML files, it seems unnecessary to go back and make similar adjustments in the console.
For how long have I used the solution?
I have been working with Amazon EKS for 4.7 years.
How are customer service and support?
I do not often communicate with the technical support and customer service of Amazon EKS.
How would you rate customer service and support?
Which other solutions did I evaluate?
Currently, I am using GKE in Google Cloud, which is similar to Amazon EKS. The differences between GKE, Amazon EKS, and AKS mainly come down to minor functional variations; overall, they provide similar capabilities.
What other advice do I have?
Regarding the pricing and licensing of Amazon EKS, I am not entirely certain, but from my perspective, it's somewhat comparable to AWS's compute instances. While it may be on the pricier side due to being a managed service provided by Amazon, the features and functionalities justify the cost, especially for applications requiring reliability and scalability.
I participate in the setup and deployment of Amazon EKS, though I don't do it directly through the console. I use a third-party application called Argo CD, which allows me to deploy Kubernetes applications without accessing the Amazon console directly, making the process efficient and straightforward.
On a scale of one to ten, I rate Amazon EKS a nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has improved deployment efficiency and reduced admin overhead across cloud and edge environments
What is our primary use case?
Our usual use cases for Amazon EKS include IoT applications for edge computing devices, where we deploy some of our proprietary IoT applications to edge devices running in multiple locations, and artificial intelligence deployment to multiple systems, with a couple of them purely on the cloud where we manage bundled infrastructures into Amazon EKS for several proprietary customers.
What is most valuable?
The features and capabilities of Amazon EKS that I have found most valuable include the ease of deployment and the interesting part being the cost, which is not as expensive as setting up other cloud infrastructure.
Amazon EKS's support for AWS tools integration has influenced our application development and management process by being quite easy, with the integration being straightforward. Whenever issues arise, we talk to the support team who provide us with documentation, which is how we basically sort out most of those issues.
Amazon EKS's self-healing nodes help minimize administrative burdens in my organization by being wonderful and seamless, as it reduces the need for a whole lot of people on the team to handle issues, and it has really been seamless for us.
What needs improvement?
An area of Amazon EKS that could be improved in the future is its use for edge computing, which has been a small issue for us, especially since most of our recent work has been on edge computing applications such as Raspberry Pi and Jetson. If they could integrate things such as K3s, that would really be helpful as K8s feels a little bit bulky for edge computing deployment.
For how long have I used the solution?
I have been working with Amazon EKS since last year, when we started moving some of our solutions to AWS EKS.
What do I think about the stability of the solution?
My experience with the stability and reliability of Amazon EKS has been very positive, with only a couple of intermittent shutdowns previously, but recently there have been no issues at all.
What do I think about the scalability of the solution?
My impression of Amazon EKS's scalability has been positive, though we have not done very large-scale deployments. Most of what we've done has been on a much smaller but continuous scaling for multiple systems, and there has not been an issue on that aspect so far, although we haven't scaled up to a million or five million devices yet.
How are customer service and support?
I often communicate with Amazon EKS technical support, as they have been our main go-to people.
An example of my interaction with Amazon EKS technical support was during the initial setup when we talked with them, and they provided us an easier route by suggesting how we should bundle our solutions in Docker for easy deployment.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Before Amazon EKS, I did not use a different solution for these use cases, as our path has always been with Amazon EKS.
How was the initial setup?
My experience with the initial setup of Amazon EKS was straightforward with no challenges at all on my part, although the interns might complain about some snags. It's basically about studying the documentation.
What about the implementation team?
My setup process involves building the application on GitHub, bundling it in Docker, and connecting with Amazon EKS.
What was our ROI?
I have seen a return on investment with Amazon EKS for time, as it has helped me save a significant amount of time. However, the cost side has not been as positive since some of the applications run in dollars, leading to complaints from customers and ourselves about the cost, particularly when providing services to customers across Nigeria and some African countries. The return on investment has not been great due to the foreign exchange rate, but for time savings, it has been wonderful in helping with deployment.
What's my experience with pricing, setup cost, and licensing?
My opinion on the pricing and licensing of Amazon EKS is that it is quite varied, especially when doing projects in the African continent. It's quite expensive considering the local currency with respect to the conversions to dollars or euros, and if this could be lowered, it would help more deployments on our side with Amazon EKS.
Which other solutions did I evaluate?
Before choosing Amazon EKS, I evaluated other options or technologies, including Kubernetes on AWS, Google Cloud, and Microsoft Azure, but most of our experiences came from AWS, so we stayed with AWS.
What other advice do I have?
I have not utilized Amazon EKS's integration with IAM solution.
I have not encountered specific benefits using Amazon EKS's automated patching feature for my Kubernetes clusters, but it has been satisfactory as we haven't actually had many issues with using Kubernetes.
The impact of Amazon EKS on my organization's ability to manage complex workflows effectively has been purely managed by my colleague, and it has been quite seamless with no issues on that particular aspect.
Some of the benefits and positive impact that I have received from Amazon EKS include getting cloud credits through Activate and certain deployments around migration, which have been quite beneficial, along with business support credit and support during certain issues. During the initial times of integrations and migrations, AWS connected us with more specialists in different locales with much more experience while also paying for their services.
Based on my overall experience with Amazon EKS, I would rate it an eight out of ten due to the lack of K3s from Rancher.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has simplified managing microservices and improved security through automation and integrations
What is our primary use case?
Our use cases for Amazon EKS include deploying and managing microservice-based applications, where Kubernetes excels at orchestrating microservices and Amazon EKS handles the heavy lifting of managing the control plane. We also use it for application modernization such as migrating legacy applications to containers and for hybrid and multi-cloud deployments, running Kubernetes workloads across on-premise and cloud environments. Additionally, we run secure and compliant workloads that require strict security and compliances, utilizing AWS IAM, VPC, and security services.
Furthermore, we leverage CI/CD pipelines to automate build, test, and deployment processes, and for machine learning, we implement SageMaker.
What is most valuable?
The most valuable features of Amazon EKS are the managed Kubernetes control plane, where AWS handles the provisioning, scaling, and maintenance, ensuring high availability and automatic patching. Integrations with AWS services offer seamless access such as IAM for access control, CloudWatch for monitoring, ELB and ALB for load balancing, and storage options including EBS, EFS, and S3. In terms of security and compliance, we utilize fine-grained access control through IAM role service accounts, support for private clusters, and network policies.
Amazon EKS supports both EC2 for full control over nodes and Fargate for serverless Kubernetes pods.
The positive impacts I have seen from using Amazon EKS include enhanced security and compliance with a managed control plane, automatic patching, updates, IAM integration for secure access to AWS services, private clusters, network policies, and encryption options. Additionally, I experience operational efficiency, scalability, performance, developer productivity, flexibility, portability, and observability and monitoring through CloudWatch, Prometheus, Grafana, and OpenTelemetry, which assists in troubleshooting issues and optimizing resources, ultimately leading to cost optimization.
What needs improvement?
Areas for improvement within Amazon EKS include the management of infrastructure. Prior to using Amazon EKS, we handled manual provisioning, patching, and scaling of our Kubernetes cluster, but now AWS manages control plane operations, automatic patching, and scaling, which has reduced our operational burden and resulted in fewer infrastructure-related incidents.
I believe only operational management could be improved in the next releases of Amazon EKS.
What do I think about the stability of the solution?
When it comes to stability and reliability in Amazon EKS, the reliability of the control plane managed by AWS is paramount, running across three availability zones in each region to ensure high availability and fault tolerance. AWS also automatically manages the scalability and health of crucial components such as the Kubernetes API server and etcd cluster. We have options for worker nodes, including auto mode, Fargate, managed node groups, and self-managed nodes, ensuring data plane reliability.
What do I think about the scalability of the solution?
Regarding scalability in Amazon EKS, we see managed node groups and Fargate profiles, where we can automatically scale the number of EC2 instances in a node group using Cluster Autoscaler or Karpenter. For serverless pods, Amazon EKS can scale without managing EC2 nodes, and we can utilize horizontal pod auto-scaling based on CPU, memory, or custom metrics, along with support for cluster limits, multi-cluster, and multi-region load scalability.
Amazon EKS is highly scalable, showing improvement in areas such as infrastructure management, security, and cost efficiency, with features such as auto-scaling for pods and nodes, making it suitable for bursty and high-demand workloads.
Which solution did I use previously and why did I switch?
Before using Amazon EKS, we relied on self-managed Kubernetes on EC2 as well as Docker Swarm for our workloads.
We decided to switch from Docker Swarm to Amazon EKS because it is a managed service that simplifies the handling of complex scalable and modern application workflows.
How was the initial setup?
Setting up Amazon EKS for the first time involves prerequisites such as installing and configuring the Amazon CLI, then installing `kubectl`, and while `eksctl` is optional, I install it for easier setup. IAM permissions are also needed to create EKS resources.
My experience with the initial setup has been straightforward, and I did not face any challenges so far, especially with `eksctl`, although there are common challenges such as IAM role configuration, network complexity, and cluster access control.
What was our ROI?
We have managed to estimate savings of around 20 to 40% using Amazon EKS, specifically achieving savings on Fargate ranging from $30 to $45 per month based on our usage.
What's my experience with pricing, setup cost, and licensing?
I consider Amazon EKS to be an affordable product overall.
Which other solutions did I evaluate?
Before choosing Amazon EKS, I did not evaluate other solutions as I found it to be the best one for us after checking the market.
What other advice do I have?
The integration of Amazon EKS with IAM enhances our authentication process as IAM users or roles can be granted access to the Kubernetes API server, managed via the AWS Auth ConfigMap in the EKS cluster, allowing us to map IAM roles or users to Kubernetes RBAC roles.
When it comes to Amazon EKS integrating IAM into application development, we utilize IAM roles for service accounts that allow our application pods to securely access services such as S3 and DynamoDB without storing credentials. We first create a Kubernetes service account and associate it with IAM roles using annotations, enabling the pod to use this role to access AWS services via temporary credentials, providing a significant developer benefit by eliminating the need to manage secrets manually and ensuring access is secure and scoped per pod.
The benefits of Amazon EKS's automated patching feature for our Kubernetes clusters primarily include improved security through the automatic application of critical security patches to the control plane and worker nodes, which reduces exposure to known vulnerabilities such as CVEs and ensures compliance with security standards. A second benefit is the reduction of operational overhead, and thirdly, enhanced cluster stability, minimized downtime, and consistency across environments. With intelligent patch management, Amazon EKS often tests patches before release.
When it comes to managing complex workflows effectively on Amazon EKS, I find that it simplifies infrastructure management by abstracting away the complexity of managing Kubernetes control planes, allowing us not to worry about patching, scaling, or securing the master nodes. It also supports scalability for high-demand applications with auto-scaling features for both pods and nodes and provides enterprise-grade security.
I utilize the AWS EKS official documentation, accessible via docs.aws.amazon.com.
My impression of the documentation is that it is very easy to learn from scratch, making it accessible even for beginners, as it is comprehensive, well-structured, and production-ready. Especially for developers and DevOps engineers such as myself, we find the user guide, best practice guide, API reference, CI tools, and workshops to be highly reliable, developer-friendly, scalable, and flexible for deployment needs.
On a scale of 1-10, I rate Amazon EKS an 8.
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has supported end-to-end administration of complex workloads with seamless deployment and monitoring
What is our primary use case?
I have been working in my current field since I work in the cloud, and I worked for AWS and for Amazon EKS because all of my customers are using Amazon EKS or application container solutions.
My use cases for Amazon EKS involve working for an AWS partner, one of the biggest in LATAM, where we have many customers or clients around the world with different solutions. I have a client that had many applications in Amazon EKS, and I had the full administration of the cluster. I manage not only networking, pods, or resources but also security, monitoring, the billing for the expenses, access to the cluster, and to the AWS account. I think that is a big field that I can handle within Amazon EKS.
What is most valuable?
What I appreciate most about Amazon EKS is that I use tools such as Datadog for monitoring and reporting, and if I have a problem with the cost, I use Prometheus, Grafana, and Loki, as this Prometheus package is cheaper than Datadog. I use Helm to install packages or applications, making it easier and secure to install, uninstall, or update them. I also use Argo CD for CI/CD workflows. In general, these are the main package solutions that I use for Amazon EKS.
What needs improvement?
Regarding the downsides of Amazon EKS, I remember a case where I used a network add-on different from what is provided by AWS because the pods request one IP address for each pod. The solution had many pods, and the blocks in the VPC were limited. I didn't have enough addresses to assign to the pods, and I had to change the add-on to handle the IP address using another third-party solution not from AWS. This was one of the first challenges I encountered with Amazon EKS.
Since then, AWS still hasn't fixed this issue or given me an opportunity to use the IPs that I needed.
Basically, the problem was that we did not have enough IP addresses for the pods, and we had to change the network add-on in Amazon EKS.
For how long have I used the solution?
I have been using Amazon EKS in my career for three and a half years since I worked in the cloud.
What was my experience with deployment of the solution?
It took me approximately one month to learn how to use Amazon EKS. The tricky part was implementing Amazon EKS cluster completely from Terraform, as creating all of the resources is challenging. If you create the cluster from the AWS console, many resources are created behind the scenes, but in Terraform, you must create each resource one by one, which was quite difficult. Overall, it took about one month.
What do I think about the stability of the solution?
Regarding stability, I have not experienced any lagging, crashing, downtime, or instability with Amazon EKS. I think that Amazon EKS, and Kubernetes in general, is stable. I have had problems with billing because it's expensive, but not with stability.
What do I think about the scalability of the solution?
When it comes to scalability, I use Carpenter with Amazon EKS because it is a tool that offers significant granularity for configuration, and it works really well and fast. The inherent scalability of Kubernetes is not the best for me based on resources, but Carpenter works really nicely.
How are customer service and support?
I have contacted the technical support of Amazon EKS when I had issues with the IP addresses, and they helped me solve it by installing another network add-on. On a scale from 1 to 10, I would give the support of Amazon EKS a nine because it was nice and fast.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have used a direct alternative to Amazon EKS, which is ECS in AWS. I have many clients using ECS, Elastic Container Service, which is the native service for containers in AWS. It's not the same as Amazon EKS; it's another orchestrator, but it works fine when your application is not big.
How was the initial setup?
The initial deployment with Amazon EKS was easy, but I remember that my first deployment was done with a YAML file, just using kubectl. I used kubectl run and the YAML file, and after that, I learned about Argo CD, which made the process much easier.
What other advice do I have?
Amazon EKS does require maintenance on my end. Last year, Kubernetes had many updates, which was a difficult task. This is why I use Helm to install all of the applications in the cluster. If the application is built for you, you can create the Helm chart for this application and install it using this tool. I think that is the best option when you need to update the cluster. I know that AWS now offers many new applications add-ons included in the console, making it easier, but I still think that maintenance is one of the most complicated aspects of the cluster.
On a scale of 1-10, I rate Amazon EKS a nine.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Reliable integration streamlines complex workflows but cost-management and specific role configurations need enhancement
What is our primary use case?
We use Amazon EKS for hosting our applications. It is also a version of compute service with its own perks. Amazon EKS is built on Kubernetes. Kubernetes is complex and is not something used for basic or simple applications due to its complex nature. It is really meant for complex apps such as banking applications or AI-enabled applications that have many services.
When dealing with microservices, if an application has around 20 microservices, then Kubernetes generally begins to make sense. Then it becomes a question of whether to host it on Amazon EKS, Azure Kubernetes Services, or Google Kubernetes Engine. That is basically what we use Amazon EKS for.
What is most valuable?
Amazon EKS is fairly reliable. The latest feature that was added last year, Amazon EKS auto mode, helps manage compute instances and EC2 instances. Amazon EKS auto mode is a very good addition as it helps reduce stress since users do not have to worry about upgrading Kubernetes versions. For example, when Kubernetes 1.34 is released, Amazon EKS handles the upgrade automatically.
Another beneficial feature of Amazon EKS is the Fargate offering. It helps run some compute instances on AWS Fargate, which means they only run when needed. Unlike typical EC2 instances that keep running once turned on, with Fargate, charges only apply when someone visits that service. For instance, in a banking app with multiple services, including a reviews service, Fargate can be utilized to ensure charges only occur when someone actually uses the review feature.
Amazon EKS is fairly stable and highly available. Once configured properly, it requires minimal maintenance. It integrates effectively with other services such as API Gateway, security groups, and load balancers.
What needs improvement?
The integration capabilities could be improved compared to Azure. While AWS services are integrated with Amazon EKS, there is room for enhancement.
For example, Azure DevOps provides better pipeline integration. When writing pipelines in Azure DevOps, users can easily import various built-in tasks into pipeline YAML files, such as kubectl tasks or native Kubernetes plugins, once a service connection to Azure is created.
We encountered challenges with WebSocket integration when implementing chat functionality on Amazon EKS. The chat service, which was part of our microservices running on Amazon EKS, needed to be exposed on application load balancer. Despite both application load balancer and network load balancer having native WebSocket integration on AWS, the connections were unstable. This required extensive tweaking of network load balancer configurations to manage API calls through the API Gateway. AWS could improve WebSocket integration across API Gateway, network load balancer, and Amazon EKS.
For how long have I used the solution?
We have not used it recently because we prefer to make patches ourselves.
What do I think about the stability of the solution?
When auto mode is enabled, self-healing functionality becomes active. If a node encounters issues or someone makes incorrect configurations, Amazon EKS automatically resets it to maintain standard configurations. This is particularly useful when someone SSH's into Amazon EKS instances and modifies Linux kernel configurations, as the self-healing node resets it to normal, helping reduce administrative burden.
How are customer service and support?
We only escalated questions regarding increasing CPU and memory allocations for Fargate. We contacted AWS through their service quota system. The process required submitting a request with justification for increasing the quota for CPU and memory on Fargate. The resolution was quick after providing a brief justification for the quota increase.
How would you rate customer service and support?
How was the initial setup?
The setup process is very straightforward.
What other advice do I have?
When considering Amazon EKS, it is important to use Infrastructure as Code (IAC), not just Terraform. Having a repeatable configuration of infrastructure as code is essential for creating clusters, as manual cluster creation is not common in professional production environments.
It is crucial to consider Fargate carefully, as it can help save costs. Fargate is particularly useful when parts of an application or the entire application are not used constantly, as it can reduce costs compared to running on EC2 instances.
On a scale of 1-10, this solution rates as an 8.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Accelerate development and streamline resource management with seamless integration
What is our primary use case?
The main use cases for Amazon EKS are that we use it normally in some new projects to optimize our costs. Instead of having many ECS services running, we prefer to set up a Kubernetes cluster and set everything there. For me, it is primarily for optimizing our resources.
What is most valuable?
What I find valuable about Amazon EKS is that it helps us manage all the Kubernetes. It isn't the workload, it is the main part of the Kubernetes, the head of all the cluster. Automatic updates are available, and we can set everything we created in AWS in Kubernetes, including IAM configuration. We can create policies such as creating a private endpoint for S3. The integration of Kubernetes with the AWS ecosystem is the best feature that Amazon EKS provides.
The IAM integration in Amazon EKS helps enhance the authentication processes because we can do this in a more granular way. Using IAM, you can set exactly what the service needs. If a service or application needs to upload objects or data to S3, connect to RDS, or perform other tasks, using IAM is the easiest way. The benefit is that it works in a granular way and it's easy to set up and validate. When you examine the permissions and rules to ensure everything has the correct permission at the correct moment, using IAM is perfect because you can validate and set up everything effectively.
Amazon EKS's support for different AWS tools integrations has accelerated our application development because we can think about all aspects comprehensively. We can architect using AWS services and objects, and Amazon EKS accepts this seamlessly. We don't need to translate the idea for AWS. We can write this idea using AWS objects and services, and Amazon EKS corresponds to that. It accelerates projects and is easy to manage because we can use Terraform to implement it.
I am using the self-healing nodes in the Amazon EKS solution. We have a client with a production workload running on spot instances. When a spot or node crashes, Amazon EKS starts a new node and moves everything before the node stopped. This self-healing is excellent because we don't experience disruptions. We don't face situations where a node stops and we need five minutes to start a new one. We use it in specific environments and can observe the difference when enabling or disabling Amazon EKS self-healing.
We are utilizing the automated patching in Amazon EKS. The valuable benefits I have experienced using the automated patching feature for the Kubernetes clusters directly increase security. Kubernetes typically releases patches focused on security rather than new features. It's beneficial because we can focus on our work without constantly thinking about new patch releases or upgrade deployments. Amazon EKS handles this automatically for us.
What needs improvement?
We face some issues with Amazon EKS when using the node group to control which nodes can start. We have a limitation where we need to set just one kind of instance - only large instances, only small instances, or only extra-large instances. This is a problem. It would be beneficial if we could specify that certain containers or services start on small instances rather than large ones.
I am uncertain whether Amazon EKS supports all LTS versions, and I think this would be something beneficial. Additionally, AWS has great AI features, so when we need to make updates to Amazon EKS, it would be helpful if AI could assist with planning, identifying migration requirements, and considering costs.
For how long have I used the solution?
I have been working with Amazon EKS for about two years in production. Including study time and other experiences, I have been involved with it for approximately four years.
What was my experience with deployment of the solution?
I faced challenges in the initial stages with Amazon EKS. The main challenge is that when we set up the cluster, it appears as a huge infrastructure just for a small application. When you set up Amazon EKS, it is configured at a large scale by default. You can't start small and gradually expand. This makes sense because for smaller applications, ECS works effectively. If you want a more integrated ecosystem, you can use Amazon EKS. The challenge lies in migrating everything, as you can't start using Amazon EKS on a small scale. It typically requires a big cluster with one, two, or three nodes. We also faced challenges with developers needing to adapt their mindset to the new way of doing things.
How are customer service and support?
I have escalated questions to the technical support of Amazon EKS two or three times, and they always provided good solutions. When we don't understand the questions, we schedule a call to demonstrate the issue, and we always receive the correct answer.
I reached out for technical support with Amazon EKS because we faced issues starting a service. The way we declared the services was incorrect, but we weren't aware of this. We called AWS support for assistance. Another issue involved a security problem that we identified and reported to AWS.
I would rate the technical support of Amazon EKS a 10. The documentation is good, and when human interaction is needed, it's readily available.
How would you rate customer service and support?
What other advice do I have?
From my perspective, I don't see any disadvantages of Amazon EKS compared to competitors in the market. Amazon EKS represents the state of the art. While Google has a powerful engine that offers more granular control, the additional configuration can be overwhelming. Amazon EKS balances the power of custom configuration with ease of setup.
I find the pricing of Amazon EKS complicated because I live in Brazil, where we use reals. With the exchange rate and taxes, the price appears six times higher. However, when viewed in dollars, it offers great features at reasonable pricing. Lower prices are always beneficial, and a reduction in hourly cost or promotional discounts would be appreciated, but the current price-to-benefit ratio is worthwhile.
My advice to other organizations considering Amazon EKS for their environment is to plan carefully. I strongly recommend planning and reading the documentation because Amazon EKS is resourceful and typically offers multiple ways to accomplish the same task. Careful planning, reviewing case studies for comparison, and thoughtful migration to Amazon EKS are worthwhile investments. Overall, I rate Amazon EKS a 9 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Experience with the setup and configuration has been positive, with seamless integration into the existing infrastructure
What is our primary use case?
I have a total of around four years of experience in multiple clouds, especially in
AWS, and many times I used
Amazon EKS for our multiple products and projects.
In our environment, we already have all the other infrastructure and services running on AWS, so we benefit from using Amazon EKS because the other services can easily communicate with it. For example, some of our services need to access S3, and our application objects reside there, so we can easily integrate them with Amazon EKS. We also use IAM rules for integration to provide granular access to resources. As per your question, in our environment, most of our clients and resources reside in AWS, which is why we prefer to deploy other services there, as most of our development environment uses Lightsail. This gives us an edge, allowing us to easily move from development to staging or production environments within the same cloud.
What is most valuable?
When we compare other clouds, AWS has an edge among all the crowded options right now. My observations and reviews with AWS also affirm this because it provides a user-friendly experience and offers many options that other clouds do not provide. When my team and I work with AWS, we always feel comfortable. I do not know the exact reason behind it; maybe we have a lot of previous experience, and we are familiar with AWS. That is why other reviews from my colleagues at previous companies indicate that AWS has some edge compared to other clouds.
I would recommend Amazon EKS to other organizations because it provides simple configuration, easy management, safety, granular access, and vast monitoring capabilities where we can easily monitor our clusters using CloudWatch. However, I would think about a clearer dashboard for Amazon EKS, but overall, I think it is sufficient.
What needs improvement?
When we need to deploy the application, we require a large number of instances. Therefore, I hope and believe I will not face out-of-capacity issues in AWS, especially since I have not yet experienced traffic around 50,000 plus, and I believe I will not face such issues in Amazon EKS the next time we deploy with a large number of nodes and worker nodes.
Additionally, the upgradation process of Kubernetes rapidly rolls out new releases, so it should be easier for our production environment to upgrade Amazon EKS clusters. Sometimes when we are going to upgrade the Amazon EKS cluster, we need to check the backups, and we should have options to export our configurations, such as exporting the configuration to S3 or somewhere else to find backups. Other tools, such as Velero, provide this functionality to back up configurations, so I hope this backup process will help us fulfill our backup policies and other requirements.
For the pricing aspect of Amazon EKS, one specific issue arises when we deploy applications, especially as we provide SaaS services to our clients. We would like to know the cost for each customer, but we face issues because AWS charges $70 USD for the Amazon EKS engine. We struggle to divide the worker nodes' fees and the engine cost among clients, as some users have low traffic and visibility while others have large amounts of visibility and traffic. Thus, we face cost-related issues when running multiple customers on the same Amazon EKS cluster.
For how long have I used the solution?
I have been using the solution for approximately four years.
What was my experience with deployment of the solution?
The initial setup and deployment of Amazon EKS was straightforward; I easily provisioned the correct cluster. Sometimes, due to the depreciation of the AMIs, AWS provides warnings to use the latest version of the EKS AMIs because some of our scripts or
Terraform scripts are old. I think it is good practice for AWS to provide messages to the console to upgrade your cluster, but overall, my experience with provisioning the Amazon EKS cluster is good, and I highly appreciate it.
What do I think about the stability of the solution?
I have not faced any issues that require escalation to customer support, and until today, I do not feel any need to escalate anything to the support team. I am happy with the service.
What do I think about the scalability of the solution?
To meet our requirements, our services need a large amount of CPU and memory, so we need high-spec machines. When we deploy applications, we require a large number of instances.
How are customer service and support?
I have not faced any issues that require escalation to customer support, and until today, I do not feel any need to escalate anything to the support team. I am happy with the service.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I also work with other clouds, such as
Oracle Cloud, but I feel comfortable with AWS because I faced some issues, such as out of capacity when we designed the infrastructure for our large traffic.
How was the initial setup?
The initial setup and deployment of Amazon EKS was straightforward; I easily provisioned the correct cluster. Sometimes, due to the depreciation of the AMIs, AWS provides warnings to use the latest version of the EKS AMIs because some of our scripts or
Terraform scripts are old. I think it is good practice for AWS to provide messages to the console to upgrade your cluster, but overall, my experience with provisioning the Amazon EKS cluster is good, and I highly appreciate it.
What about the implementation team?
I used Amazon EKS through its console rather than through the
AWS Marketplace.
What was our ROI?
I have not checked the pricing yet, but I will look into it to see if EKS brings ROI or a return on investment for us.
What other advice do I have?
Regarding self-healing nodes in Amazon EKS, I have not worked on that self-healing feature.
For automated patching in Amazon EKS, I have not used that feature.
Regarding disadvantages of Amazon EKS compared to competitors in the market, I think every cloud provider has the same Kubernetes engine and worker nodes. However, I believe AWS provides a more user-friendly environment, which is why many of our customers are trying to deploy their infrastructure or applications on AWS. I do not think there is any specific reason not to prefer Amazon EKS.
I have not integrated IAM tools with Amazon EKS yet, but my other teams have. I think they used Okta, but I'm not certain about it. I have some demos from a long time ago, but I think Okta is for SSO.
On a scale of one to ten, I rate Amazon EKS a nine out of ten.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)