High availability has boosted our AI reporting workflows but navigation and pricing still need work
What is our primary use case?
My main use case for
Amazon EKS is to host one of my AI agents which I built recently in my company. I use
Amazon EKS to host my AI agent, which is a sprint report agent that fetches the company's data from
Jira and prepares a sprint report for the complete team. The agent runs on Amazon EKS for high availability and reliability, hosted on nodes, and it is running on a GPT OS's model in the back end.
I have designed this agent for high availability on Amazon EKS because if I hosted it on any other platform, there would be chances of downtime. I have set it up in such a way that if there is any downtime, a new node is already up and running so that my use case is not affected and the users can use it seamlessly without any issues.
What is most valuable?
The best features Amazon EKS offers in my experience are its reliability and high availability, and also the support that
AWS provides. The reliability and high availability have helped me specifically because there were some configuration issues at my end for the agent, which caused it to go down repeatedly. Due to the high availability features, every time a node downscaled, a new node was automatically scaled up, ensuring that the AI agent was up. I was able to check the logs and correct the issues accurately thanks to that high availability. Also,
AWS support is very helpful; whenever I face issues regarding costing or anything else, I just create a ticket and they assist me in resolving the issues.
Amazon EKS has positively impacted my organization by improving the efficiency and working capacity of my team. It improved efficiency and working capacity because, based on how Amazon EKS works, we are more calm regarding functionality. The reliability allows us to focus on many other tasks, as the infrastructure is maintained by Amazon EKS; therefore, we can divert our attention to other tasks and perform well there as well.
What needs improvement?
Functionality-wise, Amazon EKS is acceptable, but navigation-wise, it could be improved in the AWS console; it could be more interactive and more intuitive for new users. Also, the pricing can be reviewed as it is sometimes a bit pricey, particularly regarding the extended support when new version upgrades occur that we cannot implement directly due to production workloads, as clusters running on extended support cost six times more, which is something that could be reduced.
I believe documentation could be improved on the AWS website so a new user who is starting with Amazon EKS could work much better with it.
For how long have I used the solution?
I have been working in my current field for three years.
What do I think about the stability of the solution?
In my experience, Amazon EKS is stable.
What do I think about the scalability of the solution?
The scalability of Amazon EKS is good.
How are customer service and support?
The customer support of Amazon EKS is good. I would rate the customer support on a scale of one to ten as a seven.
How would you rate customer service and support?
What was our ROI?
I have seen a return on investment because the reliability has helped me save time; I can rely on Amazon EKS's reliability and then divert my attention to other tasks, so it has definitely saved my time. I estimate that my team and I save roughly one to one and a half hours a day since using Amazon EKS.
What's my experience with pricing, setup cost, and licensing?
Regarding pricing, setup cost, and licensing, the pricing for the cluster part is a bit higher; the setup cost is acceptable, but the licensing part regarding extended support is also a bit pricey, which I think can be handled or reduced.
What other advice do I have?
Amazon EKS is a good product; if you are starting with
Kubernetes, it is a good choice, but the pricing is a bit substantial, so you should review that. Also, regarding the support, there are sometimes cases where you need to upgrade your AWS plans for particular support, which can also be a bit pricey. I would rate this product a seven overall.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Modern microservices have delivered faster deployments and stronger security for our teams
What is our primary use case?
My main use case for Amazon EKS is to run and manage containerized applications at scale with high availability, security, and automated deployments.
I am using Amazon EKS to run microservices in production. I host stateless backend APIs and web services, and each service runs in Docker containers, scaling horizontally based on traffic. I chose Amazon EKS because of Kubernetes native scaling, and it provides self-healing and rolling updates. I am also using Amazon EKS for continuous deployment of containers, deploying Docker images built in CI, using Helm and manifests for version releases and supporting rolling, blue-green, and canary deployments. Security and access control are also major reasons for using it, with IAM integrated authentication, network policies, and pod-level isolation, which Amazon EKS provides. It also offers secret management integrations and scalable infrastructure management, providing auto-scaling worker nodes that scale from a few pods to hundreds without redesign.
The main use case for Amazon EKS is running and scaling production-grade, containerized microservices with automated deployments, high availability, and strong security on AWS.
What is most valuable?
The best feature Amazon EKS offers, especially for production workloads, is its fully managed Kubernetes control plane, where AWS manages the Kubernetes master, ETCD, upgrades, and HA. The control plane runs across multiple AZs, providing high availability and resilience, with multi-AZ control plane and workloads by default, along with automatic pod restarts and self-healing. Native AWS IAM integration is also present, providing fine-grained access using IAM roles and policies, and IAM roles for service accounts. There are other features, including deep AWS ecosystem integrations, a standard Kubernetes experience, and strong security and compliance, upgrade, and version control, which I would say are great features. If I have to name the top three features or the biggest impacts, they would be managed control plane, IAM plus IRSA security model, and auto-scaling with EC2 or Carpenter.
Amazon EKS has had a strong positive impact on our organization by improving key aspects that matter to every organization, such as reliability, scalability, deployment speed, and operational efficiency for containerized workloads.
What needs improvement?
There are one or two areas for improvement that I can suggest, starting with operational capacity. There is a steep learning curve for teams new to Kubernetes and many moving parts like VPC, CNI, IAM, and add-ons and node groups. Improvements could include better out-of-the-box defaults and a simplified setup and management workflow. Moreover, observability out-of-the-box could be enhanced, as monitoring and logging require multiple add-ons, which should not be the case. Additionally, there is no single unified observability experience, so better built-in metrics, logs, and tracing, along with a native dashboard without heavy setup, would be beneficial.
For how long have I used the solution?
I have been using Amazon EKS for more than four years.
What do I think about the stability of the solution?
Amazon EKS is quite stable.
What do I think about the scalability of the solution?
The scalability of Amazon EKS has been excellent and production-grade in my experience, as it scales both application and infrastructure reliability with minimal manual interventions. The practices I have analyzed in Amazon EKS include pod-level scaling, node-level scaling, and traffic and load scaling, all of which have been great.
How are customer service and support?
Customer support has been great; I have reached out a few times, and the responses have been very quick, ensuring that any issues are resolved as soon as possible. I would rate the customer support a 10.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have not used a different solution; I am using Amazon EKS only.
How was the initial setup?
Our experience with pricing felt reasonable for production clusters given the managed control plane and high availability, although the fixed cost for smaller non-production clusters felt relatively high. Amazon EKS charges a fixed fee per cluster, currently per hour, which applies regardless of workload size. Our experience shows that using auto-scaling and right-sizing helps control costs, and combining on-demand, spot instances, and scaling policies reduced compute spending. In terms of setup, our initial setup required moderate engineering effort, especially for teams new to Kubernetes, but utilizing Terraform and AWS best practices significantly reduced setup time and errors.
What was our ROI?
We have seen a clear and measurable return on investment from using Amazon EKS, both in cost efficiency and operational productivity. Improvements in deployment speed and MTTR are evident, alongside infrastructure cost optimizations. Developer productivity and onboarding have also improved, leading to 60 to 70% faster onboarding and faster time to market. Additionally, engineering costs have decreased because processes that were previously manual are now automated, reducing the number of engineers needed to handle those tasks.
Which other solutions did I evaluate?
We evaluated some other applications and services, including Amazon ECS, which is Elastic Container Service, self-managed Kubernetes on EC2, and Docker Swarm. However, we decided to move to Amazon EKS because it proved to be more reliable and scalable than the others.
What other advice do I have?
I share every bit of advice that I feel is valuable regarding scalability at both the application and infrastructure levels, along with all the features that Amazon EKS offers. I share the positive impacts we have seen in our organization and team, including improved reliability and uptime, faster and safer deployments, and scalability without needing re-architecture. I provide metrics highlighting how it has improved our team's efforts and reduced manual tasks.
Amazon EKS has positively impacted deployment speed with a 65 to 75% reduction in deployment time. Before Amazon EKS, it took 30 to 60 minutes per deployment, and now it takes only 10 to 15 minutes. The rollback and recovery MTTR has also greatly improved, with a 75 to 85% reduction in MTTR. Earlier, it took up to 45 minutes for manual rollback, and now it takes only 5 to 10 minutes with automated rollback. We have seen a reduction in production incidents, specifically outages caused by configuration drift and manual deployments, as using Amazon EKS allows us to perform it automatically. This has resulted in 50 to 60% fewer release-related incidents. The scalability and traffic handling are also great, as it can handle two to three times traffic spikes without manual intervention, with auto-scaling triggered within minutes, leading to zero downtime during peak loads. Operational efficiency is also improved, with less time managing clusters and fewer failures, showing a 30 to 40% reduction in Kubernetes operational effort. After adopting Amazon EKS, we have reduced deployment time by 70%, MTTR by over 75%, and release-related incidents by around 55 to 60%, significantly improving scalability and operational efficiency. Overall, it has been a great experience, and I find it very useful and helpful for my team and organization.
Amazon EKS is a great service to use or implement in an organization or team, and I would rate this review as an 8 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Managed Kubernetes workflows have streamlined deployments and improved our cloud automation
What is our primary use case?
I used
Amazon EKS throughout my internship while working on Docker-based deployments,
Kubernetes orchestration, and cloud infrastructure tasks at Cognizant.
I used Amazon EKS when working as an intern at Cognizant, where it was used to deploy, manage, and scale containerized web applications. Our workflow started with building Docker images, storing them in Amazon ECR, and deploying them to Amazon EKS clusters running on EC2 worker nodes with CI/CD pipelines. We used many tools; for example, Jenkins was part of our CI/CD pipelines that automated the build and deployment process.
Additionally, I worked on application deployments, updating Kubernetes manifests, managing pods and services, and verifying application health. Amazon EKS acted as a central platform that connected Docker, AWS infrastructure, and DevOps automation into one consistent system.
I used Amazon EKS during my internship at Cognizant as part of a cloud and DevOps-focused environment, where it served as the core Kubernetes platform to run containerized applications built with Docker, deployed through CI/CD pipelines, and hosted on AWS infrastructure. We deployed numerous web applications, and we wanted to learn Amazon EKS through dummy projects with dummy web interfaces. Beyond dummy projects, we also deployed some client websites into the Kubernetes environment and managed traffic, although I cannot name the clients.
Amazon EKS is an excellent choice for organizations already invested in AWS. I recommend having a solid foundation in Docker, Kubernetes basics, and AWS core services before implementing Amazon EKS. Using infrastructure-as-code tools and following AWS best practices can significantly improve maintainability and security. Amazon EKS is particularly strong for enterprise environments and microservice-based architectures.
Amazon EKS is ideal for teams already using Docker, CI/CD, and AWS infrastructure, which our team was already utilizing. I strongly recommend learning Kubernetes fundamentals and AWS networking, containers, and security before using Amazon EKS in production, as it positively impacted our organization by making it easy to connect all our existing AWS services.
I deployed Docker applications to Amazon EKS using CI/CD pipelines, integrating with EC2, ECR, IAM, and automated workflows.
What is most valuable?
The most valuable feature of Amazon EKS is the managed Kubernetes control plane, which allows teams to focus on DevOps and application workflow instead of cluster maintenance. Another major benefit is how naturally Amazon EKS fits into the AWS ecosystem, integrating with EC2 for compute,
IAM for access control, and
Amazon ECR for container images. CloudWatch for monitoring creates a complete DevOps pipeline. The self-healing nature of Kubernetes combined with AWS scalability makes the environment reliable and suitable for real-world workloads.
The most promising feature, which I prefer the most, is its integration with all the AWS services, including EC2, IAM, VPC, ECR, and CloudWatch, making it a key part of my workflow.
Amazon EKS works very well with Docker-based container workflows; it is highly scalable and self-healing, complemented by its rolling update capabilities.
What needs improvement?
There are some drawbacks regarding Amazon EKS; pricing can increase as clusters and workloads scale, and there is an initial configuration learning curve. A beginner has to learn about
IAM networking and cluster setup, plus improved built-in cost visibility and simplified monitoring tools could be beneficial.
Pricing can be improved, especially for small teams or landing projects, and the initial setup, as well as understanding IAM networking and cluster configuration, can be complex for beginners; improving this would enhance the experience. Troubleshooting sometimes requires deeper AWS and Kubernetes knowledge, which also could use improvement.
What do I think about the stability of the solution?
Amazon EKS is very stable; during my usage, the clusters remained consistently available and workloads ran reliably. The self-healing capabilities of Kubernetes, combined with AWS managed infrastructure, help ensure minimal disruption to running applications.
What do I think about the scalability of the solution?
Amazon EKS is highly scalable; it supports horizontal scaling of pods and seamless scaling of worker nodes. It features AWS auto-scaling for vertical scaling, making it suitable for handling varying workloads and preparing the environment for real-world production demands.
How are customer service and support?
Support and documentation from AWS are very strong, with extensively available official documentation, and AWS support channels are very kind and responsive. Most issues can be resolved through AWS knowledge bases, documentation, and our organizational support teams.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
Previously, I used open-source Kubernetes as a normal Kubernetes solution. I wanted to switch to Amazon EKS because Amazon EKS connects with numerous AWS services, making it easier to deploy my applications than using open-source Kubernetes.
How was the initial setup?
The initial setup is moderately complex; creating clusters, configuring IAM roles, setting up the network, and connecting with the worker nodes requires careful planning. However, once the environment is configured, ongoing operations such as deployment, updates, and scaling are smooth and efficient.
What was our ROI?
Time is definitely saved because Amazon EKS provides automation with CI/CD pipelines, allowing us to simply monitor it, and if there is any fault, we know immediately in the pipeline. Regarding the need for fewer employees, I do not think that will happen since only a few employees will know about Amazon EKS deployments in the cloud, so some knowledge is necessary. We can see a return on investment; we can save a lot of money by using Amazon EKS as it is directly connected with all AWS services and can be integrated with coding storage options such as
GitHub and
GitLab.
Which other solutions did I evaluate?
I did not evaluate many options; I evaluated some, specifically open-source Kubernetes, which I thought would be more difficult than Amazon EKS. Therefore, I chose Amazon EKS.
What other advice do I have?
The benefits that I observed after adopting Amazon EKS are improved deployment speed and reliability, while resource usage remains the same, but we can handle a larger amount of traffic and application deployments without any issues. The user interface is very great, providing clarity on where the fault or error lies while deploying our application. I would rate this product nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Critical microservices have been managed reliably and support secure, flexible operations
What is our primary use case?
My main use case for Amazon EKS is for implementation and sustainable services and microservice application on a critical structure and services deployment.
On our application, we have more than 20 services and microservices such as authentication, login, account management, a notification service, and a billing service, which all work together to structure a heavy, useful application.
What is most valuable?
The best features Amazon EKS offers are scalability and deployment control, the ingress configuration regarding path pattern and host header to get all the services and microservices, and the HPA configuration.
The biggest difference, or the most important aspect to me, is the scalability, because you can easily scale any service or microservice to handle security during high changes in connection flow, and it is useful for the application and helps day-to-day by giving us reliability and stability so we can perform all maintenance and deployment of our system.
Reliability is a very important thing. Security and operational consistency are very important aspects, and the flexibility offered in node management and network options is also valuable. Amazon EKS is a service that is reliable and scalable, and it gives us a solid and dependable solution.
What needs improvement?
I think sometimes the documentation is not so clear and not so fast to provide more in-depth instruction and examples of bigger and critical implementations, so some difficulties for us sometimes take a lot of time to understand, test, and to put into production with security and guarantees.
For how long have I used the solution?
I have been using Amazon EKS for almost five years now.
What other advice do I have?
I advise doing a POC first and getting all the details, testing, and having a very good alignment between DevOps and development departments, and prepare all the CDN and how the connections get into your cluster, and how you configure your ingress and how to prepare every service or microservice to receive that with secure and optimized code, process, and communication with other resources. I would rate this product an 8.
Which deployment model are you using for this solution?
Private Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Running critical financial workloads has delivered reliable low-latency deployments
What is our primary use case?
I use Amazon EKS mainly for deploying my application onto the Kubernetes infrastructure, which is provided by the underlying Amazon infrastructure.
I have many services, many applications, and many web services and web applications which are deployed using Amazon EKS across different regions in Amazon web service locations and data centers.
Other than deploying applications onto Kubernetes infrastructure using Amazon EKS, I don't have any other use cases for this tool in my current organization.
What is most valuable?
Amazon EKS has very good scalability with 100% uptime and zero latency.
Amazon EKS is the most cost-effective solution that I am using currently. If I want to reduce my downtime, I can deploy it in a multiple region architecture, which can reduce the downtime.
It is a cloud-based solution which is managed by Amazon, a global cloud services provider. I have observed very negligible issues while running my applications on Amazon EKS.
It is very pocket-friendly, cost-effective, and the setup is very simple.
If you are running a few applications that require high scalability, you can go for Amazon EKS. It is a very good tool if you want a managed Kubernetes service. It will definitely work wonders for your project.
Currently I work for a financial global giant where a millisecond latency costs around a million dollars. With Amazon EKS, I have a lot of benefits.
What needs improvement?
The only thing I feel is keeping Amazon EKS updated with the current trends and requirements of the global giants which are using this tool.
For how long have I used the solution?
I have been using Amazon EKS for the past three years. It is very brilliant.
What do I think about the stability of the solution?
I have worked on several improvements, particularly regarding instances when Amazon goes down, which is the only time I see issues in Amazon EKS.
It is pretty good and pretty stable.
What do I think about the scalability of the solution?
I have not experienced any scalability issues.
How are customer service and support?
I have not had any issues with customer service.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have not used any previous solutions.
Which other solutions did I evaluate?
I don't have any alternate solutions.
What other advice do I have?
It is already a great tool. It is already a very good tool in the market. I would rate this product a 10 out of 10.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Automated deployments and cost controls have increased team efficiency and reduced downtime
What is our primary use case?
We are currently using Amazon EKS as a production environment where we deploy multiple services because we serve banks using our services for identifying threat analysis and synthetic identity. Our main use case for Amazon EKS involves deploying our services based on scoring to determine whether a particular user is valid or invalid based on their social security number and any fraudulent entity they provide. We analyze this information and use Amazon EKS as the platform for deploying that application. We mainly deploy applications using Kubernetes deployments, and based on that, we deploy pods which contain the container that runs the application.
Amazon EKS helps us manage these deployments for threat analysis and synthetic identity because the controller part is managed by AWS. This addresses our concern about managing it ourselves, as we are managing a large environment where Amazon EKS ensures we can confidently manage our application rather than relying solely on the controller. We have implemented Karpenter, which manages the nodes and continuously monitors to see if any nodes or pods are underutilized or unable to load. It identifies the requirements from the pod side, automatically creates a node, and schedules the pod based on its resource needs. This automation allows us to respond quickly, especially during high load times when pods may get stuck or unable to upload images. The alerts help us react promptly, and another pod will be scheduled to fulfill the user's request within seconds.
The best feature Amazon EKS offers is automation, particularly with the automatic scheduling of pods on nodes. Karpenter, the new feature added recently that manages the node initialization, is something I really appreciate as it makes independent decisions based on the requirements, unlike older autoscaling configurations. Karpenter prioritizes cost-effectiveness by selecting the cheapest options for nodes, whether they are spot instances or on-demand. Additionally, whenever there is an issue, the pod can be automatically recreated using the defined replica set, providing a significant advantage of Kubernetes.
Karpenter helps keep costs down by consistently evaluating whether it should take on-demand instances or lower-cost alternatives. It constantly monitors for the lowest price model available in AWS. However, because the cheapest instance can be terminated with little notice, Karpenter quickly transitions to the next available on-demand instance. This proactive cost management is a key feature.
Amazon EKS has positively impacted my organization as Kubernetes offers orchestration of container-based applications, allowing us to rapidly deploy and fulfill user requests. During busy business days or promotional offers, we experience increased traffic, and Amazon EKS enables quick deployment of containers to meet this demand. If a pod is unresponsive, Amazon EKS can easily launch another pod to maintain service delivery. This adaptability not only enhances user service but also contributes to cost savings since we leverage Karpenter to manage nodes dynamically based on usage.
What is most valuable?
Since I joined this company ten months ago, we have reduced the number of failures significantly. Earlier, before using Karpenter, issues with pods hindered our efficiency and resulted in higher costs, as we did not have the option to minimize expenses effectively. Now, with Karpenter, costs have decreased and our efficiency has improved because we can swiftly address alerts and redeploy applications without delay. This transition has greatly improved our operational performance.
After adding Karpenter to our Amazon EKS setup, we have seen efficiency improvements of approximately thirty to forty percent, along with a reduction in our costs and a decrease in the number of issues our team encounters.
What needs improvement?
In my experience with Amazon EKS, I have not encountered aspects that require improvement, as AWS has invested intelligently in its development, especially with the addition of Karpenter. I do not have any particular improvement feedback for Amazon EKS or Kubernetes at this time.
For how long have I used the solution?
I have been working in my current field for the last ten months. In terms of my total experience, I have been using Amazon EKS for the last three years, and in my current company, I have been using it for the last ten months.
What do I think about the stability of the solution?
Amazon EKS is stable in my experience.
How are customer service and support?
Customer support for Amazon EKS is excellent, particularly at the enterprise level, as we can readily raise tickets and receive prompt responses. This interactive support is incredibly useful, allowing for quick resolutions and solutions through live sessions and screen sharing.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have been using Amazon EKS since I started. Although I know about Docker Swarm, it poses issues for large environments like ours, which is why we opted for Amazon EKS.
We evaluated Docker Swarm before choosing Amazon EKS, as it is not well-suited for expanded environments and long-term deployments, making Amazon EKS the clear choice based on insights from my seniors.
What was our ROI?
I have experienced a return on investment with Amazon EKS. Since implementing Karpenter, costs have been reduced and operational efficiency has increased. We can deploy applications swiftly, and our monitoring tools such as DataDog alert us to any pod issues so we can act quickly. This responsiveness enables us to focus on critical issues that require our immediate attention.
What's my experience with pricing, setup cost, and licensing?
I appreciate the overall pricing model of AWS, where you pay based on usage, which allows for a clear understanding of costs associated with services. The setup cost is reduced significantly since Amazon EKS simplifies the laborious process of arranging the controller plane, which typically requires substantial human resources and effort. Licensing is straightforward, making it easy to start using the service.
What other advice do I have?
I would certainly recommend using Amazon EKS due to its managed services, which alleviate the complexities of controlling the Kubernetes cluster. The scalability features ensure issues with any pods are managed effectively by automatic relaunch processes, maintaining desired states. Karpenter's cost-efficient design is another highlight worth noting for anyone looking to balance container orchestration with spending.
Amazon EKS has proven to be an exceptional product, particularly as it gains popularity due to its scalability and rapid application deployment capabilities, benefiting organizations across various sectors. This review has been rated nine out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has enabled seamless infrastructure configuration while improving identity integration and monitoring capabilities
What is our primary use case?
In my current project, I am handling a US customer who is completely focused on working with LLM and security. Their clients are expected to give us the data sources, and we provision from Greengrass, which works as an agent running on the end client to fetch the customer false alarms. It's primarily focused on the SOC 2 side, where data goes to the SOC dashboard or SOC data source, leveraging LLM to filter how many alarms are false positives versus true positives. In doing this, the NOC engineers observing the SOC 2 dashboard do not have to worry about how many are true positives; our LLM API or tool filters it out and indicates how many true positives and false positives are displayed on the dashboard. This significantly eases the burden on the NOC engineers.
We are utilizing Amazon EKS for our application because most components are LLM and require certain GPUs, so we have specific managed node groups and we also use Carpenter along with HPA in place. Whenever the need arises, process queues are listed in Redis for elastic cache, and all the lists are processed and read from Redis and handed over to the nodes dynamically using HPA. We leverage HPA and Carpenter within Amazon EKS for scaling or scaling out.
How has it helped my organization?
Amazon EKS is simple due to its support for diverse AWS tools and various integrations, significantly influencing my application development and management processes. There is the aspect of trust relationships and permissions. Every service we create involves setting a role, and all authentication processes link to a policy, deciding access. This seamless process extends across accounts thanks to ARN (Amazon Resource Names). With the security guardrails in place, services are accessed effortlessly while maintaining high security, giving me confidence in its management capabilities.
What is most valuable?
The HPA feature within Kubernetes is one aspect I appreciate about Amazon EKS; it's beneficial for scalability. The managed node groups or dedicated node groups with GPU capacity allow us to scale in or out depending on our capacity planning, and Carpenter, which we have provisioned for scaling, contributes to that. These features are not available by default in EC2s or ECS, where you lack control over the nodes. Many people attempt to adopt Lambda functions or serverless solutions, but that approach does not suffice for GDPR and HIPAA compliance as it demands a solid identity of the node and clarity on where it is being provisioned. Hence, we cannot depend on any of the Lambda or serverless components which may be unstable, and this often leads to increased billing due to the dynamic nature of scheduled procedures that require a standalone node rather than a transient one.
When it comes to the integration with IAM, I have two thoughts on the authentication process of Amazon EKS. Previously, when I began using AWS, there was something called the AWS auth config map within Amazon EKS. Initially, only the person who creates the cluster has access to it. Now, if you engage in enterprise roles—as I did while working within the UK for Santander—it presents challenges. Essentially, earlier only the creator had the master role for accessing the node and the onboarding process was rather manual, with companies relying on ServiceNow and other tools for onboarding new joiners or team members. Now, EKS API or EKS CTL has many default settings that are not enabled, you need to enable these. Most clusters are created using Terraform, and you need to create a role that can manage cross-account access, as many customers don't operate with a single account.
My previous employers, such as Fidelity Investment, Nokia, and Ford, have worked across multiple accounts, necessitating single sign-on. This setup allows for cross-account access to the cluster by employing EKS CTL APIs, which leverage single sign-on to onboard team members. As such, once the role has access to the cluster, it can onboard users as a dev user, admin, or tester, simplifying the onboarding process. This way, previously manual tasks can be automated, which is a significant improvement. Earlier, we had to make changes to the config map to onboard users, but with EKS CTL API, this integration between EKS, Kubernetes service, and the cloud side is improved tremendously, alleviating many worries.
Self-healing nodes assist in minimizing administrative burdens in my projects. Coming from a telecom background where I've worked over seven years, I'm familiar with a service called SON—self-organizing and self-healing functionality. At a logical level, these are the layers we interact with, but AWS handles the physical layer through their software components. For instance, if one node is not ready and you enable the auto mode feature, AWS manages that for you—IAM upgrades or any nodes malfunctioning. I've seen these features in the UI; I've enabled them, and every 10 or 15 days, patches roll out. I can check them via AWS Inspector to see if there are any node-level patches or AMI level patches necessary. AWS takes care of these issues automatically. I appreciate that I don't need to manually check the dashboard and apply upgrades one by one, which is a significant improvement.
I measure the impact of Amazon EKS on the organization's management of complex workloads in terms of effectiveness and efficiency through my background in development and systems. Initially, I spent five years as a Java developer before transitioning to DevOps. With my understanding of end-to-end application architecture, I assess workloads based on system and application planning. For example, when I worked on a data lake product in Fidelity Investment, I observed that the cloud onboarding process, including Amazon EKS, had roadmaps extending over five years—from 2019 to 2024. I understand the nuances of enterprise or legacy applications and any system-related complexities. It all boils down to two components: system planning and application planning. Initially, we identify the type of application—whether it is database-related or has high GPU demands. Most applications today involve GPUs, which tend to incur high costs, and often customers are unaware of how to handle dynamic workloads effectively. It's crucial to assess not just one part (system), but various elements CPU, memory, and IOPS since the underlying hardware interacts with those components regardless of domain. First, we need to evaluate the application's requirements—such as its dependency on node storage. With EKS, Kubernetes provides solutions CSI, CNI, and CRI. By understanding the application's demands, I can apply the right Kubernetes configurations for performance optimization, such as taking advantage of Amazon EKS's ability to adjust the container network interface settings to suit the client's workload requirements. This loose coupling allows us to optimize our resources irrespective of whether we're using on-prem or cloud environments.
What needs improvement?
It has been since 2019 that I started using Amazon EKS. At that time, it was completely new, and many people were not using it just yet; it started from version 1.21, and right now we are on 1.33. Recently, 1.34 has been launched, but it's not yet available in the service catalog; we can see only 1.33. A lot of improvements have been made.
We had numerous add-ons to install manually because Kubernetes is a completely different service than AWS cloud provider, and everyone has opted to use it. After opting, there is an identity that you have to maintain—one at Kubernetes level and one at the AWS provider level. You have to maintain one identity at IAM level and one within the cluster, Amazon EKS. A few things do not make sense within the add-ons, many of the secret providers that read the secret from Secrets Manager and then mount it as a volume. We use a service called EBS CSI driver, which reads the secrets or sensitive data from Secrets Manager and then mounts it as a volume to the pod at runtime. However, that doesn't have a dynamic feature where, if any changes happen in the secrets, it can read and populate in the environment.
Sometimes consider your RDS password or OpenSearch password rotates. Amazon EKS doesn't have that feature to read the dynamic one and consider that the password has changed overnight; there is no functionality from the provider to see the changes and then restart the pod or fetch the new value. This often leads to downtime of 12 or even 6 hours, depending on when you realize it, so that needs improvement.
Nonetheless, mostly on the add-on side, they have developed a lot; earlier we were installing them manually, but now with EKS auto mode, many things VPC CLI and pod identity service—around four plugins—are installed by default, which is a good thing. However, I believe there should be some solution that is self-contained, covering generic use cases.
With the 1.33 release, they have addressed most of my earlier concerns, but I am still looking for some improvements, particularly in CloudWatch monitoring. In IT, we manage two aspects: either the system or the application. Currently, the application logs and monitoring are not very robust in CloudWatch; you can only find things if you are familiar with them. Fortunately, we are familiar, as most of the monitoring involves two types of databases: one is a time series for monitoring data, and the other is an indexing solution for a streaming service. This means we need to get the logs from each node, index them, and populate them on a screen. That part remains a separate service, but if they managed it within Amazon EKS service, where the monitoring is consolidated in one place, you wouldn't need to rely on Prometheus, Grafana, or different services. It would be advantageous to have a consolidated platform for EKS, as Kubernetes is leveraged; monitoring and logging should also be integrated simply by enabling parameters or tags. This would create a self-contained platform where people can onboard and start using it. Currently, I still need to enable logging and monitoring among other things myself; that shouldn't be the case after six or seven years in the market.
On a scale from 1 to 10, I would rate Amazon EKS tech support an eight. Some individuals have a deep understanding of the services and can identify potential bottlenecks, especially with load balancer endpoints and certificate management. The shift from NGINX to AWS load balancers has diminished many previous issues. However, not every support engineer meets the same level of expertise, hence why I rate it a solid eight, which I consider decent.
For how long have I used the solution?
I have been using Amazon EKS for seven years.
How are customer service and support?
Amazon's customer support has its merits; it is good overall. However, when it comes to enterprise licenses, the quality declines significantly. Startups may not recognize this at first, engaging with support during their initial phase, but they soon discover the lack of expert guidance and the costs associated with it—it's quite expensive.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have experience with other Kubernetes engines, such as those from Oracle, Azure, and Google. For instance, I find Red Hat OpenShift to be an excellent competitor. Many have transitioned to Azure due to the hosting incentives it offers for running open AI models. While people explore Amazon EKS or other Kubernetes solutions, the simplicity in moving between platforms remains an advantage since your manifests mostly stay the same, needing only minor adaptations to services. My experience with Red Hat OpenShift tells me it's a robust solution with competitive attributes against AWS.
What was our ROI?
In terms of ROI, the return on investment, I have clear examples. When we began, we utilized managed nodes for applications needing dynamic processes. Some applications simply required pod creation once data was queued for processing, allowing the node to remain free afterward. For a customer who was not cost-effective—we had provisioned a node that saw little CPU usage but substantial memory consumption—we implemented effective resource quotas and HPA. This setup enables the system to utilize resources dynamically based on the actual demand observed by reading metrics from the Prometheus adapter. Notably, the adapter isn't an out-of-the-box feature provided by Prometheus; you need to create your own adapter for it. Using tools HPA and Carpenter allowed us to scale resources based on requirements. Initially, not having them resulted in an unoptimized solution. However, with these tools in place, we witnessed a reduction of costs by approximately a third—if it was $100 beforehand, we brought costs down to $25.
What's my experience with pricing, setup cost, and licensing?
Regarding the pricing aspect and the licensing cost of Amazon EKS, sometimes it is not clear. Most discussions revolve around the data transfer costs from one region to another, and there are certain concerns regarding GPU nodes. However, if you optimize your node usage, with tools such as Kubecost, you can analyze how effectively you utilize your nodes. If you manage to optimize usage, you won't face steep costs. Otherwise, the cloud provider will certainly benefit from inefficient usage. Ultimately, it's not out of the box—if you want to monitor costs effectively, applying separate tools and acting accordingly in advance is essential.
Which other solutions did I evaluate?
I notice key differences between Amazon EKS and its competitors, analyzing both pros and cons. The seamless integration is sometimes lacking in other offerings. When managing software in platforms Kubernetes—including EKS, AKS, GKE, Rancher Kubernetes, and Oracle's Kubernetes engine—I've faced specific challenges, particularly with user management in Oracle's solution, which isn't as seamless as it is in Amazon EKS. Comparatively, OpenShift from Red Hat has notable strengths. Oracle is making improvements, especially with its longstanding database solutions. For cloud providers, though, OS from Red Hat is a formidable competitor, offering robust out-of-the-box solutions around resource limits and dashboard configurations that do not require command-line interventions.
What other advice do I have?
The review suggests that people considering Amazon EKS should heed some recommendations. They often attempt to enforce infrastructure as code with tools Terraform or HashiCorp to maintain workspace and all. I advise using services within a single environment, especially for LLM applications. It's prudent to have multiple LLM sources across various cloud providers while utilizing the same keys within your AWS environment.
My second piece of advice is to establish a separate CI/CD platform independent of AWS. This keeps things loosely coupled; with minimal tweaks in CI/CD pipelines, you can seamlessly migrate from one platform to another—say from EKS to AKS to GKE or OpenShift—thus keeping the focus on feature development rather than migration headaches. This leads to a modular approach in your code and infrastructure, ensuring that only the cloud provider specifics require adjustments.
Overall rating: 8 out of 10
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)
Has supported production workloads consistently but requires simpler configuration and clearer troubleshooting tools
What is our primary use case?
We use Amazon EKS mostly for deploying production workloads, and there are multiple AI models that we run on Amazon EKS.
What is most valuable?
My favorite feature of Amazon EKS is the ecosystem that it provides, including the integration with S3, along with EBS, and the networking that is smooth to run Kubernetes.
What needs improvement?
I have experience with Azure, and in comparison to Azure, a downside of Amazon EKS is that even if you want to deploy a dev workload or do some experimentation, we have to pay the charges for the control panel with no free option.
Additionally, I have faced many issues while configuring the node groups and the whole configurations; bringing up the nodes was a bit hectic, and I was not able to determine which node was failing and for what reason.
Specifically, the pricing for the control panel of Amazon EKS is hefty, and there is no cost-cutting that can be done on that side.
For how long have I used the solution?
I have been using Amazon EKS in my career for four years overall.
What do I think about the stability of the solution?
Amazon EKS is pretty stable, and I have not seen any lagging, crashing, or downtime.
What do I think about the scalability of the solution?
Amazon EKS is good in terms of scaling.
How are customer service and support?
I have contacted technical support and customer support for Amazon EKS.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
I have used Azure EKS, Civo, and also experimented with GCP as alternatives to Amazon EKS. If I have to rank between all three of them, GCP comes at number one, Amazon EKS comes at number two, and Azure EKS service comes at number three.
On the costing part of Kubernetes, Azure is beating Amazon EKS since I can do some experimentation without paying for the control panel; I can just pay for the node groups, which is an area where Amazon can improve.
How was the initial setup?
From my point of view, the initial deployment of Amazon EKS is difficult.
I had to configure many components, such as IAM policies and other things; it was not a simple click-through process. The major issue is that there is no single point where I can see all the logs; while there is CloudWatch, it is not easily accessible, and you have to go through a hectic process to search and find information. There is nothing where you can go and click to get all the logs in a single place.
What other advice do I have?
Amazon EKS requires maintenance on my end to continue functioning; we have to do upgradation from time to time since every Kubernetes version is only valid for one year.
On a scale from one to ten, I rate Amazon EKS a seven out of ten.
Which deployment model are you using for this solution?
Public Cloud
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?