Guidance for Galaxy Deployment on AWS
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
This Guidance uses services that allow you full visibility into your workloads through monitoring and logging, while also providing you with reliable, stable, and dependable applications. For example, with CloudWatch, you gain observability with metrics, personalized dashboards, and logs, in addition to alerts that are defined from metrics throughout this Guidance, so you can monitor the health of your workloads and minimize the impact from incidents. Also, Amazon EKS clusters can identify unhealthy containers and replace them automatically with new containers, so that your workloads are available to respond to incidents and events.
Security
By default, all incoming connections to Galaxy originate from the public Internet and are directed to the Galaxy server through a publicly accessible Application Load Balancer. Alternatively, this Guidance can be configured to use an internal Application Load Balancer in a private subnet, where traffic is routed through a virtual private network (VPN) connection or through AWS Direct Connect. In both cases, compute resources are deployed within private subnets and are not directly accessible from the public Internet. Galaxy handles application-level authentication and authorization through its own user management or through Active Directory Federation Service (AD FS).
Reliability
To implement a reliable application-level architecture, the individual components of this Guidance are deployed as loosely coupled Kubernetes pods. Also, the message broker is the fully managed service Amazon MQ, which, in our default configuration, includes a standby server. Finally, the shared filesystem is provided through Amazon EFS and is highly available, as is the database provided through Aurora Serverless.
Performance Efficiency
Amazon EKS is an AWS native service, and this Guidance focuses on cost-efficient ways to deploy and configure it with selected resources so that you can achieve a reliable Kubernetes application with high availability and low operational costs. The architecture for Amazon EKS is spanned across multiple Availability Zones for high availability. While some traffic will exist between subnets deployed into Availability Zones, its latency should not make any significant performance impact.
Amazon EFS is designed to provide serverless, fully elastic file storage that allows you to share file data without the need to provision or manage storage capacity and performance. It provides a Portable Operating System Interface (POSIX) file system with the necessary performance for bioinformatic workloads.
Cost Optimization
A significant factor for data transfer costs within Amazon EKS clusters are calls to Kubernetes services from external clients going through Application Load Balancers. The data transfer costs when calling services are mapped to communications between pods running in different Availability Zones.
Due to the highly configurable autoscaling minimum, maximum, and desired number of compute nodes, along with their corresponding Amazon Elastic Compute Cloud (Amazon EC2) parameters, resources are efficiently managed.
Finally, serverless architectures have a pay-per-value pricing model and scale based on demand. This includes the Aurora Serverless database and Amazon EFS. We recommend you tag AWS resources that belong to a project programmatically, and then create custom reports in AWS Cost Explorer using the tags to visualize and monitor costs.
Sustainability
By choosing the right sized instances, you use only the resources you need, thereby reducing unnecessary emissions. Also, by using services with dynamic scaling, you minimize the environmental impact of the backend services, and ensure scaling of compute resources based on your website needs. Additionally, the use of fully managed services, such as Amazon EFS, minimizes the required resources.
Deploy with confidence
Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages