Containers

Implement SPIFFE/SPIRE authorization on Amazon EKS

When running distributed applications across multiple Amazon Elastic Kubernetes Service (Amazon EKS) clusters, teams face two critical security challenges: establishing secure communication in untrusted networks and authenticating workloads without relying on network-based controls.

The mTLS Challenge: Traditional approaches rely on network security to determine message sender identity and ensure message integrity. However, in complex distributed applications spanning multiple networks with services deployed by different teams, network-based protection becomes insufficient. Teams need mutual Transport Layer Security (mTLS) to cryptographically prove sender and recipient identity while ensuring messages haven’t been viewed or modified in transit.

The Authentication Challenge: While mTLS secures channels between specific workloads, teams also need flexible authentication for scenarios where direct mTLS isn’t possible—such as when Layer 7 load balancers sit between services, or when multiple workloads send messages over a single encrypted channel. JSON Web Tokens (JWTs) provide this flexibility but require proper key management and validation.

SPIFFE (Secure Production Identity Framework for Everyone) and its reference implementation SPIRE (SPIFFE Runtime Environment) help address both challenges by:

  • Attesting workload identity at runtime in distributed systems
  • Delivering short-lived, automatically rotated X.509 certificates (X.509-SVIDs) for mTLS directly to workloads
  • Generating and validating JWT-SVIDs for flexible authentication scenarios
  • Minimizing manual certificate management through the SPIFFE Workload API

This guide demonstrates deploying SPIRE in a nested configuration across multiple Amazon EKS clusters, enabling secure workload identity and authentication in distributed environments.

In this post, we show you how to implement SPIFFE/SPIRE on Amazon EKS to establish secure service-to-service communication using a nested architecture. You’ll learn how to deploy SPIRE across multiple Amazon EKS clusters, configure workload attestation, and implement fine-grained authorization policies that scale with your infrastructure.

Understanding SPIFFE/SPIRE and Service-to-Service Authorization

Before diving into the implementation, let’s establish the foundational concepts of SPIFFE and SPIRE, and understand why they’re essential for modern service-to-service authorization.

What is SPIFFE?

SPIFFE is a set of open-source standards for securely identifying software systems in dynamic, heterogeneous environments. It provides a framework for workloads to present cryptographically verifiable identities, enabling secure communication without relying on network-level controls like firewalls or IP whitelists.

What is SPIRE?

SPIRE is the reference implementation of the SPIFFE standards. It serves as a production-ready SPIFFE Runtime Environment that manages the lifecycle of SPIFFE identities, including attestation, registration, and key rotation. SPIRE can be deployed in various modes, including nested hierarchies for complex, multi-cluster environments.

Why implement SPIFFE/SPIRE? In dynamic containerized environments, traditional security approaches break down:

  • IP-based security fails when containers move between nodes and clusters
  • Shared secrets don’t scale across hundreds of microservices
  • Manual certificate management becomes operationally complex
  • Network perimeters blur in multi-cloud and hybrid deployments

SPIRE can help address these challenges by providing cryptographic identity that moves with workloads, enabling automatic credential rotation, and supporting zero-trust authentication that is designed to work across different network locations.

The Need for Service-to-Service Authorization

Traditional security models often rely on network policies or API gateways, which can be cumbersome and error-prone in microservices architectures. SPIFFE/SPIRE helps address this by:

  • Providing workload identities that are cryptographically verifiable
  • Enabling mutual TLS (mTLS) authentication between services
  • Supporting fine-grained authorization policies based on identity
  • Automating key rotation and certificate management

What is Nested SPIRE?

Nested SPIRE is a deployment architecture that allows SPIRE Servers to be “chained” together, enabling all servers to issue identities within the same trust domain. This means workloads across different clusters receive identity documents that can be verified against the same root keys, creating a unified security boundary.

How the Chaining Works:

  • A SPIRE Agent is co-located with every downstream SPIRE Server
  • The downstream server obtains credentials via the Workload API from its local agent
  • These credentials are used to authenticate directly with the upstream SPIRE Server
  • The upstream server issues an intermediate Certificate Authority (CA) to the downstream server
  • All servers in the chain are configured to issue SVIDs within the same trust domain

Key Benefits:

  • Unified Trust Domain: All workloads across clusters share the same cryptographic root of trust
  • Multi-Cloud Ready: Different child servers can use different node attestors for various cloud environments
  • Scalable Architecture: New clusters can be added by chaining additional SPIRE servers
  • Centralized Root Management: Single root server manages the trust domain while child servers handle local workloads

This architecture is particularly well-suited for multi-cloud deployments where workloads need to authenticate across different infrastructure environments while maintaining a consistent security model.

Architecture Overview

Architecture diagram showing a hierarchical SPIRE (Secure Production Identity Runtime Environment) deployment on AWS, with a root cluster managing two child clusters inside a private VPC subnet, backed by Amazon Aurora MySQL

Figure 1: Nested SPIRE architecture showing root and child servers across multiple EKS clusters

Components

  • Root Cluster (spire-root-cluster): Hosts the root SPIRE server
  • Child Clusters (spire-child-cluster-01, spire-child-cluster-02): Host child SPIRE servers and workloads
  • Amazon Aurora: Persistent datastore for the root SPIRE server
  • Kubeconfig Generator: Script to create secure cluster access configurations

Node Attestation in Nested SPIRE

Before diving into the deployment process, it’s crucial to understand how SPIRE establishes trust between servers and agents across clusters through node attestation.

What is Node Attestation?

Node attestation is the process where each SPIRE agent authenticates and verifies itself when first connecting to a SPIRE server. During this process, the agent and server work together to verify the identity of the node the agent is running on using plugins called node attestors.

Node attestors interrogate a node and its environment for information that only that specific node would possess, proving the node’s identity through various methods:

  • Cloud platform identity documents (such as AWS Instance Identity Documents)
  • Hardware Security Module or TPM private keys
  • Manual verification via join tokens
  • Multi-node software system credentials (such as Kubernetes Service Account tokens)
  • Deployed server certificates

Upon successful node attestation, the SPIRE server issues a unique SPIFFE ID to the agent, which serves as the “parent” identity for all workloads it manages.

Kubernetes Node Attestation (k8s_psat)

In our Nested SPIRE deployment, we use the k8s_psat plugin for node attestation. This plugin:

  1. Validates signed projected service account tokens provided by agents
  2. Uses the Kubernetes Token Review API to perform validation
  3. Queries the Kubernetes API server for additional node metadata (UID, namespace, service account)
  4. Generates SPIFFE IDs based on the validated node information

Why Cross-Cluster Access is Required

In a nested architecture, the root SPIRE server must validate tokens from child cluster nodes. This is why our deployment process includes:

  1. Kubeconfig generation script – Creates secure access credentials for each child cluster
  2. Base64-encoded kubeconfig injection – It provides the root server with child cluster API access:
    --set "external-spire-server.kubeConfigs.child01.kubeConfigBase64=$(cat ../script/spire-child-cluster-01.kubeconfig)"
    

This cross-cluster access enables the root server to perform Kubernetes Token Review API calls against child clusters, validating that agents requesting intermediate CA certificates are legitimate nodes within the trusted infrastructure.

Prerequisites and Environment Setup

Before deploying SPIFFE/SPIRE on Amazon EKS, verify you have the following prerequisites:

Required Tools and Software

AWS Account Requirements

You’ll need an AWS account with permissions to:

  • Create EKS clusters
  • Manage VPCs and subnets
  • Deploy IAM roles and policies

Infrastructure Components

Our Nested SPIRE deployment consists of several key AWS infrastructure components that work together to provide secure workload identity across multiple clusters.

1. EKS Clusters

The deployment consists of three EKS clusters:

Cluster Purpose Terraform Path Region Node Groups
spire-root-cluster Hosts the root SPIRE server infrastructure/eks/spire-root-cluster/ us-east-1 Optimized for SPIRE server workloads
spire-child-cluster-01 Hosts child SPIRE server and application workloads infrastructure/eks/spire-child-cluster-01/ us-east-1 Mixed instance types for diverse workloads
spire-child-cluster-02 Hosts child SPIRE server and application workloads infrastructure/eks/spire-child-cluster-02/ us-east-1 Mixed instance types for diverse workloads

2. Amazon Aurora MySQL Datastore (infrastructure/rds/spire-datastore/)

The root SPIRE server requires persistent storage for audit logs, registration entries, node attestation data, and certificate authority information.

Configuration:

  • Engine: MySQL 8.0
  • Instance Class: Serverless v2 (auto-scaling)
  • Storage: Encrypted with AWS KMS
  • Backup: Automated daily backups
  • Multi-AZ: Enabled for high availability

3. Networking Architecture

Each cluster is deployed in its own or shared VPC with:

  • Public Subnets: For load balancers and NAT gateways
  • Private Subnets: For EKS nodes and Amazon Aurora instances
  • Security Groups: Restrictive rules for inter-cluster communication
  • VPC Peering: Enables secure communication between clusters

Environment Variables

Set the following environment variables:

export AWS_REGION=your-aws-region # e.g., us-east-1
export CLUSTER_NAME_ROOT=your-root-cluster-name # e.g., spire-root-cluster
export CLUSTER_NAME_CHILD1=your-child-cluster-1-name # e.g., spire-child-cluster-01
export CLUSTER_NAME_CHILD2=your-child-cluster-2-name # e.g., spire-child-cluster-02

Deploying Amazon EKS Infrastructure with Terraform

Now that we understand the architecture and components, let’s deploy the AWS infrastructure using Terraform to create our Nested SPIRE environment.

Infrastructure Overview

We’ll deploy three EKS clusters: a root cluster for the SPIRE trust anchor and two child clusters for workload deployment.

Deployment Steps

Follow these steps to deploy the infrastructure components, starting with networking, then the database, and finally the EKS clusters.

1. Obtain Networking Values

Before deploying the SPIRE infrastructure, you’ll need to gather the following networking values from your existing VPC setup:

  • VPC ID
  • Private subnets (list)
  • Public subnets (list)
  • VPC CIDR block
  • Route 53 DNS zone (for external-dns to create SPIRE server domain pointing to ALB)

These values will be required for the database and EKS cluster variables.tf files in the subsequent deployment steps.

2. Deploy Aurora MySQL

Update the variables.tf file with the networking values from the previous step.

cd infrastructure/rds/spire-datastore/
terraform init
terraform plan
terraform apply

After deployment, retrieve the database endpoint and secrets ARN that will be used in subsequent configurations:

terraform output aurora_mysql_v2_cluster_endpoint
terraform output aurora_mysql_v2_cluster_master_user_secret_arn

Save these values as they will be required for the root EKS cluster and root-values.yaml configuration in the SPIRE installation steps.

3. Deploy Root EKS Cluster

Update the variables.tf file with the networking values from step 1, the database values from step 2, your Route 53 DNS zone, and RDS cluster ARN.

cd ../../eks/spire-root-cluster/
terraform init
terraform plan
terraform apply

After deployment, retrieve the IAM role ARNs that will be needed for the child cluster configurations:

terraform output cluster_iam_role_arn
terraform output node_iam_role_arn
terraform output spire_server_service_account_role_arn

Save these ARN values as they will be required for the child cluster variables.tf files.

4. Deploy Child Cluster 01

Update the variables.tf file with the networking values from step 1, the database values from step 2, and the IAM role ARNs from step 3.

cd ../spire-child-cluster-01/
terraform init
terraform plan
terraform apply

5. Deploy Child Cluster 02

Update the variables.tf file with the networking values from step 1, the database values from step 2, and the IAM role ARNs from step 3.

cd ../spire-child-cluster-02/
terraform init
terraform plan
terraform apply

6. Setup Kubeconfigs Locally

Update your kubeconfig files for all three clusters:

aws eks --region us-east-1 update-kubeconfig --name spire-root-cluster
aws eks --region us-east-1 update-kubeconfig --name spire-child-cluster-01
aws eks --region us-east-1 update-kubeconfig --name spire-child-cluster-02

7. Run the Generate Kubeconfig Script

In the script directory, execute the script to generate base64-encoded kubeconfigs:

cd script
./generate_kubeconfig.sh

What this script does:

  1. Creates temporary kubeconfig files for each child cluster
  2. Extracts cluster certificate authority data and endpoints
  3. Retrieves service account tokens from the spire-system namespace
  4. Generates base64-encoded kubeconfig files for secure cluster access
  5. Creates both encoded and decoded versions for verification

Installing SPIRE Server on EKS

With the infrastructure deployed and kubeconfigs generated, we can now install SPIRE across our clusters using Helm charts.

Helm Values Structure

The SPIRE deployment uses multiple Helm values files to organize configuration settings for different cluster roles and shared parameters.

your-values.yaml

Contains common configuration shared across all clusters:

  • Image registry settings (for private registries)
  • Resource limits and requests
  • Security contexts
  • Logging configuration

root-values.yaml

Specific configuration for the root SPIRE server:

  • Database connection settings
  • Trust domain configuration
  • External cluster access settings
  • Certificate authority configuration

spire-child-cluster-01.yaml and spire-child-cluster-02.yaml

Child-specific configurations:

  • Parent server connection details
  • Local cluster settings
  • Node attestation configuration

Key Configuration Parameters

Root Server Configuration:

spire-server:
  dataStore:
    sql:
      databaseType: mysql
      databaseName: sharedqaspireserver
      host: "demo-serverless-mysqlv2.cluster-abc123def456.us-east-1.rds.amazonaws.com"
      port: 3306
      region: "us-east-1"
      username: spireadmin
      externalSecret:
        enabled: true
        name: spire-db-secret
        key: password

  trustDomain: "spireserver.example.com"
  
  nodeAttestor:
    k8sPSAT:
      serviceAccountAllowList:
        - "spire-mgmt:spire-agent"
        - "default:default"

Child Server Configuration:

spire-server:
  upstreamAuthority:
    spire:
      enabled: true
      upstreamDriver: "spire-plugin"
      serverAddress: "spire-server.spire-mgmt.svc.cluster.local"
      serverPort: 8081
      
  trustDomain: "spireserver.example.com"

Critical Configuration Note: Service Name Length Constraints

IMPORTANT: When deploying Nested SPIRE, the cluster identifiers used in Helm commands must be 7 characters or fewer. This is due to Internet Assigned Numbers Authority (IANA) that constraints the limit of the service names to 15 characters maximum.

RFC 6335 Section 5.1:

  • Valid service names are hereby normatively defined as follows:
    • MUST be at least 1 character and no more than 15 characters long

The SPIRE Helm chart appends these identifiers to service names, causing deployment failures if too long.

  • Template generates: prom-cm-{CLUSTER_ID}
  • With ABCDEFG-qa-01: becomes prom-cm-ABCDEFG-qa-01 (21 characters) ❌
  • With child01: becomes prom-cm-child01 (15 characters) ✅

Example of the constraint:

# ❌ This will FAIL - "ABCDEFG-qa-01" is too long (12 characters)
--set "external-spire-server.kubeConfigs.ABCDEFG-qa-01.kubeConfigBase64=..."

# ✅ This will WORK - "child01" is short enough (7 characters)  
--set "external-spire-server.kubeConfigs.child01.kubeConfigBase64=..."

The cluster identifier (e.g., child01) is:

  • Used only for Helm configuration – it doesn’t need to match the actual EKS cluster name
  • Appended to service names by the Helm template
  • Must comply with RFC 6335 naming constraints

1: Setup Root Cluster spire-root-cluster

We’ll start by configuring the root cluster, which serves as the trust anchor for our Nested SPIRE deployment.

1a. Use the Root Cluster Context

Switch to the root cluster context:

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-root-cluster

1b. Install CRDs on the Root Cluster

Navigate back to the helm directory and install the CRDs on the root cluster:

cd ../helm
helm upgrade --install --create-namespace -n spire-mgmt spire-crds spire-crds \
--repo https://spiffe.github.io/helm-charts-hardened/

1c. Install the Root Server

Install the root server using the encoded kubeconfigs for child clusters:

helm upgrade --install -n spire-mgmt spire spire-nested --repo https://spiffe.github.io/helm-charts-hardened/ \
--set "external-spire-server.kubeConfigs.child01.kubeConfigBase64=$(cat ../script/spire-child-cluster-01.kubeconfig)" \
--set "external-spire-server.kubeConfigs.child02.kubeConfigBase64=$(cat ../script/spire-child-cluster-02.kubeconfig)" \
-f your-values.yaml -f root-values.yaml

Command Explanation:

  • child01 and child02 are short identifiers (≤7 chars) to avoid service name length issues
  • These identifiers are used internally by Helm and don’t need to match cluster names
  • The actual cluster access is provided by the base64-encoded kubeconfig files
  • Each kubeConfigBase64 parameter contains the complete authentication information for accessing the respective child cluster

2. Setup Child Cluster spire-child-cluster-01

Now we’ll configure the first child cluster, which will connect to the root server to obtain its intermediate CA certificate.

2a. Switch to spire-child-cluster-01 Cluster

Switch to the spire-child-cluster-01 cluster context:

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-child-cluster-01

2b. Mark spire-system namespace as Helm-managed for the spire release

kubectl label namespace spire-system app.kubernetes.io/managed-by=Helm --overwrite && kubectl annotate namespace spire-system meta.helm.sh/release-name=spire meta.helm.sh/release-namespace=spire-mgmt --overwrite

2c. Install CRDs and Server for spire-child-cluster-01

Install the CRDs and server for spire-child-cluster-01:

helm upgrade --install --create-namespace -n spire-mgmt spire-crds spire-crds \
--repo https://spiffe.github.io/helm-charts-hardened/
helm upgrade --install -n spire-mgmt spire spire-nested --repo https://spiffe.github.io/helm-charts-hardened/ \
-f your-values.yaml -f spire-child-cluster-01.yaml

3. Setup Child Cluster spire-child-cluster-02

Next, we’ll configure the second child cluster using the same process as the first child cluster.

3a. Switch to spire-child-cluster-02 Cluster

Switch to the spire-child-cluster-02 cluster context:

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-child-cluster-02

3b. Mark spire-system namespace as Helm-managed for the spire release

kubectl label namespace spire-system app.kubernetes.io/managed-by=Helm --overwrite && kubectl annotate namespace spire-system meta.helm.sh/release-name=spire meta.helm.sh/release-namespace=spire-mgmt --overwrite

3c. Install CRDs and Server for spire-child-cluster-02

Install the CRDs and server for spire-child-cluster-02 using the specific child values file:

helm upgrade --install --create-namespace -n spire-mgmt spire-crds spire-crds \
--repo https://spiffe.github.io/helm-charts-hardened/
helm upgrade --install -n spire-mgmt spire spire-nested --repo https://spiffe.github.io/helm-charts-hardened/ \
-f your-values.yaml -f spire-child-cluster-02.yaml

Deploy Envoy Test Application

With SPIRE installed across all clusters, we’ll now deploy a test application to demonstrate secure service-to-service communication using SPIFFE identities.

1. Switch to spire-child-cluster-01

Ensure you’re connected to the child cluster:

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-child-cluster-01

2. Update SPIFFE Trust Domain in ConfigMaps

Before deploying, update the trust domain in each configmap.yaml file under the envoy/ directory to match your root cluster’s trust domain.

For example, change:

- name: "spiffe://spirenested.example.com/ns/ecommerce/sa/edge-proxy-service-account"

The domain must match the trustDomain value in helm/root-values.yaml:

trustDomain: spirenested.example.com

3. Create Namespace and Deploy

Create the ecommerce namespace and deploy the application:

kubectl create ns ecommerce
kubectl apply -R -f envoy/

4. Get Network Load Balancer DNS

Wait for the NLB to provision, then retrieve its DNS:

kubectl get svc -n ecommerce edge-proxy -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

5. Test the Application

Access the GraphQL endpoint using the NLB DNS:

http://<NLB-DNS>:8081/v1/graphql

Example:

http://k8s-ecommerc-edgeprox-a1b2c3d4e5-f6g7h8i9j0k1l2m3.elb.us-east-1.amazonaws.com:8081/v1/graphql

Run this query to verify the deployment:

query {
  orders {
    id
    orderFor
    product {
      id
      name
    }
  },
  products {
    id
    name
  }
}

You should see data for orders and products returned successfully.

6. Introduce a SPIFFE ID Mismatch

Edit envoy/graphql/configmap.yaml line 182 and change the SPIFFE ID from plural to singular:

exact: "spiffe://spirenested.example.com/ns/ecommerce/sa/order-service-account"

Apply the change and restart the deployment:

kubectl apply -f envoy/graphql/configmap.yaml
kubectl rollout restart deployment graphql -n ecommerce

7. Observe the Error

Run the same GraphQL query. You’ll see an error for orders:

{
  "errors": [
    {
      "message": "Request failed with status code 503",
      "locations": [
        {
          "line": 2,
          "column": 3
        }
      ],
      "path": [
        "orders"
      ]
    }
  ]
}

This occurs because the SPIFFE ID no longer matches the orders service.

8. Fix the Issue

Revert the change back to the correct SPIFFE ID:

exact: "spiffe://spirenested.example.com/ns/ecommerce/sa/orders-service-account"

Apply and restart:

kubectl apply -f envoy/graphql/configmap.yaml
kubectl rollout restart deployment graphql -n ecommerce

The query should now work correctly again.

Common Issues and Solutions

1. Service Name Length Constraints

Problem: Helm template generates service names exceeding 15 characters (RFC 6335 limit) Error: must be no more than 15 characters

Root Cause: The SPIRE Helm chart appends cluster identifiers to service names. For example:

  • Template generates: prom-cm-{CLUSTER_ID}
  • With ABCDEFG-qa-01: becomes prom-cm-ABCDEFG-qa-01 (21 characters) ❌
  • With child01: becomes prom-cm-child01 (15 characters) ✅

Solution: Use cluster identifiers of 7 characters or fewer:

# ❌ Too long - will cause deployment failure
--set "external-spire-server.kubeConfigs.routing-qa-01.kubeConfigBase64=..."

# ✅ Correct - short identifier
--set "external-spire-server.kubeConfigs.child01.kubeConfigBase64=..."

Important: The cluster identifier is only used for Helm templating and doesn’t need to match the actual EKS cluster name.

2. Image Registry Issues

Problem: Cannot pull images from public registries Solution: Configure private registry in your-values.yaml:

global:
  spire:
    image:
      registry: "your-private-registry.com"

3. Storage Class Issues

Problem: pod has unbound immediate PersistentVolumeClaims Solution: Ensure default storage class is configured:

kubectl patch storageclass gp2 -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

4. IP Exhaustion

Problem: Not enough IP addresses in VPC CIDR Solution:

  • Use larger CIDR blocks for VPCs
  • Implement IP address management (IPAM)
  • Consider using AWS VPC CNI with prefix delegation

5. IMDSv2 Compatibility Issue

SPIRE does not currently support IMDSv2 when strictly enforced. The Terraform configuration sets http_tokens = "optional" to allow both IMDSv1 and IMDSv2:

eks_managed_node_groups = {
  general-instances = {
    ...
    metadata_options = {
      http_endpoint = "enabled"
      http_tokens   = "optional"
    }
  }
}

Security Implications: Allowing IMDSv1 increases the risk of Server-Side Request Forgery (SSRF) attacks, as IMDSv1 does not require session tokens. To mitigate this risk:

  • Implement network policies to restrict pod access to the metadata service
  • Monitor IMDSv1 usage through CloudWatch metrics and VPC Flow Logs
  • Apply defense-in-depth controls such as IAM role restrictions and security groups
  • Review the Amazon EKS security best practices guide for additional recommendations

For an extensive review of IMDSv2 security enhancements, see Add defense in depth against open firewalls, reverse proxies, and SSRF vulnerabilities with enhancements to the EC2 Instance Metadata Service. For more details on the SPIRE limitation, see: IMDSv2 Requirement Breaks aws_mysql and aws_postgres Authentication

Verification Commands

Check SPIRE server status:

kubectl exec -n spire-mgmt spire-server-0 -- /opt/spire/bin/spire-server healthcheck

List registration entries:

kubectl exec -n spire-mgmt spire-server-0 -- /opt/spire/bin/spire-server entry show

Check agent status:

kubectl exec -n spire-mgmt daemonset/spire-agent -- /opt/spire/bin/spire-agent healthcheck

Troubleshooting Commands

Check SPIRE server logs:

kubectl logs -n spire-system deployment/spire-server

Verify agent connectivity:

kubectl exec -it spire-agent-pod -- spire-agent api fetch

Test mTLS communication:

openssl s_client -connect my-service:8443 -cert /path/to/cert.pem -key /path/to/key.pem

Best Practices and Security Considerations

1. Infrastructure Management

  • Use Infrastructure as Code (Terraform) for reproducible deployments
  • Implement proper tagging strategies for resource management
  • Use separate AWS accounts for different environments

2. Security

  • Rotate certificates regularly
  • Implement comprehensive monitoring and alerting
  • Use AWS IAM roles for service accounts (IRSA)
  • Enable AWS CloudTrail for audit logging

3. Operations

  • Implement automated backup strategies for Amazon Aurora
  • Use GitOps for configuration management
  • Implement proper CI/CD pipelines for updates
  • Monitor resource utilization and costs

4. Scalability

  • Plan for cluster growth and additional child clusters
  • Implement horizontal pod autoscaling
  • Use cluster autoscaling for dynamic node management
  • Consider multi-region deployments for disaster recovery

5. Additional Security Considerations

  • Network Policies: Implement Kubernetes network policies to restrict traffic
  • RBAC: Use least-privilege service accounts
  • Secrets Management: Store sensitive data in AWS Secrets Manager
  • Encryption: Enable encryption at rest and in transit
  • Audit Logging: Enable comprehensive audit logging

Cleanup

Before destroying the Terraform infrastructure, you must first uninstall Helm releases to remove AWS resources created by Kubernetes controllers (such as Application Load Balancers, security groups, and IAM roles).

1. Uninstall Helm Releases from Child Cluster 02

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-child-cluster-02
helm -n spire-mgmt uninstall spire
helm -n spire-mgmt uninstall spire-crds
kubectl -n spire-system delete pvc -l app.kubernetes.io/instance=spire
kubectl delete crds clusterfederatedtrustdomains.spire.spiffe.io clusterspiffeids.spire.spiffe.io clusterstaticentries.spire.spiffe.io

2. Uninstall Helm Releases from Child Cluster 01

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-child-cluster-01
helm -n spire-mgmt uninstall spire
helm -n spire-mgmt uninstall spire-crds
kubectl -n spire-system delete pvc -l app.kubernetes.io/instance=spire
kubectl delete crds clusterfederatedtrustdomains.spire.spiffe.io clusterspiffeids.spire.spiffe.io clusterstaticentries.spire.spiffe.io

3. Uninstall Helm Releases from Root Cluster

kubectx arn:aws:eks:us-east-1:111122223333:cluster/spire-root-cluster
helm -n spire-mgmt uninstall spire
helm -n spire-mgmt uninstall spire-crds
kubectl -n spire-mgmt delete pvc -l app.kubernetes.io/instance=spire
kubectl delete crds clusterfederatedtrustdomains.spire.spiffe.io clusterspiffeids.spire.spiffe.io clusterstaticentries.spire.spiffe.io

4. Destroy Child Cluster 02

cd infrastructure/eks/spire-child-cluster-02/
terraform destroy

5. Destroy Child Cluster 01

cd ../spire-child-cluster-01/
terraform destroy

6. Destroy Root EKS Cluster

cd ../spire-root-cluster/
terraform destroy

7. Destroy Aurora MySQL

cd ../../rds/spire-datastore/
terraform destroy

Cost Considerations

  • EKS clusters: Pricing based on instance types and usage
  • Amazon Aurora MySQL: Serverless v2 pricing based on compute and storage
  • Data transfer: Costs for inter-cluster communication

This blog post is based on Nested SPIRE deployment patterns and AWS EKS infrastructure. Always refer to the latest SPIRE documentation and AWS best practices for the most current deployment guidance.

Conclusion

This Nested SPIRE deployment provides a robust foundation for secure workload identity management across multiple Kubernetes clusters. The architecture scales well and provides the flexibility needed for complex microservices environments while maintaining strong security postures.

By following this post, you’ve learned how to deploy SPIRE in a nested configuration on Amazon EKS, configure workload attestation and identity management, implement service-to-service authorization policies, monitor and troubleshoot your SPIFFE/SPIRE deployment, and apply best practices for production deployments.

The combination of AWS managed services (Amazon EKS, Amazon Aurora) with the Nested SPIRE architecture creates a production-ready identity infrastructure that can support enterprise-scale applications with stringent security requirements.

As cloud-native architecture continues to evolve, SPIFFE/SPIRE will play an increasingly important role in securing distributed systems. Consider exploring integrations with service meshes like Istio and monitoring tools like Prometheus to further enhance your security posture.

Ready to get started? Clone the sample repository and deploy your own nested SPIRE environment on Amazon EKS. For questions or to share your implementation experiences, join the SPIFFE community Slack or engage with the AWS community on re:Post.

For additional support and advanced configurations, refer to the SPIRE documentation and the Helm charts repository.


About the authors

Eric Cavalcanti

Eric Cavalcanti is a Delivery Consultant at Amazon Web Services, specializing in containers and cloud-native architectures. He works with customers across a wide range of domains, quickly adapting to each customer’s unique needs and helping them deliver practical, production-ready solutions on AWS.

Gaurav Gundal

Gaurav Gundal is a DevOps consultant with AWS Professional Services, helping customers build solutions on the customer platform. When not building, designing, or developing solutions, Gaurav spends time with his family, plays guitar, and enjoys traveling to different places.