AWS AI Factories FAQs
AWS AI Factories FAQs
General
Open allAWS AI Factories are purpose built, dedicated environments that AWS deploys and fully manages in your data center on your behalf, combining the latest AI accelerators (including AWS Trainium and NVIDIA GPUs), specialized networking, and high performance storage. Each AWS AI Factory is available exclusively to you or your designated trusted community, so you can meet data residency and sovereignty requirements while maintaining access to a broad set of AWS AI capabilities.
Each AWS AI Factory includes AWS AI services such as Amazon Bedrock and Amazon SageMaker AI, enabling immediate access to leading foundation models without the need to negotiate separate agreements with individual model providers.
AWS AI Factories are built for organizations operationalizing AI at scale, delivering purpose built, dedicated infrastructure for training, fine tuning, and inference workloads in secure and sovereign environments. This includes government organizations advancing digital transformation with full control over data and infrastructure, as well as enterprises with compute intensive AI workloads that require dedicated, secure, and highly scalable environments.
AWS deploys AI Factories exclusively within your data centers, whether you own the facility or lease space through a co-location provider. AWS works closely with you to implement and manage these environments in your facilities while maintaining AWS's security standards.
AWS AI Factories use the same AWS-designed networking, compute, and storage infrastructure deployed across AWS Regions, purpose-built and optimized to deliver a consistent, high-performance experience for AI workloads. This includes AWS Trainium and NVIDIA GPU-based EC2 instances, high-performance networking technologies such as AWS Elastic Fabric Adapter (AWS EFA) and NVLink, and scalable storage and security services designed for large-scale AI training and inference.
Yes. AWS AI Factories support both multi-account and multi-tenant environments.
For example:
Single customer: An enterprise can run a centralized AWS AI Factory while different business units or teams operate in separate AWS accounts with independent governance and isolation.
Trusted multi-tenant community: Multiple organizations, can securely share AWS AI Factory infrastructure while maintaining tenant isolation and access controls.
Start by engaging with your AWS account team to scope your requirements. From there, AWS works with you on site readiness assessment, data center preparation, and AWS AI Factory configuration.
To get started, fill out this form or contact your AWS account team.
Services
Open allAWS AI Factories come with a broad set of AWS services, purpose built to support demanding AI inference, training, and fine-tuning workloads.
AI/ML Services: Amazon Bedrock and Amazon SageMaker give you immediate access to leading foundation models and tools to build, train, and deploy machine learning models without managing the underlying infrastructure.
Compute: Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), and AWS Batch provide flexible, high-performance compute for AI workloads at any scale.
Storage: Amazon S3 Express One Zone, Amazon EBS, and Amazon FSx for Lustre deliver the high throughput, low latency storage that AI training and inference workloads demand.
Networking and security: Amazon Virtual Private Cloud (Amazon VPC), AWS Direct Connect, Elastic Load Balancing, and AWS Shield deliver the secure, high throughput, low latency connectivity that large scale AI training and inference demand.
Additionally, you get access to the majority of services in AWS Regions, including Amazon CloudFormation and Amazon CloudWatch to support your AI and sovereign workloads.
Amazon Bedrock in AWS AI Factories is delivered via the Bedrock distributed inference engine and provides access to leading foundation models. AWS works closely with third party and closed weight model providers to validate and enable model availability, ensuring you can use the most suitable models based on use case, performance, and regulatory requirements.
Yes. AWS AI Factories are fully managed and include Amazon EBS, Amazon FSx for Lustre, and Amazon S3 Express One Zone for high performance block, file, and object storage.
-
General Purpose SSD – gp3, gp2
-
Provisioned IOPS SSD – io1, io2
-
Throughput Optimized HDD – st1
-
Cold HDD – sc1
AWS AI Factories support the following Amazon EBS volume types:
These volume types give customers flexibility to match storage performance and cost to their workload requirements.
You can create S3 directory buckets with Amazon S3 Express One Zone, a high performance, single zone storage class purpose-built to deliver consistent single digit millisecond data access for your most frequently accessed data and latency sensitive applications. Amazon S3 Express One Zone delivers data access speeds up to 10x faster and request costs up to 80% lower than Amazon S3 Standard.
AWS AI Factories deliver high-bandwidth, non-blocking networking through Elastic Fabric Adapter (EFA), enabling your AI training and inference workloads to scale across thousands of accelerators without performance degradation.
AWS AI Factories connect to your selected AWS Region over the high-bandwidth, low-latency AWS Global Network, giving you seamless access to the majority of AWS services such as Amazon DynamoDB and Amazon Aurora running in that Region.
Yes. AWS establishes a private AWS Direct Connect Point of Presence (PoP) at your AWS AI Factory location. This enables you to connect your on-premises network directly to the AWS-managed AWS AI Factory infrastructure within the same facility, delivering low-latency, high-bandwidth connectivity without traversing the public internet.
Yes, you can use NVIDIA Run:ai in AWS AI Factories through two options: an NVIDIA managed SaaS solution available through AWS Marketplace, or a self-managed deployment. Run:ai integrates with Amazon EKS and Amazon SageMaker HyperPod to orchestrate AI workloads and optimize GPU utilization.
Yes, in addition to AWS managed services like Amazon Bedrock and Amazon SageMaker, you can deploy NVIDIA AI Enterprise with flexible licensing options: bring your own license or purchase licensing through AWS Marketplace. This provides access to NVIDIA NIM and NeMo microservices.
AI accelerators and compute
Open allAWS AI Factories support AI accelerators from both AWS and NVIDIA through fully managed EC2 instances optimized for large scale training and inference workloads.
Available accelerator options include AWS Trainium (Trn2, Trn3) and NVIDIA GPUs (P6-B200, P6-B300, P6e-GB200 UltraServers, P6e-GB300 UltraServers). You can deploy multiple accelerator types within a single AWS AI Factory to optimize for different workload requirements. AWS also plans to offer NVIDIA Rubin GPUs and Vera Rubin platform once generally available, subject to timing and configuration requirements.
AWS AI Factories also include EC2 instance families including compute optimized (C), memory optimized (R), general purpose (M), and storage optimized (I) to support the broader range of AI application and infrastructure needs.
Deployment and management
Open allApproximately 3-6 months once the data center is ready and handed over to AWS, depending on configuration complexity and component availability. AWS AI Factories use the same infrastructure as AWS Regions and are delivered and managed end to end by AWS, reducing the need to independently coordinate multiple infrastructure vendors and helping accelerate your time to value.
Your AWS AI Factory is accessible through the standard AWS Management Console and API endpoints for its parent Region. Your administrator designates which AWS accounts or organizations are authorized to use your AWS AI Factory. Once access is granted, the AWS AI Factory appears alongside other Availability Zones in your AWS account, allowing your teams to use familiar AWS tools, APIs, and console interfaces from day one.
Capacity
Open allYes. AWS works with you to plan and scale capacity over time. As part of the configuration process, AWS collaborates with you to define long-term scaling requirements so that your data center has sufficient power and space to support future expansion.
AWS provides the flexibility to adopt new EC2 instances as they become available. You can deploy multiple accelerator types within a single AWS AI Factory and work with AWS to add newer generations through a private pricing agreement addendum.
Yes, you can reserve EC2 instance capacity in AWS AI Factories using Amazon EC2 Capacity Reservations, providing reserved compute capacity for your workloads.
Pricing and SLAs
Open allPricing for AWS AI Factories is tailored to each customer's requirements and depends on factors such as deployment location, scale, selected accelerators and services, and existing customer infrastructure. Pricing is made available after AWS and the customer complete a joint initial assessment of the AWS AI Factory configuration.
AWS services running in AWS AI Factories are covered by service-specific SLAs. For details on applicable SLA commitments, refer to the AWS Service Level Agreements page. These SLAs are backed by comprehensive data center assessments that meet AWS's infrastructure standards and the consistent infrastructure and services delivered in each AWS AI Factory.
Security and sovereignty
Open allAWS AI Factories are sovereign by design, giving you control over where your data resides, who can access it, and how AI workloads are secured and operated. They apply the same security, monitoring, management, and auditing capabilities used in AWS Regions, while enabling you to meet data residency and sovereignty requirements within your own environment.
At the hardware level, AWS AI Factories use the AWS Nitro System, which is designed with no operator access. There is no mechanism for AWS personnel or systems to log in to EC2 Nitro hosts, access instance memory, or access customer data stored on encrypted local instance storage or encrypted Amazon EBS volumes.
At the account, application, and agent level, access to AWS AI Factory resources is controlled through AWS Identity and Access Management (IAM) policies and customer-defined permissions. AWS Control Tower provides preventive, detective, and proactive controls to help organizations implement governance frameworks and support data residency and compliance requirements.
AWS AI Factories also provide control over where data is stored and processed. The AWS AI Factory data plane—including model training and inference workloads—remains within the AWS AI Factory perimeter unless you explicitly choose to integrate with AWS Region services such as Amazon S3. This helps ensure that sensitive data and AI workloads stay within your defined sovereignty boundary. Inference workloads, including Amazon Bedrock endpoints, can operate within the AWS AI Factory to keep prompts, responses, and model interactions within the sovereignty boundary.
Most AWS services support encryption at rest and in transit, with many offering customer-managed keys inaccessible to AWS operators. For organizations requiring keys to remain outside the AWS Cloud, the AWS Key Management Service External Key Store enables the use of externally managed keys.
AWS can also work with you to implement additional sovereignty requirements, such as data access monitoring and audit programs, controls restricting infrastructure access to designated AWS accounts, and operational controls enforcing nationality or security clearance requirements for local personnel.
Yes, you have full control over which accounts can access your AI Factory, and AWS provides identity, governance, and access management tools to manage permissions and centralize control across accounts.
No, only AWS personnel are authorized to operate AWS AI Factories infrastructure and services.
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages