Innovium Speeds Innovation by Running Chip-Design Workloads on AWS

Looking for Technology to Enable Scalability and Elasticity

In the world of Ethernet switch manufacturing, it’s all about speed. Companies like Innovium, which designs and makes silicon for Ethernet switches, must get products into customers’ hands as fast as possible to meet tight delivery deadlines. As a startup, however, Innovium couldn’t accommodate the compute and storage resources needed to support its business requirements. “We serve some of the biggest companies in the world, and we need to reliably deliver our products on time,” says Keith Ring, vice president of technology for Innovium. “We knew that would be impossible in our on-premises infrastructure model. We couldn’t get the compute capacity we needed to complete jobs cost-effectively because we lacked the necessary physical space, cooling capacity, and power.”

Innovium also needed more elasticity, so it could scale its electronic design automation (EDA) workloads up or down as necessary. “We use a handful of servers in the early phase of designing our products, but in the final months of design, we need to scale our compute and storage resources significantly,” says Mohammad Issa, Innovium’s founder and chief development officer.

 

“Using AWS, we can remove the compute scalability barrier, so we can focus exclusively on product innovations.”

– Keith Ring, Vice President of Technology, Innovium


  • About Innovium
  • Based in Silicon Valley, California, Innovium provides high-performance silicon switches for data centers worldwide. The company’s TERALYNX product family delivers software-compatible products ranging from 3.2Tbps to 12.8Tbps.

     

  • Benefits of AWS
    • Speeds innovation by scaling HPC workloads on demand to hundreds of cores
    • Meets strict SLAs for product delivery
    • Competes against much larger companies
  • AWS Services Used
Innovium Speeds Innovation by Running Chip Design Workloads on AWS

Running HPC Workloads on High-Memory Amazon EC2 Instances

To meet its needs for scalability and elasticity, Innovium chose to set up its high-performance computing (HPC) environment on the Amazon Web Services (AWS) Cloud. “We selected AWS because it is a leader in cloud computing, with proven capabilities, and some of our developers had experience working with AWS,” Issa says.

Innovium uses high-memory Amazon Elastic Compute Cloud (Amazon EC2) X1 instances powered by Intel Xeon E7 processors to support memory-intensive HPC workloads. “We need the largest possible footprint when we’re running HPC jobs, so we can scale the number of cores in a linear fashion,” Ring says. “That’s what we get from Amazon EC2 X1 instances.”

Scaling to Hundreds of Cores on Demand

Innovium has seen an eightfold improvement in HPC processing throughput since moving to AWS. “Many EDA tools are designed for hundreds of cores, and AWS provided us with the solution to get that kind of scalability,” Ring says. “Using AWS, we took advantage of the ability to scale to 264 cores per job, distributed across multiple machines, delivering better performance than achievable on our local servers. This means we can scale quickly and easily to support our integrated circuit design workloads.”

Meeting Strict SLAs for Product Delivery

With more scalability and elasticity, Innovium can confidently and consistently deliver high-quality products to customers on time and focus on innovation in chip development instead of infrastructure management. “The elasticity of the AWS Cloud enables us to quickly turn cores on or off during the final phase of product design,” Issa says. “With the availability of AWS Cloud services, Innovium can build a local data center to accommodate average usage while offloading peak workload to the cloud.” The availability and reliability of the AWS Cloud also enables Innovium to reduce schedule risk. “We can deliver our products in the agreed time frame much more reliably on AWS,” says Ring. “We don’t miss schedules for lack of compute capacity.”

Also, because Innovium can launch compute resources and terminate them when EDA jobs are complete, the company avoids the costs of overprovisioning local capacity. “By gaining access to a large number of machines for a short period of time, we don’t have to invest in overbuilding our local data center for peak loads,” Issa says.

 

Competing with Much Larger Companies

The availability of reliable and scalable compute is key to leveling the playing field for small companies. “There are barriers beyond integrated circuit innovations to delivering products in this industry, of which infrastructure scalability stands out. For a small company, those barriers can be even greater,” says Ring. “Using AWS, we can remove the compute scalability barrier, so we can focus exclusively on product innovation.”


Learn More

Learn more about high performance computing on AWS.