

Page overview
Designing with the whole system in mind

The Annapurna Labs office looks like a typical working space with a mix of employees typing at desktops and brainstorming in conference rooms. Many of these employees are at the front line of machine learning acceleration, developing the layers of software that power silicon chips. They make up a critical part of Annapurna’s secret sauce—a system-first mindset.
“Instead of building a chip and then integrating it into a system and writing software to it, we flipped the process on its head,” said Ron Diamant, lead architect. “We first designed the full system and work backwards from that in order to specify the most optimal chip for that system. And this allows us to create a much more tailored chip for the workloads that we’re trying to accelerate.”
Past the rows of cubicles are three different labs where the hardware comes into play. Engineers at cable-covered workstations use power tools to build boards and specialized microscopes to view tiny chip components. Dozens of fans keep the equipment (and humans) cool as servers run in the background. Despite its scattered state, everything in the lab has a purpose and serves as a reminder that learning can be messy.

“When you go into the lab and you see equipment everywhere. It’s organized chaos,” said Rami Sinno, director of silicon engineering. “We iterate quickly, we fail early, and we fix it. And this is what allows us to continuously deliver very high-performance, low-cost products to our customers.”

Annapurna Labs’ vertically integrated process enables control of the entire stack of components required for machine learning accelerator servers. Both software and hardware engineers collaborate at every stage of development, from chip design to server deployment in AWS data centers.
“As we’re developing the chip, we develop the software in parallel. We use both of them in testing so that we make sure everything’s working well together and we can do trade-off analysis,” said Laura Sharpless, software engineering manager. “Every day I come into the office, I get to solve a new problem. Maybe today we’re working on hardware, physical boards. And tomorrow we’re looking at how do we actually scale the software and support multiple generations really seamlessly to scale faster.”

Annapurna's testing and validation processes are critical to ensuring the reliability and robustness of components for 24/7 operation in AWS data centers. Engineers test all software and hardware components at every level from chip to board to server. The lab consists of stations where engineers use specialized equipment to introduce different variables like functionality, voltage, and temperature.

“Testing significantly cuts down the development time so our software engineers can iterate faster,” said Prashant Pappu, principal hardware engineer, “and hardware engineers can focus on finding issues early on in the cycle.”

Prior to the acquisition, Annapurna Labs and AWS worked together on the production of next-generation hardware AWS Nitro and its supporting hypervisor. Just over a decade later, Nitro is essential to every AWS server. The technology is the foundation of EC2 instances, enables AWS to innovate faster, further reduce cost for customers, and deliver increased security. Shortly after joining AWS, Annapurna Labs embarked on Graviton, its second product line. Now in its fourth generation, Graviton gives customers more computing capabilities while reducing their carbon footprint.

Annapurna Labs’ machine learning chips—Inferentia and Trainium—are the third product line. Their names are a direct reflection of their use cases. Customers use Inferentia to run machine learning inference at scale and Trainium to run large-scale training workloads like generative AI and computer vision. Trainium2, the second-generation chip, is an essential part of Annapurna Labs’ development of increasingly powerful AI computing systems like Trainium2 instances and UltraServers.
“An UltraServer combines four Trainium2 servers and 64 Trainium2 chips into one server with very fast connections between them,” said Tobias Edler von Koch, principal software engineer. “As machine learning models become too large to be handled definitely by an individual chip, or even by an individual machine, you need to scale out and have multiple servers collaborate.”

Annapurna Labs’ stealthy setup and eager engineers makes it uniquely suited to meet the demand for continuous innovation in the rapid race for AI advancement. Along with the development of its next-generation chips, Annapurna is partnering with AI startup Anthropic to take on its most ambitious challenge yet: building Project Rainier expected to be the world’s largest supercomputer.

“It's so exciting to be in this fast-moving environment, innovating on behalf of customers and working closely with customers to make sure that we are building the right things in the future,” said Gadi Hutt, director of product and customer engineering. "My prediction is the next celebration in 20 years will come much faster because we’re having so much fun.”
Trending news and stories
- Everything you need to know about the plot of ‘The Summer I Turned Pretty’ before Season 3
- Blink buying guide: Find out which smart doorbell or camera is best for you
- How paper towels, avocados, and bananas became some of Amazon’s most frequently purchased items
- Amazon plans to invest $10 billion in North Carolina to expand cloud computing infrastructure and advance AI innovation