Meta announced advances in AI infrastructure with its next-generation Meta Training and Inference Accelerator. Meta has made a major leap forward in its efforts to improve AI-based research and products. Meta announced advances in AI infrastructure with the unveiling of its next-generation Meta Training and Inference Accelerator. Meta has made a major leap forward in its efforts to improve AI-based research and products.The MTIA Project is improving the efficiency of the MTIA’s unique AI workloads. Meta’s platforms are able to improve user experience by incorporating deep-learning recommendation models.
First Gen vs. Next Gen MTIA.
To meet the increasing demands of AI workloads, MTIA’s first generation has been upgraded with significant technology. The first-gen MTIA was built using TSMC’s 7nm technology. It had a frequency up to 800MHz and could deliver up to 102.4 teraflops/s for INT8. The chip was fitted with 64GB off-chip LPDDR5 and 128MB on-chip LPDDR5 with a 25-watt TDP. The memory bandwidth was set to 800GB/s on the chip and 400GB/s in local memory. This optimized setup was designed for power-efficiency and performance.
TSMC has upgraded its 5nm MTIA process to enable the acceleration to run at a faster frequency, 1.35GHz. The upgrade increases FLOPS from 103 to 103 millions and doubles the gate number to 2,35 billion. This indicates a significant boost to the processing power of the chip. Next-gen MTIA features a tripled increase in the local PE memory and doubling of on-chip RAM to 256MB, while increasing off-chip LPDDR5 to 128GB. The memory bandwidth is also increased, reaching 1TB/s/PE for the local memory as well as 2.7TB/s/PE for the on-chip memory. This ensures higher efficiency and data throughput. The TDP has also been increased to 90 watts in order to support the new performance levels. This host connection is also upgraded to PCIe Gen5 8x, doubling bandwidth up to 32GB/s. It supports data transfer between the host and accelerator at a faster rate. The notable enhancements provide a solid foundation to develop and deploy AI-driven services and applications. The Next Generation MTIA
Features
MTIA’s core features include 8×8 grid PEs, which significantly improve dense and sparse computing performances. The architectural improvements and the substantial increases in bandwidth, SRAM and local PE storage are responsible for this enhancement. The improved NoC architecture of the accelerator allows for faster coordination between PEs and low latency data processing, which is essential to complex AI tasks. Meta’s innovation extends well beyond the silicon. Meta’s AI-focused projects can scale up significantly with the next-generation MTIA, which is supported by an advanced rack-based platform that houses 72 accelerators. Its modular design allows it to operate at higher frequencies, allowing for greater efficiency and a wide range of model complexity.
The MTIA ecosystem is also reliant on software integration, as Meta leverages its PyTorch work to guarantee seamless compatibility. This in turn increases developer productivity. Triton MTIA is a powerful framework for advanced programming, which allows AI models to be translated into instructions that can run on high-performance computers. This simplifies the process of development.
First Performance Results of Next Generation MTIA
Meta states that preliminary performance metrics show a significant increase over the previous generation. This shows its ability to efficiently process complex and simple ranking and recommendation algorithm. The chip can handle algorithms of varying size and complexity, while outperforming commercial GPUs. As it implements these chips in its systems, the company is focusing on improving energy efficiency. The initial testing of the new-generation MTIA chips has revealed that they triple the performance of their predecessors across key models. Meta achieved six times more model processing efficiency and 50% energy savings over the first generation MTIA system with an upgraded setup that included double devices and dual socket CPUs. The improvements are the result of extensive optimizations to computing components and server infrastructure. The developer ecosystem has matured, and the ability to optimize models is now faster. There are still many opportunities for efficiency improvements. The MTIA is now active in Data Centers, and it enhances Meta’s AI Workload Processing, which has proven to be an important strategic addition to commercial GPUs. This release is a major step in the direction of advancing AI technologies and applications. Several initiatives are underway to expand MTIA’s functionality. YouTube