Meta showcases the hardware that will power recommendations for Facebook and Instagram — low-cost RISC-V cores and mainstream LPDDR5 memory are at the heart of its MTIA recommendation inference CPU
Meta unveiled its first-generation in-house AI inference accelerator designed to power the ranking and recommendation models that are key components of Facebook and Instagram back in 2023.
The Meta Training and Inference Accelerator (MTIA) chip, which can handle inference but not training, was updated in April, and doubled the compute and memory bandwidth of the first solution.
At the recent Hot Chips symposium last month, Meta gave a presentation on its next-generation MTIA and admitted using GPUs for a recommendation engines is not without challenges. The social media giant noted that peak performance doesn’t always translate to effective performance, large deployments can be resource-intensive, and capacity constraints are exacerbated by the growing demand for Generative AI.
Mysterious memory expansion
Taking this into account, Meta’s development goals for the next generation of MTIA include improving performance per TCO and per watt compared to the previous generation, efficiently handling models across multiple Meta services, and enhancing developer efficiency to quickly achieve high-volume deployments.
Meta’s latest MTIA gains a significant boost in performance with GEN-O-GEN, which increases GEMM TOPs by 3.5x to 177 TFLOPS at BF16, hardware-based tensor quantization for accuracy comparable to FP32, and optimized support for PyTorch Eager Mode, enabling job launch times under 1 microsecond and job replacement in less than 0.5 microseconds. Additionally, TBE optimization enhances embedding indices’ download and prefetch times, achieving 2-3x faster run times compared to the previous generation.
The MTIA chip, built on TSMC’s 5nm process, operates at 1.35 GHz with a gate count of 2.35 billion and offers 354 TOPS (INT8) and 177 TOPS (FP16) GEMM performance, utilizing 128GB LPDDR5 memory with a bandwidth of 204.8GB/s, all within a 90-watt TDP.
The Processing Elements are built on RISC-V cores, featuring both scalar and vector extensions, and Meta’s accelerator module includes dual CPUs. At Hot Chips 2024, ServeTheHome noticed a Memory Expansion linked to the PCIe switch and the CPUs. When asked if this was CXL, Meta rather coyly said, “it is an option to add memory in the chassis, but it is not being deployed currently.”
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
More from TechRadar Pro
Meta unveiled its first-generation in-house AI inference accelerator designed to power the ranking and recommendation models that are key components of Facebook and Instagram back in 2023. The Meta Training and Inference Accelerator (MTIA) chip, which can handle inference but not training, was updated in April, and doubled the compute…
Recent Posts
- MediaTek’s new flagship chipset is ready for AI agents and tri-fold phones
- A Google breakup is on the table, say DOJ lawyers
- 297 Best Prime Day Deals, Vetted By Our Amazon Experts (Oct 2024)
- DJI Neo review: The best $200 drone ever made
- Prime Day kitchen deals on tech, gadgets, accessories and more during Big Deal Days
Archives
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- December 2011