Yet another tech startup wants to topple Nvidia with ‘orders of magnitude’ better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B
- Sagence brings analog in-memory compute to redefine AI inference
- Ten times lower power and 20 times lower costs
- Also offers integration with PyTorch and TensorFlow
Sagence AI has introduced an advanced analog in-memory compute architecture designed to address issues of power, cost, and scalability in AI inference.
Using an analog-based approach, the architecture offers improvements in energy efficiency and cost-effectiveness while delivering performance comparable to existing high-end GPU and CPU systems.
This bold step positions Sagence AI as a potential disruptor in a market dominated by Nvidia.
Efficiency and performance
The Sagence architecture offers benefits when processing large language models like Llama2-70B. When normalized to 666,000 tokens per second, Sagence’s technology delivers its results with 10 times lower power consumption, 20 times lower costs, and 20 times smaller rack space compared to leading GPU-based solutions.
This design prioritizes the demands of inference over training, reflecting the shift in AI compute focus within data centers. With its efficiency and affordability, Sagence offers a solution to the growing challenge of ensuring return on investment (ROI) as AI applications expand to large-scale deployment.
At the heart of Sagence’s innovation is its analog in-memory computing technology, which merges storage and computation within memory cells. By eliminating the need for separate storage and scheduled multiply-accumulate circuits, this approach simplifies chip designs, reduces costs, and improves power efficiency.
Sagence also employs deep subthreshold computing in multi-level memory cells – an industry-first innovation – to achieve the efficiency gains required for scalable AI inference.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Traditional CPU and GPU-based systems rely on complex dynamic scheduling, which increases hardware demands, inefficiencies, and power consumption. Sagence’s statically scheduled architecture simplifies these processes, mirroring biological neural networks.
The system is also designed to integrate with existing AI development frameworks like PyTorch, ONNX, and TensorFlow. Once trained neural networks are imported, Sagence’s architecture negates the need for further GPU-based processing, simplifying deployment and reducing costs.
“A fundamental advancement in AI inference hardware is vital to the future of AI. Use of large language models (LLMs) and Generative AI drives demand for rapid and massive change at the nucleus of computing, requiring an unprecedented combination of highest performance at lowest power and economics that match costs to the value created,” said Vishal Sarin, CEO & Founder, Sagence AI.
“The legacy computing devices today that are capable of extreme high-performance AI inferencing cost too much to be economically viable and consume too much energy to be environmentally sustainable. Our mission is to break those performance and economic limitations in an environmentally responsible way,” Sarin added.
Via IEEE Spectrum
You may also like
Sagence brings analog in-memory compute to redefine AI inference Ten times lower power and 20 times lower costs Also offers integration with PyTorch and TensorFlow Sagence AI has introduced an advanced analog in-memory compute architecture designed to address issues of power, cost, and scalability in AI inference. Using an analog-based…
Recent Posts
- Nintendo confirms it will sell a new Switch 2 with replaceable battery in the EU
- Apple begins requiring age verification for App Store use in Texas
- The co-creator of Scavengers Reign is working on a new show for Netflix
- Apple is bringing age verification to Texas this week
- How to watch NBA Finals 2026: Free streams, schedule, TV channels for New York Knicks vs San Antonio Spurs
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023