BitTorrent for LLM? Exo software is a distributed LLM solution that can run even on old smartphones and computers

February 26, 2025

Exo supports LLaMA, Mistral, LlaVA, Qwen, and DeepSeek
Can run on Linux, macOS, Android, and iOS, but not Windows
AI models needing 16GB RAM can run on two 8GB laptops

Running large language models (LLMs) typically requires expensive, high-performance hardware with substantial memory and GPU power. However, Exo software now looks to offer an alternative by enabling distributed artificial intelligence (AI) inference across a network of devices.

The company allows users to combine the computing power of multiple computers, smartphones, and even single-board computers (SBCs) like Raspberry Pis to run models that would otherwise be inaccessible.

This decentralized approach shares similarities with the SETI@home project, which distributed computing tasks across volunteer machines. By leveraging a peer-to-peer (P2P) network, Exo eliminates the need for a single, powerful system, making AI inference more accessible to individuals and organizations.

How Exo distributes AI workloads

Exo aims to challenge the dominance of large technology companies in AI development. By decentralizing inference, it seeks to give individuals and smaller organizations more control over AI models, similar to initiatives focused on expanding access to GPU resources.

“The fundamental constraint with AI is compute,” argues Alex Cheema, co-founder of EXO Labs. “If you don’t have the compute, you can’t compete. But if you create this distributed network, maybe we can.”

The software dynamically partitions LLMs across available devices in a network, assigning model layers based on each machine’s available memory and processing power. Supported LLMs include LLaMA, Mistral, LlaVA, Qwen, and DeepSeek.

Users can install Exo on Linux, macOS, Android, or iOS, though Windows support is not currently available. A minimum Python version of 3.12.0 is required, along with additional dependencies for systems running Linux fitted with NVIDIA GPUs.

One of Exo’s key strengths is that, unlike traditional setups that rely on high-end GPUs, it enables collaboration between different hardware configurations.

For example, an AI model requiring 16GB of RAM can run on two 8GB laptops working together. A more demanding model like DeepSeek R1, requiring approximately 1.3TB of RAM, could theoretically operate on a cluster of 170 Raspberry Pi 5 devices with 8GB RAM each.

Network speed and latency are critical concerns, and Exo’s developers acknowledge that adding lower-performance devices may slow inference latency but insists that overall throughput improves with each device added to the network.

Security risks also arise when multiple machines share workloads, requiring safeguards to prevent data leaks and unauthorized access.

Adoption is another hurdle, as developers of AI tools currently rely on large-scale data centers. The low-cost of Exo’s approach may appeal. but Exo’s approach simply won’t match the speed of those high-end AI clusters.

Via CNX Software

BitTorrent for LLM? Exo software is a distributed LLM solution that can run even on old smartphones and computers

How Exo distributes AI workloads

You may also like

Leave a Reply Cancel reply

Recent Posts

Archives

Useful Links