RecursiveMAS Speeds Multi-Agent AI by 2.4x

Why this is here: RecursiveMAS reduces token usage by 75.6% by the third round of recursion, significantly lowering compute costs for complex AI tasks.

Researchers at University of Illinois Urbana-Champaign and Stanford University developed RecursiveMAS, a new framework for multi-agent AI systems. Current systems often struggle with slow communication and high costs due to text-based interactions between agents. RecursiveMAS enables agents to share information through embedding space, bypassing text generation.

Experiments across code generation, medical reasoning, and search show RecursiveMAS improves accuracy. It increases inference speed by 1.2 to 2.4 times and reduces token usage by roughly 76% compared to text-based systems. The framework also lowers training costs, updating only a small portion of the system’s parameters—about 0.31%—while keeping the models frozen.

The team evaluated RecursiveMAS using open-weights models like Qwen and Llama-3. While the system shows promise in streamlining multi-agent workflows, the researchers acknowledge that further work is needed to explore optimal configurations for diverse applications. The code and model weights are available under the Apache 2.0 license, allowing for continued development.