Microsoft built a fake online marketplace to see how its AI agents would work selling unsupervised – and let’s just say the results were… unsurprising
- Microsoft’s Magentic Marketplace exposes AI agents’ inability to act independently
- Customer-side agents were easily influenced by business agents during simulated transactions
- AI agents slow down significantly when presented with too many choices
A new Microsoft study has raised questions on the current suitability of AI agents operating without full human supervision/
The company recently built a synthetic environment, the “Magentic Marketplace“, designed to observe how AI agents perform in unsupervised situations.
The project took the form of a fully simulated ecommerce platform which allowed researchers to study how AI agents behave as customers and businesses – with possible predictable results.
Testing the limits of current AI models
The project included 100 customer-side agents interacting with 300 business-side agents, giving the team a controlled setting to test agent decision-making and negotiation skills.
The source code for the marketplace is open source; therefore, other researchers can adopt it to reproduce experiments or explore new variations.
Ece Kamar, CVP and managing director of Microsoft Research’s AI Frontiers Lab, noted this research is vital for understanding how AI agents collaborate and make decisions.
The initial tests used a mix of leading models, including GPT-4o, GPT-5, and Gemini-2.5-Flash.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
The results were not entirely unexpected, as several models showed weaknesses.
Customer agents could easily be influenced by business-side agents into selecting products, revealing potential vulnerabilities when agents interact in competitive environments.
The agents’ efficiency dropped sharply when faced with too many options, overwhelming their attention span and leading to slower or less accurate decisions.
AI agents also struggled when asked to work toward shared goals, as the models were often unsure which agent should take on which role, which reduced their effectiveness in joint tasks.
However, their performance improved only when step-by-step instructions were provided.
“We can instruct the models – like we can tell them, step by step. But if we are inherently testing their collaboration capabilities, I would expect these models to have these capabilities by default,” Kamar noted.
The results show AI tools still need substantial human guidance to function effectively in multi-agent environments.
Often promoted as capable of independent decision-making and collaboration, the results show unsupervised agent behavior remains unreliable, so humans must improve coordination mechanisms and add safeguards against AI manipulation.
Microsoft’s simulation shows that AI agents remain far from operating independently in competitive or collaborative scenarios and may never achieve full autonomy.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
Microsoft’s Magentic Marketplace exposes AI agents’ inability to act independently Customer-side agents were easily influenced by business agents during simulated transactions AI agents slow down significantly when presented with too many choices A new Microsoft study has raised questions on the current suitability of AI agents operating without full human…
Recent Posts
- WiiM expands its whole-home ecosystem with a new soundbar
- You can make the hyper-violence in Marvel’s Wolverine more PG-13, if you want to
- Best Buy launches a huge Sonos sale ahead of the World Cup — here are the 7 top-rated soundbars and speakers I’d buy
- Nvidia is already planning N2X and N3X chips — the goal is the Star Trek computer
- A British MP is suing to see if xAI is legally responsible for the images Grok produces
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023