Researchers trained an OpenAI rival in half an hour for less than $50


To do this, researchers at Stanford and the University of Washington used a method known as distillation — which allows smaller models to draw from the answers produced by larger ones — to refine s1 using answers from Google’s AI reasoning model, Gemini 2.0 Flash Thinking Experimental. Google’s terms of service note that you can’t use Gemini’s API to “develop models that compete with” the company’s AI models. The Verge reached out to Google with a request for comment but didn’t immediately hear back.
The researchers based s1 on Qwen2.5, an open-source model from Alibaba Cloud. They initially started with a pool of 59,000 questions to train the model on, but found that the larger data set didn’t offer “substantial gains” over a whittled-down set of just 1,000. The researchers say they trained the model on just 16 Nvidia H100 GPUs.
The s1 model also uses a technique called test-time scaling, allowing the model to “think” for a longer amount of time before producing an answer. As noted in the paper, researchers forced the model to continue reasoning by adding “Wait” to the model’s response. “This can lead the model to doublecheck its answer, often fixing incorrect reasoning steps,” the paper says.
OpenAI’s o1 reasoning model uses a similar approach, something the buzzy AI startup DeepSeek sought to replicate with the launch of its R1 model that it claims was trained at a fraction of the cost. OpenAI has since accused DeepSeek of distilling information from its models to build a competitor, violating its terms of service. As for s1, the researchers claim that s1 “exceeds o1-preview on competition math questions by up to 27%.”
The rise of smaller and cheaper AI models threatens to upend the entire industry. They could prove that major companies like OpenAI, Microsoft, Meta, and Google don’t need to spend billions of dollars training AI, while building massive data centers filled with thousands of Nvidia GPUs.
To do this, researchers at Stanford and the University of Washington used a method known as distillation — which allows smaller models to draw from the answers produced by larger ones — to refine s1 using answers from Google’s AI reasoning model, Gemini 2.0 Flash Thinking Experimental. Google’s terms of…
Recent Posts
- Grok blocked results saying Musk and Trump “spread misinformation”
- A GPU or a CPU with 4TB HBM-class memory? Nope, you’re not dreaming, Sandisk is working on such a monstrous product
- The Space Force shares a photo of Earth taken by the X-37B space plane
- Elon Musk claims federal employees have 48 hours to explain recent work or resign
- xAI could sign a $5 billion deal with Dell for thousands of servers with Nvidia’s GB200 Blackwell AI GPU accelerators
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010