Mom’s website ready to put OpenAI in a time-out after learning the AI firm may have scrapped its data


British parenting hub Mumsnet has filed a lawsuit against OpenAI, claiming it violated copyright law by using its data to train its AI models, including those powering ChatGPT. It’s the first such legal action taken against OpenAI in the United Kingdom, but one of a growing number of similar cases spread internationally accusing OpenAI of illicitly scraping information for its models without permission. Mumsnet claims its forums host more than six billion words and that OpenAI employed those words to teach its AI models about parenting and related topics.
“Such scraping without permission is an explicit breach of our terms of use, which clearly state that no part of the site may be distributed, scraped or copied for any purpose without our express approval,” Mumsnet co-founder Justine Roberts explained in a post on the website. “The LLMs are building models like ChatGPT to provide the answers to any and all prospective questions that will mean we’ll no longer need to go elsewhere for solutions. And they’re building those models with scraped content from the websites they are poised to replace.”
The legal complaint points to the timing of the data collection as another point of contention since it mainly happened before websites were paying close attention to whether AI companies were scraping their data. Mumsnet alleges that third-party research institutions initially performed the majority of this data scraping.
Roberts wrote that Mumsnet reached out to OpenAI about licensing its content, pointing out that the platform has a concentrated collection of writing by women that is unlike the majority of internet content. But, OpenAI turned them down, citing interest in “datasets that are not easily accessible online,” according to Roberts.
Scrape Scraps
Mumsnet is hardly alone in voicing complaints about OpenAI’s data scraping and is now part of an expanding cohort of companies taking OpenAI to court on the matter. For instance, the Authors Guild has sued OpenAI, alleging copyrighted books were used for training AI’s models, as have a group of academics claiming their articles were similarly lifted by OpenAI. Reuters and The New York Times have both sued OpenAI over not only data scraping but also by claiming ChatGPT generates responses with content far too close to their copyrighted articles. Even Creative Commons has filed suit against the AI developer, claiming that the company used Creative Commons-licensed content to train its AI models in ways that violated the terms of the licenses.
OpenAI has defended its practices as falling under the fair use doctrine. In the UK, the company responded to a House of Lords inquiry by acknowledging the necessity of using copyrighted materials for training its AI models and that it should do more to support content creators, but still maintains that what it does is legal. While this is OpenAI’s first UK case on the matter, Getty Images has a similar case going in the country’s courts against Stability AI for its image-generating AI.
The outcome of Mumsnet’s lawsuit and other cases may set precedents for how AI companies handle copyrighted content and might influence future regulations and licensing practices. The effort to balance AI innovation and intellectual property rights is far from settled and probably won’t be for a long while.
Sign up for breaking news, reviews, opinion, top tech deals, and more.
To be fair, Mumsnet isn’t against LLMs and AI as a concept. In fact, Mumsnet employed OpenAI’s models to build an AI chatbot called MumsGPT last year. MumsGPT was only available to executives at Mumsnet when it was announced and hasn’t been mentioned since, so it may not be around anymore, but the idea was to offer it as a research tool and even as something policymakers could use in developing parenting-related regulations. Roberts didn’t mention MumsGPT but made a point of saying that there are positive potential uses for AI in her explanation of the lawsuit.
“But if the LLMs are allowed to simply steal content from publishers and communities like Mumsnet they risk destroying them,” Roberts wrote. “We know that taking on a multinational giant like OpenAI, with its $3bn of revenues, is not an easy task in the face of the huge resources they’ll throw at us but this is too important an issue to simply roll over. Not just for Mumsnet but for every website you’ve ever landed on for news, advice or simply to ask if you’re being unreasonable.”
You might also like…
British parenting hub Mumsnet has filed a lawsuit against OpenAI, claiming it violated copyright law by using its data to train its AI models, including those powering ChatGPT. It’s the first such legal action taken against OpenAI in the United Kingdom, but one of a growing number of similar cases…
Recent Posts
- With the Humane AI Pin now dead, what does the Rabbit R1 need to do to survive?
- One of the best AI video generators is now on the iPhone – here’s what you need to know about Pika’s new app
- Apple’s C1 chip could be a big deal for iPhones – here’s why
- Rabbit shows off the AI agent it should have launched with
- Instagram wants you to do more with DMs than just slide into someone else’s
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010