OpenAI admits it screwed up testing its ‘sychophant-y’ ChatGPT update


Last week, OpenAI pulled a GPT-4o update that made ChatGPT “overly flattering or agreeable” — and now it has explained what exactly went wrong. In a blog post published on Friday, OpenAI said its efforts to “better incorporate user feedback, memory, and fresher data” could have partly led to “tipping the scales on sycophancy.”
In recent weeks, users have noticed that ChatGPT seemed to constantly agree with them, even in potentially harmful situations. The effect of this can be seen in a report by Rolling Stone about people who say their loved ones believe they have “awakened” ChatGPT bots that support their religious delusions of grandeur, even predating the now-removed update. OpenAI CEO Sam Altman later acknowledged that its latest GPT-4o updates have made it “too sycophant-y and annoying.”
In these updates, OpenAI had begun using data from the thumbs-up and thumbs-down buttons in ChatGPT as an “additional reward signal.” However, OpenAI said, this may have “weakened the influence of our primary reward signal, which had been holding sycophancy in check.” The company notes that user feedback “can sometimes favor more agreeable responses,” likely exacerbating the chatbot’s overly agreeable statements. The company said memory can amplify sycophancy as well.
OpenAI says one of the “key issues” with the launch stems from its testing process. Though the model’s offline evaluations and A/B testing had positive results, some expert testers suggested that the update made the chatbot seem “slightly off.” Despite this, OpenAI moved forward with the update anyway.
“Looking back, the qualitative assessments were hinting at something important, and we should’ve paid closer attention,” the company writes. “They were picking up on a blind spot in our other evals and metrics. Our offline evals weren’t broad or deep enough to catch sycophantic behavior… and our A/B tests didn’t have the right signals to show how the model was performing on that front with enough detail.”
Going forward, OpenAI says it’s going to “formally consider behavioral issues” as having the potential to block launches, as well as create a new opt-in alpha phase that will allow users to give OpenAI direct feedback before a wider rollout. OpenAI also plans to ensure users are aware of the changes it’s making to ChatGPT, even if the update is a small one.
Last week, OpenAI pulled a GPT-4o update that made ChatGPT “overly flattering or agreeable” — and now it has explained what exactly went wrong. In a blog post published on Friday, OpenAI said its efforts to “better incorporate user feedback, memory, and fresher data” could have partly led to “tipping…
Recent Posts
- Marvel Rivals Season 3 update live build-up: our predictions and preview thoughts as we roll toward the new season’s launch
- Dissident Hunters
- The 10 Prime Day deals that are most popular with Verge readers
- IKEA launches two cheap Bluetooth speakers – including a retro throwback with a handy Spotify trick
- The Google Pixel 10 Pro Fold just got benchmarked – and it’s no match for the Samsung Galaxy Z Fold 7
Archives
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022