I had a big audio transcription problem – Gemini solved it, and ChatGPT didn’t
You know how they say, “It’s not a competition!” Well, don’t let them lie to you; everything is a competition, especially when it comes to AI. There’s rarely a day when I am not testing AI capabilities among multiple chatbots, and I am almost always surprised at the results. Some platforms really are better than others – at least for some tasks.
This journey started with Notes on my iPhone 17 Pro Max. Usually, I like to record interviews on an Android smartphone like the Google Pixel 10 Pro Fold, where the fantastic Recorder app expertly captures every utterance and, in the transcription, does a deft job of separating and labeling each speaker.
However, I arrived for this interview with just my iPhone. I know that buried inside Notes, an app I use obsessively across my iPhone and desktop (I have almost 2,500 notes), are audio recording capabilities hidden under the attachment icon (a paperclip).
Notes does a good job of recording audio, and I found my 20-minute recording perfectly captured in a note. Included was what appeared to be a useful transcription. A quick scan confirmed its accuracy, but there was a big problem: it didn’t label the speakers; everything blended into one long soliloquy. This would make it difficult to scan and pick apart my subject’s quotes from my own queries and observations.
I resigned myself to a relisten, during which I added my own labels…until I had a different thought: What if Gemini could help?
Gemini 3 Pro puts on its gloves
In recent months, I’ve been impressed with Google Gemini’s capabilities, especially the latest 3 Pro models, and how it seems to handle almost any prompt request with aplomb.
Now that I had the idea, I had to figure out how to get Gemini to listen to the recording. Playing back the audio on my iPhone speakers and asking Gemini to listen was out because I worried about how well, say, my desktop mics might pick up the sound coming out of the iPhone speakers. Plus, I was in the office and didn’t want people to overhear the private conversation (until I published a story).
Sign up for breaking news, reviews, opinion, top tech deals, and more.
First, I found that you could download the audio file from Notes. In playback, under the three dots, there’s a Share button that lets me Airdrop the audio file to my 14-inch MacBook Pro. It comes down as an MPEG-4 (M4A) file.
Back in Gemini 3 Pro, I selected the “+” sign in the prompt field, chose the M4A audio file, and added this brief prompt: “Listen to this, transcribe it and be sure to identify the different speakers.”

There was no back and forth. Germini 3 Pro quickly started spitting out the full transcript with speakers identified as “Interviewer” and the name and title of my subject. It’s worth noting here that this is the one thing Gemini 3 Pro inexplicably got completely wrong. Even though my subject spelled out his name at the end of the chat, Gemini chose a different one. Other than that, though, Gemini perfectly identified when it was me or or subject speaking. And the accuracy was truly impressive.
For the sake of completeness, I asked Gemini 3 Pro to correct the identification of my subject and list me as the “interviewer”. With that fixed, I happily used the transcript to help drive my full story.
In this corner, ChatGPT
Naturally, though, I was curious if ChatGPT 5.1 (with a Plus account) could accomplish the same task.
In the ChatGPT prompt window, I selected the audio file and entered the exact same prompt. ChatGPT told me, “I can definitely transcribe audio, but I can’t access or play the .m4a file directly from the location you referenced.”
What followed was an extensive back-and-forth in which ChatGPT kept suggesting different ways for me to upload the file, including transforming it into a zip file. No matter what I did, ChatGPT would show the audio file in the prompt window, but it couldn’t listen to it.
In this little competition, it seems, Gemini 3 Pro is the victor, turning a frustrating problem into an easy win. The less said about how useless Apple‘s Notes transcription is, the better.

The best business laptops for all budgets
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
Source
You know how they say, “It’s not a competition!” Well, don’t let them lie to you; everything is a competition, especially when it comes to AI. There’s rarely a day when I am not testing AI capabilities among multiple chatbots, and I am almost always surprised at the results. Some…
Recent Posts
- Best Buy slashes up to $400 off Apple tech in a limited-time sale — get AirPods, MacBooks, iPads and Apple Watches from $99.99
- The Instagram Plus subscription has officially launched
- Cyberdecks used to look like little laptops, but now they’re getting more personal
- Canada Prime Minister Mark Carney announces questionable national AI strategy
- Kevin O’Leary agrees to downsize massive Utah data center
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023