Skip to content

Can ChatGPT Transcribe Audio?

a team looking at a screen to see if chatgpt can reliably transcribe audio a team looking at a screen to see if chatgpt can reliably transcribe audio

Are you wondering if ChatGPT can transcribe audio? ChatGPT, OpenAI’s flagship large-language model, offers various functionalities, including writing, research, coding, visualization, and more. Businesses worldwide are scrambling to create application programming interfaces (APIs) to integrate the LLM into their workflow to see if it will work. Some are even trying to ascertain if ChatGPT can transcribe audio—and if it’s good enough to replace professional business transcription companies

While ChatGPT has many useful functions, some business processes are better left to the pros. Let’s talk about it.

In this article, you’ll learn:

  • AI accuracy tops at 86% and struggles with accents and technical terms.
  • Although AI offers lightning-fast transcription, you’ll need to proofread extensively and may make frustrating mistakes with accented speech.
  • Human transcripts remain the gold standard for accuracy, especially in medical, legal, technical, financial, or other areas where accuracy is non-negotiable.

What Is Automated Transcription?

Automated transcription technology turns spoken words or voice recordings into written text using AI (machine learning and speech recognition software). It analyzes live or recorded speech to generate a written version

However, even under the best conditions, the most advanced AI transcription systems can only achieve up to 86% accuracy. From experience, factors like accents, background noise, technical terms, multiple speakers, appropriate punctuation, and even sarcastic lines are some of the biggest weaknesses of AI transcription tools.

Since we’re at it, I’ll mention some of the most popular AI audio-to-text tools and platforms. 

PlatformsKey FeaturesBest For
Otter.aiReal-time transcription, speaker ID, mobile appRemote meetings & team collaboration
Rev.aiAPI access, custom vocabulary, timestampsDevelopers & integration projects
Google Speech-to-TextMulti-language, noise reduction, APIMulti-language content & global teams
Amazon TranscribeCustom vocabulary, batch processing, AWS integrationAWS users & enterprise solutions
Sonix40+ languages, automated translation, editing toolContent creators & international media
VIQ SolutionsQuick turnaround thanks to AIQuick drafts & personal use
TrintCollaboration tools, vocabulary builder, editing suiteMedia teams & newsrooms
Happy Scribe119+ languages, subtitle generator, export optionsVideo content & social media
Verbit.aiIndustry-specific AI models, workflow automationLarge organizations & institutions

Can ChatGPT Transcribe Audio?

ChatGPT itself cannot transcribe audio or video files. However, Whisper, a paid OpenAI model, can, sort of. Whisper claims to convert audio conversations into text across 50+ languages while maintaining speaker attribution in multi-voice recordings. It can also be integrated with ChatGPT and its various apps and API, though you need to pay for the service. 

New features now allow real-time language detection and translation of foreign audio directly into English text. However, you can’t expect the highest level of accuracy since the machine is primarily trained in English. Even with English transcriptions, research indicates the accuracy rate of AI transcriptions is 86% and will likely stay at that level for some time. 

Not only that, there’s been an uptick in concerns regarding Whisper, in particular related to it making up words as it goes along.

Pros and Cons of Letting OpenAI’s Whisper to Transcribe Audio

Of course, this discussion would be incomplete without discussing the pros and cons of audio transcription via ChatGPT or Whisper. 

Pros of Whisper Audio Transcription

Let’s start with the good news (or the alleged benefits).

  • Fast processing: It can transcribe a 1-hour audio file in roughly 5-10 minutes, compared to 4-5 hours of manual typing. 
  • Multi-language support: Handles common languages like Spanish and Mandarin, plus less common ones like Finnish or Thai. However, expect less accuracy since AIs are mainly trained in English. 
  • Cost-effectiveness: The average cost per minute is around $0.006 per minute. 
  • Always available: There are no scheduling or time zone issues. Upload audio at 3 AM and get results at 3:10 AM.

Cons of Whisper Audio Transcription

Now, let’s jump to the not-so-pretty side of it.

  • Requires extensive quality and error checks: Believe me when I say it makes many mistakes, like writing “their” instead of “there” or “to” instead of “two.”
  • Accent problems: Whisper often misunderstands speakers with Scottish, Indian, Australian, or other accents. With these, you can expect two results: frustrating or hilarious transcripts.
  • Hallucinations: Whisper is known to hallucinate, adding in words that don’t exist.
  • Technical terms: The AI Might write “mytocardial” instead of “myocardial” in medical recordings. The AI model knows a lot on the surface but little in niches.
  • Background noise: Struggles with recordings that have music, traffic, or crowd noise
  • Fact-checking: It can’t verify if spoken numbers or dates are correct. Sure, you can let it browse. However, ChatGPT pulls up the info from the SERP, and anyone who understands SEO knows that NOT everything from the SERP is accurate.

Other Effective Methods to Transcribe Audio

“So, what should I do if I can’t use ChatGPT? Manually transcribing hours of recorded audio is a grueling, monotonous grind, and I’m extremely busy with other things!”

I hear you. Time is precious, and we shouldn’t waste it. Below are some effective alternatives to ChatGPT transcribing audio; these often yield superior results. 

Hire a Freelance Transcriptionist

Websites like Freelancer, Upwork, and Fiverr have vast pools of individual contractors who can provide your transcription needs.

The bonus in hiring freelancers is that they can do the work at a relatively reasonable rate and with almost the same completion time as a transcription company.

However, choosing this route requires trial and error. Fingers crossed, you’ll get a trustworthy transcriber on your first try; we all know that hardly ever happens, and you always get what you pay for.

Professional Transcription Services

With Ditto, you won’t have to guess. Our affordable services come with our 99% accuracy guarantee, along with plenty of features and perks that you’ll only get from professional transcription companies. 

Since 2010, Ditto Transcripts has provided transcription solutions to businesses, legal firms, law enforcement agencies, academic institutions, and more. We’ve transcribed audio for authors, writers, journalists, city councils, museums, oil and gas companies, and families—you name it, we’ve done it. 

Our U.S.-based human transcription services are 100% CJIS and HIPAA compliant. We can also certify our transcripts if you need them. 

It’s no surprise that Ditto is one of the country’s leading transcription service providers—it meets all the crucial criteria regarding accuracy, security, and affordability.

Why Should You Let Ditto Transcripts Do The Work?

Ditto offers 100% human transcription—no AI, no automated tools, no soulless machines like ChatGPT listening to your recordings and spitting out inaccurate transcripts by the boatload. 

We’re a professional transcription company, so we won’t settle with giving our clients the bare minimum. Sign up for our services and enjoy the following perks:

  • 100% human transcription: Ditto’s human transcription—from initial checks to final edits—gives the highest possible accuracy guarantee. 
  • U.S.-based Transcribers: We only work with native English speakers to ensure quality, comprehension, and accuracy. Not only that, we also offer—
  • Certified Transcripts: Any transcripts involved in litigation can be certified—an extra layer of protection. 
  • No long-term contracts: We operate on a pay-as-you-go option; give us as much or as little work as possible without paying through the nose. 
  • Fast turnaround times: To ensure your workflow runs smoothly, you’ll get your transcripts in as little as 24 hours.
  • Different pricing options: We offer rush jobs or economical rates for longer turnaround times to match different budgets. 
  • Free trial: We stand behind everything we say and do, yet you don’t just have to take our word for it. Take us out for a test drive and see the difference. 

Let’s Do Business

We understand the importance of accurate transcription, so despite the trends, we do things the human way. 

You don’t need to settle for inaccurate automatic solutions to transcribe your audio. Increase your productivity while keeping things cost-effective with Ditto’s transcription services. 

Need anything else? We also offer document-to-document conversion and translation from different languages. Call us to learn more. 

Ditto Transcripts is a Denver, Colorado-based FINRA, HIPAA, and CJIS-compliant transcription services company that provides fast, accurate, and affordable transcripts for individuals and companies of all sizes. Call (720) 287-3710 today for a free quote, and ask about our free five-day trial.

Looking For A Transcription Service?

Ditto Transcripts is a U.S.-based HIPAA and CJIS compliant company with experienced U.S. transcriptionists. Learn how we can help with your next project!