Are you weighing your options between AI and human transcription?
AI technology is getting increasingly smarter over time. Every day, it’s used to command Alexa to turn the music on or ask Siri to call a friend. It’s crazy to think about what AI will be capable of in several years as technology advances. There are now so many AI transcription options with decent accuracy rates that it’s hard to pinpoint which one is the best.
On the other hand, human transcription has been around since 3400 BCE. A line of products like foot pedals, headsets, and transcription software has helped human transcription reach such high accuracy rates that it’s a no-brainer solution for legal transcription needs and especially law enforcement transcription requirements.
This article will discuss whether AI transcription is reliable enough for your business and compare it with human transcription.
Tools That Have Recently Come Out With Transcription Features
In the 1970s and 1980s, speech recognition systems became so advanced that they were placed in children’s toys. Then, in the ’90s, speech recognition was propelled forward thanks to the personal computer. Software like Dragon Dictate was easy to use with faster processors. BellSouth introduced the voice portal (VAL), a dial-in interactive voice recognition system.
Not long after that, AI transcription tools started to develop. Google launched the Google Voice Search app, accessible to millions of people. And in 2011, Apple launched Siri, a product similar to Google Voice Search.
As these tools evolve and get smarter, there are always new features to be aware of.
For example, Microsoft Word’s web version now supports automated voice to text transcription. Users speak as the application automatically transcribes.
Users can also upload recordings directly into Word up to 200MB in size in MP3, WAV, MP4, or M4A formats. The upload times vary on file size, and the transcript itself is instant. Further, the application captures audio straight from your PC. That means you can automatically transcribe YouTube videos, meetings, and calls.
The software offers English at the moment and will add more language options soon. The transcription feature is available to Microsoft subscribers, with a restriction of five hours of transcription per month right now as of this writing.
Another emerging feature is Google Translate. Any experience in an unknown language translates in real-time on your phone’s screen, making it easy for you to follow along.
The app currently supports English, French, German, Hindi, Portuguese, Russian, Spanish, and Thai.
Google states, “We’ll continue to make speech translations available in a variety of situations. Right now, the transcribe feature will work best in a quiet environment, with one person speaking at a time. In other situations, the app will still do its best to provide the gist of what’s being said.”
On a similar note, Trint released realtime transcription and translation to help newsrooms and content producers to collaboratively verify, share, and edit coverage of breaking news and live events as they happen.
The app works by plugging in a livestream feed to Trint’s online platform. The transcript then appears in the Trint Editor, which combines a text editor with an audio/video player. It merges the live source audio to the words on the screen within seconds. Dozens of users in different locations can access the transcript, edit, and publish it.
“Trint Translation removes one of the greatest barriers to communication today: language. Although the A.I.-powered translation isn’t perfect, it gives users a first draft that they can use to draw the insight they need to shape their stories and content. Translations can then be easily polished to perfection with the Trint Editor,” said Kofman, a journalist and war correspondent.
AI transcription apps will continue rolling out new features to help people in different situations. These features are handy, quick, and there’s something for everyone. The trouble is that they still can’t provide the amount of accuracy most businesses need. Businesses will spend more time than necessary on editing alone.
With human transcription, you record the audio files and send them to your transcription provider. It takes an average of 3-5 days to get your transcripts back, and it will be error-free, giving you time to focus on more important things. You can even get rushed transcripts if you need them sooner or even the same day.
Breakdown of Industries That Use Transcription Services
Nearly all industries require transcription services for one reason or another. It is vital for the smooth functioning of your business.
What are the most common industries that use transcription, and what are their specific requirements?
Lawyers and everyone in the legal industry are busy with meetings, attending court hearings, and fighting on their client’s behalf. The hustle and bustle leaves no time for transcription work. Law firms are seeing the significant benefits of outsourcing their transcription needs because it saves time, money, and they can count on presenting an accurate transcript to the court.
In the legal industry, accuracy is of the utmost importance. Errors in transcripts can cause confusion in court, false sentences, and even dismissed trials!
In which situations does the legal industry need transcripts?
- Resources for studying past cases
- Witness and suspect interviews
- 911 and jail calls
- Using the transcripts as evidence
Let’s look at in-depth examples of how the legal industry uses transcripts:
Transcripts act as an aid.
Legal transcripts are mandatory in court if you want to win a case. They are used as references, cue cards, and they help you deliver your case correctly.
Text is easier to work with.
Listening to an audio recording over and over again to find critical points is hugely unnecessary and time-consuming. Likewise, having the jury, and everyone involved listen to the audio file in court can be distracting and very time consuming.
With a transcribed document, you can highlight, mark pages, and use the handy keyboard shortcut Ctrl + F to search for specific information. Everyone in the courtroom is given the same document and can refer to it during the trial.
Transcripts are used in defendant appeals.
Lawyers use transcripts to research and prepare new strategies for an appeal.
Documents are easier to organize.
Legal documents are easier to group, prioritize, search, and sort. They can be stored as printed documents or filed as digital files in folders on your computer, in the cloud, or on your smart phone.
Is learning more effective by watching videos or reading? In 2018, MIT’s Integrated Learning Initiative (MITili) conducted a study where researchers hypothesized that students would prefer watching videos over reading text. However, 30% preferred reading, 20% preferred video, and 50% preferred another learning type. The students who used text did better on their exam results.
Three categories use academic transcription, including students, faculty, and schools.
What are the uses of transcription for students?
Students often record lectures. When they transcribe the audio recording, it gives them clearly written notes. They can go through highlighting essential sections and use the document for homework and studying.
Major papers like a thesis also often involve extended interviews. There’s usually one person at the interview rushing to type everything. Using an academic transcription company allows them to focus on the interview and not have to worry about taking so many notes right there as they happen.
What are the uses of transcription for faculty?
When a professor is giving a lecture, they can count on students to take notes. However, many students can’t keep up, and key points get missed.
An electronic copy of the day’s lecture posted online is an effective way for students to review material. Teachers can also send notes to students who missed class, to other teachers, and save them to their portfolio for future job considerations.
English as a second language students will greatly benefit from having the classes transcribed as well since many can understand text better than audio.
What are the uses of transcription for schools?
Many universities offer online courses and programs to supplement their existing on-campus degree programs. They can grow their student bodies while minimizing the cost of additional classrooms and facilities. They are also required by the Americans with Disabilities Act for the deaf and hard of hearing students. Transcribed lectures also make really good material to create new courses.
Clinics, hospitals, and private practices all over the US use medical transcription services to create various documents such as:
- History & Physical
- Clinic notes
- Chart notes
- Procedure notes
- Operative reports
- Autopsy reports
- Clinical reports
- Progress & follow-up
- Narrative summaries
- Board summaries
- Discharge summary
- Clinical trial & research
Healthcare providers must take diligent notes as part of the record-keeping process and to stay HIPAA compliant. Using a hand-held digital recorder is the most convenient way to record files. In fact, most physicians’ main concern is documentation and the time required to take notes.
Studies in 2010 and 2012 revealed resident physicians spend 40% to 49% of their time using a computer and 70% on documentation and order entry. The administrative work often extends into the evenings and contributes to the inaccuracy of health care records. With a reliable medical transcription provider, large backlogs of medical charting can be transcribed and sent back to medical facilities within hours, eliminating this extra work for health care professionals. Which in turn will help greatly reduce physician burnout.
Here are some other ways transcription helps keep the medical industry moving:
Trustable and Accurate Medical Records
For doctors, physicians, and primary care workers, accurate medical documentation helps quickly and carefully assess a patient’s records to decide the best course of action and treatment. It also allows doctors to reference their previous treatment strategies and take follow-up measures to ensure the patient’s condition is consistently improving.
Communication Throughout the Entire Facility
When patients aren’t with their doctors, nurses and support staff oversee them. Everyone having the same information is vital for proper patient care. Administrative staff also need to enter patient data and give updates to the family.
Even in small clinics, it’s rare that one doctor solely works on an entire case. Sharing documents is critical, especially when the case requires opinions from doctors in different specialties. Medical transcription helps facilitate the flow of information from one doctor to the next, ensuring proper treatment for each patient.
The Records are HIPAA Compliant
HIPAA (Health Insurance Portability and Accountability Act) revolutionized medical record-keeping and compliance throughout the US by creating standards medical practices must follow when handling confidential information. HIPAA-compliant documents serve as a basis for legal arguments should any complications occur or a lawsuit is brought up with the medical provider.
Consistency for Insurance
Insurance companies need accurate records for billing purposes. Without the submission of consistently accurate records, the medical facility may not get paid.
Both large and small businesses benefit from transcription. Business meetings, interviews, seminars, teleclasses, webinars, presentations, workshops, conferences calls, and more all need to be recorded and transcribed. The documents are sent to staff members, shared with other corporations and investors, and sometimes used as proof that an important topic had been discussed.
Business transcription often deals with financial data, new businesses, earnings, and competitive strategies. It’s about transcribing with maximum accuracy and security for confidential content. Most business transcription agencies will agree to sign an NDA (Non-Disclosure Agreement). The document prohibits the transcription service from sharing confidential information.
Because business transcription services can be a broad topic, let’s break it down into the segments that use it the most:
- Commercial and private bankers
- Corporate financial advisors
- Financial planners
- Insurance companies
- Investment bankers
- Money managers
- Call centers
Of course, outsourcing transcription saves time. Every staff member in a corporation is valuable, and if they’re spending large amounts of time on transcription work, it could hinder the company’s success.
Transcription also helps to:
- Cover many legal situations that can come about as a company grows, lets people go, or even goes public on a stock exchange.
- Access information for future referencing and recording purposes, which is crucial for the strategy development of many businesses.
- Saves time, effort, and resources. Companies don’t need to hire, train, buy equipment, and pay salaries for an in-house transcription team.
- Converting audio and video into text can help with SEO and online presence.
Large and small businesses always have files that need transcribing. Outsourcing can help lighten the load with fast turnaround times and accurate results, completed by an unbiased third party.
General transcription services cover everything that’s not mentioned above. They can be used to convert an oral history, a handwritten memoir, an inspiring church sermon, or anything in between you need in text format.
Common clientele for general transcription include:
- Life coaches
- Video production firms
- Publicity coaches
- Freelance writers
- SEO experts
- Insurance industry
- Talk radio shows
A general transcriptionist usually does not specialize in a specific field. They are well-rounded and experienced in transcribing a variety of subjects or industries.
Not sure which kind of transcription your file falls under? When you give a human transcription provider the details of your project, they’ll know exactly which category to place you in, which helps achieve the high accuracy we’ve been referring to.
With AI transcription, there are no categories or specialized transcriptionists. AI trains specific software with examples or datasets until the software gains more experience and builds a stronger algorithm.
Next, let’s look at the difference between AI and human transcription accuracy to see how they compare.
The Difference Between AI and Human Transcription Accuracy
Lately, our dependence on AI (artificial intelligence) has increased tremendously. Machines have developed over the past few years and are now used for transcription more than ever before.
The truth is, AI transcription is good. Or at least it’s good compared to what it was ten years ago. Google transcription and similar companies claim a 95% transcription accuracy rate and offer lower prices, making it an attractive choice for businesses of all kinds.
This 95% accuracy does, however, have some exceptions. The most accurate results show when the audio file is easy to decipher — one speaker, no background noises, or thick accents. Yet, even when the audio is clear, AI transcription can still fail.
Here are some results of Siri’s transcription as an example:
Here, the speaker said, “Hey Siri, what time is Soriana (grocery store) open until?”
In another case, the speaker said, “what is AI transcription’s error rate?”
Lastly, the speaker said, “what can I buy with one million dollars?”
Researchers say that functional transcription is only a matter of time, though the amount of time remains a very open question.
“We used to joke that, depending who you ask, speech recognition is either solved or impossible.” Says Gerald Friedland, the director of audio and multimedia at the International Computer Science Institute, affiliated with UC Berkeley. “The truth is somewhere in between.”
According to a 2020 study by Voicegain, Microsoft’s transcription tool had the lowest median WER (Word Error Rate) at 10.78%.
This is much higher than human transcriptionists’ error rate we employ, averaging at less than 1% within our teams.
“If you have people transcribe conversational speech over the telephone, the error rate is around 4 percent,” notes Xuedong Huang, a senior scientist at Microsoft, whose Project Oxford has provided a public API for voice recognition entrepreneurs to play with. “If you put all the systems together—IBM and Google and Microsoft and all the best combined—amazingly, the error rate will be around 8 percent.”
Huang also estimates commercially available systems are probably closer to 12 percent,. “This is not as good as humans,” Huang admits, “but it’s the best the speech community can do. It’s about as twice as bad as humans.”
So, why is human transcription much more accurate than AI? What can it do that AI can’t?
Humans Can Maneuver Background Noise
Humans can transcribe files with loud background noises (cars honking, music, people talking, machinery, office bustle, etc.). Of course, the more distractions, the longer it may take. However, human transcription can still provide you with an accurate transcript as the human ear is more attuned to different external factors. We’re used to sometimes being in loud restaurants while listening to our friends speak, getting coffee in busy coffee shops, and distracting noises in general.
Meanwhile, AI transcription can maybe produce a transcript with an error rate of 8%, even with a clear audio file.
Humans Understand Different Accents and Dialects
In 2018, a study showed Amazon Alexa and Google Assistant having problems with accuracy when identifying different accents irrespective of how fluent the English speaker may be. Accuracy dropped 2.6% for a speaker with a Chinese accent, and 4.2% for a Spanish accent!
AI service’s vocabulary is based on the dictionary, meaning they can only understand a short series of commands and limited words. Because of this, they have difficulty understanding accents, colloquial, and interlocked speech.
Humans are continually exposed to different dialects and accents, making them extremely well accustomed to many different accents.
Humans Can Differentiate Homophones (buy, by, bye)
One of the quickest ways of telling if a human or a machine did a transcript is the homophones. Homophones are words with the same pronunciation and different meanings (too, two, to). AI transcription services must rely on the sentence structure to predict which word to use, often leading to misused homophones. In contrast, humans rely on the context of the sentence to determine the appropriate homophone.
AI transcription can act as a quick and useful solution in certain situations, such as recording personal notes in a quiet room with one speaker.
When AI transcription does well and when human transcription is necessary.
Use Cases of AI Transcription Tools
AI transcription tools can be used in every situation human transcription can. The question is: where can AI do well, and where can human transcription undoubtedly provide more accuracy than AI?
AI can do well in the following areas:
- Simple business meetings with few speakers and extremely easy to understand language. AI can manage interviews with two speakers and no background noise. You won’t get speaker identification, and there will be some errors with homophones and punctuation, it most likely shouldn’t take too long to edit the document. Plan for something like 6 to 7 times longer than the audio file to do proper editing.
- AI works best with one speaker and no background noise. If you are alone and in a quiet space, AI transcription should give you the best transcript possible, requiring minor fixes. Meaning 2 to 3 times longer than the audio to do the editing.
While using AI in the situations above result in the most accurate transcripts, you can use AI in any situation. If there are multiple speakers, background noises, and accents, the best AI transcription can give you is a draft. You will spend time editing and formatting, and the time depends on how rough the draft is.
For more complicated audio recordings like panel interviews, business meetings with multiple speakers using acronyms and industry jargon, lecture recordings, courtroom recordings, and anything with background noise, choosing a human transcriptionist has a 99% accuracy guarantee. You will be provided with a complete transcript right away.
What Can Human Transcriptionists do that AI Tools Can’t?
Many benefits come with outsourcing your transcription needs to humans, such as more personalized service, support, and much better accuracy. Moreover, they use the best transcription equipment and ensure high-quality output. Most transcription services will match you with the transcriptionist best suited for your project. You can speak with them and go over any details, including customizing your format to your exact needs.
Generally, transcription styles and formats include the following:
- Full verbatim: the text is transcribed exactly as it sounds with repetitions and speech errors.
- Clean verbatim: all stammers, stutters, and filler words (hmm, like, uhmm) are removed.
- Edited verbatim: the document is made ready for publishing by improving grammar and tenses for better readability.
You can also choose the following extras:
Speaker identification: If your file has multiple speakers, they can be identified by inserting labels such as S1, S2, and S3. Speaker identification helps readers understand the flow, how many speakers there are, and who is speaking. Most transcription services can even include the speaker’s names.
Timestamps: Timestamps are inserted every 2 minutes or every time a new person starts speaking. Timestamps can be put at different intervals, depending on your needs.
SRT/Closed Captions: This is a specialized format that uses timestamps at particular intervals. It’s most commonly used in closed captioning and subtitles, and it works seamlessly with closed captioning software.
Don’t see the format you need? Transcription providers are flexible and can usually make the format you request. Make sure you specify your desired format with your transcription provider.
Next, human transcription makes documents extremely readable by using grammar, punctuation, and proper paragraphs to clarify the subject.
On another note, when someone is sworn into court, there are many repeat sentences. It’s not necessary to enter that information into a document over and over again. A human transcriptionist can easily do this for you.
In the same sense, in case of a discrepancy in court, the transcriptionist, being a real human being, can show up to testify on behalf of what is typed in the transcript.
When you deal with real people, communication is much clearer. Customer service also plays a massive role in the service and extra features you receive. On the contrary, AI transcription is programmed to transcribe. For obvious reasons, it can’t show up in court, be on a phone call with you, or have the same intuition with formatting and grammar that humans have.
Which transcription option seems the most reliable for your business? If you produce simple notes and have no background noise or speakers with accents, AI transcription tools should be good enough for your business. If you often require audio files with multiple speakers and accents or are in the legal or medical industry where accuracy is extremely crucial, human transcription will save you a lot of time on edits and present you with an accurate transcript right away.
If human transcription is the right choice for your business, Ditto Transcripts comprises 100% US-based human transcribers. We are here to help give you better service and the most accurate transcription. You can send us your project details now to firstname.lastname@example.org or call us (720) 287-3710, and we’d be happy to hash out the details of your project with you on the phone.