Technology simplifies our lives every day - by transferring routine tasks to them, we save energy and time. For example, this applies to converting audio to text. For a person, this is a kind of "endurance quest": listening to a recording again and again, trying to understand incomprehensible words, and rejoicing if the speaker speaks clearly and slowly.
Modern solutions greatly facilitate this process. There are many transcription tools on the network that can do this job quickly and efficiently. We at "Mezh" selected four services and checked how well they function. Each of them supports the Ukrainian language and offers different capabilities.
For testing, we recorded an excerpt from the story "Kaydash's Family" by Ivan Nechuy-Levytsky and converted it into text using selected tools. Below we will tell you what came of it.
Any to Text
This online service automatically converts both audio and video to text. It uses artificial intelligence and works quite simply: the user needs to upload the recording, click the Transcribe button, and wait for the result.
The website states that Any to Text supports various audio and video formats. Among them: MP4, MKV, FLV, AVI, MOV, WMV, M4A, MP3, O gg , AAC, WAV, FLAC, WMA. The service automatically recognizes over 50 languages and allows you to export the finished file in .docs, .xlsx, .srt or .txt formats.
To try the tool, users can transcribe 15 minutes of recording for free. After that, Any to Text offers different subscription plans at a discounted price: 100 minutes ($3.2), 500 minutes ($14), or 1,000 minutes ($25).
Any to Text promises up to 98% accuracy. However, testing has shown that the quality is actually somewhat lower. You can see how the tool coped with Ukrainian text in the screenshot below. On the plus side, the whole process takes just a few minutes.
Good Tape
The service was developed by journalists who "spent thousands of hours in newsrooms, media centers, and editorial offices pulling their hair out because they hated having to manually decode." They decided to make things easier—for themselves and everyone else.
This tool also uses artificial intelligence and is notable for its simplicity. As in the previous case, Good Tape allows you to upload a file, click Transcribe, and see the finished result.
In terms of functionality, the service transcribes audio/video and supports most, if not all, formats. Users can upload files up to 2 GB each.
The tool recognizes over 100 languages and dialects and allows you to export files in .docs, .srt, or .txt formats. The quality of the converted files seems to be slightly better, but it is still far from the desired perfection.
Meanwhile, Good Tape emphasizes that it does not use user files for AI training or any other purposes. The team assures: "Your files belong to you." The service also indicates that it complies with the requirements of the GDPR (General Data Protection Regulation).
Good Tape has "Free", "Professional" and "Team" subscription plans. Under the terms of the first, you can convert up to three files up to 30 minutes long each for free each month. Among the disadvantages is the long process. For example, processing a 35-second audio took about an hour.
The "Professional" plan offers 20 hours of recordings and faster processing speed. Users also get automatic speaker recognition and a short transcript summary. Plus, benefits like priority access to new features.
The cost of the "Professional" plan is 15 euros per month, but the first month costs 9 euros. With an annual subscription, the monthly cost drops to 13.75 euros.
Good Tape's "Team" plan is designed for groups of 5 or more people. The cost and specific terms of this tariff are not specified - they promise to prepare them upon individual request.
Sonix AI
Of the four services on the list, this one has perhaps the most functionality. Using the latest AI technologies, the company has created a platform for transcription, translation, and analysis.
First of all, you can perform a verbatim transcription of an audio or video file up to 4 GB in size, with all exclamations, pauses, or stutters. To do this, you need to upload the file to the platform from your PC, Dropbox, Google Drive, YouTube, or provide a link to the video.
Next, Sonix AI offers to automatically or manually identify the language or dialect in the recording from among 53+ available options. The next step is to move on to the actual conversion.
Regardless of whether it's a paid version or not, the whole process takes a few minutes. The finished text can be edited, copied, highlighted, added to with a note, etc. It contains timestamps and identifies speakers.
The platform supports most audio and video formats. You can export the file in the following formats: .docs, .txt, .pdf, .srt, .vtt, .ttml, .csv, .sesx, .xml, .fcpxml, wav or mp3. The quality of the result, with the exception of a few errors, is good.
Border
As for translation, the company promises to overcome "language barriers" by performing it quickly and accurately - again thanks to AI. According to Sonix AI, the technology does not simply translate words, but adapts content to the target audience and context.
But analytics tools allow users to delve deeper into their content, identify its key themes, and even pay attention to audience sentiment.
However, Sonix AI is not only about current capabilities, but also about future ones. For example, the company is currently offering early access to a real-time audio and video to text conversion feature, which is planned for release soon.
In addition, it has a high demand for medical transcription and offers early access to a service that will comply with HIPAA (Health Insurance Portability and Accountability Act) requirements.
Sonix AI offers 30 minutes of free audio-to-text conversion, but you need to register on the platform to get it. There are also three paid plans: Standard, Premium, and Enterprise.
"Standard" has no subscription fee. Within its limits, the user can use the transcription and translation service. The cost is $10 per hour of recording.
"Premium" is a plan with access for multiple users. It offers transcription or analysis for $5 per hour and translation for $3 per hour . In addition to the hourly fee, you must subscribe: $22 per month per user when paying monthly or $16.50 per month when paying annually.
The "Corporate" plan provides access to more than five users, its prices and terms are negotiated individually.
TurboScribe
This AI service primarily focuses on converting audio and video to text in over 98 languages. However, its working principle is slightly different from the tools described above.
Like its predecessors, TurboScribe also supports the most common formats. However, after loading the file, you need to choose one of three decryption modes: Cheetah (fastest), Dolphin (balanced), and Whale (most accurate).
In addition, you need to choose the language yourself, optionally configure speaker recognition, and remove background noise to improve sound quality. After that, you can start processing the file - it happens quickly.
The finished text can then be edited, translated, distributed, summarized, and exported. As for the quality of TurboScribe transcription, it can be considered quite good, although not perfect.
One of the distinctive features of the service is its terms of use. Users can decrypt up to 3 files per day for free, each lasting up to 30 minutes.
Those interested can also choose the TurboScribe Unlimited plan and get unlimited transcription and editing of recordings up to 10 hours long. At the same time, files up to 5 GB each are allowed to be uploaded. The cost of the plan at a reduced price is $ 20 per month when paying monthly or $ 10 per month when paying annually.
There’s also TurboScribe for Teams , a team plan with unlimited transcription that costs $ 120 per user per year.