Recently, the President of Ukraine Volodymyr Zelenskyy gave a huge interview to the famous US blogger Lex Friedman. The conversation can be watched and listened to in three languages simultaneously – Ukrainian, English, and Russian – thanks to the American AI service ElevenLabs, which provided simultaneous translation.

The three-hour conversation focused on peace talks, NATO and security guarantees for Ukraine, US military assistance, martial law and elections, as well as Donald Trump and Elon Musk. An interesting feature is the fact that artificial intelligence has kept Volodymyr Zelenskyy’s voice authentic even in translation: intonations, emotions, and style of speech are noticeable no matter what language you listen to the interview in.

We tell you how the Polish-founded startup ElevenLabs got its “unicorn” status (valued at $1 billion), what its key features are, and how the service is used by trolls and Russian propagandists.

What is ElevenLabs and how does it work?

Headquartered in New York City, ElevenLabs was founded in 2022 by former Google ML engineer Piotr Dabkowski and former Palantir deployment strategy developer Mati Staniszewski. The idea for the company was born out of a desire to improve the quality of dubbing: Peter and Mati hated the poor quality dubbing of Hollywood movies when they were children.

ElevenLabs AI service translated Zelensky’s interview with Lex Fridman. What is this startup and how does it work?
Piotr Dabkowski and Mati Staniszewski, founders of ElevenLabs; photo from forbes.pl

The startup’s goal is to help overcome language barriers in content. The company is currently developing tools for creating and editing synthetic voices. Essentially, ElevenLabs is an AI-based online voice generator that can be accessed directly from a web browser. Already in June 2023, the company raised $19 million in investments: the Series A round was led by the American venture fund Andreessen Horowitz, ex-GitHub CEO Nat Friedman, and ex-head of the AI department at the Y Combinator business incubator Daniel Gross; the company’s value was estimated at $100 million.

And in January 2024, the startup managed to receive $80 million in investment: Bloomberg reported that the company’s valuation had reached $1.1 billion, citing the project’s executive director, Mati Staniszewski. Thus, in less than a year, ElevenLabs became a unicorn. After raising funds, Staniszewski noted that ElevenLabs plans to use the service in audiobooks, video games, media, movie dubbing, and the creation of full-fledged AI actors. As of January 2024, the company employed only 40 people, but the plans were to expand the staff to 100.

ElevenLabs AI service translated Zelensky’s interview with Lex Fridman. What is this startup and how does it work?
ElevenLabs startup team; photo from the company’s website

A few months after the launch of the tools in beta, ElevenLabs saw its first million users join. The company also built on AI voice research by launching AI Dubbing, a speech-to-speech tool that allows translating audio and video into 32 different languages while preserving the voice, emotions, and style of the original speaker.

The company’s technology currently offers two main ways to create audio: text-to-speech and audio-to-different audio using different voices. ElevenLabs is suitable for a variety of tasks, from creating audiobooks and podcasts to dubbing training videos and working with virtual assistants. The service also provides tools for customizing the voice: you can change the tone, speed, and even emotions, which makes the voice creation process flexible and allows you to get a result that perfectly matches the user’s intent.

According to Mati Staniszewski, ElevenLabs’ technology combines context recognition and high compression to deliver ultra-realistic speech. “Rather than generate sentences one by one, the company’s proprietary model is built to understand word relationships and adjusts delivery based on the wider context. It also has no hardcoded features, meaning it can dynamically predict thousands of voice characteristics while generating speech. We are constantly entering into new B2B partnerships, with over 100 established to date. AI voices have wide applicability – from enabling creators to enhance audience experiences, to broadening access to education and providing innovative solutions in publishing, entertainment, and accessibility,” the startup’s co-founder told VentureBeat.

The key features of ElevenLabs include the following:

▪️high quality voice with artificial intelligence – the service offers several language models that can be selected for different languages and needs;

▪️convenient and easy customization – in the online interface, you can adjust voice parameters for the entire project: adjust stability, similarity, and add style;

▪️voice library – the service provides more than 100 ready-made voices; you can also create your own voice in the VoiceLab section; and you can customize new voices or upload audio files for cloning;

▪️simple and intuitive interface – it’s easy to understand how everything works; the voice generator allows you to change voices directly on the same page, without the need to use additional tools.

ElevenLabs offers several tariffs for different needs:

Plan Price Credits/month Audio/month Main functions
Free $0 10,000 ~10 minutes Voice generation in 32 languages, translation with automatic dubbing, creation of unique voices, sound effects, access to API
Starter $5/month 30,000 ~30 minutes Everything from Free, plus voice cloning (from 1 minute of audio), access to the dubbing studio to customize translation and synchronization, a license for commercial use
Creator $11/month 100,000 ~2 hours Everything from Starter, plus professional voice cloning, projects for long content with multiple voices, better sound
Pro $99/month 500,000 ~10 hours Everything from Creator, plus improved sound output (44.1 kHz PCM)
Scale $330/month 2,000,000 ~40 hours Everything with Pro, priority support
Business $1,320/month 11,000,000 ~180 hours Everything from Scale, plus Turbo model ($50/million credits with annual payment), 3 professional voice clones, priority support
Enterprise Individually Unlimited Unlimited Everything from Business, plus full API access, customizable terms, security guarantees, security questionnaires, SSO support, more votes and transactions every month

In 2023, the service added 20 languages to its AI dubbing. Thus, Ukrainian, Polish, Spanish, Japanese, and Arabic appeared in ElevenLabs. And in 2024, the “unicorn startup” launched a public application. ElevenLabs Reader: AI Audio can recognize and voice text from web pages and PDFs using 11 different voices. In addition to the web version, the service is now available for major mobile platforms: Android and iOS.

The company’s technologies are also used for voice interaction on the Rabbit r1 device, as well as for text-to-speech functions in the Perplexity artificial intelligence search engine and PocketFM and KukuFM audio platforms. ElevenLabs Reader: AI Audio was the first product of the service aimed at mass consumers. The closest competitor to the company’s app is Speechify, which offers additional features such as document scanning for text recognition, integration with Gmail, and Canvas. ElevenLabs has recently introduced a feature that allows you to upload different types of content to create a multi-speaker podcast: the GenFM feature can be found in the ElevenLabs Reader app for iOS. GenFM currently supports 32 languages. In 2024, the company also launched a text-to-music model, as well as the Voice Isolator feature that removes background noise from audio.

The company also announced an $11 million investment in the Polish startup ecosystem and the opening of an office in Warsaw, which will become a development center to attract AI talent. The startup also wants to expand its presence in India, where it has already hired a business manager and is building a team.

Scandals and ElevenLabs: trolls, Russians, and outraged actors

The launch of ElevenLabs was not without its problems. Internet trolls immediately took advantage of the open access to the technology and began distributing fake audio recordings on social media, in which the voices of famous people uttered offensive statements, declared wars, or quoted Hitler. For example, some users of the anonymous English-language image board 4chan used the ElevenLabs voice synthesis platform to clone celebrity voices and read out audio ranging from memes and erotica to hate speech and disinformation. The misuse of the company’s software was first reported by Motherboard, which found posters on the 4chan website who were distributing voice clips generated by artificial intelligence that resembled the voices of famous personalities, including Emma Watson and Joe Rogan:

“In one example, a generated voice that sounds like actor Emma Watson reads a section of Mein Kampf. In another, a voice very similar to Ben Shapiro makes racist remarks about Alexandria Ocasio-Cortez. In a third, someone saying “trans rights are human rights” is strangled.”

ElevenLabs AI service translated Zelensky’s interview with Lex Fridman. What is this startup and how does it work?
4chan used celebrity voices for abuse

ElevenLabs then acknowledged the abuse on social network X (ex-Twitter) and said it would explore ways to mitigate these issues. The company noted that it could “trace any generated audio back to the user” and would explore security measures such as verifying the user’s identity and manually checking each voice cloning request.

According to various human rights organizations, dubbing actors are now increasingly being asked to sign away the rights to their voices so that clients can use artificial intelligence to create synthetic versions that can eventually replace them, sometimes without additional compensation. These contractual obligations are just one of the many concerns that actors have about the development of voice-generating artificial intelligence. AI, they say, threatens to drive entire segments of the industry out of the labor market.

On the one hand, ElevenLabs sets clear rules for the use of its technology, prohibiting the cloning of voices for offensive purposes, such as fraud, discrimination, and hate speech, while on the other hand, the company supports the use of the platform for “caricatures, parodies, and satire,” as well as for “artistic and political speeches and debates.” The company also announced its authority to suspend the accounts and content of users found to be in violation of these rules.

On the eve of the US Democratic Party primaries in New Hampshire in January 2024, thousands of citizens received automated calls generated by artificial intelligence, allegedly from Joe Biden, urging voters to skip voting on the day of the primaries. The New Hampshire Attorney General’s Office then launched an investigation into the incident and linked it to a Texas-based company, but experts concluded that the call was made using ElevenLabs’ technology. In response to the incident, ElevenLabs CEO Mati Staniszewski said that ElevenLabs is “committed to preventing the misuse of artificial intelligence audio tools,” but did not provide any comments on specific incidents.

In December 2024, it was reported that Russia was using ElevenLabs’ artificial intelligence to generate votes to undermine support for Ukraine in the West. According to a report by the cybersecurity company Recorded Future, Russia is using generative AI in a new propaganda campaign aimed at discrediting Ukraine and undermining assistance from European countries. The fake videos actively used voices generated by the ElevenLabs service.

ElevenLabs AI service translated Zelensky’s interview with Lex Fridman. What is this startup and how does it work?
Screenshot from Recorded Future‘s report

This campaign was organized by the Social Design Agency, a Russian entity under US sanctions. The videos, targeted at a European audience, accused Ukrainian politicians of corruption and presented Western equipment, including American Abrams tanks, as ineffective. Experts found that several of the videos were voiced by real people. This was revealed by a noticeable Russian accent. Journalists have previously investigated that this agency uses disinformation to discredit Ukraine and promote the interests of Russia. It is run by Russian political strategist Ilya Gambashidze.

As reported by TechCrunch, the Russians quickly translated the video into several European languages: English, German, French, Polish, and Turkish using ElevenLabs’ AI. Recorded Future used ElevenLabs’ AI Speech Classifier tool to confirm that the voices in the video were created by AI. In response to these incidents, ElevenLabs introduced new security measures to limit the use of their technology for cheating, including automatically blocking the voices of politicians.