Lightspeed Ventures-backed audio platform Pocket FM has announced an exciting partnership with voice-cloning company ElevenLabs to swiftly transform text into audio series using advanced AI technology.
After successfully raising $103 million in Series D funding in March, Pocket FM said that it was already piloting the AI text-to-audio conversion capabilities powered by ElevenLabs. Now, the India-based company is set to expand this functionality, making the conversion tool accessible to all creators in the upcoming weeks.
During the testing phase, Pocket FM produced an impressive 30,000 hours of audio series utilizing ElevenLabs’ AI technology. With the new rollout, the startup anticipates tripling its existing content library, which currently exceeds 100,000 hours of audio, by the end of this year. Additionally, Pocket FM reported that its AI-driven tools significantly reduced audio production costs by up to 90% in the experimental phase.
Prateek Dixit, Pocket FM’s co-founder and CTO, explained that the goal of this partnership is to simplify the process for writers to convert their written works into engaging audio series. “We have over 250,000 writers, including those on our Pocket Novel platform, and this collaboration lowers the barriers for them to produce audio,” he noted.
Dixit also highlighted the impact of AI tools: “Typically, a writer can create around 30 minutes of high-quality audio daily with the right recording set-up. With these AI tools, that output can increase up to tenfold.”
Pocket FM has developed a tool that integrates ElevenLabs technology, providing writers with access to 50 diverse voice options for content conversion. ElevenLabs co-founder Mati Staniszewski mentioned that their tool adeptly understands the context of the writing and automatically infuses emotion into the voice generation. “Working with Pocket FM, we are deploying newer models that better understand writing genres and emotional expressions,” he stated.
To further enhance user experience, Pocket FM plans to recommend specific voice options tailored to different genres based on user engagement data.
Pocket FM isn’t alone in leveraging AI for audio series. Google-backed Kuku FM also utilizes technologies like GPT-4, Claude, BandLab, and ElevenLabs to assist writers at various stages of content creation, including refining scripts, generating thumbnails, adding sound effects, and converting text into audio.
Kuku FM has also reported experimenting with visual creation tools like Midjourney and Runway to develop promotional content related to its offerings.
Quality Control and Artist Impact
While AI-driven tools promise rapid content generation, the challenge remains: ensuring quality. Pocket FM’s strategy for maintaining high standards is to refine its content discovery algorithm and actively analyze user engagement. Dixit emphasized, “When a writer publishes an audio series, we showcase it to a select audience and monitor their engagement. Positive metrics trigger further promotion.”
Kuku FM is also investing in quality assurance, with a dedicated team focused on promoting only high-quality content on its platform. “We recognized the need for a human quality control team at the heart of our decisions for audio content production,” said Lal Chand Bisu, co-founder and CEO of Kuku FM.
However, the rise of AI could threaten the roles of traditional voiceover artists. The Association of Voiceover Artists (AVA) in India has voiced concerns that AI could jeopardize their jobs. Amarinder Singh Sodhi, the association's general secretary, stated, “If AI takes over, our livelihoods are at risk. We need regulatory measures to safeguard our profession.” He also mentioned instances where voiceover artists were invited to record samples for AI training without proper consent.
On a personal level, Delhi-based artist Aditya Mattoo expressed his worries, stating, “Using AI diminishes the emotional connection inherent in storytelling.” He cautioned that allowing access to premium voices to unskilled creators could flood the market with subpar content.
Voiceover artists globally have joined the conversation, sharing apprehensions about AI’s impact on their careers. Despite collaborating with AI companies, many remain uncomfortable with alterations to their voices.
In response to inquiries about the effects of AI-generated content at Pocket FM, the company refrained from giving a direct answer but noted that user engagement with AI-generated audio was "on par with human narrations." Additionally, Pocket FM is exploring technology that enables multiple voices in a single audio output.
Currently, both Pocket FM and Kuku FM do not label their content to indicate whether AI was used in its creation.