Podcast recording and editing platform Podcastle is now joining other companies in the AI-powered, text-to-speech race by releasing its own AI model called Asyncflow v1.0. An API for developers will also be available, allowing them to directly integrate the text-to-speech model in their apps.

Thanks to the new model, the company is able to offer more than 450 AI voices that can narrate your text. The startup said that it developed the technology and model in such a way that its training and inference costs are low, giving it an advantage against competitors.

With the move, Podcastle joins a number of startups, including ElevenLabs, Speechify, and WellSaid, that have developed technology and AI models to convert any kind of text into a voice clip narrated by AI. This technology spans use cases like marketing, advertisement, content creation, education, and corporate training.

Podcastle’s founder, Arto Yeritsyan, told TechCrunch that the company had always wanted to build a text-to-speech model, but the cost of training and data requirements for that were very high.

“We wanted to build a robust text-to-speech model since our inception. However, the costs of development were very high. Thanks to recent large language model developments, we were able to reach a breakthrough last year to get to a place where we could build a high-quality voice model without needing a ton of data,” Yeritsyan said.

The company was also aided in its efforts by its $13.5 million Series A fundraise last year.

Yeritsyan said that while Podcastle charges around $40 per 500 minutes of text-to-speech conversion, ElevenLabs charges $99 for the same.

Podcastle’s voice cloning feature is getting an upgrade, as well, to create a quicker process for training.

Earlier, the training process involved reading roughly 70 different sentences. Now, it just needs a few seconds of recording from you to create a clone of your voice. The new process also used Podcastle’s Magic Dust AI, which was released last year, to improve audio recording quality.

In our testing, the voice created with the new process sounded a bit robotic, though it mimicked our tone. The company said that, over time, it will improve the feature. Plus, you can train different samples of your voice to get different results.

Podcastle said that apart from costs, having tools for audio, video, podcasts, and AI-powered narration under one redesigned site will give it an edge over competitors. Yeritsyan said that while the majority of the users use Podcastle to work on audio content, video is catching up to it as well.