Kokoro 82M Text to Speech AI Model

Kokoro 82M is a state-of-the-art text-to-speech (TTS) model leveraging the StyleTTS 2 and ISTFTNet architectures. Released under Apache 2.0, this model combines compact size and unmatched performance, delivering high-quality speech synthesis in American and British English.

Discover Skyreels - Revolutionizing Video Content Creation. Learn More →

Model Input

Select ModelSpecial Spanish sound models are available now, try it!

Prompt*

0/800

Voice

Allowed to publicly display the creations.

Generated Sound

No Sound Generated Yet

em_santaHarry Potter y la pi...

Generated on 9/25/2025

Audio Preview

Generated on 2/25/2025

zm_yunxi测试.。340800 20000101 ...

Generated on 4/15/2025

em_santaEsta es una prueba d...

Generated on 10/14/2025

af_hearthi

Generated on 10/7/2025

ff_siwisBonjour, je m'appell...

Generated on 10/3/2025

pm_santaÓ Altíssimo, em ti b...

Generated on 11/20/2025

zf_xiaoxiao前方兩百公尺請左轉堤頂大道,接下來請右轉...

Generated on 7/14/2025

bf_emmaHi Dana, I'm followi...

Generated on 9/4/2025

af_bellaA little over ten ye...

Generated on 4/12/2025

How to Use Kokoro 82M

A quick guide to getting started with Kokoro 82M for seamless text-to-speech generation.

Install dependencies: Clone the Kokoro 82M repository and set up your environment using pip and espeak-ng.
Load the model: Use the provided code to build the Kokoro model and select your desired voicepack.
Generate speech: Input your text and generate 24kHz audio output using the built-in functions.

Frequently Asked Questions

What makes Kokoro 82M unique among TTS models?

Kokoro 82M stands out due to its efficient architecture, compact size of just 82 million parameters, and high performance. It surpasses larger models like MetaVoice (1.2B params) and XTTS (467M params) while being open source and commercially viable.

Is Kokoro 82M suitable for commercial use?

Yes, Kokoro 82M is licensed under the Apache 2.0 license, making it perfect for commercial applications. It offers reliable, high-quality TTS solutions without proprietary restrictions.

How does Kokoro 82M handle different accents?

Kokoro 82M supports both American and British English. You can select specific voicepacks like Bella, Sarah, Adam, and others to match your preferred accent.

What are the system requirements for running Kokoro 82M?

Kokoro 82M is lightweight and can run on consumer-level hardware. It supports both GPU and CPU configurations, and the ONNX version provides even broader compatibility for real-time applications.

Can Kokoro 82M handle multilingual text?

Currently, Kokoro 82M is optimized for English text-to-speech synthesis. However, its architecture has the potential to support other languages with additional training data.

Is Kokoro 82M capable of voice cloning?

While Kokoro 82M does not currently support voice cloning due to its limited training dataset (<100 hours), its existing voicepacks deliver exceptional quality for specific voice styles.