XTTS-v2

Free

Multilingual voice cloning with only 6 seconds of audio.

MultilingualVoice CloningOpen-Source

Overview

XTTS-v2 clones a voice from just six seconds of reference audio and can speak it across more than a dozen languages. It supports cross-lingual synthesis, so a voice recorded in English can speak fluent Spanish or Japanese. It remains one of the most popular open voice-cloning models available.

What makes XTTS-v2 special?

Focus: Cross-lingual voice cloning.
Availability: Free & Open-Source (Coqui).
Use cases: Localization, dubbing, and personalised voices.

Visit official site

Related free models

Chatterbox

A lightweight, fast TTS model built on LLaMA.

Dia

A 1.6B parameter TTS model from Nari Labs.

Kokoro

An 82M parameter TTS model by Hexgrad.

Back to directory