resemble-ai

    resemble-ai/chatterbox

    SoTA open-source TTS

    backend
    Python
    MIT
    22.7K stars
    3.0K forks
    22.7K watching
    Updated 2/27/2026
    View on GitHub
    Backblaze Advertisement

    Loading star history...

    Health Score

    75

    Weekly Growth

    +0

    +0.0% this week

    Contributors

    1

    Total contributors

    Open Issues

    300

    Generated Insights

    About chatterbox

    cb-big2

    Chatterbox TTS

    Alt Text Alt Text Alt Text Discord

    _Made with ♥️ by resemble-logo-horizontal

    We're excited to introduce Chatterbox, Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations.

    Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out. Try it now on our Hugging Face Gradio app.

    If you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (link). It delivers reliable performance with ultra-low latency of sub 200ms—ideal for production use in agents, applications, or interactive media.

    Key Details

    • SoTA zeroshot TTS
    • 0.5B Llama backbone
    • Unique exaggeration/intensity control
    • Ultra-stable with alignment-informed inference
    • Trained on 0.5M hours of cleaned data
    • Watermarked outputs
    • Easy voice conversion script
    • Outperforms ElevenLabs

    Tips

    • General Use (TTS and Voice Agents):

      • The default settings (exaggeration=0.5, cfg_weight=0.5) work well for most prompts.
      • If the reference speaker has a fast speaking style, lowering cfg_weight to around 0.3 can improve pacing.
    • Expressive or Dramatic Speech:

      • Try lower cfg_weight values (e.g. ~0.3) and increase exaggeration to around 0.7 or higher.
      • Higher exaggeration tends to speed up speech; reducing cfg_weight helps compensate with slower, more deliberate pacing.

    Installation

    pip install chatterbox-tts
    

    Alternatively, you can install from source:

    # conda create -yn chatterbox python=3.11
    # conda activate chatterbox
    
    git clone https://github.com/resemble-ai/chatterbox.git
    cd chatterbox
    pip install -e .
    

    We developed and tested Chatterbox on Python 3.11 on Debain 11 OS; the versions of the dependencies are pinned in pyproject.toml to ensure consistency. You can modify the code or dependencies in this installation mode.

    Usage

    import torchaudio as ta
    from chatterbox.tts import ChatterboxTTS
    
    model = ChatterboxTTS.from_pretrained(device="cuda")
    
    text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."
    wav = model.generate(text)
    ta.save("test-1.wav", wav, model.sr)
    
    # If you want to synthesize with a different voice, specify the audio prompt
    AUDIO_PROMPT_PATH = "YOUR_FILE.wav"
    wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH)
    ta.save("test-2.wav", wav, model.sr)
    

    See example_tts.py and example_vc.py for more examples.

    Supported Lanugage

    Currenlty only English.

    Acknowledgements

    Built-in PerTh Watermarking for Responsible AI

    Every audio file generated by Chatterbox includes Resemble AI's Perth (Perceptual Threshold) Watermarker - imperceptible neural watermarks that survive MP3 compression, audio editing, and common manipulations while maintaining nearly 100% detection accuracy.

    Watermark extraction

    You can look for the watermark using the following script.

    import perth
    import librosa
    
    AUDIO_PATH = "YOUR_FILE.wav"
    
    # Load the watermarked audio
    watermarked_audio, sr = librosa.load(AUDIO_PATH, sr=None)
    
    # Initialize watermarker (same as used for embedding)
    watermarker = perth.PerthImplicitWatermarker()
    
    # Extract watermark
    watermark = watermarker.get_watermark(watermarked_audio, sample_rate=sr)
    print(f"Extracted watermark: {watermark}")
    # Output: 0.0 (no watermark) or 1.0 (watermarked)
    

    Official Discord

    👋 Join us on Discord and let's build something awesome together!

    Citation

    If you find this model useful, please consider citing.

    @misc{chatterboxtts2025,
      author       = {{Resemble AI}},
      title        = {{Chatterbox-TTS}},
      year         = {2025},
      howpublished = {\url{https://github.com/resemble-ai/chatterbox}},
      note         = {GitHub repository}
    }
    

    Disclaimer

    Don't use this model to do bad things. Prompts are sourced from freely available data on the internet.

    Discover Repositories

    Search across tracked repositories by name or description