mudler

    mudler/LocalAI

    :robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference

    ai
    api
    web3
    llm
    computer-vision
    nlp
    audio-generation
    decentralized
    distributed
    gemma
    image-generation
    libp2p
    llama
    mamba
    mcp
    mistral
    musicgen
    object-detection
    rerank
    rwkv
    stable-diffusion
    text-generation
    tts
    Go
    MIT
    43.0K stars
    3.6K forks
    43.0K watching
    Updated 2/27/2026
    View on GitHub
    Backblaze Advertisement

    Loading star history...

    Health Score

    75

    Weekly Growth

    +0

    +0.0% this week

    Contributors

    1

    Total contributors

    Open Issues

    170

    Generated Insights

    About LocalAI




    LocalAI forks LocalAI stars LocalAI pull-requests

    LocalAI Docker hub LocalAI Quay.io

    Follow LocalAI_API Join LocalAI Discord Community

    mudler%2FLocalAI | Trendshift

    :bulb: Get help - ❓FAQ 💭Discussions :speech_balloon: Discord :book: Documentation website

    💻 Quickstart 🖼️ Models 🚀 Roadmap 🥽 Demo 🌍 Explorer 🛫 Examples Try on Telegram

    testsBuild and Releasebuild container imagesBump dependenciesArtifact Hub

    LocalAI is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI (Elevenlabs, Anthropic... ) API specifications for local AI inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU. It is created and maintained by Ettore Di Giacinto.

    📚🆕 Local Stack Family

    🆕 LocalAI is now part of a comprehensive suite of AI tools designed to work together:

    LocalAGI Logo

    LocalAGI

    A powerful Local AI agent management platform that serves as a drop-in replacement for OpenAI's Responses API, enhanced with advanced agentic capabilities.

    LocalRecall Logo

    LocalRecall

    A REST-ful API and knowledge base management system that provides persistent memory and storage capabilities for AI agents.

    Screenshots

    Talk InterfaceGenerate Audio
    Screenshot 2025-03-31 at 12-01-36 LocalAI - TalkScreenshot 2025-03-31 at 12-01-29 LocalAI - Generate audio with voice-en-us-ryan-low
    Models OverviewGenerate Images
    Screenshot 2025-03-31 at 12-01-20 LocalAI - ModelsScreenshot 2025-03-31 at 12-31-41 LocalAI - Generate images with flux 1-dev
    Chat InterfaceHome
    Screenshot 2025-03-31 at 11-57-44 LocalAI - Chat with localai-functioncall-qwen2 5-7b-v0 5Screenshot 2025-03-31 at 11-57-23 LocalAI API - c2a39e3 (c2a39e3639227cfd94ffffe9f5691239acc275a8)
    LoginSwarm
    Screenshot 2025-03-31 at 12-09-59 Screenshot 2025-03-31 at 12-10-39 LocalAI - P2P dashboard

    💻 Quickstart

    Run the installer script:

    # Basic installation
    curl https://localai.io/install.sh | sh
    

    For more installation options, see Installer Options.

    Or run with docker:

    CPU only image:

    docker run -ti --name local-ai -p 8080:8080 localai/localai:latest
    

    NVIDIA GPU Images:

    # CUDA 12.0
    docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12
    
    # CUDA 11.7
    docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11
    
    # NVIDIA Jetson (L4T) ARM64
    docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64
    

    AMD GPU Images (ROCm):

    docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas
    

    Intel GPU Images (oneAPI):

    docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel
    

    Vulkan GPU Images:

    docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan
    

    AIO Images (pre-downloaded models):

    # CPU version
    docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
    
    # NVIDIA CUDA 12 version
    docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12
    
    # NVIDIA CUDA 11 version
    docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11
    
    # Intel GPU version
    docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel
    
    # AMD GPU version
    docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas
    

    For more information about the AIO images and pre-downloaded models, see Container Documentation.

    To load models:

    # From the model gallery (see available models with `local-ai models list`, in the WebUI from the model tab, or visiting https://models.localai.io)
    local-ai run llama-3.2-1b-instruct:q4_k_m
    # Start LocalAI with the phi-2 model directly from huggingface
    local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf
    # Install and run a model from the Ollama OCI registry
    local-ai run ollama://gemma:2b
    # Run a model from a configuration file
    local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
    # Install and run a model from a standard OCI registry (e.g., Docker Hub)
    local-ai run oci://localai/phi-2:latest
    

    Automatic Backend Detection: When you install models from the gallery or YAML files, LocalAI automatically detects your system's GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see GPU Acceleration.

    For more information, see 💻 Getting started

    📰 Latest project news

    Roadmap items: List of issues

    🚀 Features

    🧩 Supported Backends & Acceleration

    LocalAI supports a comprehensive range of AI backends with multiple acceleration options:

    Text Generation & Language Models

    BackendDescriptionAcceleration Support
    llama.cppLLM inference in C/C++CUDA 11/12, ROCm, Intel SYCL, Vulkan, Metal, CPU
    vLLMFast LLM inference with PagedAttentionCUDA 12, ROCm, Intel
    transformersHuggingFace transformers frameworkCUDA 11/12, ROCm, Intel, CPU
    exllama2GPTQ inference libraryCUDA 12
    MLXApple Silicon LLM inferenceMetal (M1/M2/M3+)
    MLX-VLMApple Silicon Vision-Language ModelsMetal (M1/M2/M3+)

    Audio & Speech Processing

    BackendDescriptionAcceleration Support
    whisper.cppOpenAI Whisper in C/C++CUDA 12, ROCm, Intel SYCL, Vulkan, CPU
    faster-whisperFast Whisper with CTranslate2CUDA 12, ROCm, Intel, CPU
    barkText-to-audio generationCUDA 12, ROCm, Intel
    bark-cppC++ implementation of BarkCUDA, Metal, CPU
    coquiAdvanced TTS with 1100+ languagesCUDA 12, ROCm, Intel, CPU
    kokoroLightweight TTS modelCUDA 12, ROCm, Intel, CPU
    chatterboxProduction-grade TTSCUDA 11/12, CPU
    piperFast neural TTS systemCPU
    kitten-ttsKitten TTS modelsCPU
    silero-vadVoice Activity DetectionCPU

    Image & Video Generation

    BackendDescriptionAcceleration Support
    stablediffusion.cppStable Diffusion in C/C++CUDA 12, Intel SYCL, Vulkan, CPU
    diffusersHuggingFace diffusion modelsCUDA 11/12, ROCm, Intel, Metal, CPU

    Specialized AI Tasks

    BackendDescriptionAcceleration Support
    rfdetrReal-time object detectionCUDA 12, Intel, CPU
    rerankersDocument reranking APICUDA 11/12, ROCm, Intel, CPU
    local-storeVector databaseCPU
    huggingfaceHuggingFace API integrationAPI-based

    Hardware Acceleration Matrix

    Acceleration TypeSupported BackendsHardware Support
    NVIDIA CUDA 11llama.cpp, whisper, stablediffusion, diffusers, rerankers, bark, chatterboxNvidia hardware
    NVIDIA CUDA 12All CUDA-compatible backendsNvidia hardware
    AMD ROCmllama.cpp, whisper, vllm, transformers, diffusers, rerankers, coqui, kokoro, barkAMD Graphics
    Intel oneAPIllama.cpp, whisper, stablediffusion, vllm, transformers, diffusers, rfdetr, rerankers, exllama2, coqui, kokoro, barkIntel Arc, Intel iGPUs
    Apple Metalllama.cpp, whisper, diffusers, MLX, MLX-VLM, bark-cppApple M1/M2/M3+
    Vulkanllama.cpp, whisper, stablediffusionCross-platform GPUs
    NVIDIA Jetsonllama.cpp, whisper, stablediffusion, diffusers, rfdetrARM64 embedded AI
    CPU OptimizedAll backendsAVX/AVX2/AVX512, quantization support

    🔗 Community and integrations

    Build and deploy custom containers:

    WebUIs:

    Model galleries

    Other:

    🔗 Resources

    :book: 🎥 Media, Blogs, Social

    Citation

    If you utilize this repository, data in a downstream project, please consider citing it with:

    @misc{localai,
      author = {Ettore Di Giacinto},
      title = {LocalAI: The free, Open source OpenAI alternative},
      year = {2023},
      publisher = {GitHub},
      journal = {GitHub repository},
      howpublished = {\url{https://github.com/go-skynet/LocalAI}},
    

    ❤️ Sponsors

    Do you find LocalAI useful?

    Support the project by becoming a backer or sponsor. Your logo will show up here with a link to your website.

    A huge thank you to our generous sponsors who support this project covering CI expenses, and our Sponsor list:


    🌟 Star history

    LocalAI Star history Chart

    📖 License

    LocalAI is a community-driven project created by Ettore Di Giacinto.

    MIT - Author Ettore Di Giacinto [email protected]

    🙇 Acknowledgements

    LocalAI couldn't have been built without the help of great software already available from the community. Thank you!

    🤗 Contributors

    This is a community project, a special thanks to our contributors! 🤗

    Discover Repositories

    Search across tracked repositories by name or description