LocalAI

Self-hosted API server replacing OpenAI, Anthropic, and ElevenLabs locally.

LocalAI is profiled here as a LLM tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.

LLM Backend RAG Framework Embeddings Authentication Deployment Agentic Capabilities Model RoutingOpen Source

Visit Website GitHub

Description

Short Intro: LocalAI is a self-hosted AI inference server created by Ettore Di Giacinto, an Italian open-source infrastructure engineer who also maintains Kairos, a cloud-native immutable OS targeting Kubernetes edge deployments. Created in 2023, released under MIT, and supported entirely by GitHub Sponsors and Spectro Cloud compute donations rather than any VC funding, the project provides drop-in REST API compatibility with OpenAI, Anthropic, and ElevenLabs from a single local endpoint. Where Ollama focuses on LLM text generation with minimal setup, LocalAI covers the full multi-modal surface — text, images, audio, video, voice cloning, face recognition, and distributed cluster serving — across 36+ interchangeable backends.

Key Capabilities:

Drop-in API compatibility for OpenAI, Anthropic, and ElevenLabs endpoints
36+ backends including llama.cpp, vLLM, transformers, whisper.cpp, diffusers, SGLang, and MLX
Multi-modal support covering text generation, image generation, audio, video, voice cloning, and face recognition with antispoofing liveness
Speaker diarization and WebRTC realtime audio-to-audio with tool calling
Distributed cluster mode with VRAM-aware smart routing and autoscaling
No GPU required with CPU fallback and automatic backend detection
Hardware acceleration for NVIDIA CUDA, AMD ROCm, Intel oneAPI, Apple Silicon Metal, Vulkan, and NVIDIA Jetson
MCP client support with tool streaming and Agenthub for native agentic orchestration
Multi-user platform with OIDC authentication, per-user API keys, and usage attribution
Ollama API drop-in compatibility for ecosystem integrations
P2P and decentralized inference with RDMA support
Backend Gallery with on-the-fly installation and OCI image signing
LocalAGI agent orchestration, LocalRecall memory system, and Cogito Go library as companion projects

See LocalAI pricing details →

Alternative tools

Kong AI Gateway
AI traffic governance built on the Kong gateway
Eden AI
One API for many AI models across providers and modalities
Reka
Multimodal models that reason across text, image, and video
Moonshot AI
Open-weight Kimi models tuned for agentic and coding tasks
xAI Grok
Grok model family with real-time search through one API
LLM Guard
Open-source security toolkit for LLM interactions

Used in Stacks

No saved stacks include this tool yet.

Browse more in LLM