Replicate
Run open-source AI models through a single API.
Description
Replicate is an inference platform built by Ben Firshman, the creator of Docker Compose, and Andreas Jansson, who built ML infrastructure at Spotify. Founded in 2019 and headquartered in San Francisco with a remote-first team, the platform gives software engineers API access to thousands of open-source models without requiring ML engineering skills. The company also maintains Cog, the open-source packaging format that turns any ML model into a reproducible container with an HTTPS endpoint.
Key Capabilities:
API access to 9,000+ open-source models including Stable Diffusion, Flux, Llama, and Whisper
Cog open-source CLI for packaging custom models with code, weights, and dependencies
Auto-generated REST API endpoints for any uploaded model
Pay-per-prediction billing tied to GPU runtime per second
Scale-to-zero autoscaling that drops to no charge during idle periods
Fine-tuning API for customizing open-weight models
Deployments API for assigning model versions to dedicated GPU hardware
Multi-GPU support across A100, H100, and other classes
Web playground on every model page for browser-based testing
Python, Node.js, TypeScript, and Go client SDKs
Hugging Face Inference Providers integration
Multi-modal coverage across image, video, audio, and text generation
Alternative tools
No similar tools in this category yet.
