Nomic
Open embedding models with large-scale data visualization
Nomic is profiled here as a LLM tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.
Description
Nomic is an AI company founded in 2022 by Brandon Duderstadt and Andriy Mulyar. Its Nomic Embed models are open-weight, openly trained text and multimodal embeddings, and its Atlas platform maps millions of embeddings into an interactive view for exploring and cleaning unstructured datasets. The company also maintains GPT4All, a project for running language models locally on consumer hardware. Open training data and reproducible methods set its embeddings apart for teams that need transparency into how the models were built. The Atlas platform turns large embedding sets into an interactive map for inspecting and cleaning data.
Key Capabilities:
Open-weight text embeddings with reproducible training data
Multimodal embeddings that place text and images in one space
Atlas platform for visualizing and curating embedding datasets
Local inference through GPT4All for offline use
Variable embedding dimensions through Matryoshka representation
Apache 2.0 model weights with an inference API
Alternative tools
- txtai
All-in-one embeddings database for semantic search and RAG
- Perplexity Sonar API
Search-grounded language model API with live citations
- AI21 Labs
Hybrid Mamba-Transformer models for enterprise applications
- Instructor
Structured outputs from LLMs with validation built in
- Tecton
Enterprise feature platform for real-time machine learning
- Feast
Open-source feature store for production machine learning
