TruLens
Evaluate LLM applications with programmatic feedback functions
Description
TruLens is an open-source Python evaluation framework originally created by TruEra and now maintained by Snowflake following its acquisition in May 2024. TruEra was founded by ML interpretability researchers, and that lineage is visible in TruLens's core abstraction: feedback functions that attach to any input, output, or intermediate trace step and return explainable float scores rather than opaque pass/fail results. For teams running on Snowflake Cortex, TruLens is the native evaluation layer, with traces logging directly to Snowflake and feedback functions doubling as production guardrails.
Feedback functions and Metric class API: Programmatic evaluators that score inputs, outputs, and intermediate steps on a 0.0–1.0 scale, with chain-of-thought reasoning support for interpretable results
RAG Triad built-in metrics: Context Relevance, Groundedness, and Answer Relevance, a three-metric evaluation framework for RAG pipelines that has been widely adopted beyond TruLens itself
Snowflake Cortex native integration: Traces log directly to Snowflake; feedback functions run against Cortex Search and Cortex LLM Functions without additional configuration
Stack-agnostic instrumentation: Works with LangChain, LlamaIndex, LangGraph, and raw Python across OpenAI, Anthropic, HuggingFace, and Snowflake Cortex as feedback LLM providers
OpenTelemetry-native tracing: OTel span support makes TruLens trace data portable to Datadog, New Relic, and other observability stacks
Built-in local dashboard: Visualizes feedback scores and traces across evaluation runs via session.start_dashboard(), with no external service required
Alternative tools
- Claude Code
Agentic coding tool that runs in your terminal
- OpenAI Codex CLI
Terminal coding agent built on OpenAI reasoning models
- Aider
AI pair programming in your terminal
- Cline
Open-source AI coding agent for any editor
- Braintrust Evals
Trace every step your LLM agent takes, from prompt to response
- Giskard
Scan AI agents for vulnerabilities before and after deployment
