Llama Guard

Open safeguard model for classifying LLM inputs and outputs

Llama Guard is profiled here as a LLM tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.

LLM Evaluation GuardrailsOpen Source

Visit Website GitHub

Description

Llama Guard is an open-weight safeguard model from Meta, released as part of the Llama protections suite. It classifies prompts and model responses against a taxonomy of safety categories, returning whether content is safe and which policy it violates, so it acts as a content filter around a primary model. Because the policy is supplied in the prompt, teams adapt the categories to their own application without retraining. Released with open weights, it runs privately as a filter around any generative model a team already operates. Multimodal variants extend the same screening to image content, and related models cover prompt and code safety.

Key Capabilities:

Input and output classification against a safety taxonomy
Customizable policy categories defined at inference time
Multimodal variants that screen image content
Open weights for self-hosted, private deployment
Integration as a filter in front of any generative model
Part of a suite alongside Prompt Guard and Code Shield

Alternative tools

LLM Guard
Open-source security toolkit for LLM interactions
Martian
Model router that optimizes cost and quality per request
Cloudflare AI Gateway
A gateway for caching, routing, and observing AI requests
Databricks
Lakehouse platform unifying data engineering and AI
BigQuery
Serverless, petabyte-scale cloud data warehouse
Browser Use
Connect AI agents to the browser for web tasks

Used in Stacks

No saved stacks include this tool yet.

Browse more in LLM