Llama Guard
Open safeguard model for classifying LLM inputs and outputs
Llama Guard is profiled here as a LLM tool for engineering teams. Read about features, pricing, and how it compares to related options in the tools directory.
Description
Llama Guard is an open-weight safeguard model from Meta, released as part of the Llama protections suite. It classifies prompts and model responses against a taxonomy of safety categories, returning whether content is safe and which policy it violates, so it acts as a content filter around a primary model. Because the policy is supplied in the prompt, teams adapt the categories to their own application without retraining. Released with open weights, it runs privately as a filter around any generative model a team already operates. Multimodal variants extend the same screening to image content, and related models cover prompt and code safety.
Key Capabilities:
Input and output classification against a safety taxonomy
Customizable policy categories defined at inference time
Multimodal variants that screen image content
Open weights for self-hosted, private deployment
Integration as a filter in front of any generative model
Part of a suite alongside Prompt Guard and Code Shield
Alternative tools
- LLM Guard
Open-source security toolkit for LLM interactions
- Martian
Model router that optimizes cost and quality per request
- Cloudflare AI Gateway
A gateway for caching, routing, and observing AI requests
- Databricks
Lakehouse platform unifying data engineering and AI
- BigQuery
Serverless, petabyte-scale cloud data warehouse
- Browser Use
Connect AI agents to the browser for web tasks
