Giskard

Scan AI agents for vulnerabilities before and after deployment

Testing Evaluation GuardrailsOpen Source

Description

Giskard is an open-source Python evaluation and red teaming framework built by Alex Combessie and Jean-Marie John-Mathews, headquartered in Paris. Combessie's prior role at Dataiku exposed a consistent gap in enterprise AI workflows: testing was fragmented, hard to compare across vendors, and unprepared for production edge cases. Giskard targets that gap with automated vulnerability scanning that generates domain-specific test cases from your own knowledge base, rather than applying generic probes. As the only major LLM testing platform built by a European entity, Giskard is purpose-built for EU AI Act compliance and data-residency requirements that US-based tools address only partially.

Key Capabilities

Automated LLM scan: A single giskard.scan() call detects hallucinations, prompt injection, sensitive information disclosure, stereotypes, and harmful content across your LLM agent without manual test case authoring
RAGET (RAG Evaluation Toolkit): Generates realistic synthetic test cases directly from a RAG knowledge base to evaluate answer correctness, groundedness, and retrieval quality across pipeline components
EU AI Act and OWASP LLM Top 10 compliance packs: Pre-built compliance presets activate full vulnerability suites aligned to European regulatory requirements and OWASP LLM Top 10 categories from a single config entry
Black-box testing via API endpoint: Giskard tests any accessible API without requiring access to internal model architecture, vector databases, or source code, making it usable against third-party or vendor-hosted AI systems
Giskard Guards (guardrail platform): An on-premise guardrail layer with a Policy-as-Code framework that secures the full agent execution chain for regulated industries requiring EU-sovereign data processing
Continuous red teaming (Hub): Giskard Hub generates new adversarial attack scenarios automatically as threat landscapes evolve, with RBAC, audit trails, team collaboration, and GDPR-native data handling built into the enterprise tier

See Giskard pricing details →

Alternative tools

Claude Code
Agentic coding tool that runs in your terminal
OpenAI Codex CLI
Terminal coding agent built on OpenAI reasoning models
Aider
AI pair programming in your terminal
Cline
Open-source AI coding agent for any editor
Braintrust Evals
Trace every step your LLM agent takes, from prompt to response
Promptfoo
Test and red team LLM applications from the command line

Used in Stacks

No saved stacks include this tool yet.

Browse more in Testing