Giskard
Scan AI agents for vulnerabilities before and after deployment
Description
Giskard is an open-source Python evaluation and red teaming framework built by Alex Combessie and Jean-Marie John-Mathews, headquartered in Paris. Combessie's prior role at Dataiku exposed a consistent gap in enterprise AI workflows: testing was fragmented, hard to compare across vendors, and unprepared for production edge cases. Giskard targets that gap with automated vulnerability scanning that generates domain-specific test cases from your own knowledge base, rather than applying generic probes. As the only major LLM testing platform built by a European entity, Giskard is purpose-built for EU AI Act compliance and data-residency requirements that US-based tools address only partially.
Key Capabilities
Automated LLM scan: A single giskard.scan() call detects hallucinations, prompt injection, sensitive information disclosure, stereotypes, and harmful content across your LLM agent without manual test case authoring
RAGET (RAG Evaluation Toolkit): Generates realistic synthetic test cases directly from a RAG knowledge base to evaluate answer correctness, groundedness, and retrieval quality across pipeline components
EU AI Act and OWASP LLM Top 10 compliance packs: Pre-built compliance presets activate full vulnerability suites aligned to European regulatory requirements and OWASP LLM Top 10 categories from a single config entry
Black-box testing via API endpoint: Giskard tests any accessible API without requiring access to internal model architecture, vector databases, or source code, making it usable against third-party or vendor-hosted AI systems
Giskard Guards (guardrail platform): An on-premise guardrail layer with a Policy-as-Code framework that secures the full agent execution chain for regulated industries requiring EU-sovereign data processing
Continuous red teaming (Hub): Giskard Hub generates new adversarial attack scenarios automatically as threat landscapes evolve, with RBAC, audit trails, team collaboration, and GDPR-native data handling built into the enterprise tier
Alternative tools
- Claude Code
Agentic coding tool that runs in your terminal
- OpenAI Codex CLI
Terminal coding agent built on OpenAI reasoning models
- Aider
AI pair programming in your terminal
- Cline
Open-source AI coding agent for any editor
- Braintrust Evals
Trace every step your LLM agent takes, from prompt to response
- Promptfoo
Test and red team LLM applications from the command line
