LLM Guard is an open-source AI security toolkit by Protect AI that scans LLM inputs and outputs for security and compliance risks. It has 2.5k GitHub stars and 342 forks on GitHub.

The library is MIT licensed and requires Python 3.10+. Protect AI, the company behind LLM Guard, also develops Guardian and ModelScan for ML supply chain security. The latest release is v0.3.16.
What is LLM Guard?
LLM Guard sits between your application and its language model. It runs 15 input scanners on user prompts before they reach the model, and 20 output scanners on the model’s responses before they reach the user.
Each scanner handles a specific risk: prompt injection, PII exposure, toxic language, secrets in code, and more. Scanners are modular. You pick which ones you need and configure them independently.
The library works with any language model since it processes text, not model internals. It also ships with an API server mode for language-agnostic deployments.
Key Features
| Feature | Details |
|---|---|
| Input Scanners | 15 scanners: Anonymize, BanCode, BanCompetitors, BanSubstrings, BanTopics, Code, Gibberish, InvisibleText, Language, PromptInjection, Regex, Secrets, Sentiment, TokenLimit, Toxicity |
| Output Scanners | 20 scanners: BanCompetitors, BanSubstrings, BanTopics, Bias, Code, Deanonymize, JSON, Language, LanguageSame, MaliciousURLs, NoRefusal, ReadingTime, FactualConsistency, Gibberish, Regex, Relevance, Sensitive, Sentiment, Toxicity, URLReachability |
| PII Handling | Anonymize scanner replaces PII in prompts; Deanonymize restores it in outputs when appropriate |
| Prompt Injection | Dedicated scanner using ML models to detect direct and indirect injection attempts |
| Secrets Detection | Identifies API keys, passwords, and credentials in both inputs and outputs |
| Factual Consistency | Output scanner that checks whether responses are consistent with provided context |
| License | MIT — fully open-source |
| Python Support | Requires Python >=3.10, <3.13 |
Input Scanners
The 15 input scanners process user prompts before they reach the LLM:
- Anonymize — replaces PII (names, emails, phone numbers, credit card numbers) with placeholders
- BanCode — blocks prompts containing code snippets
- BanCompetitors — filters mentions of specified competitor names
- BanSubstrings — blocks prompts containing specific text patterns
- BanTopics — prevents prompts about restricted subjects
- Code — detects code content in prompts
- Gibberish — identifies nonsensical or garbled input
- InvisibleText — detects hidden Unicode characters used in prompt injection
- Language — enforces language restrictions on input
- PromptInjection — detects direct and indirect injection attacks using ML models
- Regex — pattern-based filtering with custom regular expressions
- Secrets — identifies API keys, passwords, and credentials
- Sentiment — analyzes emotional tone of input
- TokenLimit — enforces maximum token count
- Toxicity — filters harmful, offensive, or abusive language
Output Scanners
The 20 output scanners validate and filter model responses:
- FactualConsistency — checks whether the response is consistent with the provided context
- Bias — detects biased or discriminatory content in responses
- Deanonymize — restores PII that was anonymized in the input stage
- JSON — validates JSON structure and schema compliance
- MaliciousURLs — blocks links to known malicious sites
- NoRefusal — flags when the model refuses to answer without good reason
- Relevance — checks whether the response matches the original query
- URLReachability — verifies that URLs in responses actually resolve
- ReadingTime — estimates the reading time of responses
- LanguageSame — verifies the response is in the same language as the input
Getting Started
pip install llm-guard. Requires Python 3.10 or higher. For GPU-accelerated inference, install with pip install llm-guard[onnxruntime-gpu].scan_prompt() to process user input through your chosen input scanners, send the sanitized prompt to your LLM, then use scan_output() to validate the response.When to use LLM Guard
LLM Guard fits teams that want open-source, self-hosted guardrails for LLM applications. Since it runs locally, your data never leaves your infrastructure.
The modular scanner design means you can start with just prompt injection detection and add PII anonymization or toxicity filtering later. Each scanner works independently, so adding one doesn’t affect the others.
The library works with any LLM provider because it scans the text, not the model. Whether you use OpenAI, Anthropic, local models, or a mix, the same scanners apply.
For a broader look at AI and LLM security, read our AI security guide. For a different approach to LLM safety that includes dialog flow control, look at NeMo Guardrails. For red teaming and adversarial testing rather than runtime protection, consider Garak or PyRIT. Lakera Guard offers similar scanner functionality as a managed cloud API.
