LLM Guard is an open-source AI security toolkit by Protect AI that scans LLM inputs and outputs for security and compliance risks. It has 2.5k GitHub stars and 342 forks on GitHub.

The library is MIT licensed and requires Python 3.10+. Protect AI, the company behind LLM Guard, also develops Guardian and ModelScan for ML supply chain security. The latest release is v0.3.16.
I use LLM Guard when I need to sit a policy layer in front of a chatbot before it talks to a user. The input scanners catch prompt injection, PII, and banned topics. The output scanners catch refusals, toxic output, and sensitive leakage. It is the most complete free guardrail library I have tried, and it runs offline without calling back to a vendor API.
Quick Pick
- Self-hosted, offline, free? โ LLM Guard (this page)
- Managed cloud API with SLA? โ Lakera Guard
- Dialog flow control with Colang DSL? โ NeMo Guardrails
- Red-team / adversarial testing (not runtime)? โ Garak or PyRIT
What is LLM Guard?
LLM Guard sits between your application and its language model. It runs 15 input scanners on user prompts before they reach the model, and 20 output scanners on the model’s responses before they reach the user.
Each scanner handles a specific risk: prompt injection, PII exposure, toxic language, secrets in code, and more. Scanners are modular.
You pick which ones you need and configure them independently.
The library works with any language model since it processes text, not model internals. It also ships with an API server mode for language-agnostic deployments.
LLM Guard Key Features
| Feature | Details |
|---|---|
| Input Scanners | 15 scanners: Anonymize, BanCode, BanCompetitors, BanSubstrings, BanTopics, Code, Gibberish, InvisibleText, Language, PromptInjection, Regex, Secrets, Sentiment, TokenLimit, Toxicity |
| Output Scanners | 20 scanners: BanCompetitors, BanSubstrings, BanTopics, Bias, Code, Deanonymize, JSON, Language, LanguageSame, MaliciousURLs, NoRefusal, ReadingTime, FactualConsistency, Gibberish, Regex, Relevance, Sensitive, Sentiment, Toxicity, URLReachability |
| PII Handling | Anonymize scanner replaces PII in prompts; Deanonymize restores it in outputs when appropriate |
| Prompt Injection | Dedicated scanner using ML models to detect direct and indirect injection attempts |
| Secrets Detection | Identifies API keys, passwords, and credentials in both inputs and outputs |
| Factual Consistency | Output scanner that checks whether responses are consistent with provided context |
| License | MIT โ fully open-source |
| Python Support | Requires Python >=3.10, <3.13 |
LLM Guard Input Scanners
LLM Guard ships 15 input scanners that process user prompts before they reach the LLM:
- Anonymize โ replaces PII (names, emails, phone numbers, credit card numbers) with placeholders (addresses OWASP LLM02: Sensitive Information Disclosure)
- BanCode โ blocks prompts containing code snippets
- BanCompetitors โ filters mentions of specified competitor names
- BanSubstrings โ blocks prompts containing specific text patterns
- BanTopics โ prevents prompts about restricted subjects
- Code โ detects code content in prompts
- Gibberish โ identifies nonsensical or garbled input
- InvisibleText โ detects hidden Unicode characters used in prompt injection
- Language โ enforces language restrictions on input
- PromptInjection โ detects direct and indirect injection attacks using ML models (addresses OWASP LLM01: Prompt Injection)
- Regex โ pattern-based filtering with custom regular expressions
- Secrets โ identifies API keys, passwords, and credentials
- Sentiment โ analyzes emotional tone of input
- TokenLimit โ enforces maximum token count
- Toxicity โ filters harmful, offensive, or abusive language
LLM Guard Output Scanners
LLM Guard’s 20 output scanners validate and filter model responses:
- FactualConsistency โ checks whether the response is consistent with the provided context
- Bias โ detects biased or discriminatory content in responses
- Deanonymize โ restores PII that was anonymized in the input stage
- JSON โ validates JSON structure and schema compliance
- MaliciousURLs โ blocks links to known malicious sites
- NoRefusal โ flags when the model refuses to answer without good reason
- Relevance โ checks whether the response matches the original query
- URLReachability โ verifies that URLs in responses actually resolve
- ReadingTime โ estimates the reading time of responses
- LanguageSame โ verifies the response is in the same language as the input
Getting Started with LLM Guard
pip install llm-guard. Requires Python 3.10 or higher. For GPU-accelerated inference, install with pip install llm-guard[onnxruntime-gpu].scan_prompt() to process user input through your chosen input scanners, send the sanitized prompt to your LLM, then use scan_output() to validate the response.A minimal end-to-end scan looks like this:
from llm_guard import scan_prompt, scan_output
from llm_guard.input_scanners import Anonymize, PromptInjection, Toxicity
from llm_guard.output_scanners import Deanonymize, Sensitive
from llm_guard.vault import Vault
vault = Vault()
input_scanners = [Anonymize(vault), Toxicity(), PromptInjection()]
output_scanners = [Deanonymize(vault), Sensitive()]
prompt = "Hi, my name is John Doe and my credit card is 4242-4242-4242-4242."
sanitized_prompt, results_valid, results_score = scan_prompt(input_scanners, prompt)
if any(not result for result in results_valid.values()):
raise ValueError(f"Prompt blocked: {results_valid}")
response = your_llm_call(sanitized_prompt) # OpenAI, Anthropic, local, anything
sanitized_response, results_valid, results_score = scan_output(
output_scanners, sanitized_prompt, response
)
Anonymize replaces the PII with placeholders before the prompt hits the LLM. Deanonymize restores the original values in the response when it is safe to do so. The Vault keeps the mapping between placeholders and real values.
LLM Guard vs other LLM security tools
| LLM Guard | NeMo Guardrails | Lakera Guard | Garak | |
|---|---|---|---|---|
| Purpose | Runtime input/output scanning | Runtime dialog + content rails | Runtime scanning (cloud API) | Offline red-team / adversarial testing |
| Deployment | Self-hosted (pip, Docker) | Self-hosted (pip, Docker) | Managed cloud API | CLI, offline |
| License | MIT (free) | Apache-2.0 (free) | Commercial (free tier) | Apache-2.0 (free) |
| Prompt injection scanner | Yes (ML-based) | Yes (via rail config) | Yes (proprietary model) | N/A (red-team, not runtime) |
| PII / anonymization | Yes (15 input scanners) | Via Colang custom rails | Yes | No |
| Dialog flow control | No | Yes (Colang DSL) | No | No |
| Offline / air-gapped | Yes | Yes | No (cloud API) | Yes |
| Any LLM provider | Yes (text-only) | Yes | Yes | Yes |
| Best for | Self-hosted policy layer | Complex dialog rules | Managed SaaS guardrails | Security testing before launch |
Release note โ v0.3.16: The current LLM Guard release (v0.3.16) ships 15 input scanners and 20 output scanners, with expanded PromptInjection model updates and improved performance on Python 3.12. Check the GitHub releases page for the latest version. Protect AI ships multiple minor releases per quarter.

When to use LLM Guard
LLM Guard fits teams that want open-source, self-hosted guardrails for LLM applications. Since it runs locally, your data never leaves your infrastructure.
The modular scanner design means you can start with just prompt injection detection and add PII anonymization or toxicity filtering later. Each scanner works independently, so adding one doesn’t affect the others.
The library works with any LLM provider because it scans the text, not the model. Whether you use OpenAI, Anthropic, local models, or a mix, the same scanners apply.
For a broader look at AI and LLM security, read the AI security guide. For a different approach to LLM safety that includes dialog flow control, look at NeMo Guardrails.
For red teaming and adversarial testing rather than runtime protection, consider Garak or PyRIT. Lakera Guard offers similar scanner functionality as a managed cloud API.