Skip to content
LLM Guard

LLM Guard

Category: AI Security
License: Free (Open-Source)
Suphi Cankurt
Suphi Cankurt
+7 Years in AppSec
Updated April 14, 2026
6 min read
Key Takeaways
  • Open-source LLM security toolkit with 15 input scanners and 20 output scanners
  • Detects prompt injection, PII leaks, toxic outputs, and data leakage
  • MIT licensed, 2.5k GitHub stars, deployable as a standalone API server
  • Works with any LLM provider โ€” not locked to a specific vendor

LLM Guard is an open-source AI security toolkit by Protect AI that scans LLM inputs and outputs for security and compliance risks. It has 2.5k GitHub stars and 342 forks on GitHub.

LLM Guard architecture: application sends prompts through Input Controls (anonymization, prompt injection, PII, toxicity) to the LLM, and responses pass through Output Controls before returning to the app

The library is MIT licensed and requires Python 3.10+. Protect AI, the company behind LLM Guard, also develops Guardian and ModelScan for ML supply chain security. The latest release is v0.3.16.

I use LLM Guard when I need to sit a policy layer in front of a chatbot before it talks to a user. The input scanners catch prompt injection, PII, and banned topics. The output scanners catch refusals, toxic output, and sensitive leakage. It is the most complete free guardrail library I have tried, and it runs offline without calling back to a vendor API.

Quick Pick

  • Self-hosted, offline, free? โ†’ LLM Guard (this page)
  • Managed cloud API with SLA? โ†’ Lakera Guard
  • Dialog flow control with Colang DSL? โ†’ NeMo Guardrails
  • Red-team / adversarial testing (not runtime)? โ†’ Garak or PyRIT

What is LLM Guard?

LLM Guard sits between your application and its language model. It runs 15 input scanners on user prompts before they reach the model, and 20 output scanners on the model’s responses before they reach the user.

Each scanner handles a specific risk: prompt injection, PII exposure, toxic language, secrets in code, and more. Scanners are modular.

You pick which ones you need and configure them independently.

The library works with any language model since it processes text, not model internals. It also ships with an API server mode for language-agnostic deployments.

Input Scanning
15 scanners that filter user prompts before they reach your LLM. Covers prompt injection, PII anonymization, secrets detection, toxicity, banned topics, invisible text, and more.
Output Scanning
20 scanners that validate model responses. Checks for bias, malicious URLs, factual consistency, sensitive data leaks, toxicity, and relevance to the original query.
API Server
Deploy LLM Guard as a standalone HTTP API. Integrates with any language or framework, not just Python. Available via Docker for production deployments.

LLM Guard Key Features

FeatureDetails
Input Scanners15 scanners: Anonymize, BanCode, BanCompetitors, BanSubstrings, BanTopics, Code, Gibberish, InvisibleText, Language, PromptInjection, Regex, Secrets, Sentiment, TokenLimit, Toxicity
Output Scanners20 scanners: BanCompetitors, BanSubstrings, BanTopics, Bias, Code, Deanonymize, JSON, Language, LanguageSame, MaliciousURLs, NoRefusal, ReadingTime, FactualConsistency, Gibberish, Regex, Relevance, Sensitive, Sentiment, Toxicity, URLReachability
PII HandlingAnonymize scanner replaces PII in prompts; Deanonymize restores it in outputs when appropriate
Prompt InjectionDedicated scanner using ML models to detect direct and indirect injection attempts
Secrets DetectionIdentifies API keys, passwords, and credentials in both inputs and outputs
Factual ConsistencyOutput scanner that checks whether responses are consistent with provided context
LicenseMIT โ€” fully open-source
Python SupportRequires Python >=3.10, <3.13

LLM Guard Input Scanners

LLM Guard ships 15 input scanners that process user prompts before they reach the LLM:

  • Anonymize โ€” replaces PII (names, emails, phone numbers, credit card numbers) with placeholders (addresses OWASP LLM02: Sensitive Information Disclosure)
  • BanCode โ€” blocks prompts containing code snippets
  • BanCompetitors โ€” filters mentions of specified competitor names
  • BanSubstrings โ€” blocks prompts containing specific text patterns
  • BanTopics โ€” prevents prompts about restricted subjects
  • Code โ€” detects code content in prompts
  • Gibberish โ€” identifies nonsensical or garbled input
  • InvisibleText โ€” detects hidden Unicode characters used in prompt injection
  • Language โ€” enforces language restrictions on input
  • PromptInjection โ€” detects direct and indirect injection attacks using ML models (addresses OWASP LLM01: Prompt Injection)
  • Regex โ€” pattern-based filtering with custom regular expressions
  • Secrets โ€” identifies API keys, passwords, and credentials
  • Sentiment โ€” analyzes emotional tone of input
  • TokenLimit โ€” enforces maximum token count
  • Toxicity โ€” filters harmful, offensive, or abusive language

LLM Guard Output Scanners

LLM Guard’s 20 output scanners validate and filter model responses:

  • FactualConsistency โ€” checks whether the response is consistent with the provided context
  • Bias โ€” detects biased or discriminatory content in responses
  • Deanonymize โ€” restores PII that was anonymized in the input stage
  • JSON โ€” validates JSON structure and schema compliance
  • MaliciousURLs โ€” blocks links to known malicious sites
  • NoRefusal โ€” flags when the model refuses to answer without good reason
  • Relevance โ€” checks whether the response matches the original query
  • URLReachability โ€” verifies that URLs in responses actually resolve
  • ReadingTime โ€” estimates the reading time of responses
  • LanguageSame โ€” verifies the response is in the same language as the input
Playground available
Protect AI hosts an interactive playground on Hugging Face Spaces where you can test LLM Guard scanners without installing anything. Visit the LLM Guard Playground to try it out.

Getting Started with LLM Guard

1
Install the library โ€” Run pip install llm-guard. Requires Python 3.10 or higher. For GPU-accelerated inference, install with pip install llm-guard[onnxruntime-gpu].
2
Choose your scanners โ€” Import the input and output scanners you need. Each scanner is independent, so you only load what you use.
3
Scan inputs and outputs โ€” Use scan_prompt() to process user input through your chosen input scanners, send the sanitized prompt to your LLM, then use scan_output() to validate the response.
4
Deploy as API (optional) โ€” For non-Python environments, deploy LLM Guard as a standalone API server. The API server wraps all scanner functionality behind HTTP endpoints.

A minimal end-to-end scan looks like this:

from llm_guard import scan_prompt, scan_output
from llm_guard.input_scanners import Anonymize, PromptInjection, Toxicity
from llm_guard.output_scanners import Deanonymize, Sensitive
from llm_guard.vault import Vault

vault = Vault()
input_scanners = [Anonymize(vault), Toxicity(), PromptInjection()]
output_scanners = [Deanonymize(vault), Sensitive()]

prompt = "Hi, my name is John Doe and my credit card is 4242-4242-4242-4242."
sanitized_prompt, results_valid, results_score = scan_prompt(input_scanners, prompt)

if any(not result for result in results_valid.values()):
    raise ValueError(f"Prompt blocked: {results_valid}")

response = your_llm_call(sanitized_prompt)  # OpenAI, Anthropic, local, anything

sanitized_response, results_valid, results_score = scan_output(
    output_scanners, sanitized_prompt, response
)

Anonymize replaces the PII with placeholders before the prompt hits the LLM. Deanonymize restores the original values in the response when it is safe to do so. The Vault keeps the mapping between placeholders and real values.

LLM Guard vs other LLM security tools

LLM GuardNeMo GuardrailsLakera GuardGarak
PurposeRuntime input/output scanningRuntime dialog + content railsRuntime scanning (cloud API)Offline red-team / adversarial testing
DeploymentSelf-hosted (pip, Docker)Self-hosted (pip, Docker)Managed cloud APICLI, offline
LicenseMIT (free)Apache-2.0 (free)Commercial (free tier)Apache-2.0 (free)
Prompt injection scannerYes (ML-based)Yes (via rail config)Yes (proprietary model)N/A (red-team, not runtime)
PII / anonymizationYes (15 input scanners)Via Colang custom railsYesNo
Dialog flow controlNoYes (Colang DSL)NoNo
Offline / air-gappedYesYesNo (cloud API)Yes
Any LLM providerYes (text-only)YesYesYes
Best forSelf-hosted policy layerComplex dialog rulesManaged SaaS guardrailsSecurity testing before launch

Release note โ€” v0.3.16: The current LLM Guard release (v0.3.16) ships 15 input scanners and 20 output scanners, with expanded PromptInjection model updates and improved performance on Python 3.12. Check the GitHub releases page for the latest version. Protect AI ships multiple minor releases per quarter.

LLM Guard Python API: importing PromptInjection, Toxicity, and Anonymize scanners, running scan_prompt(), and seeing scanner results with INVALID score 0.94 blocking the input

When to use LLM Guard

LLM Guard fits teams that want open-source, self-hosted guardrails for LLM applications. Since it runs locally, your data never leaves your infrastructure.

The modular scanner design means you can start with just prompt injection detection and add PII anonymization or toxicity filtering later. Each scanner works independently, so adding one doesn’t affect the others.

The library works with any LLM provider because it scans the text, not the model. Whether you use OpenAI, Anthropic, local models, or a mix, the same scanners apply.

Best for
Teams that need self-hosted, open-source input/output scanning for LLM applications with fine-grained control over which security checks to apply.

For a broader look at AI and LLM security, read the AI security guide. For a different approach to LLM safety that includes dialog flow control, look at NeMo Guardrails.

For red teaming and adversarial testing rather than runtime protection, consider Garak or PyRIT. Lakera Guard offers similar scanner functionality as a managed cloud API.

Frequently Asked Questions

What is LLM Guard?
LLM Guard is an open-source security toolkit by Protect AI that provides input and output scanners for LLM applications. It has 2.5k GitHub stars, offers 15 input scanners and 20 output scanners, and is MIT licensed.
Is LLM Guard free to use?
Yes, LLM Guard is free and open-source under the MIT license. Install it via pip (requires Python 3.10+) and deploy it as a standalone API server or integrate it directly into your Python application.
Does LLM Guard protect against prompt injection?
Yes, LLM Guard includes a dedicated PromptInjection scanner that detects direct and indirect injection attacks. Additional input scanners handle jailbreak detection, invisible text detection, and content filtering.
What LLM vulnerabilities does LLM Guard address?
LLM Guard covers prompt injection, PII anonymization, toxicity filtering, secrets detection, malicious URL blocking, bias detection, factual consistency checking, and data leakage prevention through its modular scanner architecture.