Skip to content
Home AI Security Tools Guardrails AI
Guardrails AI

Guardrails AI

NEW
Category: AI Security
License: Free (Open-Source) and Commercial
Suphi Cankurt
Suphi Cankurt
AppSec Enthusiast
Updated April 3, 2026
5 min read
Key Takeaways
  • Open-source Python framework (Apache 2.0, 6.6k GitHub stars) for building input/output guards that detect, quantify, and mitigate risks in LLM applications.
  • Guardrails Hub provides a library of pre-built validators — covering toxicity, PII, hallucination, profanity, bias, and more — that can be composed into multi-validator guards.
  • Supports structured data generation from LLMs with schema validation, function calling, and prompt optimization for reliable JSON/XML output.
  • Dual model: free open-source core for self-hosting, plus Guardrails Pro managed service with hosted validation, observability dashboards, and enterprise support.

Guardrails AI is an open-source Python framework for validating LLM inputs and outputs using composable validators from Guardrails Hub, covering risks like toxicity, PII leaks, hallucinations, and bias. It is listed in the AI security category with 6.6k GitHub stars and 561 forks.

The framework is Apache 2.0 licensed and maintained by the Guardrails AI team. The latest release is v0.9.2 (March 2026). The project also offers a partnership course with Andrew Ng on building production-ready, failure-resistant AI applications.

Guardrails AI follows a dual model: a free open-source core for self-hosting, and Guardrails Pro as a commercial managed service with hosted validation, observability, and enterprise support. The project has recently evolved with a new product called Snowglobe for synthetic data generation and dynamic evaluation datasets.

Snowglobe
Snowglobe is Guardrails AI’s newer product for synthetic data generation. It creates realistic user personas for fine-tuning and evaluation, dynamic evaluation datasets targeting edge cases, and runtime guardrails detecting policy violations and data leakage. Masterclass (Head of AI Aman Gupta) describes the synthetic user personas as notably realistic compared to other synthetic data approaches.

What is Guardrails AI?

Guardrails AI intercepts LLM inputs and outputs, running configurable validation checks called “validators” to catch risks before they reach users. The design principle is composability — individual validators handle specific risks, and multiple validators combine into guards that run checks together.

The Guardrails Hub is the validator library. It contains pre-built validators covering toxicity detection, PII anonymization, hallucination detection, profanity filtering, bias detection, logical consistency checks, and more. Each validator is independently testable and deployable, and the hub continues to grow with community contributions.

Beyond safety, the framework handles structured data generation. When you need an LLM to produce valid JSON, XML, or data matching a specific schema, Guardrails AI validates the output structure and re-prompts the model when validation fails. This cuts down on the brittle parsing code that usually surrounds LLM integrations.

Input/Output Guards
Guards intercept LLM traffic and run validator chains on inputs and outputs. Input guards block risky prompts before they reach the model. Output guards catch problematic responses before they reach users. Guards are composable — stack as many validators as needed.
Guardrails Hub
A growing library of pre-built validators for specific risk types: toxicity, PII, hallucination, profanity, bias, logical consistency, and more. Each validator is independently configurable and can be combined with others into multi-validator guards.
Structured Output
Validates LLM outputs against JSON schemas, data types, and custom constraints. When validation fails, the framework can automatically re-prompt the model with corrective instructions, reducing manual retry logic in application code.

Key Features

FeatureDetails
Core ConceptComposable validators organized into input/output guards
Validator LibraryGuardrails Hub with pre-built validators for common risk types
Risk DetectionToxicity, PII, hallucination, profanity, bias, logical consistency
Structured DataJSON schema validation, function calling, prompt optimization
Re-promptingAutomatic re-prompting when output validation fails
API ServerStandalone Flask-based REST API for language-agnostic deployment
Multi-ValidatorCompose multiple validators into a single guard for comprehensive checks
ObservabilityGuardrails Pro dashboards for monitoring validation metrics
LanguagePython (JavaScript support available)
LicenseApache 2.0 (open-source core)
GitHub6.6k stars, 561 forks, 3,186 commits

Validators and guards

The building blocks are validators and guards. A validator is a single check — does this text contain PII? Is this response toxic? Does this JSON match the expected schema? A guard combines multiple validators into a pipeline that runs on every LLM interaction.

Guards can run on inputs (before the LLM processes the prompt), on outputs (after the LLM generates a response), or both. When a validator flags a violation, the guard can block the response, return a default value, or trigger re-prompting to get a corrected output.

Guardrails Hub

The hub is where the ecosystem lives. Each validator addresses a specific risk category, and the collection covers the most common LLM failure modes: generating toxic content, leaking PII from training data, hallucinating facts, producing biased responses, or failing to match structural requirements.

Validators from the hub are installable via pip and work with the same guard composition pattern. Community-contributed validators extend coverage to domain-specific risks.

Structured data generation

A practical problem Guardrails AI solves: getting LLMs to produce valid structured data. Instead of writing brittle regex parsers or hoping the model follows instructions, the framework validates output against schemas and re-prompts automatically when the structure is wrong.

This cuts the “parsing tax” in LLM applications where developers spend too much time handling malformed model outputs.

Guardrails Index benchmark
In February 2025, Guardrails AI launched the Guardrails Index — the first benchmark comparing the performance and latency of 24 guardrails across 6 common risk categories. This helps teams evaluate which validators perform best for their specific use cases.

Getting Started

1
Install the framework — Run pip install guardrails-ai. The core library requires Python 3.9+. For JavaScript applications, a JS client is also available.
2
Browse Guardrails Hub — Visit the Guardrails Hub to find validators for your risk categories. Install validators via pip — each is an independent package.
3
Create a guard — Compose selected validators into a guard object. Configure thresholds and failure actions (block, default value, re-prompt) for each validator.
4
Wrap your LLM calls — Use the guard to wrap your existing LLM calls. The guard automatically runs input validators before the call and output validators after, handling violations according to your configuration.
5
Deploy as API (optional) — For production deployments, run Guardrails as a standalone API server using the built-in Flask wrapper. Or upgrade to Guardrails Pro for managed hosting and observability.

When to use Guardrails AI

Guardrails AI fits teams that want modular, composable validation for LLM applications without being locked into a specific LLM provider or deployment model. The open-source core means you can self-host everything, inspect the code, and contribute validators back to the community.

The framework is particularly good for applications that need structured output validation. If your LLM needs to produce JSON, fill forms, or generate data matching specific schemas, the built-in re-prompting logic handles the reliability work that would otherwise require custom retry code.

For teams that want to start simple and scale, the path from a single validator to a multi-validator guard to Guardrails Pro managed service is incremental — you don’t need to rearchitect as requirements grow.

Best for
Teams building LLM applications that need composable, open-source input/output validation with a growing validator ecosystem — especially when structured data generation and automatic re-prompting are important.

For a broader overview of AI security tools, see the AI security tools guide. For dialog flow control and multi-turn conversation safety, NeMo Guardrails offers capabilities Guardrails AI doesn’t cover.

For lightweight, self-hosted input/output scanning, LLM Guard takes a scanner-based approach. For OpenAI-specific agent guardrails with tripwire patterns, see OpenAI Guardrails.

Frequently Asked Questions

What is Guardrails AI?
Guardrails AI is an open-source Python framework for adding input and output validation to LLM applications. It uses composable validators from Guardrails Hub to detect risks like toxicity, PII leaks, hallucinations, and bias. The framework has 6.6k GitHub stars and is Apache 2.0 licensed.
Is Guardrails AI free?
The core framework is free and open-source under the Apache 2.0 license. Guardrails also offers Guardrails Pro, a commercial managed service that adds hosted validation, observability dashboards, and enterprise support for teams that don’t want to self-host.
What is Guardrails Hub?
Guardrails Hub is a collection of pre-built validators that each detect a specific type of risk — toxicity, PII, profanity, hallucination, bias, logical consistency, and many more. Validators can be combined into multi-validator guards that run multiple checks on each LLM input or output.
How does Guardrails AI compare to NeMo Guardrails?
Guardrails AI focuses on composable input/output validation with a large validator library. NeMo Guardrails by NVIDIA adds dialog flow control through its Colang language, modeling multi-turn conversations. Guardrails AI is better for request-level validation; NeMo Guardrails is better when you need conversation-level control.