NeMo Guardrails 2026: NVIDIA's LLM Safety Toolkit

NVIDIA NeMo Guardrails is an open-source AI security toolkit for adding programmable guardrails to LLM-based conversational applications. It has 6.5k GitHub stars and 597 forks.

NeMo Guardrails architecture showing how programmable guardrails intercept LLM inputs and outputs

The project is Apache 2.0 licensed and maintained by NVIDIA. The latest release is v0.22.0 (May 2025).

It requires Python 3.10-3.13. The framework was published as a research paper at EMNLP 2023 (arXiv:2310.10501).

What is NeMo Guardrails?

NeMo Guardrails intercepts the traffic between your application and its LLM, applying configurable safety checks that block or modify content based on defined policies. The main difference from other guardrail tools is dialog management.

Most solutions only filter individual inputs and outputs. NeMo Guardrails models entire conversation flows.

The toolkit uses Colang, a domain-specific language purpose-built for defining conversational guardrails.

Colang lets you declaratively specify what conversations should look like, which topics are allowed, and how the system should respond to specific user intents. Both Colang 1.0 and the newer Colang 2.0 are supported.

Five rail types cover different stages of the LLM interaction: input rails process user messages, dialog rails control conversation flow, retrieval rails filter knowledge base results, execution rails gate tool/action calls, and output rails validate responses.

Colang Language

A Python-like domain-specific language for defining dialog flows and guardrail policies. Supports both v1.0 and v2.0 syntax with flows, events, actions, and LLM integration.

Five Rail Types

Input, dialog, retrieval, execution, and output rails cover every stage of the LLM interaction. Dialog rails are unique to NeMo Guardrails, controlling how conversations flow across turns.

Provider Support

Works with OpenAI, Azure, Anthropic, HuggingFace, NVIDIA NIM, and other LLM providers. Integrates with LangChain, LangGraph, and third-party security services.

What are NVIDIA NeMo Guardrails’s key features?

Feature	Details
Rail Types	Input, dialog, retrieval, execution, output
Colang Versions	1.0 (default) and 2.0
Input Rails	Jailbreak detection, prompt injection, content moderation, intent classification
Output Rails	Fact-checking, hallucination detection, sensitive data blocking, response validation
Dialog Rails	Topic boundaries, conversation flows, canonical forms, branching logic
LLM Providers	OpenAI, Azure, Anthropic, HuggingFace, NVIDIA NIM, LLaMA, Falcon, Vicuna, Mosaic
Frameworks	LangChain, LangGraph, custom chains
Deployment	Python API, FastAPI server, Docker, NeMo Microservice
Observability	OpenTelemetry tracing, structured logging, performance evaluation
License	Apache 2.0

Input Rails

Input rails process user messages before they reach the LLM. They handle jailbreak detection, prompt injection filtering, content moderation, and intent classification.

You can use NVIDIA’s built-in safety models or plug in third-party providers.

NeMo Guardrails Python REPL showing an input rail blocking a jailbreak prompt injection attempt before it reaches the LLM

Dialog Rails

Dialog rails are what make NeMo Guardrails different from input/output-only tools. They model the intended flow of conversations using Colang, keeping the LLM on track across multiple turns.

If a user tries to steer the conversation off-topic or into restricted areas, dialog rails redirect it back.

NeMo Guardrails flow diagram showing input, dialog, retrieval, execution, and output rails

Retrieval and Execution Rails

Retrieval rails filter and validate knowledge base results before they reach the LLM, reducing hallucination risk. Execution rails gate tool calls and actions, preventing the LLM from triggering operations it shouldn’t.

Output Rails

Output rails validate LLM responses before they reach the user. Built-in checks include fact-checking against knowledge bases, hallucination detection, sensitive data blocking, and response quality validation.

Note

Third-party integrations

NeMo Guardrails supports external security providers alongside its built-in capabilities. You can also write custom rails for your own requirements.

Colang Language

Colang is purpose-built for defining conversational guardrails. It uses a Python-like syntax where you define user intents, bot responses, and flows that connect them.

Colang 2.0 adds events, actions, and more direct LLM integration.

Example Colang 1.0 flow:

define user express greeting
  "Hello!"
  "Good afternoon!"

define flow
  user express greeting
  bot express greeting
  bot offer to help

How do I get started with NVIDIA NeMo Guardrails?

Install the package — Run pip install nemoguardrails. Requires Python 3.10-3.13 and a C++ compiler (needed for the Annoy library dependency).

Create a config directory — Set up a config/ folder with a config.yml file specifying your LLM provider and which rails to enable (input, output, dialog).

Define Colang flows — Write .co files in your config directory to define user intents, bot responses, and conversation flows that enforce your safety policies.

Run with Python or server — Use the LLMRails Python API for direct integration, or launch the built-in FastAPI server with nemoguardrails server for HTTP access.

NeMo Guardrails pricing

NeMo Guardrails is free and open-source under the Apache 2.0 license, so the toolkit itself has no list price. The cost driver is the LLM provider you point it at — input and output rails make additional model calls (e.g. for jailbreak classification or fact-checking), which incur standard provider charges from OpenAI, Anthropic, Azure, or whichever endpoint you wire in.

For NVIDIA-native deployments, NeMo Guardrails ships as a NeMo Microservice inside the NVIDIA AI Enterprise software suite. NVIDIA AI Enterprise is a paid annual subscription with sales-gated pricing — contact NVIDIA for a quote per GPU and per node. The open-source toolkit remains free regardless of whether you license the broader NVIDIA stack.

The official documentation and GitHub repository cover the open-source distribution. For a paid managed alternative, see Lakera Guard . For the wider open-source landscape, see the AI security tools hub.

When to use NeMo Guardrails

NeMo Guardrails makes sense when you need control over conversational AI behavior beyond input/output filtering. The dialog rails let you model how conversations should flow, which matters for customer service bots, enterprise assistants, or any application where multi-turn conversation control is needed.

The Colang language gives you declarative control that would require complex state management code in other frameworks. If you need to keep conversations on topic, implement branching dialog policies, or integrate fact-checking against knowledge bases, NeMo Guardrails handles these natively.

Organizations using NVIDIA’s AI ecosystem benefit from native compatibility with NVIDIA NIM models and NVIDIA AI Enterprise.

Tip

Best for

Teams building conversational AI applications that need dialog flow control, topic boundaries, and multi-turn safety policies in addition to input/output filtering.

NeMo Guardrails alternatives

NeMo Guardrails is one of several open-source guardrail toolkits, each with a different positioning. The closest substitutes:

LLM Guard — open-source Python library with 15 input scanners and 20 output scanners, MIT licensed, self-hostable. A fit when you want input/output-only scanning without the dialog management overhead.
OpenAI Guardrails — MIT-licensed Python library with a tripwire mechanism, native OpenAI Agents SDK integration, and built-in PII / jailbreak / hallucination checks. A fit when your stack is OpenAI-native.
Lakera Guard — commercial managed API (Check Point company) with sub-50ms latency and 100+ language support. A fit when ops simplicity outweighs the cost of a managed service.
Promptfoo — testing-first framework with 50+ vulnerability types, real-time guardrails, and CI/CD integration. A fit when red teaming and evaluation matter as much as runtime defense.

For programmable conversational dialog flow specifically, NeMo Guardrails remains the strongest open-source option. For the wider AI security landscape, see the AI security tools hub.

For adversarial testing and red teaming, look at Garak or PyRIT .

Visit NVIDIA NeMo Guardrails View source on GitHub (6500★)

Frequently Asked Questions

What is NeMo Guardrails?

NeMo Guardrails is an open-source toolkit by NVIDIA for adding programmable guardrails to LLM-based conversational applications. It has 6.5k GitHub stars and uses the Colang domain-specific language for defining safety policies and dialog flows.

Is NeMo Guardrails free to use?

Yes, NeMo Guardrails is free and open-source under the Apache 2.0 license. It requires Python 3.10-3.13 and integrates with OpenAI, Azure, Anthropic, HuggingFace, and NVIDIA NIM models.

Does NeMo Guardrails protect against prompt injection?

Yes, NeMo Guardrails includes input rails for jailbreak detection and prompt injection protection. It also provides dialog rails that control conversation flow, preventing the LLM from being steered into unwanted territory.

What makes NeMo Guardrails different from other LLM safety tools?

NeMo Guardrails is one of the few open-source toolkits that combines safety controls with dialog management through its Colang language. It offers five rail types (input, dialog, retrieval, execution, output) and supports both Colang 1.0 and 2.0 syntax.