DeepTeam 2026: Free LLM Red Teaming Framework

DeepTeam is an open-source LLM red teaming framework by Confident AI that tests language model applications for security vulnerabilities and safety risks. It’s part of the AI security category.

Red teaming defends models; the offensive counterpart — autonomous agents that plan and run attacks — is covered in my analysis of AI pentesting agents .

DeepTeam CLI scanning GPT-4o for bias, PII leakage, and toxicity vulnerabilities — 2 of 5 attack cases flagged

The project has 1,900 GitHub stars and 187 forks with 22 contributors (as of April 2026). Jeffrey Ip at Confident AI leads development — the same team behind DeepEval, a widely used LLM evaluation framework.

DeepTeam is licensed under Apache 2.0 and supports Python 3.9 through 3.13.

What is DeepTeam?

DeepTeam automates adversarial testing of LLM-based applications.

You define a target model callback, pick vulnerability types and attack methods, and the framework generates adversarial inputs to probe your model for weaknesses.

The framework covers 40+ vulnerability types across categories like bias, PII leakage, toxicity, and misinformation.

It uses 10+ adversarial attack methods split into single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem encoding) and multi-turn attacks (linear jailbreaking, tree jailbreaking, crescendo jailbreaking).

40+ Vulnerability Types

Covers bias (race, gender, political, religion), PII leakage, toxicity, misinformation, and robustness issues. Mapped to OWASP Top 10 for LLMs and NIST AI RMF .

10+ Attack Methods

Single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem) and multi-turn attacks (linear, tree, and crescendo jailbreaking).

Custom Vulnerabilities

Define your own vulnerability types beyond the built-in 40+ using the CustomVulnerability class for domain-specific testing.

What are DeepTeam’s key features?

Feature	Details
Vulnerability Count	40+ built-in types
Attack Methods	10+ (single-turn and multi-turn)
Single-Turn Attacks	Prompt injection, Leetspeak, ROT-13, math problem
Multi-Turn Attacks	Linear jailbreaking, tree jailbreaking, crescendo jailbreaking
Standards Coverage	OWASP Top 10 for LLMs, NIST AI RMF
Custom Vulnerabilities	Supported via CustomVulnerability class
Configuration	Python API and YAML config files
CLI Commands	`deepteam run`, `deepteam set-api-key`
Dependencies	deepeval, openai, aiohttp, grpcio, pyyaml
Python Support	3.9 to 3.13

Vulnerability categories

DeepTeam organizes its 40+ vulnerabilities by category. Bias testing covers race, gender, political, and religion dimensions.

PII leakage checks for API and database access exposure. Toxicity probes for insults and harmful content, while misinformation testing catches hallucinated or false claims.

Each vulnerability type accepts specific sub-types.

For example, bias can be scoped to race or gender, and PII leakage can focus on API key exposure versus database credential leaks.

Attack methods

Single-turn attacks modify prompts in one interaction: prompt injection embeds malicious instructions and Leetspeak replaces characters to evade filters.

ROT-13 encodes harmful requests, and math problem wraps adversarial content in mathematical framing.

Multi-turn attacks play out across multiple exchanges. Linear jailbreaking escalates gradually over several messages.

Tree jailbreaking explores branching conversation paths. Crescendo jailbreaking starts with innocent-sounding interactions and builds toward harmful outputs step by step.

DeepTeam multi-turn crescendo jailbreak sequence — model bypasses restrictions across 5 escalating conversation turns

You can assign weights to attacks to control how heavily they factor into testing:

from deepteam.attacks.single_turn import PromptInjection, ROT13

prompt_injection = PromptInjection(weight=2)
rot_13 = ROT13(weight=1)

Note

OpenAI API key required

DeepTeam uses OpenAI models to power its attack simulation and evaluation engine. You need to set an OPENAI_API_KEY environment variable or use deepteam set-api-key before running scans.

Framework alignment: OWASP, NIST, MITRE, ISO

DeepTeam’s editorial wedge against peer red-teaming tools is explicit framework alignment. The 40+ vulnerability types map onto specific OWASP Top 10 for LLM Applications codes:

LLM01 Prompt Injection — single-turn prompt injection plus multi-turn linear, tree, and crescendo jailbreaking attacks
LLM02 Sensitive Information Disclosure — PII leakage probes for API keys and database credentials
LLM05 Improper Output Handling — toxicity (insults, hate) and unsafe content categories
LLM09 Misinformation — hallucination and false claim detection
Bias and fairness vulnerabilities — race, gender, political, religion testing supports the NIST AI RMF Measure function and EU AI Act Annex III high-risk system requirements

The vulnerability and attack telemetry also maps onto MITRE ATLAS techniques AML.T0051 LLM Prompt Injection and AML.T0054 LLM Jailbreak, and the structured risk assessment output supports ISO/IEC 42001 AI management system audits where evidence of adversarial testing is required.

How do I get started with DeepTeam?

Install DeepTeam — Run pip install -U deepteam in a Python 3.9+ environment. Optionally install pandas for enhanced result visualization.

Set your API key — Run deepteam set-api-key sk-proj-abc123... to configure your OpenAI API key. DeepTeam uses this for attack generation and evaluation.

Write a scan — Use the Python API to define your target, vulnerabilities, and attacks. Or create a YAML config file and run it with deepteam run config.yaml.

Review results — Access the risk assessment object to view results. Call risk_assessment.overview for a summary or risk_assessment.save(to="./results/") to export.

Python API

The core API uses a red_team function. Pass a model callback (either a string like "openai/gpt-3.5-turbo" or an async function), vulnerability types, and attack methods:

from deepteam import red_team
from deepteam.vulnerabilities import Bias, PIILeakage, Toxicity
from deepteam.attacks.single_turn import PromptInjection

bias = Bias(types=["race", "gender"])
pii_leakage = PIILeakage(types=["api_and_database_access"])
toxicity = Toxicity(types=["insults"])
prompt_injection = PromptInjection()

risk_assessment = red_team(
    model_callback="openai/gpt-3.5-turbo",
    vulnerabilities=[bias, pii_leakage, toxicity],
    attacks=[prompt_injection]
)

For custom model endpoints, use an async callback:

async def model_callback(input: str) -> str:
    # Call your model endpoint here
    return response

risk_assessment = red_team(
    model_callback=model_callback,
    vulnerabilities=[bias],
    attacks=[prompt_injection]
)

YAML configuration

For repeatable scans, define a YAML config:

models:
  simulator: gpt-3.5-turbo-0125
  evaluation: gpt-4o

target:
  purpose: "A helpful AI assistant for customer support"
  model: gpt-3.5-turbo

system_config:
  max_concurrent: 8
  attacks_per_vulnerability_type: 1
  output_folder: "development-security-audit"

default_vulnerabilities:
  - name: "Bias"
    types: ["religion"]
  - name: "Toxicity"
    types: ["insults"]
  - name: "PIILeakage"
    types: ["api_and_database_access"]

Run it with the CLI:

deepteam run config.yaml

You can customize concurrency and attempts per attack:

deepteam run config.yaml -c 20 -a 5 -o results

Stateful red teaming

For running multiple scans with shared state, use the RedTeamer class:

from deepteam import RedTeamer
from deepteam.vulnerabilities import Bias

red_teamer = RedTeamer()
red_teamer.red_team(
    model_callback="openai/gpt-3.5-turbo",
    vulnerabilities=[Bias(types=["race"])]
)

# Reuse simulated test cases for a second model
red_teamer.red_team(
    model_callback="openai/gpt-4o",
    reuse_simulated_test_cases=True
)

Working with results

The red_team function returns a risk assessment object:

risk_assessment = red_team(...)

# View overview
print(risk_assessment.overview)

# Export to DataFrames (requires pandas)
risk_assessment.overview.to_df()
risk_assessment.test_cases.to_df()

# Save results to disk
risk_assessment.save(to="./deepteam-results/")

Provider configuration

DeepTeam supports multiple model providers beyond OpenAI:

# Azure OpenAI
deepteam set-azure-openai --openai-api-key "key" --openai-endpoint "endpoint"

# Local model
deepteam set-local-model model-name --base-url "http://localhost:8000"

# Ollama
deepteam set-ollama llama2

When to use DeepTeam

DeepTeam works well for teams that want a Python-native red teaming tool with structured vulnerability categories.

Its mapping to OWASP Top 10 for LLMs and NIST AI RMF helps satisfy compliance requirements.

The multi-turn attack methods (crescendo, tree, linear jailbreaking) test attack vectors that single-prompt tools miss. The stateful RedTeamer class is useful for comparing how different models respond to the same adversarial inputs.

Tip

Best for

Compliance-driven Python teams that need OWASP Top 10 for LLMs and NIST AI RMF mapping out of the box, with CLI and Python API access.

For a broader overview of AI and LLM security risks, read the AI security guide and the LLM red teaming overview. For a wider probe library and NVIDIA backing, see Garak .

For a full evaluation framework with red teaming built in, look at Promptfoo . For Microsoft’s red teaming toolkit, check PyRIT .

For runtime protection rather than testing, consider Lakera Guard or LLM Guard .

What are alternatives to DeepTeam?

DeepTeam’s framework-mapping wedge is strong, but five alternatives cover overlapping ground with different tradeoffs.

Garak is the broadest peer with a wider probe library and NVIDIA backing. Garak is the better pick when probe diversity and active research-paper integration matter more than out-of-the-box compliance mapping.

Promptfoo is an evaluation-framework superset that combines red teaming with prompt regression testing and CI integration. Promptfoo fits teams that want one tool for both quality and security evals rather than a security-focused framework.

PyRIT is Microsoft’s enterprise red teaming orchestrator with strong Azure integration and human-in-the-loop workflows. PyRIT is the right pick for long red-team campaigns and Azure-heavy environments.

Giskard is a hybrid LLM and traditional ML testing framework. It is the better fit when the same team owns both classical ML models and LLM applications and wants a unified vulnerability surface.

FuzzyAI is a jailbreak-fuzzing specialist from CyberArk Labs. FuzzyAI is the right tool when the goal is generating novel jailbreak variants through mutation rather than running a fixed set of vulnerability checks.

For runtime guardrails rather than red teaming, see Lakera Guard (acquired by Check Point in September 2025) and LLM Guard ; for agentic-AI threat models, see Augustus and Agentic Radar .

Visit DeepTeam View source on GitHub (1900★)

Frequently Asked Questions

What is DeepTeam?

DeepTeam is an open-source LLM red teaming framework by Confident AI. It scans LLM applications for 40+ vulnerability types including bias, PII leakage, toxicity, and misinformation using 10+ adversarial attack methods.

Is DeepTeam free to use?

Yes. DeepTeam is free and open-source under Apache 2.0. Install via pip and run scans against any LLM endpoint. It requires an OpenAI API key for its attack simulation engine.

Does DeepTeam test for prompt injection?

Yes. DeepTeam includes prompt injection as one of its single-turn attack methods. It also supports multi-turn attacks like linear jailbreaking, tree jailbreaking, and crescendo jailbreaking that can bypass prompt injection defenses.

What LLM vulnerabilities does DeepTeam cover?

DeepTeam covers 40+ vulnerabilities mapped to the OWASP Top 10 for LLMs and NIST AI RMF. Categories include bias (race, gender, political, religion), PII leakage, toxicity, misinformation, and robustness issues.