DeepTeam is an open-source LLM red teaming framework by Confident AI that tests language model applications for security vulnerabilities and safety risks. It’s part of the AI security category.

The project has 1,277 GitHub stars and 187 forks with 22 contributors. Jeffrey Ip at Confident AI leads development — the same team behind DeepEval, a widely used LLM evaluation framework. DeepTeam is licensed under Apache 2.0 and supports Python 3.9 through 3.13.
What is DeepTeam?
DeepTeam automates adversarial testing of LLM-based applications. You define a target model callback, pick vulnerability types and attack methods, and the framework generates adversarial inputs to probe your model for weaknesses.
The framework covers 40+ vulnerability types across categories like bias, PII leakage, toxicity, and misinformation. It uses 10+ adversarial attack methods split into single-turn attacks (prompt injection, Leetspeak, ROT-13, math problem encoding) and multi-turn attacks (linear jailbreaking, tree jailbreaking, crescendo jailbreaking).
Key Features
| Feature | Details |
|---|---|
| Vulnerability Count | 40+ built-in types |
| Attack Methods | 10+ (single-turn and multi-turn) |
| Single-Turn Attacks | Prompt injection, Leetspeak, ROT-13, math problem |
| Multi-Turn Attacks | Linear jailbreaking, tree jailbreaking, crescendo jailbreaking |
| Standards Coverage | OWASP Top 10 for LLMs, NIST AI RMF |
| Custom Vulnerabilities | Supported via CustomVulnerability class |
| Configuration | Python API and YAML config files |
| CLI Commands | deepteam run, deepteam set-api-key |
| Dependencies | deepeval, openai, aiohttp, grpcio, pyyaml |
| Python Support | 3.9 to 3.13 |
Vulnerability categories
DeepTeam organizes its 40+ vulnerabilities by category. Bias testing covers race, gender, political, and religion dimensions. PII leakage checks for API and database access exposure. Toxicity probes for insults and harmful content, while misinformation testing catches hallucinated or false claims.
Each vulnerability type accepts specific sub-types. For example, bias can be scoped to race or gender, and PII leakage can focus on API key exposure versus database credential leaks.
Attack methods
Single-turn attacks modify prompts in one interaction: prompt injection embeds malicious instructions, Leetspeak replaces characters to evade filters, ROT-13 encodes harmful requests, and math problem wraps adversarial content in mathematical framing.
Multi-turn attacks play out across multiple exchanges. Linear jailbreaking escalates gradually over several messages. Tree jailbreaking explores branching conversation paths. Crescendo jailbreaking starts with innocent-sounding interactions and builds toward harmful outputs step by step.
You can assign weights to attacks to control how heavily they factor into testing:
from deepteam.attacks.single_turn import PromptInjection, ROT13
prompt_injection = PromptInjection(weight=2)
rot_13 = ROT13(weight=1)
deepteam set-api-key before running scans.Getting Started
pip install -U deepteam in a Python 3.9+ environment. Optionally install pandas for enhanced result visualization.deepteam set-api-key sk-proj-abc123... to configure your OpenAI API key. DeepTeam uses this for attack generation and evaluation.deepteam run config.yaml.risk_assessment.overview for a summary or risk_assessment.save(to="./results/") to export.Python API
The core API uses a red_team function. Pass a model callback (either a string like "openai/gpt-3.5-turbo" or an async function), vulnerability types, and attack methods:
from deepteam import red_team
from deepteam.vulnerabilities import Bias, PIILeakage, Toxicity
from deepteam.attacks.single_turn import PromptInjection
bias = Bias(types=["race", "gender"])
pii_leakage = PIILeakage(types=["api_and_database_access"])
toxicity = Toxicity(types=["insults"])
prompt_injection = PromptInjection()
risk_assessment = red_team(
model_callback="openai/gpt-3.5-turbo",
vulnerabilities=[bias, pii_leakage, toxicity],
attacks=[prompt_injection]
)
For custom model endpoints, use an async callback:
async def model_callback(input: str) -> str:
# Call your model endpoint here
return response
risk_assessment = red_team(
model_callback=model_callback,
vulnerabilities=[bias],
attacks=[prompt_injection]
)
YAML configuration
For repeatable scans, define a YAML config:
models:
simulator: gpt-3.5-turbo-0125
evaluation: gpt-4o
target:
purpose: "A helpful AI assistant for customer support"
model: gpt-3.5-turbo
system_config:
max_concurrent: 8
attacks_per_vulnerability_type: 1
output_folder: "development-security-audit"
default_vulnerabilities:
- name: "Bias"
types: ["religion"]
- name: "Toxicity"
types: ["insults"]
- name: "PIILeakage"
types: ["api_and_database_access"]
Run it with the CLI:
deepteam run config.yaml
You can customize concurrency and attempts per attack:
deepteam run config.yaml -c 20 -a 5 -o results
Stateful red teaming
For running multiple scans with shared state, use the RedTeamer class:
from deepteam import RedTeamer
from deepteam.vulnerabilities import Bias
red_teamer = RedTeamer()
red_teamer.red_team(
model_callback="openai/gpt-3.5-turbo",
vulnerabilities=[Bias(types=["race"])]
)
# Reuse simulated test cases for a second model
red_teamer.red_team(
model_callback="openai/gpt-4o",
reuse_simulated_test_cases=True
)
Working with results
The red_team function returns a risk assessment object:
risk_assessment = red_team(...)
# View overview
print(risk_assessment.overview)
# Export to DataFrames (requires pandas)
risk_assessment.overview.to_df()
risk_assessment.test_cases.to_df()
# Save results to disk
risk_assessment.save(to="./deepteam-results/")
Provider configuration
DeepTeam supports multiple model providers beyond OpenAI:
# Azure OpenAI
deepteam set-azure-openai --openai-api-key "key" --openai-endpoint "endpoint"
# Local model
deepteam set-local-model model-name --base-url "http://localhost:8000"
# Ollama
deepteam set-ollama llama2
When to use DeepTeam
DeepTeam works well for teams that want a Python-native red teaming tool with structured vulnerability categories. Its mapping to OWASP Top 10 for LLMs and NIST AI RMF helps satisfy compliance requirements.
The multi-turn attack methods (crescendo, tree, linear jailbreaking) test attack vectors that single-prompt tools miss. The stateful RedTeamer class is useful for comparing how different models respond to the same adversarial inputs.
For a broader overview of AI and LLM security risks, read our AI security guide. For a wider probe library and NVIDIA backing, see Garak. For a full evaluation framework with red teaming built in, look at Promptfoo. For Microsoft’s red teaming toolkit, check PyRIT. For runtime protection rather than testing, consider Lakera Guard or LLM Guard.
