Skip to content
FU

FuzzyAI

NEW
Category: AI Security
License: open-source
Suphi Cankurt
Suphi Cankurt
AppSec Enthusiast
Updated March 23, 2026
6 min read
Key Takeaways
  • Open-source LLM fuzzer from CyberArk Labs that has successfully jailbroken every major AI model it was tested against.
  • 18 built-in attack techniques including ArtPrompt, DAN jailbreaks, crescendo attacks, genetic algorithm mutations, and ASCII smuggling.
  • Supports 8 target providers: OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Hugging Face, Ollama, and custom REST APIs.
  • Extensible framework lets security teams add custom attack methods for domain-specific vulnerability testing.
  • Released December 2024 under Apache 2.0 license with 1.3k+ GitHub stars.

FuzzyAI is an open-source LLM fuzzing framework from CyberArk Labs that tests large language models for jailbreak vulnerabilities using 18 built-in attack techniques. Released in December 2024 under the Apache 2.0 license, it is listed in the AI security category and has earned 1.3k+ GitHub stars.

FuzzyAI has successfully jailbroken every major AI model it was tested against. The framework implements 18 attack techniques — ranging from ASCII art obfuscation (ArtPrompt) to multi-turn conversational escalation (crescendo) and genetic algorithm prompt mutation — designed to help security teams identify guardrail bypass vulnerabilities before attackers exploit them. Unlike fixed test suites, FuzzyAI uses mutation-based fuzzing to discover novel jailbreak paths.

Key Features at a Glance

FeatureDetails
Attack Techniques18 built-in methods including ArtPrompt, DAN jailbreaks, crescendo, genetic algorithm mutation, and ASCII smuggling
Target ProvidersOpenAI, Anthropic, Gemini, Azure OpenAI, AWS Bedrock, Hugging Face, Ollama, and custom REST APIs
Fuzzing ApproachMutation-based fuzzing generates novel jailbreak paths rather than relying on fixed test cases
GUI + CLIWeb-based GUI for interactive testing and CLI for automation and CI/CD integration
ExtensibilityCustom attack method framework for domain-specific vulnerability testing
System Prompt ExtractionTests whether system prompts can be leaked through adversarial queries
Guardrail RegressionRe-run attacks after updates to verify previously blocked techniques stay blocked
LicenseApache 2.0 open-source, free to use

Overview

FuzzyAI takes a fuzzing approach to LLM security testing. Rather than checking for known vulnerabilities with fixed test cases, it generates mutated attack prompts using algorithmic techniques (genetic algorithms, conversational escalation, encoding tricks) to discover previously unknown jailbreak paths. Compared to broader LLM scanners like Garak that cover 50+ probe modules across prompt injection, hallucination, and toxicity, FuzzyAI goes deeper on jailbreak-specific fuzzing.

The tool targets the guardrails that LLM providers and deployers put in place. If a model is supposed to refuse certain requests, FuzzyAI systematically probes different ways to bypass that refusal. When it finds a technique that works, it reports the successful jailbreak along with the specific prompt that triggered it.

This approach is particularly valuable because LLM guardrails are inherently fragile. A model might correctly refuse a direct request for harmful content but comply when the same request is rephrased using ASCII art, split across multiple conversational turns, or encoded in invisible Unicode characters.

18 Attack Techniques
Built-in methods including ArtPrompt, DAN jailbreaks, crescendo attacks, genetic algorithm mutation, ASCII smuggling, many-shot jailbreaking, PAIR, ActorAttack, and more.
Multi-Provider Support
Test models from OpenAI, Anthropic, Gemini, Azure, Bedrock, Hugging Face, Ollama, and any custom REST API endpoint.
Extensible Framework
Add your own attack methods to test domain-specific vulnerabilities and guardrails unique to your LLM deployments.

Key Features

FuzzyAI GUI showing the execution step of an LLM jailbreak test with multiple attack prompts and model responses in a dark-themed web interface

Attack Techniques

FuzzyAI implements 18 distinct attack methods, each targeting a different weakness in LLM safety mechanisms. The following table summarizes the key techniques:

TechniqueHow It Works
ArtPromptReplaces safety-triggering words with ASCII art representations, bypassing text-based content filters while preserving meaning for the model
DAN (Do Anything Now)Promotes the LLM to adopt an unrestricted persona that ignores standard content filters
CrescendoStarts with innocuous queries and gradually steers the conversation toward restricted topics across multiple turns
Genetic AlgorithmUses evolutionary algorithms to mutate prompts, selecting for variants that get closer to bypassing guardrails
ASCII SmugglingEmbeds hidden instructions using invisible Unicode Tag characters that are processed by the model but invisible to users
Many-Shot JailbreakingProvides many examples of the desired (harmful) behavior in the prompt context to shift the model’s behavior
Shuffle Inconsistency AttackRearranges harmful text in prompts to exploit inconsistencies between an LLM’s comprehension and its safety mechanisms
ActorAttackBuilds semantic networks of “actors” to subtly guide conversations toward restricted targets
PAIRAutomates adversarial prompt generation by iteratively refining prompts using two LLMs
Best-of-n JailbreakingUses input variations to repeatedly elicit harmful responses from the model

Additional techniques cover system prompt extraction, ethical filter bypass, and information leakage testing.

FuzzyAI jailbreak attempt results showing a successful guardrail bypass with the prompt and model response details

Supported Providers

FuzzyAI can target models from all major LLM providers:

  • OpenAI — GPT models via API
  • Anthropic — Claude models
  • Google Gemini — Gemini models
  • Azure OpenAI — Azure-hosted OpenAI models
  • AWS Bedrock — Claude, Llama, Titan, and other Bedrock models
  • Hugging Face — Hosted model endpoints
  • Ollama — Local models (requires 8GB+ RAM, 16GB+ recommended)
  • Custom REST API — Any LLM with an HTTP endpoint

Each provider requires its own API key configuration via environment variables.

FuzzyAI bulk testing terminal output showing automated fuzzing results across multiple attack techniques with pass/fail indicators

Extensibility

FuzzyAI is built to be extended. Security teams can add custom attack methods to:

  • Test domain-specific guardrails unique to their LLM deployment
  • Implement novel attack techniques from recent research papers
  • Create regression tests for previously discovered jailbreaks
  • Build attack chains that combine multiple techniques
Built by CyberArk Labs
FuzzyAI comes from CyberArk Labs, the research arm of CyberArk — a publicly traded identity security company. The tool was developed as part of CyberArk’s AI security research program and released as open source to help the broader security community test LLM deployments.

Use Cases

Pre-deployment security assessment — Before deploying an LLM-powered application, run FuzzyAI against it to discover jailbreak vulnerabilities in your guardrails.

Red team exercises — Security teams use FuzzyAI as part of AI red team assessments to systematically test how well LLM safety mechanisms hold up.

Guardrail regression testing — After updating guardrails or switching models, re-run FuzzyAI to verify that previously blocked attack techniques remain blocked.

Research and benchmarking — AI security researchers use FuzzyAI to compare the jailbreak resistance of different models and guardrail implementations.

Strengths & Limitations

Strengths:

  • Mutation-based fuzzing discovers novel jailbreaks that fixed test suites miss
  • 18 attack techniques cover a wide range of bypass methods
  • Backed by CyberArk Labs with active research and development
  • Extensible design lets teams add custom attack methods
  • Supports all major LLM providers plus custom endpoints
  • Free and open-source under Apache 2.0

Limitations:

  • Covers only jailbreak/guardrail bypass, not broader LLM security issues like hallucination, toxicity measurement, or data leakage
  • Requires API keys and usage costs for cloud-hosted target models
  • Running local models via Ollama needs significant resources (8GB+ RAM minimum)
  • Younger project compared to established tools like Garak, with a smaller community and fewer contributors
  • Results depend on the quality of the jailbreak detection criteria you define

Getting Started

1
Clone the repository — Get FuzzyAI from GitHub: git clone https://github.com/cyberark/FuzzyAI.git && cd FuzzyAI.
2
Install dependencies — FuzzyAI uses Poetry for dependency management. Run poetry install to set up the environment.
3
Configure API keys — Set environment variables for your target provider. For example: export OPENAI_API_KEY="sk-..." for OpenAI, or export ANTHROPIC_API_KEY="..." for Anthropic.
4
Run your first fuzz — Execute a fuzzing session against a target model. FuzzyAI will cycle through its attack techniques and report any successful jailbreaks.
5
Review results — Examine the output for successful bypasses. Each finding includes the specific prompt that triggered the jailbreak and the attack technique used.

How FuzzyAI Compares

FuzzyAI fills a specific role among AI security tools: focused jailbreak fuzzing with mutation-based attack techniques. It is narrower in scope but deeper in jailbreak coverage compared to multi-purpose LLM security scanners.

Compared to Garak, which offers 50+ probe modules covering prompt injection, hallucination, toxicity, and data leakage, FuzzyAI specializes in jailbreak discovery through mutation-based techniques like genetic algorithms and crescendo attacks. For an evaluation framework that combines red teaming with prompt testing and model comparison, look at Promptfoo. For Microsoft’s multi-turn red teaming orchestrator built for enterprise AI red teams, check PyRIT.

For structured vulnerability testing with OWASP mapping, see DeepTeam. For runtime protection rather than pre-deployment testing, consider Lakera Guard, LLM Guard, or NeMo Guardrails.

Best for
Security teams and researchers who need to systematically test LLM guardrails for jailbreak vulnerabilities using mutation-based fuzzing techniques. Particularly useful for organizations deploying custom LLM applications that need to verify how well their safety mechanisms hold up.

For a broader overview of AI security threats and defense tools, see the AI security tools category page.

Frequently Asked Questions

What is FuzzyAI?
FuzzyAI is an open-source LLM fuzzing framework created by CyberArk Labs. It automatically tests large language models for jailbreak vulnerabilities using 18 attack techniques. Released in December 2024 under Apache 2.0, it has jailbroken every major AI model it was tested against.
Is FuzzyAI free to use?
Yes. FuzzyAI is free and open-source under the Apache 2.0 license. It is available on GitHub at github.com/cyberark/FuzzyAI. You need API keys for the target LLM providers you want to test, but the tool itself is free.
What attack techniques does FuzzyAI support?
FuzzyAI implements 18 attack methods including ArtPrompt (ASCII art obfuscation), DAN (Do Anything Now) persona jailbreaks, crescendo multi-turn escalation, genetic algorithm prompt mutation, ASCII smuggling with invisible Unicode Tag characters, many-shot jailbreaking, PAIR (iterative prompt refinement), ActorAttack, and Best-of-n jailbreaking. You can also add custom attack methods.
Which LLM providers does FuzzyAI support?
FuzzyAI can target models from OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Hugging Face, and Ollama (for local models). It also supports custom REST API endpoints, so you can test any LLM that exposes an HTTP API.
How does FuzzyAI compare to Garak?
Both are open-source LLM security testing tools, but they have different focuses. FuzzyAI specializes in jailbreak fuzzing with mutation-based attack techniques (genetic algorithms, crescendo). Garak is a broader vulnerability scanner with 50+ probe modules covering prompt injection, hallucination, toxicity, and data leakage. FuzzyAI goes deeper on jailbreak techniques; Garak covers a wider attack surface.
Can I add custom attack methods to FuzzyAI?
Yes. FuzzyAI is built as an extensible framework specifically designed for researchers and security teams to add their own attack methods. This lets you create domain-specific tests tailored to your organization’s threat model and the specific guardrails your LLM deployments use.