Betterleaks is an open-source secrets scanner that detects and validates hardcoded credentials in git repositories, directories, and archives using BPE tokenization instead of traditional entropy-based filtering. Built by Zachary Rice (zricethezav) — the same developer who created Gitleaks (25,000+ GitHub stars) — Betterleaks achieves 98.6% recall on the CredData benchmark, compared to 70.4% with entropy-based detection. Rice is currently Head of Secrets Scanning at Aikido Security.

The project was created on February 3, 2026 and reached v1.1.1 by March 17, 2026. It is written in Go, licensed under MIT, and designed as a drop-in replacement for Gitleaks with backwards-compatible configuration files and CLI flags.
Overview
Betterleaks targets the same problem as Gitleaks — finding hardcoded secrets in git repositories — but improves detection accuracy, secrets validation, and scanning performance.
The main difference between Betterleaks and Gitleaks is the detection engine. Where Gitleaks relies on Shannon entropy to distinguish random strings from real secrets, Betterleaks uses BPE tokenization with the cl100k_base model (the same tokenizer used by GPT-4). On the CredData benchmark dataset, Betterleaks achieves 98.6% recall compared to Gitleaks’ 70.4% with entropy-based filtering. On large codebases, that gap translates to significantly fewer missed secrets.
Betterleaks also adds CEL-based secrets validation. When Betterleaks finds a potential credential, it can fire an HTTP request to the target service and check whether the credential is actually live. A finding goes from “possible leak” to “confirmed active secret,” which changes how teams prioritize remediation.
Because it is backwards-compatible with Gitleaks configuration files and CLI flags, migrating is straightforward. Existing .gitleaks.toml files work without modification.

The benchmark above (from the Betterleaks repository) compares scan times on three real-world repositories. With RE2 and 8 git workers enabled, Betterleaks scans the Rails repo in 5.8s vs Gitleaks’ 24.5s (4.2x faster), the Ruby repo in 10.3s vs 55.2s (5.4x faster), and the GitLab repo in 2m13s vs 11m28s (5.2x faster).
Key Features
| Feature | Details |
|---|---|
| CLI commands | git (scan repos), dir (scan directories), stdin (pipe input) |
| Configuration | TOML format (.betterleaks.toml or .gitleaks.toml), backwards-compatible with Gitleaks |
| Detection engine | BPE tokenization (cl100k_base) + regex rules; 98.6% recall on CredData |
| Secrets validation | CEL expressions fire HTTP requests to verify if leaked credentials are still active |
| Output formats | JSON, CSV, JUnit, SARIF, custom Go templates |
| Installation | Homebrew, Docker, DNF (Fedora), from source |
| Regex engines | Go stdlib or RE2 (switchable); RE2 guarantees linear-time matching |
| Recursive decoding | base64, hex, percent-encoding, unicode escapes; configurable depth (default 5) |
| Archive support | zip, tar, and nested archives via --max-archive-depth |
| Git scanning | Parallelized via --git-workers; scans GitLab repo 5.2x faster than Gitleaks |
| Composite rules | Multi-part patterns with proximity matching to reduce false positives |
| Redaction | --redact flag with configurable percentage (0-100%) for logs and stdout |
| Baseline support | --baseline-path to ignore known findings and track only new secrets |
| Language | Pure Go (no CGO) — deploys anywhere without native library dependencies |
| License | MIT (no commercial restrictions) |
Token Efficiency Filter
Traditional entropy-based detection measures the randomness of a string to decide whether it might be a secret. The problem: many real secrets do not have high enough entropy to pass the threshold, and many non-secrets (like UUIDs or hashes) have high entropy but are not credentials.
Betterleaks replaces entropy with BPE tokenization. It uses the cl100k_base tokenizer — the same model used by GPT-4 — to evaluate how efficiently a string compresses into tokens. Real secrets tend to tokenize inefficiently because they are genuinely random, while structured strings (variable names, UUIDs, file paths) tokenize efficiently.
On the CredData benchmark, Betterleaks’ Token Efficiency Filter produces 98.6% recall versus 70.4% with Shannon entropy. In practice, I found this means fewer missed secrets without a proportional increase in false positives.
Secrets validation via CEL
Detecting a secret is useful, but knowing whether it still works is what drives remediation urgency.
Betterleaks uses CEL (Common Expression Language) expressions to define validation logic per rule. When a rule matches, the CEL expression can fire an HTTP request to the target API and check the response. If the credential returns a valid response, the finding is marked as confirmed-active.
This is similar to what TruffleHog, an open-source secrets detection tool, does with its built-in verifiers. The main difference between Betterleaks and TruffleHog validation is configurability: Betterleaks makes the validation logic user-configurable via CEL expressions, while TruffleHog’s verifiers are hardcoded per detector.
Composite and multi-part rules
Like Gitleaks, Betterleaks supports composite rules: a primary pattern combined with auxiliary patterns that must appear within a specified proximity. This reduces false positives for patterns that only matter when they appear near related identifiers (e.g., an API key near a specific service name).
Recursive decoding
Secrets are not always stored in plaintext. Betterleaks recursively decodes base64, hex, percent-encoding, and unicode escape sequences before applying detection rules. This catches secrets that developers have obfuscated or that build tools have encoded along the way.
Archive scanning
Betterleaks scans inside compressed archives (zip, tar, etc.), so secrets buried in vendored dependencies or bundled artifacts do not slip through.
Regex engine switching
You can switch between Go’s standard library regex engine and RE2. RE2 provides guaranteed linear-time matching, which matters when scanning large files with complex patterns.
Use Cases
CI/CD pipeline scanning. Run Betterleaks in your CI pipeline to block pull requests that introduce secrets. The --git-workers flag keeps scan times manageable even on large repositories. SARIF output feeds directly into GitHub Advanced Security.
Pre-commit hook. Install Betterleaks as a pre-commit hook to catch secrets before they reach version control. Same workflow as Gitleaks — existing pre-commit configurations work with minimal changes.
Incident response. When you discover a leaked credential, use CEL-based validation to quickly determine whether the secret is still active. This tells you whether rotation is urgent or can be scheduled.
Legacy codebase audits. Recursive decoding and archive scanning help surface secrets that are base64-encoded, hex-encoded, or buried inside zip files — common patterns in older codebases.
Getting Started

brew install betterleaks on macOS, or pull the Docker image with docker pull ghcr.io/betterleaks/betterleaks:latest. On Fedora, use dnf install betterleaks. You can also build from source with Go.betterleaks git /path/to/repo to scan git history for secrets. Use betterleaks dir /path/to/dir for non-git directories. Add --git-workers 4 for parallelized scanning and -v for verbose output..gitleaks.toml into the repository root. Betterleaks reads it natively. CLI flags are backwards-compatible — swap gitleaks for betterleaks in your scripts.--report-path results.json --report-format json to save findings. Validated secrets are marked as confirmed-active. Upload SARIF output to GitHub Advanced Security with --report-format sarif.Strengths & Limitations
Strengths:
- BPE tokenization is a measurably better approach to secret detection than Shannon entropy (98.6% vs 70.4% recall on CredData).
- CEL-based validation is user-configurable, unlike hardcoded verification in other tools.
- Drop-in Gitleaks replacement — no migration friction.
- Parallelized git scanning reduces wall-clock time on large repos.
- Recursive decoding catches encoded and obfuscated secrets.
- MIT license with no commercial restrictions.
Limitations:
- Very new project (created February 2026). Betterleaks’ rule library is smaller than mature SAST tools like Gitleaks or TruffleHog.
- 468 GitHub stars — small community compared to Gitleaks (25k+) or TruffleHog (25k+). Ecosystem integrations (GitHub Actions, pre-commit hooks) are still catching up.
- No managed cloud platform — this is a CLI tool. Teams wanting dashboards, team management, or hosted scanning should look at GitGuardian or TruffleHog’s commercial offering.
- CEL validation requires writing expressions per rule. Out-of-the-box coverage for common services is still growing.
Comparison with alternatives

| Feature | Betterleaks | Gitleaks | TruffleHog | GitGuardian |
|---|---|---|---|---|
| Detection method | BPE tokenization + regex | Entropy + regex | 800+ detectors | Pattern matching + ML |
| Secrets validation | CEL expressions (configurable) | No | Built-in verifiers (hardcoded) | Yes (commercial) |
| License | MIT | MIT | AGPL-3.0 | Freemium |
| Scan targets | Git, directories, stdin, archives | Git, directories, stdin | Git, Slack, S3, Docker, etc. | Git, CI/CD (commercial) |
| Parallelized scanning | Yes (–git-workers) | No | Yes | Yes |
| Recursive decoding | Yes (base64, hex, etc.) | Yes (v8.26+) | Limited | Yes |
| GitHub Stars | 468 | 25,500 | 25,100 | N/A |
Betterleaks is better for teams that prioritize detection accuracy and configurable validation logic, especially those already using Gitleaks who want a zero-friction upgrade. TruffleHog is better for organizations that need scanning beyond git repositories — including Slack, S3, and Docker images — thanks to its broader scan target support. GitGuardian, a commercial secrets detection platform, is better for enterprises that need dashboards, team management, and hosted scanning out of the box.