Betterleaks 2026: Open-Source Gitleaks Successor

Betterleaks is an open-source secrets scanner built by Zachary Rice, the original creator of Gitleaks (25,000+ GitHub stars).

It detects and validates hardcoded credentials in git repositories, directories, and archives using BPE tokenization instead of entropy-based filtering, achieving 98.6% recall on the CredData benchmark compared to 70.4% with traditional entropy detection. Rice is currently Head of Secrets Scanning at Aikido Security.

Betterleaks CLI scan output showing detected Slack webhook and Stripe secret key findings with redacted values

The project was created on February 3, 2026 and reached v1.1.1 by March 17, 2026. It is written in Go, licensed under MIT, and designed as a drop-in replacement for Gitleaks with backwards-compatible configuration files and CLI flags.

What is Betterleaks?

Betterleaks is a free, open-source secrets detection tool that scans git repositories, directories, stdin, and compressed archives for hardcoded credentials such as API keys, tokens, and passwords. It replaces Shannon entropy with BPE (Byte Pair Encoding) tokenization using the cl100k_base model to determine whether a string is likely a real secret.

On the CredData benchmark, this approach achieves 98.6% recall versus 70.4% for entropy-based scanning. Betterleaks is licensed under MIT and can be installed via Homebrew, Docker, DNF, or built from source.

Why secret detection matters

Hardcoded credentials in source code are one of the most common — and most damaging — failure modes I see in application security. GitGuardian’s State of Secrets Sprawl reports have tracked tens of millions of new secrets leaked to public GitHub each year, and once a credential lands in a repository it tends to stay reachable in commit history long after rotation.

The cost shows up two ways. First, attackers actively scrape public commits for valid keys — AWS, Stripe, OpenAI, internal SSO tokens — and exploit them within minutes. Second, pre-commit and pre-push secret scanners reduce that exposure window only when the false-positive rate is low enough that developers stop ignoring the noise.

That is the gap a Gitleaks-style tool tries to close, and the gap Betterleaks tries to close better. Token efficiency filtering cuts entropy false positives, and the CEL-driven validation step distinguishes a string that looks like a key from a credential that is still live against the issuing API. Both moves push the scanner toward the regime where developers actually act on findings instead of muting the rule.

How does Betterleaks improve on Gitleaks?

Betterleaks targets the same problem as Gitleaks – finding hardcoded secrets in git repositories – but improves detection accuracy, validation capability, and scanning speed.

The core difference is the detection engine. Gitleaks relies on Shannon entropy to distinguish random strings from real secrets. Betterleaks uses BPE tokenization with the cl100k_base model (the same tokenizer GPT-4 uses).

On the CredData benchmark, Betterleaks hits 98.6% recall compared to Gitleaks’ 70.4% with entropy-based filtering. On large codebases, that gap means a significant number of secrets that entropy misses.

Betterleaks also adds CEL-based secrets validation. When it finds a potential credential, it can fire an HTTP request to the target service and check whether the credential is still live. A finding goes from “possible leak” to “confirmed active secret,” which changes how you prioritize remediation.

Since it is backwards-compatible with Gitleaks configuration files and CLI flags, migrating takes minimal effort. Existing .gitleaks.toml files work without modification.

Betterleaks scan time comparison benchmark showing 4.2–5.4x faster scanning than Gitleaks on Rails, Ruby, and GitLab repositories

The benchmark above (from the Betterleaks repository) compares scan times on three real-world repositories. With RE2 and 8 git workers enabled, Betterleaks scans the Rails repo in 5.8s vs Gitleaks’ 24.5s (4.2x faster), the Ruby repo in 10.3s vs 55.2s (5.4x faster), and the GitLab repo in 2m13s vs 11m28s (5.2x faster).

Token Efficiency Filter

Uses BPE tokenization (cl100k_base) instead of Shannon entropy for secret detection. Achieves 98.6% recall on CredData, compared to 70.4% with entropy-based filtering.

CEL Secrets Validation

Fires HTTP requests against detected credentials using CEL expressions to verify whether leaked secrets are still active and exploitable.

Parallelized Git Scanning

Distributes git history scanning across multiple workers via –git-workers flag, reducing scan times on large repositories.

What are Betterleaks’s key features?

Feature	Details
CLI commands	`git` (scan repos), `dir` (scan directories), `stdin` (pipe input)
Configuration	TOML format (`.betterleaks.toml` or `.gitleaks.toml`), backwards-compatible with Gitleaks
Detection engine	BPE tokenization (cl100k_base) + regex rules; 98.6% recall on CredData
Secrets validation	CEL expressions fire HTTP requests to verify if leaked credentials are still active
Output formats	JSON, CSV, JUnit, SARIF, custom Go templates
Installation	Homebrew, Docker, DNF (Fedora), from source
Regex engines	Go stdlib or RE2 (switchable); RE2 guarantees linear-time matching
Recursive decoding	base64, hex, percent-encoding, unicode escapes; configurable depth (default 5)
Archive support	zip, tar, and nested archives via `--max-archive-depth`
Git scanning	Parallelized via `--git-workers`; scans GitLab repo 5.2x faster than Gitleaks
Composite rules	Multi-part patterns with proximity matching to reduce false positives
Redaction	`--redact` flag with configurable percentage (0-100%) for logs and stdout
Baseline support	`--baseline-path` to ignore known findings and track only new secrets
Language	Pure Go (no CGO) — deploys anywhere without native library dependencies
License	MIT (no commercial restrictions)

What is the Token Efficiency Filter?

The Token Efficiency Filter is Betterleaks’ core detection innovation that replaces Shannon entropy with BPE (Byte Pair Encoding) tokenization for identifying secrets. Entropy-based detection measures the randomness of a string to decide whether it might be a secret, but many real secrets don’t have high enough entropy to pass the threshold, and many non-secrets (like UUIDs or hashes) score high entropy but aren’t credentials.

Betterleaks uses the cl100k_base tokenizer (the same tokenizer GPT-4 uses) to evaluate how efficiently a string compresses into tokens. Real secrets tokenize inefficiently because they are random, while structured strings (variable names, UUIDs, file paths) compress well.

On the CredData benchmark, the Token Efficiency Filter produces 98.6% recall versus 70.4% with Shannon entropy — fewer missed secrets without an obvious false-positive penalty in the published evaluation.

Betterleaks scan output showing a low-entropy Stripe secret key caught by the Token Efficiency Filter that entropy-based scanning would have missed

How does CEL-based secrets validation work?

CEL-based secrets validation is Betterleaks’ mechanism for determining whether a detected credential is still active and exploitable. Finding a secret is useful, but knowing whether it still works is what decides how fast you need to act.

Betterleaks uses CEL (Common Expression Language) expressions to define validation logic per rule. When a rule matches, the CEL expression can fire an HTTP request to the target API and check the response. If the credential returns a valid response, the finding is marked as confirmed-active rather than just a potential leak.

This is similar to what TruffleHog does with its built-in verifiers. The key difference: Betterleaks makes the validation logic user-configurable via CEL expressions, so security teams can write custom verification for internal APIs and services. TruffleHog’s verifiers are hardcoded per detector.

Betterleaks CEL validation output showing confirmed-active Stripe and AWS secrets versus a revoked GitHub PAT, with HTTP response codes

What are composite and multi-part rules?

Composite rules in Betterleaks combine a primary regex pattern with auxiliary patterns that must appear within a specified proximity in the source code. This approach reduces false positives for patterns that only matter near related identifiers – for example, a random-looking string is only flagged as an API key if a service name like STRIPE_KEY or aws_secret appears nearby. Betterleaks inherited this capability from Gitleaks and extended it with proximity matching configuration.

Does Betterleaks support recursive decoding?

Yes. Betterleaks recursively decodes base64, hex, percent-encoding, and unicode escape sequences before applying detection rules. The decoding depth is configurable (default 5 levels).

This catches secrets that developers have obfuscated or that build tools have encoded during packaging – a common pattern in older codebases where credentials end up base64-encoded in configuration files or environment variable exports.

Does Betterleaks scan inside archives?

Betterleaks scans inside compressed archives including zip, tar, and nested archive formats via the --max-archive-depth flag. This ensures secrets hiding in vendored dependencies, bundled artifacts, or release packages don’t get missed during audits.

Can you switch regex engines in Betterleaks?

Betterleaks supports two regex engines: Go’s standard library regex engine and RE2. RE2 provides guaranteed linear-time matching, which matters when scanning large files with complex patterns. You can switch between them based on your performance and compatibility needs.

Who created Betterleaks?

Betterleaks was created by Zachary Rice (GitHub handle: zricethezav), the original author of Gitleaks, one of the most popular open-source secrets scanners with over 25,000 GitHub stars. Rice is currently Head of Secrets Scanning at Aikido Security.

He started the Betterleaks project on February 3, 2026, building on the lessons learned from years of maintaining Gitleaks. The project is hosted on GitHub under the MIT license and accepts community contributions.

When should you use Betterleaks?

Tip

Best for

Teams already using Gitleaks that want better detection accuracy and live secrets validation without changing their workflow. Also works well for new secret scanning setups where you want verified findings from day one.

CI/CD pipeline scanning. Run Betterleaks in your CI pipeline to block pull requests that introduce secrets. The --git-workers flag keeps scan times reasonable even on large repositories. SARIF output feeds directly into GitHub Advanced Security.

Pre-commit hook. Install Betterleaks as a pre-commit hook to catch secrets before they reach version control. Same workflow as Gitleaks; existing pre-commit configurations work with minimal changes.

Incident response. When you discover a leaked credential, use CEL-based validation to check whether the secret is still active. That tells you whether rotation is urgent or can wait.

Legacy codebase audits. Recursive decoding and archive scanning help find secrets that are base64-encoded, hex-encoded, or tucked inside zip files, which is common in older codebases.

How do I get started with Betterleaks?

Betterleaks CLI help output showing available commands and key flags

Install. Run brew install betterleaks on macOS, or pull the Docker image with docker pull ghcr.io/betterleaks/betterleaks:latest. On Fedora, use dnf install betterleaks. You can also build from source with Go.

Scan a repository. Run betterleaks git /path/to/repo to scan git history for secrets. Use betterleaks dir /path/to/dir for non-git directories. Add --git-workers 4 for parallelized scanning and -v for verbose output.

Migrate from Gitleaks. Drop your existing .gitleaks.toml into the repository root. Betterleaks reads it natively. CLI flags are backwards-compatible, so just swap gitleaks for betterleaks in your scripts.

Review findings. Use --report-path results.json --report-format json to save findings. Validated secrets are marked as confirmed-active. Upload SARIF output to GitHub Advanced Security with --report-format sarif.

Strengths & Limitations

Strengths:

BPE tokenization measurably outperforms Shannon entropy for secret detection (98.6% vs 70.4% recall on CredData).
CEL-based validation is user-configurable, unlike hardcoded verification in other tools.
Drop-in Gitleaks replacement. No migration pain.
Parallelized git scanning cuts wall-clock time on large repos.
Recursive decoding catches encoded and obfuscated secrets.
MIT license, no commercial restrictions.

Limitations:

Very new project (created February 2026). The rule library is smaller than mature tools like Gitleaks or TruffleHog .
1,254 GitHub stars. Small community compared to Gitleaks (25k+) or TruffleHog (25k+). Ecosystem integrations (GitHub Actions, pre-commit hooks) are still catching up.
No managed cloud platform. This is a CLI tool. Teams that want dashboards, team management, or hosted scanning should look at GitGuardian or TruffleHog’s commercial offering.
CEL validation requires writing expressions per rule. Out-of-the-box coverage for common services is still limited.

How does Betterleaks compare to other secrets scanners?

GitHub star history comparing secrets scanners — Gitleaks, TruffleHog, Kingfisher, Betterleaks, Nosey Parker, and detect-secrets over time

Feature	Betterleaks	Gitleaks	TruffleHog	GitGuardian
Detection method	BPE tokenization + regex	Entropy + regex	800+ detectors	Pattern matching + ML
Secrets validation	CEL expressions (configurable)	No	Built-in verifiers (hardcoded)	Yes (commercial)
License	MIT	MIT	AGPL-3.0	Freemium
Scan targets	Git, directories, stdin, archives	Git, directories, stdin	Git, Slack, S3, Docker, etc.	Git, CI/CD (commercial)
Parallelized scanning	Yes (–git-workers)	No	Yes	Yes
Recursive decoding	Yes (base64, hex, etc.)	Yes (v8.26+)	Limited	Yes
GitHub Stars	1,254	25,500	25,100	N/A

Betterleaks fits best if you care about detection accuracy and configurable validation, especially if you’re already on Gitleaks and want a painless upgrade. TruffleHog is a better pick for teams that need scanning beyond git repos (Slack, S3, Docker images). GitGuardian is the way to go for enterprises that need dashboards, team management, and hosted scanning.

Visit Betterleaks View source on GitHub (1254★)

Frequently Asked Questions

What is Betterleaks?

Betterleaks is a free, open-source secrets scanner created by Zachary Rice (zricethezav), the original author of Gitleaks (25k+ GitHub stars). It detects hardcoded credentials in git repositories, directories, and archives using BPE tokenization instead of Shannon entropy, achieving 98.6% recall on the CredData benchmark. Betterleaks is a drop-in replacement for Gitleaks with backwards-compatible configuration files and CLI flags, and is licensed under MIT.

How does Betterleaks compare to Gitleaks?

Betterleaks is backwards-compatible with Gitleaks configurations and CLI flags, so migration takes minimal effort. The main improvements over Gitleaks are: token efficiency filtering using BPE tokenization (98.6% recall vs 70.4% with entropy), live secrets validation via CEL expressions, parallelized git scanning with the –git-workers flag (4-5x faster on large repos), and recursive decoding for base64, hex, percent-encoding, and unicode-encoded secrets.

What is the Token Efficiency Filter in Betterleaks?

The Token Efficiency Filter is Betterleaks’ core detection innovation that replaces Shannon entropy with BPE (Byte Pair Encoding) tokenization using the cl100k_base model (the same tokenizer GPT-4 uses). Real secrets tokenize inefficiently because they are random, while structured strings like variable names and UUIDs compress well. On the CredData benchmark, this approach achieves 98.6% recall compared to 70.4% with entropy-based filtering, meaning far fewer missed secrets.

Can Betterleaks validate if leaked secrets are still active?

Yes. Betterleaks uses CEL (Common Expression Language) expressions to fire HTTP requests against detected credentials and verify whether they are still active. If a credential returns a valid response, the finding is marked as confirmed-active rather than just a potential leak. Unlike TruffleHog’s hardcoded verifiers, Betterleaks’ validation logic is fully user-configurable, so security teams can write custom verification for internal APIs and services.

Is Betterleaks free?

Yes. Betterleaks is completely free and open-source under the MIT license with no commercial restrictions. You can use it in personal projects, commercial codebases, and CI/CD pipelines without licensing fees. It can be installed via Homebrew, Docker, DNF (Fedora), or built from source.

How fast is Betterleaks compared to Gitleaks?

Betterleaks is significantly faster than Gitleaks on large repositories. With RE2 and 8 git workers enabled, benchmarks from the Betterleaks repository show 4.2x faster scanning on Rails (5.8s vs 24.5s), 5.4x faster on Ruby (10.3s vs 55.2s), and 5.2x faster on GitLab (2m13s vs 11m28s). The speed improvement comes from parallelized git scanning via the –git-workers flag and the optional RE2 regex engine.