← Snitch for Claude Code

blog

From the
team.

Thoughts on AI security, what we're building, and where this is going.

June 9, 2026

Snitch for Fable is here.

Anthropic shipped Claude Fable, its most capable model yet. We spent the run-up getting Snitch ready for it, and as of today that work is live. Install Snitch and you're on the version built for how Fable actually works.
We didn't bolt Fable on. We went back through the audit top to bottom and tuned it to the model's strengths: it reasons more deeply, holds more in its head at once, and can work on its own for far longer without losing the thread. A scanner gets better when the model underneath does, so the job was to get out of its way in the right places and tighten the guardrails in the others.

What changed

  • Tracing goes further. A risky-looking line isn't a finding until the input reaching it is actually attacker-controlled. The audit now follows that input back to where it really comes from, across files, instead of glancing at the lines around the match. A guard sitting in shared middleware gets found, which means fewer false alarms and findings you can trust.
  • Verification got more adversarial. The deepest scan tries to disprove its own findings before it shows them to you, and tells you which ones it withdrew and why. Confirmed beats plausible.
  • Honest on long runs. On big audits, progress is checked against what was actually scanned rather than claimed, and anything it couldn't finish says so plainly instead of rounding up to "done."
  • Defensive by default. Findings describe the risk and the fix; the tool doesn't write working attack code. If the model declines to go somewhere, the report marks that area incomplete rather than quietly skipping it.
We'll spare you the internals. What matters on your end is the output: the same evidence-on-every-finding audit, traced and verified harder, running on the strongest model available.

What you do

Nothing to configure. Install, or reinstall if you already run Snitch, and you're on the Fable-ready version. The everyday scan works the way it always has; the deepest, most thorough tier is there for when you're about to ship and want to be sure.
curl -fsSL https://snitchplugin.com/snitch.sh | sh
Or grab the bundle from the home page. It lands in ~/.claude/skills/snitch/; type /snitch in Claude Code to start. Anything you scanned before still stands; this is purely additive.

Run it on something real and want to talk through what it found? Email eric.waters@snitchplugin.com.

May 28, 2026

Ultra Audit: a scanner that tries to prove itself wrong

Every security scanner has the same failure mode: it finds something that looks bad, reports it with confidence, and is wrong. Do that enough times and people stop reading your output. The fix is not "be smarter at finding." It is "be honest about what you found." Ultra Audit, shipping today in v1.2.0, is Snitch trying to disprove its own findings before it shows them to you.

What it does

Deep Scan reasons forward: it traces input to a dangerous sink, then chains findings into exploit paths. It never reasons against a finding. Ultra adds that missing pass. After the scan assembles its findings, every one of them gets put on trial through three lenses, each trying to refute it:
  • Reachability. Does attacker-controlled input actually reach this sink on a real path, or does the scanner just think it does?
  • Sanitizer. Is there a validator, guard, or framework default upstream that the finder walked right past?
  • Context. Is this test code, dead code, an example, a comment? Something that cannot run in production?
Findings that survive are confirmed. Findings that get refuted are withdrawn, and the report says which lens killed them and why. Findings the scanner cannot fully trace stay in, flagged low-confidence and tagged for a human, because suppressing a real critical to look clean is the worse mistake. Then a dedup pass collapses the same root cause reported under five different category names into one finding.
The report ends with a verification summary: how many were raised, how many survived, how many were withdrawn, and the false-positive rate that pass removed. You see the scanner kill its own bad findings in the open. That is the whole point. A tool that withdraws its weak findings with a reason is one you can act on without re-checking everything by hand.

Why now

This only works on a model that can actually reason hard, between tool calls, across files, many times over. Claude Opus 4.8's thinking is what makes refuting a finding as rigorous as raising it. And it is expensive by design: in the deepest setting every finding gets its own set of skeptics working in parallel. That is not a default you want on a quick pre-commit check. It is the setting you reach for when you are about to ship and you want to be sure.

How to use it

Two ways, depending on how deep you want to go.
  • In the menu. Run /snitch and pick Ultra Scan. It runs everywhere Snitch runs and gets sharper on a thinking-capable model. On a model with no reasoning mode it steps down to a normal scan and tells you so, rather than pretending.
  • The deepest version, on Claude Code. Turn on ultracode (or ask for a full verified audit as a workflow) and Snitch fans the verification out: each finding gets its own parallel skeptics instead of one self-review pass. Slower and more expensive, and the most thorough audit Snitch can produce.
Either way the install is the same. One line:
curl -fsSL https://snitchplugin.com/snitch.sh | sh
or grab the bundle from the home page. Pick Ultra when it counts.

Run Ultra on something real and want to compare what it confirmed versus withdrew? Email eric.waters@snitchplugin.com.

May 23, 2026

Deep Scan is finally on. Here's what Opus 4.8 changed.

If you've been running Snitch in Claude Code, you've seen a menu option called Deep Scan sitting there greyed out since v1.0.0. Today it works. v1.1.0 turns it on, and the reason it was off until now is a good window into what actually changed when Claude Opus 4.8 landed. So let's talk about both.
Quick recap of what Deep Scan is supposed to do. A normal scan finds issues one at a time. Deep Scan does that too, but then it asks the harder question: do any of these findings combine into something worse than the sum of their parts? That low-severity info leak plus that unauthenticated internal endpoint might be a real exploit chain. Answering that means holding a lot of findings in your head at once and reasoning about how they connect, without hallucinating a chain that isn't there. We shipped it disabled in v1.0.0 because a half-working version of that is worse than none. If your security tool invents exploit chains, you stop trusting it, and a tool you don't trust is shelfware.

What Opus 4.8 actually changed

Two things, and if you build with the Claude API you'll care about both.
First, interleaved thinking. The model can now reason between tool calls, not just in one burst before its final answer. That sounds academic until you think about what a real audit does. To know whether a SQL query is exploitable, you grep for the query, read the file, follow the variable back to the function that called it, check whether a validator touched it on the way. Read, think, read, think. The whole job is reasoning in the gaps between file reads. Old models did the reads fine and then guessed at the connective tissue. Now the thinking happens where the work happens.
Second, the 1M-token context window went GA, and instruction-following got measurably better. Anthropic calls out three changes in particular: the model skips fewer tool calls it was supposed to make, it flags its own uncertainty more often, and it claims progress without evidence less often. Read those last two again if you write security tooling. A scanner that confidently reports a bug it can't prove is actively worse than no scanner, because now you're chasing ghosts in your own code. The model getting more honest about what it doesn't know is the single most useful thing that could happen to an audit tool. The full announcement has the rest.

So what's new in Snitch

Deep Scan runs. All 67 categories, plus the chain step. After each confirmed finding it checks whether that finding plus an earlier one adds up to a worse exploit path, and it has to actually reason it through before it'll call it a chain. No chaining on vibes or surface similarity. Every finding gets a short trace attached: what the attacker does, what has to be true for it to work, and what they get. If you run it on a host with no thinking mode, it falls back to a regular Full Scan and tells you so in the report instead of faking the chain detection.
The trace step thinks before it labels. Every finding hinges on one question: where did this input come from? A hardcoded string is fine. A value that went through a validator is fine. req.query.id dropped straight into a query is not. Same line of code, completely different severity, and the only way to tell them apart is to trace it. v1.1.0 makes the model work that trace out before it commits to a label, which is exactly the kind of step interleaved thinking is built for.

Two things we corrected while we were in here

We used to split big scans across parallel sub-agents partly to keep from blowing the context budget. With a million tokens to play with, that's not the reason anymore, so we stopped pretending it was. The split still happens, because running category batches in parallel is faster and keeps one noisy batch from polluting the others. Same behavior, honest reason.
We also deleted a bunch of shouting. Early versions were full of all-caps "YOU MUST" instructions to stop older models from rambling or wandering out of scope. Opus 4.8 just follows plain instructions, so the caps lock was noise, and over-emphasis can actually make a model behave worse. We toned it down. The rules that actually matter (never print a secret in a report, never touch your code without asking first) are exactly as strict as they were. Those aren't going anywhere.

Grab it

One line in your terminal:
curl -fsSL https://snitchplugin.com/snitch.sh | sh
Or grab the zip from the home page, unzip, run the installer. It lands in ~/.claude/skills/snitch/. Then type /snitch in Claude Code and pick Deep Scan. Anything you scanned under v1.0.0 still stands; this is purely additive.

Run it on something real and want to talk through what it found? Email eric.waters@snitchplugin.com.

May 12, 2026

Mini Shai-Hulud is back: TanStack compromised. What Snitch users should do today.

Update, May 28, 2026
The advisories this post was waiting on are indexed in OSV now, so a current scan flags the compromised TanStack versions on its own, no action needed from us. The part worth re-reading is the bottom: the teams that got burned were the ones letting dependency bumps happen in the background. That hasn’t changed.
Aikido published a writeup this morning of a second wave of the Shai-Hulud-style npm worm. This one swept 83 package-version entries across the TanStack ecosystem: @tanstack/react-router, react-start, vue-router, solid-router, router-core, history, and others. The payload runs on install and scrapes GitHub tokens, npm tokens, GitHub Actions OIDC tokens, AWS credentials, Kubernetes service-account files, HashiCorp Vault tokens, environment variables, and anything else a worker process can read off disk. It then tries to publish itself onward through any npm package the victim has write access to.

The compromised versions

  • @tanstack/history: 1.161.9, 1.161.12
  • @tanstack/react-router: 1.169.5, 1.169.8
  • @tanstack/router-core: 1.169.5, 1.169.8
  • @tanstack/vue-router: 1.169.5, 1.169.8
  • @tanstack/solid-router: 1.169.5, 1.169.8
  • @tanstack/react-start: 1.167.68, 1.167.71
  • @tanstack/vue-start: 1.167.61, 1.167.64
  • @tanstack/solid-start: 1.167.65, 1.167.68
Aikido lists another 75 entries beyond these. Their post is the canonical list; if you maintain a TanStack project, open it and cross-reference your lockfile against the full set, not the abbreviated one above.

Are we (snitchplugin.com) compromised?

No. Every TanStack package resolved in our build sits below the malicious version range, and the resolution that produced those pins predates the attack by weeks. The dependencies running in production were not "the latest we could resolve before disclosure"; they were a snapshot taken well before the attacker had publish access. Until the registry confirms a clean replacement line, we hold dependency updates on the affected namespaces.
That's not luck. It's the same dependency discipline we recommend below, applied to ourselves.

What to do right now if you ship a TanStack app

  1. Grep your lockfile. grep -E '"@tanstack/(history|react-router|router-core|vue-router|solid-router|react-start|vue-start|solid-start)"' package-lock.json. Cross-reference every version field against Aikido's table.
  2. If you ran an install in the affected window (May 11 onward) on a machine that holds any credentials worth stealing, assume those credentials are exposed. Rotate npm tokens, GitHub PATs, AWS access keys, Kubernetes service-account tokens, and any Vault token the workstation or CI runner had access to. Do not wait for evidence of misuse; the malware exfiltrates on install, so the window between install and impact is whatever the attacker chooses.
  3. Check CI logs for unexpected Bun invocations, optional-dependency installation failures, or npm publish events you didn't trigger. Aikido flags those as the loudest signals.
  4. Drop your node_modules and your global npm cache (npm cache clean --force) before reinstalling. A poisoned tarball can be cached locally even after the registry pulls the version.
  5. Search the codebase and any disk image of an affected machine for router_init.js, tanstack_runner.js, and router_runtime.js. SHA-256 for the first two: ab4fcadaec49c03278063dd269ea5eef82d24f2124a8e15d7b90f2fa8601266c and 2ec78d556d696e208927cc503d48e4b5eb56b31abc2870c2ed2e98d6be27fc96. Also look for the dependency marker @tanstack/setup referenced as a GitHub URL.

What Snitch catches here

Two things, and both are worth being specific about because the temptation in a fresh disclosure is to overclaim.
Snitch's SCA pass (Category 27) queries OSV.dev for every resolved package-version pair in your lockfile on every scan. As the TanStack advisories land in OSV, likely today or tomorrow, every subsequent snitch scan picks them up automatically. There is no Snitch update to wait for. If you scan a project right now with one of the compromised TanStack versions pinned, you may or may not see the advisory depending on whether OSV has indexed it yet. We do not maintain a parallel CVE database that races OSV; the right answer is the boring answer, and OSV is the boring answer.
The malicious-postinstall pattern itself, package scripts that fetch and execute remote payloads on install, is Category 5 in Snitch's methodology and has been since launch. That category fires on your own package.json: if you wrote a preinstall or postinstall that curls an external URL or executes downloaded JS, Snitch flags it as a supply-chain hazard. It does not, today, statically read every transitive dependency's install scripts. That's a known gap and one we're scoping work to close, but it would not have helped on this incident specifically: the TanStack payload was embedded in the published tarball as a regular import, not a postinstall hook, which means even a deep install-script scanner wouldn't have caught it on inspection alone. SCA backed by a coordinated registry response is what catches this class.

The bigger pattern, and the boring defenses that work

This is the second self-propagating npm worm in twelve months. The mechanism is consistent: phishing or token theft compromises a maintainer, the attacker publishes a poisoned version, the version executes on install and harvests every credential the install process can see, and the harvested credentials are used to publish onward. The defenses are also consistent and unglamorous.
The three architectural choices that kept us out of this incident are the ones we'd recommend to any team:
  1. Decouple deploy from dependency resolution. If git push on your default branch triggers a CI job that runs npm install against current-state npm and ships the result, the registry's worst minute is your production minute. A deploy that bundles a pre-built tree from a known-good node_modules, or at minimum runs npm ci against a committed lockfile, never npm install, puts time between an attacker's publish and your user's browser. That time is what lets disclosures land before exploits do.
  2. Treat the lockfile as the contract, not the manifest. Caret ranges in package.json describe intent; the lockfile describes reality. The team that audits their package.json diffs but lets the lockfile float on every npm install is auditing the wrong file. Pin dependencies intentionally, in PRs, with a human reading the lockfile diff.
  3. Make dependency bumps intentional, not ambient. npm update on a schedule, Dependabot auto-merging minor bumps, Renovate squashing seventy-five updates into a Friday PR, these patterns trade safety for ergonomics. The right cadence for security-sensitive dependencies (routers, auth, anything that touches credentials at install time) is "when there is a reason to bump, with the human who has that reason reviewing the lockfile diff in the PR." Background bumping is how the poisoned tarball ends up in your lockfile before anyone has read the changelog.
On top of that, two operational habits that don't show up in any architecture diagram but cut the blast radius when an incident does land: keep developer-workstation credentials short-lived (OIDC for cloud, scoped GitHub Apps with short token lifetimes, ephemeral CI runners), and audit your own preinstall and postinstall scripts because they're the highest-leverage line of code you own, a poisoned upstream is an external risk; a sloppy local install script is one you wrote yourself.
Provenance is not a substitute for any of this. Aikido's writeup is explicit: a valid provenance record on a poisoned package only proves the build pipeline ran; it does not prove the source is clean. Treat provenance as one signal among several, never as the load-bearing check.

References

Running a TanStack-based app and want a second pair of eyes on your lockfile and install scripts? Email eric.waters@snitchplugin.com.

May 11, 2026

Three new categories in the marketing audit: persuasion architecture, pricing psychology, retention psychology

Update, May 28, 2026
Cat 114 (persuasion architecture) turned out to be the one people run first, because the section score points you straight at the weakest part instead of handing you a pile of unranked fixes. All three categories are still in the marketing audit, unchanged.
Today we shipped v1.1.0 of Snitch's marketing audit skill. Three new categories, one new reference, and surgical additions to several existing ones. Here is what changed and why we did it.

What the marketing audit was missing

The existing marketing audit had broad coverage: technical SEO, schema, performance, content depth, on-page conversion, off-site channels, positioning. Over 110 categories at the surface level. What it did not have was a way to ask, "given everything that is on this page, does the persuasion architecture HOLD?" A site could pass every individual category, alt text in place, schema valid, CTAs present, and still convert poorly because the elements did not compose into anything that actually moved a visitor through the decision.
The retention side had the opposite problem. Acquisition surfaces got most of the attention because acquisition is the loudest part of the funnel. Activation, peak-end design on first success, switching cost construction, cancellation flow dignity, all of these lived in the negative space between existing categories. None of them owned a category.
Pricing display sat in a gap too. Cat 112 audits whether the brand charges the right amount for the right audience. That is a strategic question. The tactical questions, charm vs rounded match, decoy tier presence, anchoring order, Rule of 100 framing, mental accounting, strike-through provenance, those were nowhere.

Cat 114: Persuasion architecture

A holistic audit that scores the whole persuasion surface 0-175 across seven sections: First Impression and Attention, Trust and Credibility, Motivation and Desire, Friction and Conversion, Emotional Resonance, Decision Support, Follow-Through and Retention. Each section scores 0-25 across four elements with a rubric. Each section cross-references the tactical categories that produce specific fixes.
The point is not the score. The point is the section breakdown. A site might score 117 / 175 (B) overall, but Section 5 (Emotional Resonance) at 11. That is the section to work on. The audit always recommends the lowest-scoring section first, regardless of which one it is. A site with strong trust and weak motivation does not need trust polish; it needs motivation work.

Cat 115: Pricing psychology (tactical display)

Audits the mechanics of how prices are shown on the page. Are the price endings (.99 vs .00) matched to the brand positioning? Is there a decoy tier on a 3-plus tier pricing page? Is the highest tier shown first to anchor expectations, or is the cheapest tier first (which inverts the anchor)? Is the annual discount framed as a percentage or a dollar amount, and does the choice match Rule of 100? Is there a strike-through price, and if so does it have provenance (an original-date stamp, an MSRP reference, a source)?
Distinct from Cat 112, which is strategic. Cat 112 asks "are you charging the right amount". Cat 115 asks "given what you charge, are you displaying it in a way that converts".

Cat 116: Retention psychology

Audits everything past the conversion event. Time to first win (under five minutes is the bar). Endowment leveraging in trials (can the customer create or save anything before paying). Peak-end design on first success (is there a celebration, a memorable moment, or just a plain checkmark). Switching cost construction (integrations, data accumulation, workflow customization). Exit and cancellation flow (pause, downgrade, support outreach, or a dead exit). Streak systems audited for ethical line: encouragement versus pressure.
This category flags ethical risks separately from optimization findings. Confirmshaming copy ("No thanks, I hate saving money") and dark cancellation flows (cancellation requires a phone call) are surfaced as ethical violations regardless of any conversion lift they may produce.

New reference: mental models catalog

Over 70 psychology and behavioral-design models, grouped by application: Foundational Thinking, Buyer Psychology, Influencing Behavior, Pricing Psychology, Design and Delivery, Growth and Scaling, plus current research on persuasive design. Loss Aversion, Anchoring, Decoy Effect, Peak-End Rule, Goal-Gradient, Zeigarnik Effect, Activation Energy, Switching Costs, all of them. Each entry lists the mechanism, a marketing application, and the audit categories that reference it.
The reference is loaded on demand from categories that need it, not pre-loaded into the skill prompt. That keeps the audit's token cost from ballooning. When the audit produces a finding that calls for a specific mental model, the model name appears in the fix narrative so the customer team understands the WHY, not just the WHAT.
The catalog opens with a psychology hierarchy: build genuine value before establishing credibility, before reducing friction, before creating motivation, before guiding decisions. Findings that recommend a motivation tactic without first auditing the friction underneath get flagged as out-of-order. Friction is fixed before scarcity is added. Trust is built before authority is invoked.

What customers do with this

Run snitch marketing on any site. The new categories surface in the full audit and are included in the B2B SaaS preset. The CLI fetches v1.1.0 on the next session; no install action needed. Reports generated under v1.0.0 remain valid; the new categories are additive.

Questions or want a walk-through of what Cat 114 produces on your own site? Email eric.waters@snitchplugin.com.

May 9, 2026

We got jailbroken. Here's what we added to Snitch so it doesn't happen to you.

Update, May 28, 2026
The Cat 15 coverage below has been live since the day we wrote this, and we haven’t been caught by that vector again. If you skim one paragraph, make it the one about running a cheap input gate before the expensive model. That’s the part that generalizes to any AI chat surface.
This one is short and honest: a user sent a jailbreak prompt to an AI product we operate, the model complied, produced content it should not have, and it cost us real money. We're writing this because we think transparency about what went wrong matters, and because we turned the incident into a concrete improvement in Snitch's methodology.

What happened

An AI-powered chat interface we run was sent a persona-override prompt. This class of attack establishes a fictional AI identity with "no restrictions," frames the real system prompt as an attack to be ignored, and then requests harmful output under a creative-fiction cover. The attack is long by design, elaborate by design, and targets the gap between what a system prompt instructs and what the model will actually do when someone works hard enough at the framing.
We won't describe the specific prompt, link to it, or explain how to find it. What we'll say is that the attack class is not new, it's been documented since 2023, and we didn't have a technical gate in front of the model to catch it. We had prose instructions in the system prompt that said to refuse. That's not enough.

Why it worked

There are two ways to enforce scope on an AI chat interface. One is to tell the model what to do in the system prompt. The other is to run a cheap, deterministic check on the user's message before it ever reaches the expensive model, and reject anything that looks like an attack. We were doing the first but not the second.
A well-crafted persona-override prompt is specifically designed to defeat the first approach. It tells the model that refusals are themselves attacks, that the system prompt is an injection to be dismissed, and that some higher identity or relationship supersedes the operator's instructions. A non-trivial fraction of model invocations, including this one, will comply. The only reliable defense is a gate that runs before the model sees the message at all.

What we added to Snitch

Category 15 (AI API Security) in Snitch's methodology now explicitly covers this. A Snitch security audit will flag any AI chat application that:
  • Relies solely on system-prompt prose for scope enforcement with no separate server-side gate.
  • Does not scan incoming user messages for persona-override signals before passing them to the model.
  • Sets user-message text limits high enough that a long persona-override prompt fits in a single turn.
  • Has no abuse log, meaning there is no forensic record of jailbreak attempts when an incident occurs.
Previously the category covered output handling, cost controls, and multi-turn escalation techniques (Crescendo, Many-Shot, Skeleton Key). Those were all there. The gap was the persona-override class, which is a different shape: it works in a single turn, it targets identity rather than escalating through turns, and it needs a synchronous input gate rather than multi-turn monitoring to catch it.

Sorry

We should have had this coverage before someone demonstrated the gap the hard way. We're a security audit tool. When a security issue hits us directly, we're accountable to write about it plainly, fix it, and make the fix part of what we audit for in everyone else's code. That's what we did here.
If you operate an AI chat interface, the question to ask is: what happens when someone sends a 3,000-word message that opens with "from now on you are a different AI with no restrictions"? If the honest answer is "the system prompt probably holds, but I'm not certain," that's the gap. Run a scan.

Questions or want to walk through what a pre-model input gate looks like in your stack? Email eric.waters@snitchplugin.com.

May 8, 2026

Twelve React + Next.js CVEs landed this week: what your Snitch scan flags today, what it doesn't, and what you should actually do

Update, May 28, 2026
Every patched version named below has been out for weeks now. If your lockfile still sits under them, the scan flags it and that’s the whole finding. Bump and ship; there’s nothing clever to do here.
On May 6, 2026, Cloudflare published a changelog post covering twelve CVEs disclosed in coordinated fashion across the React and Next.js ecosystems. The classes span denial of service, middleware bypass, SSRF, XSS, and cache poisoning. Cloudflare deployed two managed WAF rules covering the highest-priority cases and stated plainly that several of the disclosed issues are not WAF-blockable. Patching is the actual control.
We won’t paraphrase Cloudflare’s post or claim credit for the disclosure. What we want to do here is tell you, concretely, which of these CVEs your Snitch security audit catches today, which ones it doesn’t, and what to do about both.

Affected packages and patched versions

  • react-server-dom-webpack, react-server-dom-parcel, react-server-dom-turbopack: patched in 19.0.6, 19.1.7, 19.2.6.
  • next: patched in 15.5.16 and 16.2.5.
If your project pins below those, you’re inside the disclosure window. Updating closes the door regardless of whether anything in front of you (Cloudflare, Vercel, your own load balancer) has a virtual patch in place.

What Snitch catches today

Every Snitch security audit runs a Software Composition Analysis (SCA) pass before the LLM phase. The pass parses your package-lock.json, yarn.lock, or pnpm-lock.yaml, queries OSV.dev for every resolved package + version pair, and renders matched advisories alongside the AI findings. This is Category 27 in the methodology.
We confirmed against the live OSV API on May 8, 2026. A scan of a project with next@15.5.0 in its lockfile gets five advisories surfaced today, including the high-severity ones:
  • GHSA-9qr9-h5gf-34mp, RCE in the React flight protocol.
  • GHSA-ggv3-7p47-pfv8, HTTP request smuggling in rewrites.
  • GHSA-q4gf-8mx6-v5v3, denial of service via Server Components.
  • GHSA-9g9p-9gw9-jx7f, denial of service via the Image Optimizer remotePatterns.
  • GHSA-3x4c-7xq6-9pq8, unbounded next/image disk cache growth.
A scan of a project pinned at next@15.5.16 returns zero. That’s the control.
Most projects pulling react-server-dom-* get it transitively through next, which means the Next.js advisories cover them by association. Update Next.js, the transitive dep moves with it, the finding clears.

What Snitch doesn’t catch yet

Direct dependencies on react-server-dom-webpack, react-server-dom-parcel, or react-server-dom-turbopack below the patched versions are not yet in OSV as of this morning. We tested react-server-dom-webpack@19.0.5 and got zero advisories back from the OSV API. This is OSV’s lag, not Snitch’s. The advisories will land there in the next hours or days. When they do, every subsequent Snitch scan picks them up automatically.
We deliberately don’t maintain a hand-curated CVE table that races OSV. Adding a one-off rule for this disclosure means committing to maintain a parallel database forever, and the moment we miss one CVE the customer is worse off than the OSV-backed version. The right answer is also the boring answer: query OSV, and trust it.
If your project carries a direct react-server-dom-* pin and you want to act before OSV catches up, run your scan, then check the package version against the patched releases above by hand. There aren’t many projects with a direct pin, but if you’re one of them, this is the gap.

What about the static-code side?

Several of these CVEs are bug-class issues inside the framework itself: a parser that mishandles a particular protocol shape, a router that resolves a segment-prefetch path it shouldn’t, a Server Component path that doesn’t bound a resource. Static AI code review of your code does not catch a bug inside their code. SCA is the right tool for that and SCA already runs.
The exception is the application-level patterns the disclosure surfaced indirectly: SSRF via WebSocket upgrades, cache poisoning via inconsistent normalization, XSS via unsafe rendering of attacker-controlled HTML. Those are categories Snitch already covers in the LLM phase. Specifically, Cat 12 (SSRF), Cat 7 (XSS), and the cache-key handling guidance under Cat 39 will fire on a vulnerable application pattern even if your Next.js version is up to date. The framework patch is necessary; it isn’t always sufficient.

What you should do this week

  1. Bump next to 15.5.16 or 16.2.5, whichever line you’re on. npm i next@15.5.16 or npm i next@16.2.5, run your test suite, deploy.
  2. If you have a direct dependency on any react-server-dom-* package, bump it to the matching patched version (19.0.6, 19.1.7, or 19.2.6).
  3. Run snitch scan against the working tree. Confirm the SCA section is clean. If you’re still seeing advisories on a freshly updated lockfile, the lockfile didn’t actually move, run npm install with no --frozen-lockfile flag.
  4. If you’re behind Cloudflare, the two managed WAF rules are auto-enabled. Don’t treat them as a substitute for the package update. Cloudflare said as much.

References

Running a Next.js or React Server Components stack and want a second pair of eyes on the application surface? Email eric.waters@snitchplugin.com.

April 28, 2026

GitHub Enterprise Server RCE (CVE-2026-3854): when push options become a header-injection vector

Update, May 28, 2026
This CR/LF-into-headers shape is Category 72 in the catalog, and it’s one of the sink patterns Deep Scan now traces from sink back to source in v1.1.0. The CVE itself was GitHub’s to fix, but the class shows up in ordinary app code more than people expect.
GitHub published GHSA-64fw-jx9p-5j24 (CVE-2026-3854) today. CVSS 8.7. The summary, in one sentence: any user with push access to a repository on a vulnerable GitHub Enterprise Server appliance could send a crafted git push --push-option value, smuggle a delimiter character into an internal service header, and execute arbitrary commands on the GHES backend.
GitHub.com itself was server-side mitigated on March 4, 2026; no customer action is required for repositories hosted at github.com. The patch for self-hosted GHES dropped the same day across the supported lines: 3.14.25, 3.15.20, 3.16.16, 3.17.13, 3.18.7, and 3.19.4. If you run any version of GHES below those, your appliance is exposed to authenticated RCE from anyone you grant push access to, including untrusted contributors on public repositories you mirror or fork inbound.

What the bug actually is

Git’s --push-option mechanism lets a client send arbitrary string values along with a push, which the server can route to internal hooks. GHES used those option values to build the value of an internal HTTP header that drove a downstream service call. The header was constructed by string concatenation with a delimiter character that the option value could also contain. A push option carrying that delimiter therefore injected an extra metadata field into the header, and the receiving service trusted that field enough to escalate the request into command execution.
This is CWE-93 (improper neutralization of CRLF sequences), family CWE-74. The same shape that produced HTTP response splitting in 2006, email header injection in countless web mailers, and CRLF log forging in every access logger that didn’t escape user input. The vector changes (sometimes it’s a query parameter, sometimes a cookie, sometimes a push option), the bug class is identical: user input concatenated into a field built with a known delimiter, no escaping on the delimiter char, escalation through whatever protocol consumes the resulting line.

Why this still ships in 2026

The honest answer is that header construction is everywhere AI-generated code reaches. Webhook dispatchers. Reverse proxies on Cloudflare Workers. Custom auth middleware that builds an Authorization header from a token field on the request. API-gateway code that copies X-Forwarded-User straight from inbound to outbound. LLMs reach for string concatenation by default. Sanitization is a habit a careful reviewer adds, not a default the model inserts. Modern fetch implementations (undici on Node, the Workers runtime, Deno) reject CR/LF in header values at the wire, which catches the most obvious cases. They don’t catch protocol fields that aren’t HTTP headers (the GHES bug used an internal header format the runtime didn’t police), and they don’t catch log-line injection at all.
It’s also worth giving GitHub credit. The vulnerability was reported through their Bug Bounty program, fixed on github.com within hours of confirmation, fixed across all supported GHES lines on the same day disclosure was assigned, and disclosed publicly with full advisory text and a coordinated patch window. That’s the playbook. The bug existed; the response was clean.

What Snitch flags, starting today

We shipped Category 72: HTTP / protocol header injection in v7.5.0 today, triggered directly by this CVE. It catches the pattern across the surfaces we see most often in AI-written code:
  • Outbound headers built from request input without CR/LF stripping (headers: { "X-Foo": req.body.x }).
  • Webhook dispatchers and reverse proxies that copy inbound headers straight to outbound calls.
  • Custom auth middleware that interpolates user-controlled values into Authorization or session headers.
  • Build-tool wrappers that take CI variables and stuff them into internal protocol fields (the literal GHES shape).
  • Logging shims writing user input to access logs without escaping (lower severity, same family).
The category ships with a reference sanitizer, the same one we already use in packages/app/src/emails/sendEmail.ts: strip CR, LF, tab, and control chars, replace with a space, trim, clamp to RFC line length. Apply it at the boundary, before the value reaches the header builder. False positives are suppressed when the value flows through a known sanitizer or a typed schema (Zod, Pydantic) that rejects strings containing CR/LF.

What to do

If you run GitHub Enterprise Server, upgrade to one of the patched releases now: 3.14.25, 3.15.20, 3.16.16, 3.17.13, 3.18.7, 3.19.4. The vector requires push access, which on most enterprise installations is internal, but if you accept community contributions or mirror public repositories inbound, the trust boundary is wider than it looks. GitHub.com itself needs nothing.
If you write or review code that builds HTTP headers from request input, run Snitch’s Category 72 against your repo. The CLI, the GitHub Action, and the Plugin all carry the new rule starting today. The Plugin’s update arrives the next time the AI loads the skill; the CLI is on npm; the GitHub Action is at /action and runs on every PR.

References

Found a similar pattern in your stack and want a second opinion, or have an incident report we should cover next? Email eric.waters@snitchplugin.com.

April 28, 2026

When the security vendor becomes the supply-chain attack: a review of the Checkmarx + Bitwarden incidents, and what AppSec tools should structurally do about it

Update, May 28, 2026
The news cycle moved on; the argument didn’t. A security vendor sitting in your build path is still a dependency, and “but it’s a security tool” was never a threat model. The structural defenses in the back half are how we’ve built Snitch ever since, your source never leaves your machine, and there’s nothing of ours in your runtime to compromise.
Between March 23 and April 25, 2026, the AppSec vendor Checkmarx suffered the worst kind of compromise a security company can suffer: their own publishing pipeline was used to ship malware to their customers’ developer machines. Two weeks earlier, the same threat group used a compromised CI runner to push a poisoned version of the Bitwarden CLI through a separate trust chain. Both incidents share a structural lesson that’s embarrassing for almost every hosted security tool on the market: the AppSec vendor itself is a target, and a successful compromise of that vendor often hits the customers’ CI runners more directly than any vuln in the customers’ own code ever would.
This post walks through what actually happened (it’s more nuanced than “Checkmarx leaked customer code”, which they didn’t), why the supply chain matters more than the scan results in incidents like this, and what we did to Snitch’s own architecture this week to make sure the equivalent attack against us would have a much smaller blast radius. It’s also a public threat model: we don’t pretend we’re unattackable, we just made sure the worst case is bounded.

TL;DR

  • What happened: A threat actor (the same group behind the earlier Trivy and LiteLLM compromises) stole CI credentials from Checkmarx, pushed malware into their public KICS Docker image and two of their VS Code extensions, and dumped Checkmarx’s internal source code on the dark web.
  • What was NOT compromised: Customer scan data. Checkmarx confirmed publicly that customer source code is never stored in their internal GitHub repos.
  • What WAS compromised: Customers who pulled the poisoned Docker image (~28 minute window) or the poisoned VS Code extension versions had credential-harvesting code execute on their developer machines and CI runners.
  • The Bitwarden CLI angle: A separate but related attack used a compromised CI runner with valid credentials to publish a poisoned Bitwarden CLI build, exfiltrating secrets from the runners that pulled it.
  • The structural problem: Hosted security tools auto-update their published artifacts. When their pipeline gets popped, every customer pulling the latest version is the target.
  • What Snitch did this week: Shipped v7.4.0 with a runtime egress allowlist, build-time URL audit, SLSA build provenance, npm publish provenance, recommended SHA pinning over @v1, and published a full threat model. The result: even if our publish credentials get popped tomorrow and a malicious version of the Action ships to every customer, the runtime allowlist throws on any attempt to exfiltrate data to an attacker host.

Timeline of the Checkmarx incident

The chain unfolded over five weeks. Each step is independently documented in vendor and press disclosures (links at the end of this post):

  • March 23, 2026: The TeamPCP threat actor group used CI credentials stolen via earlier Trivy and LiteLLM supply-chain attacks to compromise Checkmarx’s GitHub Actions workflows.
  • March 30, 2026: Attackers exfiltrated GitHub repository data: source code, API keys, employee credentials. Customer scan data was not in scope.
  • April 22, 2026: Second wave: poisoned Docker Hub tags on the public KICS image (Checkmarx’s open-source IaC scanner) and two VS Code extensions (ast-results 1.17.0 and Developer Assist 1.19.0) shipped with credential-harvesting payloads. The KICS image was malicious for ~28 minutes; the GitHub Action that depends on it was compromised for ~84 minutes; the VS Code extensions were patched in version 1.18.0.
  • April 25, 2026: The exfiltrated GitHub data was posted on the dark web by LAPSUS$, who claimed credit alongside the original TeamPCP group.

Checkmarx’s communications throughout were unusually forthright by industry standards: dated security updates, scope disclosures, and an explicit claim that customer scan content was never accessed because it’s never persisted in the compromised repositories. We take them at their word here. The framing of “they leaked our scans” that’s floating around isn’t accurate.

What did happen is arguably worse for the AppSec category as a whole: customers who trusted Checkmarx’s publishing pipeline (Docker Hub, VS Code Marketplace) had attacker code running on their dev machines and CI runners. The vendor’s tools became the attack vector.

The Bitwarden CLI angle: a runner credential is enough

During the same window, the Bitwarden CLI was compromised through an adjacent but separate vector: a CI runner with stolen credentials published a malicious version of the official@bitwarden/cli package. Anyone whose CI pipeline or dev environment pulled the latest version had their secrets exfiltrated to an attacker-controlled endpoint.

The lesson is the one that’s easiest to dismiss because it sounds basic: a single compromised runner credential, not a zero-day, not a model error, not a social engineering campaign against the CISO, is enough to weaponize an entire publishing chain. Once an attacker can publish under your name, your “trusted” auto-update is their attack delivery mechanism.

For tools the customer’s CI pulls and runs, SAST scanners, secret managers, security audit GitHub Actions, this is the whole game. The customer’s code doesn’t have to have a single vulnerability. The customer’s identity and access systems don’t have to be misconfigured. The customer’s developers don’t have to click anything suspicious. They just have to keep using the tool they paid for, on the version the tool’s vendor said was the latest.

Why “hosted security tool” is structurally a target

Three properties make AppSec tools especially juicy targets:

  1. They run with elevated trust on customer machines. A SAST scanner needs to read every file in your repo. A secret scanner needs to grep through environment variables. An IaC scanner needs to evaluate your cloud configuration. The whole product’s value depends on having privileged access to the things attackers most want.
  2. They auto-update. The pinning hygiene that customers do for application dependencies (npm ci, lockfiles, Renovate) typically isn’t applied to security tools, where customers want the “latest threat detection.” @v1 on a GitHub Action force-updates to whatever the publisher pushes today.
  3. They run in CI environments with broad credentials. The CI runner that scans your code also has access to your AWS deploy role, your npm token, your Stripe webhook secret. A scanner that decides to phone home is in a privileged position to phone home with a lot.

Combine those three and you have an attack class where compromising the vendor is more efficient than compromising any individual customer. The Checkmarx incident is the textbook example. Bitwarden CLI is a smaller-scale demonstration of the same mechanic. There will be more.

What Snitch did this week (v7.4.0)

We had three of the four structural defenses in place before this week (BYO-key, source-never-leaves, minimal Action permissions). The Checkmarx incident pushed us to ship the fourth, and we did this week as part of the v7.4.0 release. Here’s the full stack now:

1. Runtime egress allowlist (NEW in v7.4.0)

The single most important defense. The very first thing the Snitch GitHub Action and CLI now do, before any scan code runs, is install a wrapper around the global fetchfunction. The wrapper checks every outbound request against an allowlist of hostnames Snitch knows it needs to talk to:

// from packages/snitch-github/src/egress.ts (excerpt)
const SNITCH_HOSTS = new Set<string>([
  "snitchplugin.com",
  "api.osv.dev",
  "api.github.com",
  "uploads.github.com",
  // ... GitHub Actions internal services
]);
// Plus the customer-selected AI provider host (and only that one).

Any request to a host not on this list throws aSnitchEgressBlocked error and surfaces as a loud::error:: in the workflow log. A poisoned transitive dependency that tries to call out to attacker.example.com(or even to a different AI provider than the one the customer selected) cannot complete the request. The customer sees the attempt the moment it happens.

This is the structural answer to the Bitwarden incident. Even with attacker code running inside our published package, the attacker has to reach out to their own infrastructure to get any value from it, and the egress check stops them at that gate.

2. Build-time URL audit

Defense in depth. Before every Action publish, our publish script runs scripts/audit-dist-urls.sh, which greps the bundled dist/index.js for every URL it contains and fails the publish if any URL is on a hostname not on our allowlist. A poisoned transitive dep gets caught at publish time instead of at customer runtime.

3. SLSA build provenance

The snitchplugin/snitch-github-action mirror runs a GitHub Actions workflow on every release that usesactions/attest-build-provenance to attach a Sigstore-signed claim that the dist/ bundle was built from a specific source commit on a specific GitHub-hosted runner. Customers verify with:

gh attestation verify dist/index.js --owner snitchplugin

Same for the CLI on npm: npm publish --provenanceattaches a verifiable claim. Customers verify with npm audit signatures.

4. SHA pinning, recommended in the install snippet

The dashboard YAML template now recommends pinning to a specific commit SHA for security-sensitive repos:

# For supply-chain hardening (recommended for security-sensitive repos),
# pin to a specific commit SHA instead of @v1:
- uses: snitchplugin/snitch-github-action@<40-char-sha>  # vX.Y.Z

Customers on @v1 auto-update. Customers on a SHA opt into immutable, vetted versions. If our publish credentials get popped, SHA-pinned customers are completely unaffected.

5. Minimal Action permissions

Our action.yml requests only:contents: read, pull-requests: write,security-events: write, statuses: write, and models: read. We do not requestactions: write (can’t modify your CI configs),id-token: write (can’t mint OIDC tokens against AWS / GCP), or packages: write (can’t publish to your registry). Even if a malicious version of our Action runs, the maximum blast radius is “comment on a PR, write a SARIF, set a check status”, not arbitrary code execution against your cloud accounts.

6. Source never leaves your machine

Snitch’s servers do not have, and structurally cannot have, the contents of any file you scanned, the findings produced by any scan, or your repo / org / PR identifier (we hash it client-side before sending). The AI inference happens on your runner against your own provider key. We never see your prompts or responses. So even in the imaginary worst case where Snitch gets popped, our customer database does not contain anything that maps back to your source.

What happens if our publish credentials get compromised

Realistic worst case: an attacker with our npm token + GitHub publish credentials pushes a malicious version of the Action to the @v1 tag.

  1. Customers on @v1 get the malicious code on their next CI run.
  2. The runtime egress allowlist catches any attempt to exfiltrate to an attacker host. Their CI run logs show ::error::[snitch-egress] Blocked outbound request to <host>.
  3. The Action’s permissions cap the damage: the worst the malicious code can do is comment on a PR or set a check status, not exfiltrate secrets to an attacker’s server.
  4. SHA-pinned customers are completely unaffected.
  5. We rotate credentials, force-revert @v1, post an incident notice on snitchplugin.com/changelog within 24 hours, and email every active customer.

The full threat model lives at /docs/security. It’s public; read it before you trust us with anything.

Lessons for buyers of AppSec tooling

If you’re evaluating any security scanner that runs in your CI or pulls from your developers’ machines, ask the vendor these five questions before you buy:

  1. What hostnames does your tool talk to at runtime, and is there a runtime check that enforces it? If the answer is “trust us” or “the docs are accurate,” that’s not enforcement. The Bitwarden incident proves that the gap between “the docs say we don’t” and “the runtime can’t” is the gap an attacker drives a truck through.
  2. How is your published artifact signed? Sigstore / SLSA provenance / npm provenance / GitHub attestations are all reasonable answers. “We use a strong CI password” is not.
  3. What permissions does your tool require? If it asks for id-token: write or actions: write and isn’t a deployer, push back. Most security tools don’t need either.
  4. Where does my source code go? If the answer involves your code crossing the vendor’s infrastructure, treat the vendor as a single point of failure for the confidentiality of every codebase you point it at.
  5. Can I pin to a specific version that you cannot retroactively replace? If the only available pin is a moving tag, you’re subscribing to whatever the vendor pushes next.

Snitch’s answers to all five live in our public threat model. We don’t expect every vendor in this category to adopt the same defenses, but we do expect every buyer to ask the questions.

Where to learn more

Snitch documentation:/docs/security (full threat model),/docs/iac (the new IaC scanner),/docs/dependencies (SCA),/docs/dead-code (DCA),/changelog (v7.4.0 release notes).

Vendor disclosures and press coverage of the Checkmarx incident:

The Snitch GitHub Action is at /action. Free with GitHub Models, runs in your runner. Snitch's servers never receive your code, and we don't store anything about it. The egress allowlist ships with every release starting v7.4.0.

Have a security concern, an incident report, or a tip on a vendor we should look at? Email eric.waters@snitchplugin.com.

April 16, 2026

Auditing a real app: what Snitch found in vibeHealth

Update, May 28, 2026
vibeHealth is the codebase behind the sample report on the home page, and it’s what we point Deep Scan at when we’re testing. Every finding below still reproduces, which makes it a fair checklist of what AI-generated code tends to get wrong.
We pointed Snitch at vibeHealth, a Next.js healthcare app written almost entirely with AI assistance. Seven findings in four categories. Two of them were severity Critical. Here’s what the scan returned, what Snitch does with findings like these, and why this kind of audit needs to exist in the age of vibe coding.
7findings
2critical
4 / 8new categories hit
0false positives

What the scan found

Four representative findings from the run:

  • CriticalCloud metadata SSRFapp/api/appointments/route.ts:47

    An endpoint fetched a user-supplied callback URL with no validation. With @aws-sdk/client-s3 in the stack, an attacker could redirect that fetch at the AWS metadata endpoint and exfiltrate IAM credentials.

  • CriticalMalicious install-script patternpackage.json:6

    The preinstall script silently curled an external URL and swallowed failures. If that host is ever compromised or name-squatted, every developer and CI machine runs whatever it returns, with failure hidden behind `|| true`.

  • HighReDoS in email validatorapp/api/validate/route.ts:6

    An email regex used nested quantifiers, the classic shape for catastrophic backtracking. One crafted request can pin the event loop and stall the process.

  • HighPrototype pollution via user JSONapp/api/validate/route.ts:23

    Object.assign was given a JSON-parsed user object. That path invokes the __proto__ setter, polluting Object.prototype for every other request the worker handles.

What Snitch does with a finding

Every finding in the report carries the same payload: file path, line number, the exact code evidence, the risk in plain language, a concrete fix, and CWE, OWASP Top 10:2025, and CVSS 4.0 tags. No hand-waving. If Snitch cannot show you the vulnerable line, it does not report it, that rule is baked into the skill itself.

After the report is displayed, Snitch offers to fix findings one by one or in a batch. Scanning and fixing are always two phases. The scan is read-only. Nothing touches your files until you pick a fix and confirm it.

Why Snitch exists

AI coding tools write most of the code now. They’re fast, they’re capable, and they’re good enough that the demo always works. But demos aren’t the failure mode. The failure mode is the webhook that doesn’t verify signatures, the callback URL that is never validated, the regex that looks like email validation but is actually a denial-of-service switch.

Traditional scanners drown you in 500 findings. Ad-hoc AI review prompts miss the structural classes of bugs. Snitch sits in the middle, 68 structured categories, evidence-first reporting, contextual false-positive suppression. First line of defense for code that was vibe-coded into existence in the first place.

One audit, every model you vibe with

Vibe coding isn’t one model anymore. Developers reach for whichever frontier model is best at the task in front of them. Snitch runs inside all of them:

  • Claude Opus 4.7 , inside Claude Code
  • OpenAI Codex 5.4 , inside Codex CLI
  • Gemini 3.1 Pro , inside Gemini CLI

Same catalog, same evidence requirements, same fix flow, whichever model you’re pairing with. The vibeHealth audit you just read was executed by Claude Opus 4.7. The exact same scan, run by Codex 5.4 or Gemini 3.1 Pro, produces the same findings.

Snitch is a security audit plugin that runs inside your AI coding tool. 72 categories, evidence for every finding, works with 30+ tools. Get it at /plugin.

April 16, 2026

Snitch 7.1: 8 new security categories

Update, May 28, 2026
The catalog has moved well past 7.1. It’s 67 categories at v8.1.0 now, after we folded the per-vendor SaaS checks into a single detect-then-audit category. The eight below are all still here, just renumbered.
Snitch 7.1 is live. Eight new security categories, bringing the catalog to 68, rolled out to every plan. If you have Snitch installed today, the upgrade is one command.

What’s new

These target the classes of bugs we kept seeing in audits without a dedicated check for them:

  • ReDoS, regex patterns that hang your server on crafted input
  • Prototype Pollution, __proto__ and deep-merge attacks through user JSON
  • JWT Algorithm Attacks, signature bypass and algorithm confusion
  • Cloud Metadata SSRF, outbound fetches that leak AWS / Azure / GCP credentials
  • Insecure Deserialization, Python pickle, Java object streams, Ruby Marshal, PHP unserialize, unsafe YAML
  • Typosquatting & Install Scripts, lookalike package names and suspicious postinstall hooks
  • Type Coercion Bypasses, loose equality in auth paths, non–constant-time password comparisons
  • Agent Prompt Injection, RAG and tool-use patterns that let untrusted data steer the model

Every category ships with the same evidence requirements as the rest of Snitch, file path, line number, the exact code, a fix, and OWASP / CWE tags. No hand-wavy warnings. If Snitch can’t show you the vulnerable line, it won’t report it.

Every plan gets every category

Free, Base, Pro, and Enterprise all include the full catalog. Upgrade if you need more rulesets, more projects, or higher limits, not for category access.

How to upgrade

Check your original purchase email and run the install command you received. It pulls the latest version automatically:

curl -sL https://snitchplugin.com/x/YOUR_TOKEN.sh | sh
Prefer a manual upgrade? Open the license link from your purchase email, it has a fresh ZIP and the install command for your token. Re-run install.sh in the same directory and your custom rules and config stick around.

New here? Snitch is a security audit plugin that runs inside your AI coding tool. 72 categories, evidence for every finding, works with 30+ tools. Get it at /plugin.

March 2026

Why we built Snitch

Update, May 28, 2026
Still true, and more so. AI shipping code faster than anyone can review it was the whole reason for this, and the gap has only widened since. Snitch is several catalog versions on from where it started, but the premise is the same.
AI coding tools write most of the code now. They’re fast, they’re good, but they learned from tutorials that cut corners. The code compiles, the tests pass, and the demo looks great. But nobody checked whether the webhook verifies signatures, whether the session token expires, or whether the admin route actually checks permissions before returning data.
We built Snitch because someone needs to check the code after the AI writes it. Not with a 500-finding scanner that drowns you in noise, but with structured categories that produce findings you can actually act on. 72 categories, evidence for every finding, works inside the tool you’re already using.
The goal was never to replace security teams. It was to catch the things that slip through when code gets written faster than anyone can review it. Snitch is the first pass , the one that makes sure the obvious stuff doesn’t ship.

April 2026

30+ tools and the Agent Skills standard

Update, May 28, 2026
The standard held up. Snitch installs the same way across Claude Code, Codex, Cursor and the rest, and the one-line installer on the home page is what that buys you. Write the skill once, run it everywhere; that’s still the point.
When we started, Snitch worked with 14 tools. Each one had its own way of loading skills, different directories, different formats, different conventions. We maintained separate configs for Claude Code, Cursor, Copilot, and every other tool that wanted to integrate. It worked, but it didn’t scale.
14 → 1integrations
30+tools supported
Then the Agent Skills open standard happened. Now a single SKILL.md file works across 30+ AI coding tools. We adopted the spec, and suddenly Snitch runs in tools we’ve never even tested on. That’s the power of an open standard, you write the skill once, and every tool that implements the spec gets it for free.
We went from maintaining 14 tool-specific integrations to maintaining one file. The install script detects which tools you have, drops the skill into the right directory, and you’re done. One command, every tool.

April 2026

Snitch and Claude Fable

Update, May 28, 2026
Mythos shipped publicly in June 2026 under its release name, Claude Fable. Some of what we were waiting on it for showed up early, in Claude Opus 4.8. Its interleaved thinking is what let us switch Deep Scan on in v1.1.0, the mode this post was describing before it existed. Fable is still the bigger swing; 4.8 was the down payment. What we shipped.
Anthropic announced Project Glasswing and Claude Mythos this week. A model that finds decades-old kernel bugs for $50. The benchmarks are staggering, but what caught our attention wasn’t the raw capability, it was what happens when you pair that capability with structured guidance.
We’ve been building a 74-category client specifically for this model, deep reasoning analysis, real-time exploit chain detection, and contextual severity scoring. When models get smarter, the structured framework that guides them gets more valuable, not less. A more capable model doesn’t need less structure. It needs better structure so it can apply that capability to the right problems.
We applied to Project Glasswing because we think structured scanning and a more capable model are a natural fit. Snitch gives the model the categories, the evidence format, and the audit methodology. Fable brings the reasoning depth. More at /fable.