What Anthropic actually announced
The headline finding from Anthropic’s post is that Claude Mythos Preview can, in their phrasing, surpass all but the most skilled humans at finding and exploiting software vulnerabilities. On reported runs, the model found flaws that had survived twenty-seven years of scrutiny in OpenBSD, and bugs in FFmpeg that conventional automated testing tools had hit roughly five million times without catching. On vulnerability-reproduction benchmarks, Anthropic reports a jump from 66.6% (the previous-generation Claude Opus 4.6) to 83.1%.
The framing Anthropic chose is the line worth reading twice: the same capabilities that make these models dangerous in the wrong hands make them invaluable for finding and fixing flaws. The announcement was paired with a $100M model-credit commitment and $4M in open-source security funding, and brought AWS, Apple, Microsoft and Google to the table as partners.
I’d encourage anyone shipping PHP, Drupal, or anything that handles a payment to read the announcement directly. The claims matter, and so does the tone.
Why this matters for Drupal Commerce specifically
Drupal has a mature security team and a solid disclosure process. That is not the concern. The concern is everything around core:
- The custom module you paid a contractor to write in 2019.
- The theme
.themefile that ended up with business logic nobody remembers adding. - The payment webhook endpoint that trusts a header it shouldn’t.
- The
hook_preprocess_*that silently kills the page cache (I’ve written about this before — it’s more common than you’d think). - The Composer dependency that hasn’t been audited since it was added.
None of that code is being reviewed by Drupal’s security team. It’s being reviewed — if it’s being reviewed at all — by whoever wrote it and whoever maintains the site now. Historically this has been fine, because attackers had to invest real human effort to find bugs in a single custom module on a single mid-sized site. The economics didn’t favour them.
Glasswing is a signal that those economics are changing. You no longer need elite offensive-security expertise to run a frontier model over a repository. The same goes for defenders — which is the useful half of the news.
What we’re doing about it on client sites
This is the part of the post where, if I were a bigger agency, I’d sell you a “Glasswing Readiness Audit” for five thousand pounds. I’m not going to do that. Here is what actually moves the needle, in the order we do it on live Drupal Commerce engagements:
- Run
composer auditas a CI gate, not a one-off. If your build doesn’t fail on a known CVE in your dependency tree, fix that first. It is a thirty-minute job. - Review every custom module with Claude Code as a second reader. Not “ask it to find bugs” — walk through each file with it, with the security rules loaded in context: input validation at the boundary, prepared statements, CSRF on state-changing routes, no secrets in config. You will find things. I always do.
- Treat the payment path as a separate problem. On a Stripe or Salesforce-integrated site, the webhook endpoints, JWT handling, and any place you decrypt stored tokens deserve a dedicated review pass. I cover the approach to Salesforce JWT OAuth and two-factor encryption in the B2B Commerce case study.
- Fault-inject your test suite. If your Playwright tests do not break when you deliberately break the code, they are theatre. The site quality audit case study has the receipts: 50% of our injected faults went undetected on the first pass. The fix is to write the few tests that actually notice when something falls over.
- Get security headers right at the edge.
Content-Security-Policy,Strict-Transport-Security,Referrer-Policy,X-Content-Type-Options,Permissions-Policy. If your.htaccessor Nginx config does not set these, you are leaving free wins on the table.
None of this is exotic. It is the unglamorous work that the AI-defender framing from Anthropic is meant to accelerate, not replace.
What I’d avoid
Two things to be careful of in the next few months as every vendor pivots to “AI security”:
Don’t buy the scanner just because it has “AI” on the box. A bad scanner that hallucinates vulnerabilities costs you more engineering time than no scanner at all. Ask what it was trained on, what the false-positive rate is on real Drupal codebases, and whether it understands Drupal’s hook and render systems. If the answer is vague, walk away.
Don’t let “AI found a bug” replace the post-mortem. The bug is in your codebase because of a process. Find the process gap. Fix that too.
If you run a Drupal Commerce site, here’s the honest ask
If you’re running Drupal Commerce with Stripe or Salesforce and the last real security review was “we run the updates”, get in touch. A fixed-price review covering the five items above takes about a week on a typical mid-sized site. You will get a written report, a Jira-tracked fix log, and a test suite that actually catches regressions.
That is the offer. No “Glasswing” buzzword in the invoice.
