Opus 4.7 Launch Scorecard: Tech 9, Comms 4, Hype Cycle 7

Opus 4.7 just dropped and Anthropic basically said, “Yeah, this isn’t even our best model.”

We’re in weird territory when the launch headline isn’t “new frontier model,” it’s “this is the safe understudy because the real thing might jailbreak the global banking system.”

That’s the Mythos angle. @AnthropicAI went full “we have a dragon in the basement, you’re getting the house cat.” Totally new move in model launches.

Tech – 9.1 / 10

On raw capability, Opus 4.7 is a problem for everyone else.

Beating GPT-5.4 and Gemini 3.1 Pro on agentic coding, scaled tool use, agentic computer use, and financial analysis is not vibes, that’s the only quadrant that actually matters to builders. If your model can chain tools, drive a browser, and not hallucinate a balance sheet, you’re in the first string.

Vision bump to 2,576px long edge (3x prior Claude) matters more than people think. It’s the “feed it your entire dashboard / Figma / spreadsheet in one go” unlock. Not as flashy as “1080p YouTube from vibes” but absolutely lethal for ops and analytics.

The tokenizer tax (1.0–1.35x more input tokens vs 4.6) plus the model’s “thinks harder” habit at high effort = it will eat more tokens than you expect. But in real-world dev flows, if Opus solves what GPT-5.4 fails on, that’s cheap, not expensive.

xhigh effort is underrated. Max is often overkill, high is sometimes mushy. xhigh is the “serious work, not PhD thesis” setting. That’s the one teams will quietly standardize on.

Then there’s the /ultrareview in Claude Code. Simulated senior engineer that actually flags subtle design flaws and logic gaps is exactly what we keep pretending GitHub Copilot can do but doesn’t. This is the first “AI code review” thing that sounds like a real feature, not a slide deck.

Tech score: 9.1 because Mythos exists and Anthropic literally told us “this is not peak.” But for anything you can actually use today? It’s frontier.

Comms – 7.3 / 10

I kind of love that @AnthropicAI did the unthinkable: they launched Opus 4.7 with a big neon sign saying, “This is not our best model. Mythos is better and too dangerous.”

That’s either supreme confidence or supreme fear. Maybe both.

Credit where it’s due: the Mythos disclosure is honest in a way we never got from @sama or @sundarpichai. OpenAI and Google always imply “you’re using the tip of the spear.” Anthropic just said, “this is the training wheels version; the race car stays in the garage for now.”

But comms are also sloppy around the edges. Public narrative right now is: “Anthropic has a model that can hack banks and they briefed the Trump administration about it.” That is not exactly the “trust us, we’re the responsible adults” vibe you want.

Red site for Mythos preview, Glasswing whispers, “thousands of zero-days in every major OS and browser” — this is Marvel Cinematic Universe villain origin PR. Of course Bloomberg/Axios/CNBC run with “too dangerous to release” headlines.

They also didn’t do enough to explain to normal customers what differentiates Opus 4.7 from Mythos for day-to-day use. The story is all fire-breathing cyber-god, not “this thing will 10x your agents and finance workflows.”

Comms score: 7.3. Huge respect for admitting Mythos exists and is stronger; dinged for letting safety theater hijack the launch story from the product people.

Pricing – 8.4 / 10

Keeping it at $5 / $25 per million tokens for this level of capability is aggressive.

Yes, with the fatter tokenizer and the tendency to “think harder” at high effort, your real-world bill is quietly going up versus 4.6. That’s the fine print. You’re paying steady sticker price on a model that now consumes more tokens per unit of work.

But relative to GPT-5.4 and Gemini 3.1 Pro, you’re getting a model that beats them on the workloads that actually print money (agentic coding, tool use, financial analysis). So per task, Opus 4.7 is probably cheaper than both.

This is where @AnthropicAI is clearly signaling: “We are not the discount brand, we’re the devs’ daily driver.” They could have absolutely justified a bump here and didn’t.

Pricing score: 8.4. Not insane generosity, but for the first true multi-cloud, frontier-beating model at this price point, that’s a builder-friendly move.

Hype-vs-Substance – 9.0 / 10

This is maybe the funniest part: the only real hype in this whole launch is about a model you can’t use (Mythos).

Opus 4.7 itself is being weirdly under-hyped given the actual benchmarks. You tell me “beats GPT-5.4 on agentic computer use and financial analysis” and I’m expecting confetti cannons and a 90-minute demo stream. Instead we got “also here’s the nuclear option we locked away.”

On substance, though? It delivers. Agentic coding + tool use + the /ultrareview flow is exactly what every AI-native shop has been begging for. It’s the first time “AI pair programmer” looks like something your staff engineer might actually respect.

Vision upgrade is real. Not a TikTok party trick. A lot of people are going to quietly rebuild their internal workflows around “just show Opus everything on the screen at once.”

And the cybersecurity posture — automated blocking of prohibited cyber requests + a gated Cyber Verification Program — is a concrete, product-level answer to “what if the model hacks the world.” That’s a safety claim backed by architecture, not vibes.

Hype vs substance: 9.0. Almost all the juice here is real. The “too dangerous Mythos” narrative is extra, but Opus 4.7 itself is actually underrated by its own marketing.

Competitive Position – 8.7 / 10

Against GPT-5.4: Opus 4.7 is now the default “serious work” pick if you care about agents, coding, and finance. OpenAI still wins mindshare, ecosystem, and plug-and-play products. But on raw, programmable capability, Anthropic just outflanked them.

Against Gemini 3.1 Pro: @sundarpichai has scale, YouTube, and infra. But if you give dev teams a clean A/B between Gemini agents and Opus 4.7 agents, I know where most of the sweatshirts are going. Anthropic also being live on Vertex AI is a dagger — Google’s own cloud is now hosting an obviously better agent brain.

Against Mythos: this is the weird one. Anthropic created a permanent “you’re not using our best” shadow over their own flagship. That’s either brilliant expectation management or permanent FOMO fuel. Every time Opus fails on a hard security or systems task, devs will think, “Mythos would’ve nailed that.”

Strategically though, it’s not dumb. Mythos handles the scary stuff with a tiny circle of Glasswing partners and open-source defenders. Opus 4.7 handles literally everything else for the entire market. That’s a nice separation of concerns if you’re planning to be the “aligned lab” in DC.

Being on Amazon Bedrock, Google Vertex AI, and Microsoft Foundry is quietly huge. Anthropic is now the Switzerland of frontier models sitting inside all three hyperscalers. That gives them a distribution and neutrality story neither OpenAI nor Google can match.

Competitive position: 8.7. They self-nerf a bit by admitting there’s a stronger model in-house, but versus GPT-5.4 and Gemini 3.1 Pro, Opus 4.7 is now the model everyone serious has to justify not using.

Bottom line: who should switch today, who should wait

If you’re building:

Agentic coding tools or dev infra
Serious multi-step tool-using agents
Quant / financial analysis / FP&A automation
Ops dashboards that need strong vision + reasoning

…you should be moving to Opus 4.7 now or at least running a hard bake-off against GPT-5.4 and Gemini 3.1 Pro this week. The fact it’s on Bedrock / Vertex / Foundry removes half your procurement excuses.

Who should wait?

People deeply locked into OpenAI ecosystem products (ChatGPT Team/Enterprise, Assistants API, plug & play tools). The switching costs and glue code are real.
Teams whose workloads are thin-chat, marketing copy, or simple Q&A. You
Stay sharp. — Max Signal