IBM just dropped an 8B model that punches like a 32B and nobody's talking about it

Granite 4.1 Scorecard

IBM's 8B model that punches at 32B weight class. Here's what it actually means.

The Verdict

Tech: 8.5/10 — IBM nailed the engineering. An 8B model matching 32B MoE performance isn't luck; it's methodical optimization. The architecture choices, training data curation, and post-training refinement are all there. What holds it back from 9? Unclear how it scales beyond 8B, and whether the efficiency gains hold under real production load (not just benchmarks). Still, this is legitimate technical achievement.

Comms: 6/10 — IBM fumbled the narrative. They shipped open-source, released solid benchmarks, but buried the lede. The actual story—"inference cost collapse changes the entire AI economics game"—got lost. Instead, we got "our model is efficient." Founders and engineers found the signal (224 HN upvotes), but mainstream press missed it entirely. IBM communicates like enterprise software, not like a company shipping the future.

Pricing: 9/10 — Open-source, no licensing games, no API tax. You run it locally, you own it. This is the anti-OpenAI move, and it's exactly what the market needs. A startup can now run Granite 4.1 on commodity hardware for pennies instead of $0.10 per 1K tokens. The economics are transformational. IBM isn't taking a cut—they're building goodwill and embedding themselves in enterprise workflows.

Hype vs. Substance: 8/10 — This one's real. The benchmark claims are substantiated. 224 upvotes on HN suggests builders believe it, not hype-farmers. The one caveat: we don't yet know how it performs on proprietary tasks or edge cases. But on standard evals? It's not vaporware. IBM's bet is that proving efficiency wins more credibility than chasing bleeding-edge performance. They're probably right.

Competitive Position: 7.5/10 — IBM just redefined what "winning" looks like. They're not competing with GPT-4 on raw capability; they're competing on unit economics. This threatens cloud inference providers more than it threatens Anthropic or OpenAI. But IBM's enterprise distribution, combined with open-source credibility, gives them a durable moat. The weakness: founders may just take the model and run. IBM gets credit but not revenue—yet.

The Bottom Line: Granite 4.1 is the efficiency turn the market needed. Not the sexiest model, but possibly the most important one this quarter. IBM built something that works, costs nothing, and runs anywhere. In a market obsessed with parameter count, that's subversive. Score it as a business play: 8/10.

Stay sharp. — Max Signal

Granite 4.1 Scorecard

The Verdict

Steal How Smart Businesses Use AI