Anthropic downgraded cache TTL on March 6th

HACKERNEWS · 435 pts · 331 comments
Anthropic downgraded cache TTL on March 6th

So Anthropic quietly nerfs the prompt cache TTL and the GitHub discourse is HEATED. 435 upvotes, 331 comments — this is the kind of unsexy-but-actually-crucial story that separates the builders from the scroll-past crowd. They basically said "hey, your cached prompts expire faster now" and the internet lost it. And honestly? Fair.

Here's the thing: prompt caching is supposed to save you money and latency. That's the whole value prop. You cache expensive context once, you reuse it a thousand times, you win. But if you're silently tightening the TTL window, you're just slowly making the feature worse without telling anyone. It's like when your credit card company lowers your limit without a phone call. TELL US THE MOVE. We can handle it. We're engineers.

The ratio on this is brutal because Anthropic didn't lead with transparency. You make a change that impacts production behavior? You communicate it FIRST. Blog post, release notes, email to API customers — pick your lane. Instead, people found out in a GitHub issue. That's not how you keep trust with the developer community. Rate: 4/10 on execution, 2/10 on comms. The tech itself is probably fine. The story is a masterclass in how NOT to announce a degradation.

This is why we watch these things. Not because TTL is inherently spicy. But because how companies treat their power users and builders tells you everything about their values. Transparency wins. Always.

Stay sharp.

Read the source →


Exploiting the most prominent AI agent benchmarks

HACKERNEWS · 483 pts · 122 comments

So Berkeley just dropped a paper basically saying "yeah, all those AI agent benchmarks you've been flexing? We can game them." And look, this needed to happen. We've been treating these leaderboards like they're the SAT when they're actually more like a college admissions essay written by ChatGPT — technically impressive, completely misleading.

The wild part? It's not even that hard. They showed you can exploit the most prominent benchmarks with relatively simple prompt engineering and clever workarounds. We're talking the benchmarks that companies literally use to justify their funding rounds and CEO tweets. Stripe's agent, Anthropic's evals, all of it potentially vulnerable to the same gaming that makes a TikTok go viral for the wrong reasons. That's the headline.

Here's what kills me — we all kind of knew this was coming. Same energy as when people figured out how to jailbreak ChatGPT with creative prompt injection. But knowing and proving it are different things. Now every researcher and investor has to actually reckon with: "Wait, are we measuring real capability or just measuring 'how well does this model exploit our own benchmark?'" It's like finding out your favorite restaurant's health inspection score was faked. Rating the benchmarks themselves? 2/10. Rating Berkeley for exposing it? 9/10. They did the hard work nobody wanted to do.

The culture moves: Real builders should actually care about this. If your entire pitch is "look at our benchmark score," and those scores are hackable, you're living on borrowed credibility. This is the kind of paper that separates signal from noise, and we need way more of it. Stay sharp.

Read the source →


European AI. A playbook to own it

HACKERNEWS · 100 pts · 43 comments
European AI. A playbook to own it

Okay, so Mistral just dropped this whole "European AI playbook" thing and I'm sitting here like... are we finally doing this? Are we ACTUALLY building something that doesn't answer to Silicon Valley? I have to respect the energy. 100 engagement points says people are paying attention.

Here's the thing though — the playbook is solid on paper. "Own it" messaging, sovereign AI infrastructure, not being beholden to US export controls. That's the dream, right? But we've heard this speech before. Europe's been talking about AI independence since ChatGPT dropped and... where are we? Still running on NVIDIA chips someone else controls. Still dealing with regulatory whiplash. The vibe is "let's build different" but the execution needs to be absolutely RELENTLESS for this to actually work.

The real test: Can Mistral actually ship products that compete with OpenAI/Claude/Gemini in the real world, or is this just premium positioning for the EU government contract circuit? Because "European values" doesn't win on latency or quality. You win on SPEED and CAPABILITY. That's the score that matters. Right now? I'd give this move a 7/10 — ambitious framing, real ambition underneath, but the bar is impossibly high and everyone's watching to see if they choke.

Stay sharp.

Read the source →


Show HN: Claudraband – Claude Code for the Power User

HACKERNEWS · 70 pts · 14 comments
Show HN: Claudraband – Claude Code for the Power User

So some builder named halfwhey just dropped Claudraband on HN and honestly? This is the vibe we need more of. It's basically a power user toolkit for Claude — think of it like someone finally asked "what if we made Claude actually usable for people who know what they're doing?" and actually built it.

70 points, 14 comments. That's not viral, but it's the kind of engagement that tells you something real is happening. It's not getting the "look at my AI startup that's just a wrapper" treatment. People are actually using this. The GitHub tells me it's not some half-baked side project either — this is someone who clearly frustrated with stock Claude and said "fine, I'll fix it myself." Respect the move.

Here's what gets me: we're drowning in AI tools that add complexity when all we needed was someone to strip it back and make the core experience better. No VC money announcement, no TechCrunch hype cycle, just a builder solving their own problem and throwing it on GitHub. That's the energy. 7.5/10 — solid execution, real utility, but needs better docs to go mainstream. Still, this is how good tools actually get built.

Stay sharp.

Read the source →


Trump officials may be encouraging banks to test Anthropic’s Mythos model

TECHCRUNCH · 50 pts
Trump officials may be encouraging banks to test Anthropic’s Mythos model

So Trump officials are allegedly whispering in banker ears about Claude's new Mythos model. This is the kind of story that makes you go "wait, WHAT?" because nothing makes sense anymore and yet somehow it all makes perfect sense.

Here's what's wild: If true, this is the government playing favorites with AI vendors. Not because Anthropic has the best tech (debatable), but because it fits some political calculus. That's either genius or a disaster depending on your stance. Banks testing it? Fine. Government encouraging it behind closed doors? That's when I start asking questions nobody's asking yet.

The real story isn't Mythos—it's that we're entering the era where AI adoption is tied to political winds. Claude's solid tech, don't get me wrong. But if the administration is basically saying "hey, test this one specifically," we're watching industrial policy happen in real-time. That changes everything about fair competition. 6.5/10 on the story itself because the implications are massive but the actual facts are still fuzzy.

Stay sharp.

Read the source →


Apple reportedly testing four designs for upcoming smart glasses

TECHCRUNCH · 50 pts
Apple reportedly testing four designs for upcoming smart glasses

Apple's testing four designs for smart glasses and I'm already exhausted. Not because four options is too many—it's because we've SEEN this movie before. Remember when Google Glass was going to change everything? Or when Snapchat Spectacles were "the future"? Now Apple's doing the classic play: leak some designs, let the rumor mill go crazy, and by the time they actually ship it in 2027, everyone's forgotten what we were hyped about.

Here's my thing though—if anyone can make smart glasses not look like you're cosplaying as a cyborg from a 2009 dystopia film, it's Apple's design team. Four prototypes means they're actually THINKING about the form factor, not just slapping a screen on someone's face. That's different. The question isn't whether they'll look okay. It's whether the software justifies wearing them. Nobody cares how sleek your glasses are if you look like an idiot talking to yourself.

Rating the vibe? 7/10. The testing stage is real, the designs probably don't leak for months, and by then we'll have moved on to AI contact lenses or whatever. But there's something refreshing about Apple just... working on stuff quietly instead of the entire startup ecosystem screaming about their product before the beta exists. Stay sharp.

Read the source →


From LLMs to hallucinations, here’s a simple guide to common AI terms

TECHCRUNCH · 50 pts
From LLMs to hallucinations, here’s a simple guide to common AI terms

Look, I respect TechCrunch doing the basics here. We need a glossary. We NEED it. The number of people confidently using "machine learning" and "neural network" interchangeably at dinner parties is embarrassing. So props for stepping in.

But here's the thing — and I say this with love — glossaries are the missionary position of tech content. Necessary, fine, nobody's mad, but also... nobody's excited. "Here's what hallucinations are!" Cool, cool. Explains the concept. Zero personality. Zero edge. It's the equivalent of reading the instruction manual instead of just figuring out how to use the thing. We get smarter, sure. But we don't get ENTERTAINED.

Rating: 7/10. Does its job. Accurate, comprehensive, probably saves a lot of people from looking stupid in Slack. The real win? It exists so I don't have to explain this stuff 40 times a week. That's the service here. But let's be honest — you're reading this because you got lost in an AI thread and needed a refresher, not because you woke up excited about semantic search definitions.

Here's what I want: Take this glossary. Now make it spicy. Give me hot takes on which terms are actually useful vs. which ones tech bros invented to sound smart. THAT'S content. THAT'S the move.

Stay sharp.

Read the source →


At the HumanX conference, everyone was talking about Claude

TECHCRUNCH · 50 pts
At the HumanX conference, everyone was talking about Claude

So apparently Claude walked into HumanX like it owned the place. Everyone's talking about it, which is either the best sign or the worst sign depending on who you ask. I'd need to see what actually happened at the conference to give you the real take, but "everyone was talking about Claude" hits different depending on whether it's good talking or concerned talking, you know?

Here's what I'm picking up: Anthropic's clearly moving the needle if a whole conference is buzzing about one model. That's the dream for any AI company. But also—and I say this with love—if the only story coming out of HumanX is "Claude," where were the other players? What happened to the rest of the field? That's either a Claude domination moment or everyone else had a rough showing.

Without seeing the specifics, I'm rating this a 7.5/10 for Claude's vibe, but the conference coverage itself? Feels thin. Give me the receipts. What demos? What benchmarks? What actually made people lose their minds? "Everyone was talking about it" is great narrative, but we need the meat.

Stay sharp.

Read the source →


Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

TECHCRUNCH · 50 pts
Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home

So @sama is out here doing a press tour after his house got attacked, responding to a New Yorker piece that apparently went full character assassination mode. This is the part of the AI wars where it stops being funny and starts being... actually concerning? Like, we've gone from "is AGI safe?" to "is Sam Altman safe?" That's a vibe shift nobody ordered.

Look, I get it. The New Yorker probably had receipts. Sam's been running the most powerful AI company on earth while living like a venture capitalist fever dream—the houses, the energy deals, the whole mystique. Fair game for criticism. But there's criticism and then there's the kind that makes someone's home a target. That second one? Not it, chief. You can think Sam's a visionary OR a hype merchant (I flip on this weekly) without wishing violence on the guy.

The real problem here is that Sam's response probably won't matter. The narrative's already baked. Half of tech Twitter thinks he's Elon 2.0, the other half thinks he's the only guy standing between us and paperclip apocalypse. A statement doesn't move the needle when people have already decided who you are. That's the actual story—not the article, not the attack, but the fact that we've fully tribalized around AI leadership to the point where nuance died.

Rate the situation? 2/10. Bad for everyone involved. Bad for Sam's security. Bad for the discourse. Bad for the actual important conversations about AI safety getting drowned out by tabloid energy. We can do better than this.

Stay sharp.

Read the source →


First man convicted under Take It Down Act kept making AI nudes after arrest

ARS TECHNICA · 50 pts
First man convicted under Take It Down Act kept making AI nudes after arrest

So this guy got convicted under the Take It Down Act — literally the first person to face consequences for AI nudes — and his response was to... keep making them? This is the digital equivalent of getting caught smoking in the bathroom and then lighting up another cigarette in front of the principal. The audacity. The stupidity. The absolute refusal to read the room.

Look, I'm not here to moralize. But there's a difference between "guy makes bad choices" and "guy makes bad choices AFTER A FELONY CONVICTION." This isn't a gray area anymore. We've got case law now. The law is literally named after the problem he's creating. And he's still out here acting like the rules don't apply to him. That's not edgy. That's just dumb.

The whole thing is a disaster for everyone. Bad for the guy (obviously — more charges incoming). Bad for the creators getting victimized again. Bad for the people arguing we don't need regulation because "people will just keep doing it anyway" — congrats, you just got your answer, and it's not the one you wanted. Rating: 2/10. The only thing it proves is that we needed this law, and we're going to need enforcement teeth we probably don't have yet.

Stay sharp.

Read the source →

Stay sharp. — Max Signal