
I've been building with AI long enough now that new model releases barely register anymore. Every week there's something. A point release here, a minor update there. Most of the time it's incremental stuff that doesn't really change how I work.
But Sonnet 4.6 is different, and I think it matters more than the usual hype cycle would suggest.
Here's the thing: Anthropic just pushed Opus-level performance down into the Sonnet tier. That's not a small deal. Performance that used to cost you $15 per million tokens (on the input side, anyway) is now available at $3. Same pricing as the old Sonnet 4.5, but developers with early access are saying they often prefer it to Opus 4.5 from November. On complex coding tasks, on agent planning, on the kind of work that actually costs money when you get it wrong.
The economics just shifted. If you've been reaching for Opus because Sonnet wasn't quite good enough for your use case, or if you've been avoiding certain automation projects because the per-token math didn't work, that calculation probably changed yesterday.
I'm going to walk through what actually improved, where it still falls short (because it does), and what this means if you're trying to decide whether to migrate existing workflows or spin up something new.
The coding improvements are probably the biggest deal here. Developers with early access are choosing Sonnet 4.6 over Sonnet 4.5 by a wide margin, which is expected. But they're also picking it over Opus 4.5 from just a few months ago, which is kind of wild when you think about it. That's the flagship model getting outperformed by the mid-tier option.
What that means in practice: tasks that used to need Opus-class performance (think complex app builds, deep codebase work, multi-step reasoning) now run on Sonnet. Same pricing as before, $3/$15 per million tokens. You're getting more capability without paying more.
The 1M token context window is in beta, but it's there. That's a lot of room for long documents, entire codebases, or extended back-and-forth sessions where you don't want the model forgetting what you talked about three hours ago.
Computer use skills got a major bump too. I think this is one of those features that sounds abstract until you actually use it, then it clicks. The model can interact with interfaces, navigate tools, do things that feel more like an assistant and less like a chatbot.
And maybe the most telling thing? It's now the default for free tier users. Anthropic doesn't usually hand out their best stuff to the free plan. That they're comfortable doing it here says something about where the model sits in their lineup.
Here's the thing that actually matters: you're getting Opus-class performance at $3/$15 per million tokens. That's Sonnet pricing.
This isn't just "oh, the model got a bit better." This is a tier collapse. Tasks that used to mean reaching for Opus 4.5 (and paying 5x more) now run on Sonnet 4.6. Anthropic even says it directly in their release: "Performance that would have previously required reaching for an Opus-class model...is now available with Sonnet 4.6."
I've seen this change project economics before, and it's probably going to again. When you're running a feature that makes 10,000 API calls a day, the difference between $3 input and $15 input adds up fast. You can suddenly afford to be more aggressive with context, run more iterations, or just... not worry as much about the bill.
And for a lot of teams, this removes the whole "should we upgrade to Opus for this?" conversation. You just don't need to have it as often. The mid-tier model is handling the complex stuff now. Codebase work, multi-step reasoning, the kind of tasks where you used to hold your breath and hope Sonnet could pull it off.
It's still worth keeping Opus around for the really gnarly problems. But your default just got way more capable without touching your budget. That's rare.
I've been testing Sonnet 4.6 on actual projects for the past week, and the coding improvements are real. Not just "slightly better" real. We're talking about developers in early access preferring it to Sonnet 4.5 by a wide margin, and in a lot of cases even choosing it over Opus 4.5 from November.
The big deal here is consistency and instruction following. I threw a complex app build at it that would've normally needed Opus, and Sonnet 4.6 handled it without the usual back-and-forth corrections. Anthropic calls this "frontier-level results on complex app builds," which sounds like marketing speak until you actually use it.
What that means for your workflow: tasks that used to require the expensive model now run on the mid-tier one. Bug fixes are cleaner. Architecture decisions make more sense. One developer at Rakuten said it reached for modern tooling they didn't even ask for.
The computer use upgrades are maybe even more interesting, though I haven't pushed those as hard yet. Sonnet 4.6 shows "major improvement" compared to prior Sonnet models in actually controlling interfaces and navigating UIs. If you've been building agents or automation tools, this probably matters more than the coding stuff.
And pricing stayed the same. $3/$15 per million tokens. So you're getting Opus-class performance at Sonnet prices, which changes the math on a lot of projects.
The 1M token context window is probably the thing that sounds most abstract but actually changes how you can use Claude day to day. I think of it this way: you can now dump an entire codebase into a conversation and have Claude remember all of it while you work through multiple problems.
Before this, you'd hit context limits pretty fast. Maybe you're debugging something, then you want to refactor a related function, then you notice a pattern that needs fixing across three files. You'd constantly be starting fresh conversations or carefully managing what stayed in context. It was like trying to have a conversation with someone who kept forgetting what you said ten minutes ago.
Now you can keep going. One session, multiple tasks, and Claude still remembers the architecture decisions you made at the start. I've been using it for client projects where we're iterating on features over a few hours, and not having to re-explain the codebase every time is... honestly just a relief.
The practical stuff: you can upload bigger files directly, paste in more documentation, or even include your entire API spec alongside the code you're working on. It's in beta, so I'd still keep an eye on how it handles really long sessions, but it's been solid so far.
For you, this probably means fewer interrupted workflows. Less context-switching. You can actually have a back-and-forth that feels more like working with a person who's been on your project for a while, not someone who just showed up.
The free tier just got a serious upgrade. Sonnet 4.6 is now the default model for everyone, including users who aren't paying a dime. That's not just a version bump, it's Opus-class performance from a few months ago, now available without a credit card.
Here's what changed. Free tier users now get file creation, connectors, skills, and compaction built in. Before, you'd hit limits pretty fast or need to upgrade just to do basic workflow stuff. Now you can actually build with it.
I think the bigger deal is what this does for prototyping. You can spin up a connector to pull data from an API, have Claude write the code to process it, create files as output, and chain skills together without paying for Pro. That's the whole builder loop right there.
The 1M token context window (still in beta, but available) means you can throw entire codebases or long documents at it and it'll actually remember what you're working on. Free tier used to feel like you were working through a keyhole. This opens it up.
What this really democratizes is experimentation. You don't need to justify a Pro account to your boss or eat the cost yourself just to see if an idea works. You can test automations, build internal tools, or prototype client work on the free tier and only upgrade when you hit usage limits. That's a different game entirely.
Here's how I think about it. Sonnet 4.6 just became my default for almost everything. The performance jump is real enough that I'm reaching for Opus way less than I used to.
Use Sonnet 4.6 when you're building features, fixing bugs, or handling multi-step workflows that need solid reasoning but don't require absolute perfection. Think: building out a new API endpoint, refactoring a component, generating documentation, or routing logic between services. Anthropic says developers with early access often prefer it to Opus 4.5, and I get why. It's fast, it follows instructions better, and the coding output is genuinely good.
I'm also using it for anything involving the 1M token context window. Long document analysis, pulling insights from big codebases, that kind of thing. It handles it without getting confused or dropping details halfway through.
Opus 4.6 is still worth the extra cost when you need the deepest reasoning available. Anthropic specifically calls out codebase refactoring, coordinating multiple agents, and problems where getting it exactly right matters more than speed. If you're architecting something complex or the cost of being wrong is high, Opus is probably the move.
But honestly? Start with Sonnet 4.6. You'll know pretty quickly if you need to step up to Opus, and most of the time you won't.
Look, the big story here is pretty simple. Sonnet 4.6 just closed a gap that used to cost you real money. Tasks that would've made you reach for Opus a few months ago? You can run them on Sonnet now, at a fraction of the price. That's not a minor upgrade, that's a shift in how you should be thinking about which model to use for what.
I've been testing it against some of the workflows I usually reserve for heavier models, and honestly, the consistency is what stands out. It's not just that it can handle complex coding tasks now. It's that it does it reliably, without the weird refusals or the need to rephrase things three times.
Here's what I'd do if I were you: pick one thing you currently use Opus for (or avoid doing because Opus feels too expensive to run at scale) and try it with Sonnet 4.6 instead. Maybe it's a multi-step automation. Maybe it's refactoring a gnarly section of code. Just one real task from your actual backlog.
Run it. See if the output holds up.
If it does, you just found a way to cut costs without cutting quality. And if it doesn't quite match Opus on your specific use case, at least you know where the line is now. Either way, you'll have a better sense of what this thing can actually do for your business instead of just reading about benchmarks.
The pricing didn't change, the context window got bigger, and it's already the default for most users. There's not much reason to wait.
I build custom websites and web apps for small businesses and solopreneurs. Let's talk about your project.
Get in touch