Hacker Newsnew | past | comments | ask | show | jobs | submit | extr's commentslogin

Seems like a continuation of the current meta where GPT models are better in GPT-like ways and Claude models are better in Claude-like ways, with the differences between each slightly narrowing with each generation. 5.5 is noticeably better to talk to, 4.7 is noticeably more precise. Etc etc.

I'm not sure they have "officially" said anything but they do allow Codex OAuth login for 3rd party coding agents: pi, opencode, etc. Employees on twitter have explicitly approved this.

That matches what I have seen, but I think I remember reading a tweet that had mentioned those "developing in the open" (not an exact citation, just based on what I remember), which made me wonder if it meant they considered this allowed only for open source software, or if they were intending to be much more permissive, essentially considering users can use their quotas wherever they want, or maybe even completely different rules, again I feel there could be more transparency regarding all of that.

I mean surely you can understand the the difficulty of their position, right? It's as if Waymo offered a subsidized, subscription based plan that models a certain type of ridership as typical but then people start scheduling rides on a timer with no one in it, far outside the original use case of "Get me from point A to point B". And of course the line between what is acceptable is quite fuzzy. You could imagine it being seen as okay to send a rider-less Waymo to pick up groceries occasionally - but not to schedule one every single day at 4:30PM to pick up a single ice cream cone.

You can argue that this is unfair and they should provide clearer guidance. Well - as soon as they do people find ways to skirt the letter of the rules to once again take advantage of the economics of the subscription model. So should they just scrap the entire plan? Ruin it for people who are using it as it was intended (coding agent, light experimentation/headless use outside of that)? That doesn't seem right either.


I don't think anyone would want the type of user that OpenClaw users are as customers...

There will be a time for OpenClaw, but in the current world with limited compute, that time is not now.


I think HN needs a regular reminder that most things sold are commodities -without limits or re-use. Coal and wheat have no DRMs.

This kind of thing is the exception. Subsidized subscriptions work to distort the power of the market. The more successful they are (in destroying competition), the worse it leaves consumers.

While i get the individual steps that leads them to this "difficult position", I think i'll just keep telling everybody to cancel their sub and make sure to not get locked in.


> Most things are sold as commodities without limits or re-use.

This is somehow doubly wrong. Not only are most economic goods NOT commodities, there are plenty of economic analogs to AI subscriptions (streaming, telecom, gyms, buffets) and none of them operate as "unlimited with no restrictions on re-use". Really just terribly misinformed way of thinking here.


In most parts of the world telecom & gyms are commodities - America is 'further ahead' in letting companies distort markets without regulation.

But i think you misunderstood the scope of my claim. We can argue whether its 30% or 70% of an average paycheck is spend on fungible things and per line item how much of it is fungible and not - but I was also including all the B2B sales.

Companies that let themselves become entirely dependent on specific suppliers do worse.


Yes same here. I use CC almost constantly every day for months across personal and work max/team accounts, as well as directly via API on google vertex. I have hardly ever noticed an issue (aside from occasional outages/capacity issues, for which I switch to API billing on Vertex). If anything it works better than ever.

I think you are kidding if you think you are going to be remotely approximately the quantity/quality of output you get from a $100/max sub with Zed/Openrouter. I easily get $1K+ of usage out of my $100 max sub. And that's with Opus 4.6 on high thinking.


For personal use I've noticed Claude (via the web-based chat UI) making really bizarre mistakes lately like ignoring input or making completely random assumptions. At work Claude Code has turned into an absolute dog. It fails to follow instructions and builds stuff like a lazy junior developer without any architecture, tests, or verification. This is even with max effort, Opus 4.6, multiple agents, early compaction, etc. I don't know what they did but Anthropic's quality lead has basically evaporated for me. I hope they fix it because I've since adapted my project's Claude artifacts for use with Codex and started using it instead - it feels like Claude Code did earlier this year.

I'd like to give the new GLM models a try for personal stuff.


Same, I'm looking hard for an alternative to what I had.

And I'm seeing the same thing in my sphere- everyone is bailing Anthropic the past few weeks. I figure that's why we're seeing more posts like this.

I hope they're paying attention.


I've noticed the same thing, and even done side by side tests where I compare Claude Code with Cursor both running Opus 4.6.

It seems Cursor somehow builds a better contextual description of the workspace, so the model knows what I'm actually trying to achieve.

The problem is that with Cursor I'm paying per-token, so as GP suggested you can easily spend $100+ per month vs $20 on Claude Code.


I saw this immediately with 4.6 and dumped back to 4.5 because I actually asked it wtf it was doing and it's response was "being lazy"

> At work Claude Code has turned into an absolute dog.

Could it be related to this?: https://news.ycombinator.com/item?id=47660925


Some of the newer models available on OpenRouter are good, but I agree that none of them are a replacement for Opus 4.6 for coding.

If you're trying to minimize cost then having one of the inexpensive models do exploratory work and simple tasks while going back to Opus for the serious thinking and review is a good hybrid model. Having the $20/month Claude plan available is a good idea even if you're primarily using OpenRouter available models.

I think trying to use anything other than the best available SOTA model for important work is not a good tradeoff, though.


I've been thinking of doing this — using one of the "pretty good but not Opus 4.6-good, YET very cheap" models for the implementation part of more basic code features, AFTER first using Opus 4.6 high for the planning stage.

Do you think this would be a decent approach?

Also, which client would I use for this? OpenCode? I don't think Claude Code supports using other models. Thoughts?


I have been doing this and the results have been fairly good.

I use claude to build requirements.md -> implementation.md -> todo.md. Then I tell opencode + openrouter to read those files and follow the todo using a cheap (many times free) model.

It works 90% of the time. The other 10% it will get stuck, in which case I revert to claude.

That has allowed me to stay on the $20/month claude subscription as opposed to the $100.


I appreciate your guidance here. Thank you very much. I will start doing this.

> I easily get $1K+ of usage out of my $100 max sub. And that's with Opus 4.6 on high thinking.

And people keep claiming the token providers are running inference at a profit.


>And people keep claiming the token providers are running inference at a profit.

Not everyone gets $1K of usage, and you don't know how fat the per-token margins are. It's like saying the local buffet place is losing money because you eat $100 worth of takeout for $30.


> Not everyone gets $1K of usage, and you don't know how fat the per-token margins are.

Well, we're going to find out sooner rather than later. Right now you don't know how thin (or negative) the margins are, either, after all.

All we know for certain is how much VC cash they got. Revenue, spend, profit, etc calculated according to GAAP are still a secret.


In addition to usage distribution aspects others called out .

$1K is not actual cost, just API pricing being compared to subscription pricing. It is quite possible that API has a large operating margins, and say costs only $100 to deliver $1K worth of API credits.


The model developers across the board stand by that most/all models are profitable by EOL, and losses come from R&D/Training.


Yes and when we say things like that we are not talking about plans. Running inference at a profit means api token use is run profitably. It’s a huge unknown what’s happening at the plan level, we know there is subsidy happening but in aggregate impossible to know if it’s profitable or not.


Yeah — I just created an anthropic API key to experiment with pi, and managed to spend $1 in about 30 minutes doing some basic work with Sonnet.

Extrapolating that out, the subscription pricing is HEAVILY subsidized. For similar work in Claude Code, I use a Pro plan for $20/month, and rarely bang up against the limits.


And it scales up - the $200 plan gets you something like 20x what the Pro plan gets you. I've never come close to hitting that limit.

It's obviously capital-subsidized and so I have zero expectation of that lasting, but it's pretty anti-competitive to Cursor and others that rely on API keys.


Ignoring the training costs, the marginal cost for inference is pretty low for providers. They are estimated to break even or better with their $20/month subscriptions.

That being said, they can't stop launching new models, so training is not a one time task. Therefore one might argue that it is part of the marginal cost.


I ran ccusage on my work Max account and I spend what would cost $300 a week if it was billed at API rates.


Out of curiosity, how many tokens are people using? I checked my openrouter activity - I used about 550 million tokens in the last month, 320M with Gemini and 240M with Opus. This cost me $600 in the past 30 days. $200 on Gemini, $400 on Opus.


  My Claude Code usage stats after ~3 months of heavy use:

    Favorite model: Opus 4.6          Total tokens: 42.6m
    Sessions: 420                     Longest session: 10d 2h 13m
    Active days: 53/95                Longest streak: 16 days
    Most active day: Feb 9            Current streak: 4 days

    ~158x more tokens than Moby-Dick

  Monthly breakdown via claude-code-monitor (not sure how accurate this is):

    Month     Total Tokens     Cost (USD)
    2026-01     96,166,569       $112.66
    2026-02    340,158,917       $393.44
    2026-03  2,183,154,148     $3,794.51
    2026-04  1,832,917,712     $3,412.72
    ─────────────────────────────────────
    Total    4,452,397,346     $7,713.34


According to the meter, I used $15k in tokens with my Max plan (along with $5k of Codex tokens) in the last 30 days. That built an entire working and (lightly) optimized language, parser, compiler, runtime toolchain among other things.


Not everyone is just vibecoding everything and relying on agents running sota models to do anything tho.


I actually find Zed pretty reasonable in terms of memory usage. But yeah, like you say, there are lots of small UX/DX papercuts that are just unfortunate. In some cases I'm not sure it's even Zed's fault, it's just years and years of expecting things to work a certain way because of VS Code and they work differently in Zed.

Eg: Ctrl+P "Open Fol.." in Zed does not surface "Opening a Folder". Zed doesn't call them folders. You have to know that's called "Workspace". And even then, if you type "Open Work..." it doesn't surface! You have to purposefully start with "work..."


The issues you described show a critical lack of awareness from the Zed developers that people migrate to their IDE mainly from VS Code.

They are blowing their "weirdness budget" on nonsense.


I don't think it's conscious or even a result of not caring about UX/DX. But I do think you're right - I've noticed the loudest voices in their Issue queue are people wanting things like better vim support, helix keybind support (super niche terminal modal editor), etc. Fine if they want to make that their niche but if you are migrating from VS Code like 99% of people you can't have these kinds of papercuts, people will just uninstall.


I think explicit post-training is going to be needed to make this kind of approach effective.

As this repo notes is "The secret to good memory isn't remembering more. It's knowing what to forget." But knowing what is likely to be important in the future implies a working model of the future and your place in it. It's a fully AGI complete problem: "Given my current state and goals, what am I going to find important conditioned on the likelihood of any particular future...". Anyone working with these agents knows they are hopelessly bad at modeling their own capabilities much less projecting that forward.


What is Cursor doing? They need to relax a little bit. Recently I saw they released "Glass" which WAS here: https://cursor.com/glass, now just redirects to /download.

Is "Cursor 3" == Glass? I get they feel like their identity means they need to constantly be pushing the envelope in terms of agent UX. But they could stand to have like an "experimental" track and a "This is VS Code but with better AI integration" track.


Glass was a codename while the UI was in early alpha with testers. It redirects to download now because there is no special link anymore. It's just part of Cursor 3 itself.


Just emailed him. Ridiculous issue.


Disagree completely. Works great for me.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: