More

muyuu · 2026-04-19T11:00:01 1776596401

uhmmm so that is the p they are hacking? it would actually explain a lot

hereme888 · 2026-04-19T14:26:58 1776608818

It's an incomplete cost model, but it's not p-hacking. Could be cherry-picking.

muyuu · 2026-04-19T10:47:20 1776595640

that will only increase the demand for RAM as models will now be usable in scenarios that weren't feasible prior, and the ceiling for model and context size is not even visible at this point

I hate to mention Jevons paradox as it has become cliche by now, but this is a textbook such scenario

muyuu · 2026-04-19T10:36:39 1776594999

they also never experience violent death by orcas, foxes, sea lions or leopard seals

muyuu · 2026-04-17T08:57:01 1776416221

Very plausible that they would outlaw this if these bills pass and consolidate. Would be seen as a loophole.

nicce · 2026-04-17T10:45:07 1776422707

Probably works as well as "forbidding" adults to sell or give beer to underage.

muyuu · 2026-04-16T17:07:25 1776359245

Currently GPT just works much better, and so does Gemini but it's more expensive right now. Going through Opencode stats, their claim is that Gemini is the current best model followed by GPT 5.4 on their benchmarks, but the difference is slim.

My personal experience is best with GPT but it could be the specific kind of work I use it for which is heavy on maths and cpp (and some LISP).

muyuu · 2026-04-16T12:08:13 1776341293

sounds interesting and i don't want to sound negative, but i'm not going to install software of this nature that is closed source

kirby88 · 2026-04-16T12:42:07 1776343327

Claude code is closed source as well (at least before the leak). But point taken thank you.

muyuu · 2026-04-16T16:47:01 1776358021

I don't use Claude Code either in fairness, but when I buy tokens from them that's an exposure I already have. These days I use opencode and I work local first with the models, then remote models as fallback. GPT and Gemini have been giving me better results than Opus for a while.

Would you install someone else's binary blob to give it those much permissions and your API keys? The measures required to run that somewhat trustless complicate things a lot.

muyuu · 2026-04-13T06:31:41 1776061901

I don't know about users on reddit and discord, but the open models are essentially at SotA with a 3-4 months delay. That puts a hard backstop at what OpenAI and Anthropic can do before I personally can cut them off entirely without losing too much.

Granted the experience can be worse, esp. if you're using it very hands-off and not like a junior assistant who's extremely fast but doesn't know what he's doing at the architecture and strategy level. But even for that I'm relatively confident the Chinese will be competitive pretty soon, and they won't be too expensive. And we know this because we can see their current models and we know what it takes to run them.

Currently my Strix Halo computer that costed me under £3k can do a lot of LLM stuff that is perfectly useful. In some ways, it's better than "cloud" models, I have models that essentially don't say "no" and I have relatively predictable setups. If you want to get fancy, you can right now rent compute to run models that are extremely capable like the latest ones from Kimi, GLM, Qwen, Minimax at full size from providers that are not operating at a loss and it won't be too expensive. You can pool resources to do the same locally. You can do stuff that cloud providers are unlikely to market, like distillation and abliteration to serve your specific needs.

I'm very optimistic about open weights models just the way they are right now.

But I agree with you that OpenAI will likely play similar games to Anthropic and it could be soon.

muyuu · 2026-04-12T15:33:08 1776007988

it may also be local/timezone effects

it has been reported that it behaves very differently depending on those factors, presumably because people are placed in best-effort buckets, who knows

muyuu · 2026-04-12T02:14:38 1775960078

I think the "Mythos" name is genius. The people at Anthropic make a bunch of claims and the public is expected to just believe them without any possibility of testing those claims or reproducing those results, and since so many people are invested in this saviour for the Global economy, or in the industry in general, or in hype to feed their engagement-based income sources, then there is faith to spare.

Meanwhile this mythical beast wasn't able to prevent the Bun vulnerability that exposed their code, let alone precluding the need to acquire that IP in the first place for presumably hundreds of millions of $$$, instead of coding a better replacement or a solution of its own.

What is real and measurable is that subscription plan users are getting a much degraded service for the same money through both open and hidden policies, while Anthropic moves compute to serve off-the-counter customers. The same people who come with the most obvious and brazen lies to dismiss the clear degradation of their service also come with this "security" justification for a move that looks just like good old market segmentation which would perfectly fit the strong symptoms that they cannot afford to offer tokens at a competitive price in this market.

JSR_FDED · 2026-04-12T06:04:30 1775973870

One very clever consequence of Anthropic's guarded release of the Mythos model is that they've kind of claimed the position of best in class here, and also positioned themselves as the responsible vendor in this space in one fell swoop.

muyuu · 2026-04-12T06:37:16 1775975836

OpenAI pulled the same trick with GPT3. It's amazing how well it's working judging by the comments I'm hearing from people I know exist. Because out there on social media, who knows.

kilroy123 · 2026-04-12T11:27:09 1775993229

Well said. I really hope the Chinese models keep getting better. Competition is good.

tokioyoyo · 2026-04-12T02:52:39 1775962359

There are two possibilities:

a) Anthropic is lying, and every company that is collaborating on vulnerability squishing project is an accomplice in this big lie b) Anthropic has then goldest gold of the shovels to sell to people, which is actually useful for enterprises

Everyone, including Ant, understands that other companies will catch up in terms of model strength. So it’s a damned if you do, damned if you don’t position wrt releasing it to the public.

phire · 2026-04-12T05:08:37 1775970517

The model is probably legitimately better. But it might not be enough better to justify the extra cost of inference.

They know if they released it publicly, people will be able to see exactly how smart it is, and adjust their demand correspondingly. Anthropic will either need to price it high enough that nobody uses it (and the hardware is sitting mostly idle to servicing a few customers), or lower their profit margins (potentially below cost) to price it fairly.

So instead, they bundle it with this fancy new exploit finding scaffold, and sell the combined it to enterprise customers. I bet the scaffold works fine with smaller models, but gets notably improved results with Mythos.

The two products support each-other, and with the exclusive bundle Anthropic can get more profit selling both together than they would get selling them individually.

And as an added bonus, people over estimate the capability of this unreleased model, providing hype for Anthropic.

muyuu · 2026-04-09T23:03:52 1775775832

Have you tried Kilo? I'd like to hear from someone who has tried both to know how do they compare.

wiether · 2026-04-11T17:43:39 1775929419

I went on Kilo's website and it's seems to be closing doors on what I'm already doing.

My coding is done with OpenCode with an OpenRouter API Key.

Going with KiloCode would be doing the same, but with some more layers.

And given that I can't see the on-demand API pricing, I'm really not convinced on how it would be an improvement.