Hacker Newsnew | past | comments | ask | show | jobs | submit | helloplanets's commentslogin

If the model is based on a new tokenizer, that means that it's very likely a completely new base model. Changing the tokenizer is changing the whole foundation a model is built on. It'd be more straightforward to add reasoning to a model architecture compared to swapping the tokenizer to a new one.

Usually a ground up rebuild is related to a bigger announcement. So, it's weird that they'd be naming it 4.7.

Swapping out the tokenizer is a massive change. Not an incremental one.


> Usually a ground up rebuild is related to a bigger announcement. So, it's weird that they'd be naming it 4.7.

Benchmarks say it all. Gains over previous model are too small to announce it as a major release. That would be humiliating for Anthropic. It may scare investors that the curve flattened and there are only diminishing returns.


It doesn't need to be. Text can be tokenized in many different ways even if the token set is the same.

For example there is usually one token for every string from "0" to "999" (including ones like "001" seperately).

This means there are lots of ways you can choose to tokenize a number. Like 27693921. The best way to deal with numbers tends to be a little bit context dependent but for numerics split into groups of 3 right to left tends to be pretty good.

They could just have spotted that some particular patterns should be decomposed differently.


Mm, don't you just need to retrain the embedding layer for the new tokenizer? I agree it seems likely this is like a stopgap new model release or a distillation of mythos or something while they get a better mythos release in place. But there are some things that look really different than mythos in the model card, e.g. the number of tokens it uses at different effort levels.

Maybe it's an abandoned candidate "5.0" model that mythos beat out.


Major numbers are just for marketing, if it's not good enough that it feels like a similar jump as from 3.7 to 4 they're not going to give it a new number.

I wonder why computer use has taken a back seat. Seemed like it was a hot topic in 2024, but then sort of went obscure after CLI agents fully took over.

It would be interesting to see a company to try and train a computer use specific model, with an actually meaningful amount of compute directed at that. Seems like there's just been experiments built upon models trained for completely different stuff, instead of any of the companies that put out SotA models taking a real shot at it.


On the other hand, I never understood the focus on computer use.

While more general and perhaps the "ideal" end state once models run cheaply enough, you're always going to suffer from much higher latency and reduced cognition performance vs API/programmatically driven workflows. And strictly more expensive for the same result.

Why not update software to use API first workflows instead?


The industry probably moves a lot faster adding apis and co than learning how to use a generic computer with generic tools.

I also think its a huge barrier allowing some LLM model access to your desktop.

Managed Agents seems like a lot more beneficial


The trillion dollar "Computer Use" model could not figure out how to configure audio outputs in Microsoft Teams. It then model-collapsed when trying to configure an HP printer. AGI was postponed, we'll get back to this after next weeks retrospective.

Sadly it does. Most of those people have to spend a lot of money on security. But usually it's not the Forbes list that specifically outs them as being wealthy. You can't really build a billion dollar company under the radar.

This is just a strange situation where someone has made billions without their identity being known, without being a criminal.


Whatever you're using as your visual templating instructions, I like it. Mind sharing?

Been using a slightly modified Tufte template for my vibed small apps, but this is much better.


I added some notes above on the tiling technology. As for the base map itself I posted a link to the original file. I hope that helps but happy to answer any other questions you might have.

AI is one of the core parts of cyberpunk, through androids / humanoid robots. Blade Runner is completely built on the protagonist having to interact with rogue artificial intelligence.

> We are in fact very disposable and replaceable.

To your friends and family as well? Or just your employer?

You're describing things that may well be true for a lot of employers, but fall apart outside of that context.


For some rare once-in-a-lifetime friendships, you are not disposable, and if anything were to happen to you, you would be missed. I can count those on one hand.

For most casual acquaintances (that some people incorrectly label as friends), it's certainly true.

On the family's side: only parents, siblings, children, maybe some aunt or grandparent. Second distant cousin you saw 3 times in life?


> For most casual acquaintances

You may feel this way, but it feels a lot different when you learn that one of your acquaintances has died.

I enjoyed a brief intellectual conversation with a professor at the end of a semester. When I returned the next academic year, I stopped by his office for a quick chat, but his name was no longer on the door. The department administrator told me "Oh, he's no longer with us."

My heart sunk. I didn't know him well, he may not have remembered my name, but I wanted to thank him, and now he was gone. Cut down in his prime? He was just an acquaintance to me, he was not my friend. But I still felt that shock and grief deeply.

I asked the administrator how he'd died, and she quickly clarified: he was still alive! He had just been a guest lecturer visiting for one semester from a Scandinavian university and had now returned home. This has taught me not to delay expressing my gratitude for the acquaintances in my life.


There's a been a few similar instances in my life that have led me take up the personal practice of "Always say hi or wave to friend when the chance comes around, because there may not be a next time". It came about because I tend to see a lot of close friends and looser acquaintances on a day to day basis physically in the world, and there used to be more times than not where I wouldn't bother crossing the street or stopping for a minute to chat. Later I realized this costs me almost nothing, and even for less-close relationships, I'd prefer to have put in the tiny amount of effort to walk up and show them they're worth even that much before they overdosed or moved away or committed suicide. It's not always opportune, but what else is life for?

Granted, in retrospect, there's not really ever a sufficient amount of interaction you could have had, but if I see someone inside a cafe that I'm walking past, it's worth popping in and at least saying hi or waving from outside.


There still is value with the casual acquaintances. Just because a person is replaceable doesn't mean they are not valuable when present. My neighbor who I barely talk to has helped me out when I am in a bind. Even if a new neighbor moves in and replaces him, the original neighbor was valuable and gave me a sense of security, peace, and community while he was present.


> I can count those on one hand.

What's the problem about that?

I'd rather have my family and 1-2 close friends, and literally no one else, instead of 100 close friends that will vanish as soon as I am not able to bring anything to the table anymore, which will inevitably happen for everyone.


That’s not the point. The point is that the number is small. They are not making a judgment on the value of such relationships but rather that that number is and will always be small that in the grand scheme of things it’s insignificant, it only matters in a person’s immediate sphere.

People on this thread seriously need to stop reacting so emotionally to things. Damn. Grow up people.


The number is small in comparison to the whole humanity, yes, but this is not at all what it's about in this post. Did you read the article?

Instead, it actually is literally about each individual's immediate sphere, which, as you correctly point out, is where it matters. Having 5 true friends in a world with 100 people or in a world with 1 billion people doesn't change anything.


Is that not enough?


It is (and even if it is not, it's just the way it is...).

what I'm arguing is that it's not only the workspace where we all are disposable and replaceable. It happens in friends and family context, too.

What to do with this information... I'm not sure. But usually it's a good first step to see things clearly.


That may be true for a lot of families and friends as well. They may not dispose you outright but they will try to cut you off every chance they get.


But it may not be true for family or of employers… so we are back to square one I suppose.


I think modern healthcare really put a focus back on people as individuals. Mortality rates were quite high even from what we see now as trivial illness or injury, and people would have a lot of offspring to account for this in the past.

By the time I reached adult hood I only experienced a handful of deaths of close people, all from old age.


Apart from a few, friends and family who care about you can be counted on one hand as well. When automation replaces our job or diminishes our economic worth many fold, not many friends and family remain unchanged. You parents, your siblings maybe and maybe 1-2 of your closest friends. Others will drift apart because we are in a different lifestyle now. Heck, even without any change, parent/siblings drift apart for many. It is tough then to not correlate our value with our economic value.


> Apart from a few, friends and family who care about you can be counted on one hand as well.

It's not about the quantity but about the quality of friendships and human connection. I couldn't care less about the number of my friends. I do care a lot though about the connection to them.


> Moat seems to be shrinking fast.

It's been a moving target for years at this point.

Both open and closed source models have been getting better, but not sure if the open source models have really been closing the gap since DeepSeek R1.

But yes: If the top closed source models were to stop getting better today, it wouldn't take long for open source to catch up.


This is also what's called the beginner's mind, Shoshin. [0] One of the core concepts of Zen Buddhism. Tangentially related would be the concept of no-mind, Mushin. [1]

[0]: https://en.wikipedia.org/wiki/Shoshin [1]: https://en.wikipedia.org/wiki/No-mind


Pretty sure the majority of people who are a part of the community work a 9-5. But yes, spending after work hours making something that's just completely out there with no monetary purpose at all is much more nourishing for the soul than attempting to create a passive income machine.


> copy pasting it's training data

This is a total misrepresentation of how any modern LLM works, and your argument largely hinges upon this definition.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: