Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"That's a higher level of abstraction"

No, it's not because it's seen 'anatomy' for Pelicans, Animals - even how it's represented in Animals.

If you try to get the AI to actually decompose it and start to 'draw pelicans' in very obscure ways, it will immediately fail.

Try to get the AI to draw the pelican form a very odd angle - like underneath, to the right, one wing extended, one wing not ... 0% chance.

Precisely because it does not understand those things.

FYI it's a slightly unfair case because it does not have 'world model' yet, which will actually solve that problem, but even then not through very much abstracting.

We're a long way away - but in the meantime, there's lots to unpack.



> Try to get the AI to draw the pelican form a very odd angle - like underneath, to the right, one wing extended, one wing not ... 0% chance.

Proof by existence?

https://gist.github.com/nlothian/50241d34a654fcf0caa280d4475...

Looks pretty good to me. ChatGPT in "Thinking" model.

Edit: I've added the Opus version on the same link.


? That's evidence that it does not work.

Neither of those are from 'under' they both look either front or top?

Imagine yourself under the ducks feet, looking up at an oblique angle - wings as I suggested. The AI won't do that, it has no reference for dimensionality.


What on earth do you mean?

I live near an area with lots of pelicans. If you look up at one flying overhead this is what they look like.

Here is a photo for comparison: https://commons.wikimedia.org/wiki/File:American_white_pelic...


Sure, something like that. Note like the examples you posted.


I've very confused. The SVGs show the beak, wing, tail, feet and body as though viewed from directly underneath.

They look similar to the photo, but meet the instructions better ("from underneath").

What are you expecting exactly?


They don't look anything like the photo.

They're not 'oblique' - they're 'squared' views and none of the anatomy looks appropriately adjusted.

The model has no ability to 'rotate a figure in 3d space' and conceptualize how all of the elements work together.

It's 'pattern matching'.

This is the 'great intuition' for how LLMs work - it's not perfect because a lot of 'synthetic reasoning' can be done obviously.

And they probably never will, LLMs are not the right thing for this kind of task.

Think about how they can investigate massive code-bases and find arcane bugs - but cant draw a duck from arbitrary oblique angles etc.

That said, with enough examples they probably could.


Those are just awful compared to the side view of a pelican on a bike.


Have you seen a pelican from underneath? There's not much to show!



Link doesn't work - maybe not public?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: