Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There wasn't a study or analysis. It was just lazy speculation that felt good because it could be bound up in a "evil white countries exploiting the developing world" narrative. Where exploiting was "paying to do a job".

It was submitted as https://news.ycombinator.com/item?id=40623629

Again, there is effectively zero real data showing this. Further, RLHF isn't likely to reinforce such word selection regardless.

A more logical, likely scenario is that training data is biased heavily towards higher grade level material, so word selection veers towards writings that you find in those realms.



> It was just lazy speculation that felt good because it could be bound up in a "evil white countries exploiting the developing world" narrative. Where exploiting was "paying to do a job".

Exploitation like that is in fact happening (see pretty much everything having to do with social media content moderation and RLHF to avoid disturbing content.

Also "paying to do a job" is not the moral panacea you seem to think it is.


tinfoil had theory: they implanted watermarks already, so that AI generated text can be flagged for future training runs or as a service, such that some phrases are coaxed to become statistical beacons.


That's not really a tinfoil hat theory. That's been possible for some years and OpenAI reportedly does watermark their outputs, and can detect it. They just haven't released it as a service because it'd annoy all the users who are using it for cheating :)


I believe that if that was possible to do on purpose, they wouldn’t have so much trouble preventing the LLMs from talking about things they shouldn’t.


Yeah I would like to see some evidence of this too. It's just asserted as truth in the article. Delve doesn't seem like a particularly unusual word to me, especially in the context of scientific abstracts, and LLMs could totally learn random weird things. How common is "it's important to remember" in Nigeria?


Wait, why wouldn’t RLHF influence word choices?


I didn't say it wouldn't (or rather couldn't), I said it was unlikely for the selected hypothesis given standard training data vs RLHF iterations.


then again, most history consists of whitewashing back when northern countries were exploiting everywhere else in various ways: imperialism, colonialism, neocolonialism, capitalism, financialization,...

typical people prefer to pretend this is simply "order" and "progress"; seemingly blind to their own ideological baggage like fish in water




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: