So distribute copies of the model in RAM to multiple machines, have each machine... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		greenavocado 32 days ago \| parent \| context \| favorite \| on: MegaTrain: Full Precision Training of 100B+ Parame... So distribute copies of the model in RAM to multiple machines, have each machine update different parts of the model weights, and sync updates over the network

olliepro 32 days ago [–]

decentralized training makes a lot more sense when the required hardware isn't a $40K GPU...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact