Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> "wanted to run glm-4.7-flash:q8_0" > q8_0

a well made (as in, unsloth) smaller quant will help a good amount here, without a notable reduction in performance or increase in perplexity



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: