Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think you nailed it.

For us it’s classifiers that we train for very specific domains.

You’d think it’d be better to just finetune a smaller non-LLM model, but empirically we find the LLM finetunes (like 7B) perform better.



I think it's no surprise that any model that has a more general understanding of text performs better than some tiny ad-hoc classifier that blindly learns a couple of patterns and has no clue what it's looking at. It's going to fail in much weirder ways that make no sense, like old cnn-based vision models.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: