I think you nailed it. For us it’s classifiers that we train for very specific d... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		schopra909 21 days ago \| parent \| context \| favorite \| on: Olmo 3: Charting a path through the model flow to ... I think you nailed it. For us it’s classifiers that we train for very specific domains. You’d think it’d be better to just finetune a smaller non-LLM model, but empirically we find the LLM finetunes (like 7B) perform better.

moffkalast 21 days ago [–]

I think it's no surprise that any model that has a more general understanding of text performs better than some tiny ad-hoc classifier that blindly learns a couple of patterns and has no clue what it's looking at. It's going to fail in much weirder ways that make no sense, like old cnn-based vision models.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact