Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

  "Inference starts at a comfortable 30 t/s
is this including the context? context: 1000t and instruction: 20t takes (1020/30 s)? or 20/30 s?

  "Second, LLMs have goldfish-sized working memory. ... In practice, an LLM can hold several book chapters worth of comprehension “in its head” at a time. For code it’s 2k or 3k lines (code is token-dense).
That's not exactly goldfish-sized and in fact very useful already.

  "Third, LLMs are poor programmers. At best they write code at maybe an undergraduate student level who’s read a lot of documentation.
Exactly what I want for local code generation.

I think he's anti-hyping a little by pretending LLMs are in fact _not_ super-intelligent and what not. Sure, some people believe that but come on ... we're not on a McKinsey workshop here.

---

Any good German language models out there?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: