Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, the concept of "syntactic complexity" applied to LLMs can be very different of what we think and I think it depends of the tokenizer. Perhaps LLMs could be fine-tuned by using a grammar for computer languages and special tokens for this grammar in order to reduce syntactic complexity. For example in Lisp, a right or left parenthesis could be tokenized in a special way (indicating left-lisp-parenthesis or right-lisp-parenthesis), that way the LLM could learn faster and reduce syntactic errors.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: