Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Given this was published in 2010, and https://en.wikipedia.org/wiki/Word2vec was published in 2013, perhaps this was an early precursor?

From the article linked from this blog post: "Enabling computers to understand language remains one of the hardest problems in artificial intelligence."



I worked for this task for a year and it doesn't work very well because in embedding space relatedness, synonymy and antonymy are mixed up and require pairwise thresholding. You can probably get to 90% but not 99% this way. Better use a crossentropy approach.

In modern RAG applications we return top-k results for this reason - it can't simply give the correct snippet in one result, leaving the hard part to the LLM to make sense what is useful and what is not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: