Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You shouldn't need to allocate every possible combination !_! if you dynamically add new pairs/distance as you find them. Im talkin simple for loops.


you might enjoy this read, which is an up-to-date document from this year laying out what was the state of the art 20 years ago:

https://web.stanford.edu/~jurafsky/slp3/3.pdf

Essentially you just count every n-gram that's actually in the corpus, and "fill in the blanks" for all the 0s with some simple rules for smoothing out the probability.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: