Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Training is performed in parallel with batching and is more flops heavy. I don't have an intuition on how memory bandwidth intensive updating the parameters is. It shouldn't be much worse than doing a single forward pass though.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: