Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes. Pretraining and fine-tuning use standard Adam optimizers (usually with weight-decay). Reinforcement learning has been the odd-man out historically, but these days almost all RL algorithms also use backprop and gradient descent.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: