Yes. Pretraining and fine-tuning use standard Adam optimizers (usually with weig...

		samsartor 56 days ago \| parent \| context \| favorite \| on: Backpropagation is a leaky abstraction (2016) Yes. Pretraining and fine-tuning use standard Adam optimizers (usually with weight-decay). Reinforcement learning has been the odd-man out historically, but these days almost all RL algorithms also use backprop and gradient descent.