This operation’s output is estimated by simulating trajectories in the REINFORCE algorithm, one of a class of methods named for “policy” and this operation. Pseudo-residuals are fit using “weak learners” such as decision trees in a method contrasted with random forests named for this operation’s “boosting.” Since the sigmoid activation function saturates at extreme values, it is susceptible to a problem in which this operation “vanishes.” The negative learning rate scales the result of this operation applied to the loss associated with a single data point in the update step of a stochastic algorithm named for this operation. An optimization algorithm that takes steps in the opposite direction to this operation is named for this operation’s “descent.” For 10 points, name this operation that outputs a vector of partial derivatives. ■END■
ANSWER: gradient [accept policy gradient; accept gradient boosting or gradient-boosted trees; accept vanishing gradient; accept gradient descent or stochastic gradient descent; prompt on del or nabla; prompt on partial derivative until read]
<Editors, Other Science>
= Average correct buzz position