Kingma and Welling’s paper introducing variational autoencoders provides an analytical formula for this quantity in the case of an approximately standard normal prior and approximately normal posterior. For 10 points each:
[10m] Name this measure of difference between two probability distributions, P and Q, that equals the cross entropy of P and Q, minus the entropy of P.
ANSWER: Kullback–Leibler divergence [or KL divergence, or relative entropy, or I-divergence]
[10h] The paper introduces the AEVB algorithm to calculate the SGVB estimator for this quantity that equals the conditional log-prior minus the KL divergence. Variational Bayesian methods seek to maximize this quantity when the marginal posterior is intractable.
ANSWER: evidence lower bound [or ELBO; accept variational lower bound or negative variational free energy; prompt on lower bound]
[10e] Gibbs’ inequality gives this value as a lower bound of the KL divergence. The Cramér–Rao bound approaches this value for the lower bound of the variance of an estimator as the Fisher information approaches infinity.
ANSWER: zero
<Adam Fine and David Bass, Other Science - Mathematics>