Last year I covered some modern Variational Inference theory. These methods are often used in conjunction with Deep Neural Networks to form deep generative models (VAE, for example) or to enrich deterministic models with stochastic control, which leads to better exploration. Or you might be interested in amortized inference.
All these cases turn your computation graph into a stochastic one – previously deterministic nodes now become random. And it's not obvious how to do backpropagation through these nodes. In this series I'd like to outline possible approaches. This time we're going to see why general approach works poorly, and see what we can do in a continuous case.