Random notes on Computer Science, Mathematics and Software Engineering

Stochastic Computation Graphs: Discrete Relaxations

This is the second post of the stochastic computation graphs series. Last time we discussed models with continuous stochastic nodes, for which there are powerful reparametrization technics.

Unfortunately, these methods don’t work for discrete random variables. Moreover, it looks like there’s no way to backpropagate through discrete stochastic nodes, as there’s no infinitesimal change of random values when you infinitesimally perturb their parameters.

In this post I’ll talk about continuous relaxations of discrete random variables.

Stochastic Computation Graphs: Continuous Case

Last year I covered some modern Variational Inference theory. These methods are often used in conjunction with Deep Neural Networks to form deep generative models (VAE, for example) or to enrich deterministic models with stochastic control, which leads to better exploration. Or you might be interested in amortized inference.

All these cases turn your computation graph into a stochastic one – previously deterministic nodes now become random. And it’s not obvious how to do backpropagation through these nodes. In this series I’d like to outline possible approaches. This time we’re going to see why general approach works poorly, and see what we can do in a continuous case.