Random notes mostly on Machine Learning

Reciprocal Convexity to reverse the Jensen Inequality

Jensen’s inequality is a powerful tool often used in mathematical derivations and analyses. It states that for a convex function \(f(x)\) and an arbitrary random variable \(X\) we have the following upper bound: \[ f\left(\E X\right) \le \E f\left(X\right) \]

However, oftentimes we want the inequality to work in the other direction, to give a lower bound. In this post I’ll outline one possible approach to this.

Matrix and Vector Calculus via Differentials

Many tasks of machine learning can be posed as optimization problems. One comes up with a parametric model, defines a loss function, and then minimizes it in order to learn optimal parameters. One very powerful tool of optimization theory is the use of smooth (differentiable) functions: those that can be locally approximated with a linear functions. We all surely know how to differentiate a function, but often it’s more convenient to perform all the derivations in matrix form, since many computational packages like numpy or matlab are optimized for vectorized expressions.

In this post I want to outline the general idea of how one can calculate derivatives in vector and matrix spaces (but the idea is general enough to be applied to other algebraic structures).

Resizing Policy of std::vector

Sometime ago when Facebook opensourced their Folly library I was reading their docs and found something interesting. In section “Memory Handling” they state
In fact it can be mathematically proven that a growth factor of 2 is rigorously the worst possible because it never allows the vector to reuse any of its previously-allocated memory

I haven’t got it first time. Recently I recalled that article and decided to deal with it. So after reading and googling for a while I finally understood the idea, so I’d like to say a few words about it.