Little fact used in the IWAE paper

There’s a little lemma used in the Importance Weighted Autoencoder paper [1]

Let $I \in {1,\cdots, k}$ with $|I|=m$ be a uniformly distributed subset of distinct indices. Then

proof:

The probability of a random subset $S$ with $m$ distinct indices chosen from $I$ is ${m \choose k}^{-1}$. Therefore

The part that needs explaining is line 3, the number of times $a_j$ appears in the sum in line 2 is exacly ${k-1\choose m-1}$. This is because once any first index fixed, there are ${k-1\choose m-1}$ combinations such an index appears in.

Reference

[1]

Yuri Burda, Roger Grosse & Ruslan Salakhutdinov. 2016. Importance Weighted Autoencoder. arXiv:1509.00519