11 September 2014

Beta Function

How to compute $\int_0^1 dp\, p^k(1-p)^{n-k}$ without symbol manipulation.

The symbols I used to write it might be a giveaway. We'll imagine throwing a coin and obtaining sequences of head and tail like $$HTTTTHTHHH$$ This one has probability $p^5(1-p)^5$. If $p$ is the probability of obtaining H and $k$ is the number of Hs, then the general form is $p^k(1-p)^{n-k}$. The chances of obtaining some sequence with $k$ Hs is $$\binom{n}{k} p^k(1-p)^{n-k}$$

But, what if we don't know the value of $p$? Instead, we just have some guess about where its value is. (For example, perhaps we just know it's very likely in $[0.4,0.6]$.) In general, we'd model this with a probability density: We say that value $p$ happens with probability $h(p)\,dp$, where the function $h(p)$ is the probability density. Then, to get an expectation for the probability of seeing $k$ Hs, we average over the possible values of $p$: $$\int_0^1 dp\, h(p) \binom{n}{k} p^k(1-p)^{n-k}$$ The integral is from 0 to 1, because $h(p)$ must be $0$ elsewhere; after all, we know that $p$ is a probability.

If we have no idea what value $p$ has, then we should expect to see all values of $k$ with equal probability. The possible values are $0,1,\ldots,n$, so the probability is $1/(n+1)$. How do we express in terms of $h(p)$ that we have no idea what value $p$ has? That's right, we simply choose a uniform distribution $h(p)=1$, to say that we don't prefer one guess over another. Hence, $$\int_0^1 dp\, \binom{n}{k} p^k(1-p)^{n-k} = \frac{1}{n+1}$$ and $$\int_0^1 dp\, p^k(1-p)^{n-k} = \frac{1}{(n+1)\binom{n}{k}}$$

Sure, this reasoning is fast and loose. Nevertheless, this is what I did when I had to compute the integral, and, because I thought it's kinda cute, I'll probably not forget the trick. If you want a more rigouros derivation, Wikipedia has one.