13 September 2014

The Effect of Discouragement

How many books will one write? (Or blog posts, for that matter.)

Terry Tao recently explained why most bad books are written by good authors. The basic idea is that bad authors are are discouraged and give up writing. The model he used describes each author by the probability $p$ to produce good books. But, there's no model for discouragement. Instead, it just sneaks in in an example. Let's try to rectify that. We add a probability $q_0$ of giving up after writing a good book, and a probability $q_1$ of giving up after writing a bad book.

How many books will one write on average, in this model? There's a picture below, for $q_0=1\%$ and various values of $q_1$. The $x$-axis is $p$. So, for example, we see that if all written books are good ($p=1$), then we expect 100 books. But, if all books are bad ($p=0$) and $q_1=2\%$, then we expect 50 books. [Edit: I should perhaps state explicitly what I see in this graph: The number of books you'll write doesn't depend too much on how good you are. Unless you are very good ($p>0.8$), in which case you'll write a lot of books.]

In case you are curious, here's how I drew this. One writes $n\ge1$ books if one writes $n-1$ books without giving up and gives up after the $n$th. Thus, the probability of writing $n$ books is $\alpha^{n-1}\beta$, where \begin{align} \alpha &= p(1-q_0) + (1-p)(1-q_1) \\ \beta &= pq_0 + (1-p)q_1 \end{align} If we define $$G(x) = \sum_{n\ge 1} \alpha^{n-1} \beta x^n$$ then the expected value we search for is $G'(1)$. Since, $G$ is a geometric series, it's easy to compute the sum. \begin{align} G(x) &= \beta x \sum_{n\ge 0} (\alpha x)^n \\ &= \beta x\frac{1-(\alpha x)^\infty}{1-\alpha x} \\ &= \frac{\beta x}{1-\alpha x} \end{align} Derive: \begin{align} G'(x) &=\frac{\beta(1-\alpha x)+\alpha\beta x}{(1-\alpha x)^2} \\ &=\frac{\beta}{(1-\alpha x)^2} \end{align} So, the plot you see above is for the function $$\frac{\beta}{(1-\alpha)^2}$$

Edit 2: There are two things that are kinda misleading above. Let me try to rectify, if only briefly. One: Tao shows that most bad books could be written by good authors, not that they are. Two: In the model from above with $q_0$ and $q_1$ it is not true that most bad books are written by bad authors.