(by Michael Lavine)
In the May 2010 issue of Statistical Science, Bradley Efron wrote an article on The Future of Indirect Evidence. His point is that indirect evidence is all around us, in increasing amounts, and that statistics is adapting, and may have to adapt further, to handle it. In this post, I’d like to use some of his examples to ask whether the distinction between direct and indirect evidence is real. All examples in this post come from Efron’s article.
Example 1. A couple is expecting twins. From a sonogram, they know the twins are both boys. What is the probability they are identical? Efron says that among all twins, Pr[Identical] = 1/3 and Pr[Fraternal] = 2/3. Further, Pr[both boys | Identical]/Pr[both boys | Fraternal] = 2. An easy calculation yields Pr[Identical | both boys] = Pr[Fraternal | both boys] = 1/2. So far, so good. But Efron goes on to remark that the couple “are learning [directly] from their own experience (the sonogram), but also indirectly from the experience of others” [the 1:2 odds ratio of Identical to Fraternal].
Example 2. We know from experience that kidney function decreases with age. A new kidney donor, aged 55, appears. How good are his kidneys? One previous donor in our database was 55 years old. Efron says that he, the other 55 year old donor, provides direct evidence, while the other donors provide indirect evidence, through the regression function, about the new donor’s kidney function.
Here’s what puzzles me. In Example 1, the experience of everyone else — millions of couples who had twins — similarly situated to the couple of interest, is called indirect. But in Example 2, the experience of everyone else — just one person this time — similarly situated to the person of interest, is called direct.
I don’t see the difference. Can anyone enlighten me?