This post is in two parts. First, Christian Robert reviews a new book by two linguists, Shravan Vasishth and Michael Broe; then Vasishth responds.

**Review by Christian Robert:**

“We have seen that a perfect correlation is perfectly linear, so an imperfect correlation will be `imperfectly linear’.”page 128

**T**his book has been written by two linguists, Shravan Vasishth and Michael Broe, in order to teach statistics “in areas that are traditionally not mathematically demanding” at a deeper level than traditional textbooks “without using too much mathematics”, towards building “the confidence necessary for carrying more sophisticated analyses” through R simulation. This is a praiseworthy goal, bound to produce a great book. However, and most sadly, I find the book does not live up to expectations. As in Radford Neal’s recent coverage of introductory probability books, there are statements there that show a deep misunderstanding of the topic…

“The least that you need to know about is LaTeX, Emacs, and Emacs Speaks Statistics. Other tools that will further enhance your working experience with LaTeX are AucTeX, RefTeX, preview-latex, and Python.”page 1

**T**he above recommendation is cool (along with the point that these tools are “already pre-installed in Linux”, while, for Windows or Macintosh, users “will need to read the manual”!) but eventually rather daunting when considering the intended audience. While I am using LaTeX and only LaTeX in my everyday work, the recommendation to learn LaTeX prior to “understand the principles behind inferential statistics” sounds inappropriate. The book clearly does not require an understanding of LaTeX to be read, understood, and practiced. (Same thing for Python!)

**T**he authors advertise a blog about the book that contains very little information. (The last entry is from December 2010: “The book is out”.) This was a neat idea, had it been implemented.

“Let us convince ourselves of the observation that the sum of the deviations from the mean always equals zero.”page 5

**W**hat I dislike the most about this book is the waste of space dedicated to expository developments that aim at bypassing mathematical formulae, only to provide at the very end of the argument this mathematical formula. And then the permanent confusion between the distribution and the sample, the true parameters and their estimates. (Plus the many foundational mistakes, as those reported below.) If a reader has had some earlier exposition to statistics, the style and pace are likely to unsettle/infuriate her. If not, she will be left with gapping holes in her statistical bases: no proper definition of unbiasedness (hence a murky justification of the degrees of freedom whenever they appear), of the Central Limit theorem, of the t distribution, no mention being made of the Law of Large Numbers. (although a connection is made in the summary, page 63) This does not seem sufficient enough to engage in reading Gelman and Hill (2007), as suggested at the end of thebook… Having the normal density defined as the “somewhat intimidating-looking function” (page 39)

certainly does not help! (Nor does the call to *integrate* rather than *pnorm* to compute normal tail probabilities (pages 69-70)).

“The key idea for inferential statistics is as follows: If we know what a `random’ distribution looks like, we can tell random variation from non-random variation.”page 9

**T**he above quote gives a rather obscure and confusing entry to statistical inference. Especially when it appears at the beginning of a chapter (Chapter 2) centred on the binomial distribution. As the authors seem reluctant to introduce the binomial probability function from the start, they resort to an intuitive discourse based on (rather repetitive) graphs (with an additional potential confusion induced by the choice of a binomial probability of *p=0.5*, since *p ^{k}(1-p)^{n-k}* is then constant in

*k*…) In Section 2.3, the distinction between binomial and hypergeometric sampling is not mentioned, i.e. the binomial approximation is used without any warning. The fact that the mean of the binomial distribution

*B(n,p)*is

*np*is not established and the variance being

*np(1-p)*is not stated at all. (However, the book spends four pages [36-39] showing through an R experiment that “the sum of squared deviations from the mean are [sic!] smaller than from any other number”.)

“The mean of a sample is more likely to be close to the population mean than not.”page 49

**T**he above is the conclusive summary about the Central Limit theorem, after an histogram with 8 bins showing that “the distribution of the means is normal!”… It is then followed by a section on “*s*is an Unbiased Estimator of *σ*“, nothing less!!! This completely false result (*s* is the standard estimator of the standard deviation *σ*) is again based on the “fact” that it is “more likely than not to get close to the right value”. The introduction of the t distribution is motivated by the “fact that the sampling distribution of the sample mean is no longer be modeled by the normal distribution” (page 55). With such flaws in the presentation, it is difficult to recommend the book at any level. Especially the most introductory level.

“We know that the value is within 6 of 20, 95% of the time.”page 27

**I** am also dissatisfied with the way confidence and testing are handled (and not only because of my Bayesian inclinations!). The above quote, which replicates the usual fallacy about the interpretation of confidence intervals, is found a few lines away from a warning about the inversion of confidence statements! A warning only repeated later *“it’s a statement about the probability that the hypothetical confidence intervals (that would be computed from the hypothetical repeated samples) will contain the population mean”* (page 59). The book spends a large amount of pages on hypothesis testing, presumably because of the own interests of the authors, however it is unclear a neophyte could gain enough expertise from those pages to conduct his own tests. Worse, statements like (page 75)

show a deep misunderstanding of the nature of both testing and random variables. How can one test a property about the *observed* sample mean?! A similar confusion appears in the ANOVA chapter (e.g. (5.51) on page 112).

“The research goal is to find out if the treatment is effective or not; if it is not, the difference between the means should be `essentially’ equivalent.”page 92

**T**he following chapters are about analysis of variance (5), linear models (6), and linear mixed models (7). all of which face fatal deficiencies similar to the ones noted above. The book would have greatly benefited from a statistician’s review before being published. (I cannot judge whether or not the book belongs to a particular series.) As is, it cannot deliver the expected outcome on its readers and train them towards more sophisticated statistical analyses. As a non-expert on linguistics, I cannot judge of the requirements of the field and of the complexity of the statistical models it involves. However, even the most standard models and procedures should be treated with the appropriate statistical rigour. While the goals of the book were quite commendable, it seems to me it cannot endow its intended readers with the proper perspective on statistics…

**Response by Shravan Vasishth:**

I speak only for myself, not my co-author.

1. About the outright mistakes in the book

These I will definitely correct on the blog (acknowledging this

reviewer and the review) and in the next edition (which I’ll try to

get out as soon as possible). I’m grateful for that feedback. There

are two kinds of errors; one involves statements that the reviewer

would like to see differently presented, and the other are outright

errors (such as H_0: \bar{x}=\mu; this one is truly absurd, I agree).

I am certainly not a statistician and so it’s not surprising that I

end up annoying real statisticians with erroneous statements (this is

a general problem whenever someone from field X writes about material

from field Y, e.g., psychologists working on language).

It would be perfect if a statistician would write a book like this,

instead of a bunch of linguists who have only an imperfect

understanding of statistical theory. So why hasn’t it happened? I

don’t know. My theory is that statisticans can’t see the world from a

non-statistician’s eyes.

Perhaps the reviewer would like to write a better book; I would be the

first to adopt it in my classes. Perhaps such a book already exists; I

would appreciate a pointer on a book that the author considers a more

competent version than ours that has the same goals.

I should add that I tried to recruit several statisticians (at

different points) to join us as co-authors for this book, in order to

save us from ourselves, so to speak. Nobody was willing to do join in;

the most frequent response was that all the information is in the

standard books already (which is true–it’s just not accessible to

everyone).

Would the reviewer be willing to join us as co-author in a second

edition? That would be a very useful outcome of this review.

2. About LaTeX recommendations being beside the point:

I do take the reviewer’s point about LaTeX etc. not being necessary

for the book. But he did leave out in his review this line from the

book when criticizing the recommendations regarding LaTeX etc.

“In order to use the code that comes with this book, you only need to install R….None of these [LaTeX, emacs, etc.] are required but are highly

recommended for typesetting and other sub-tasks necessary for data

analysis.” (p. 1).

The reviewer considers the task of using/installing LaTeX/emacs etc.

daunting/distracting, but these are standard tools for linguists at

universities I have attended (Graduate Program in Language and

Culture, Osaka University; Linguistics, Ohio State, where I did my

PhD; Computational Linguistics, Saarbruecken; Linguistics, Potsdam).

Many among my intended audience need these tools anyway, so I don’t

consider it overkill to suggest them to those who don’t have them yet.

Many linguists (especially linguists doing theoretical work involving,

e.g., the lambda calculus or formal logic) only need to know that

Sweave exists–they know the rest.

3. About the blog not being updated:

This was the first indication to me that the author has let his rage

get the better of him (the rage is no doubt there for good reason).

When I am this enraged with an article or book, and have to review it,

I generally write my venomous version of the review first, and then

let it sit for a while, and then rewrite it, just focusing on the real

weaknesses. I think the reviewer may benefit from this strategy in

future.

Turning now to the critique about the non-active blog: the book is 6

months old. Finding time to add useful entries to the blog is hard in

my current situation (8 hours of teaching hours per week; lots of PhD

students; a 5 year old child; health problems; and so on). I expect to

add material to the blog over this summer, when my sabbatical begins.

4. The criticism about the methods presented (“The book spends a large

amount of pages on hypothesis testing, presumably because of the own

interests of the authors, however it is unclear a neophyte could gain

enough expertise from those pages to conduct his own tests.”)

Hypothesis testing is unfortunately the standard in Linguistics; our

focus has nothing to do with our personal interests. I am very keen to

use bayesian methods in my work, and I know that a new generation of

linguists is already working hard in that direction. I know the

material in Gelman and Hill 2007 enough to know how to fit models

using WinBUGS, and am working through Kruschke’s book. However, I

don’t yet see how I can convince a journal reviewer that they should

abandon hypothesis testing.

We generally fit models using linear mixed models, where we have some

dependent measure, like reading time or event-related potentials, and

we usually a 2×2 or 2×3 within subjects counterbalanced design, with

participants and items as crossed random factors (see the book by

Baayen, 2008, for more examples). If I use uninformative priors in

WinBUGS to fit such a model, I will get comparable results to:

## condition with some appropriate contrast coding:

lmer(dv~condition+(1|subject)+(1|item,data)

What’s the payoff of switching to bayesian models for the reviewer (I

understand this myself–I follow Andrew Gelman’s blog ;)? If I could

use informative priors, I could make a strong case for abandoning

hypothesis testing in a paper, but I haven’t seen any clear strategy

for doing that kind of analysis in books yet.

So, what did I learn from this review? I learnt that I should spend

more time studying statistics. Even without the review, I knew I need

to understand this material better, and I am working on it. One

central problem for an outsider trying to get insider knowledge of

statistics is that one cannot engage a professional statistician long

enough to get enough detail on obscure points (of which there are

many). Nobody has the time. Andrew Gelman has patiently answered a lot of my questions, by email and in person, but there’s a limit to how

much one can impose. The ideal way seems to be to become a full-time

student again in a statistics department. Maybe that’s the way I will

go.

In closing, I should add that although the reviewer doesn’t recommend

this book at all, the reality is that we have received a lot of

positive reviews from users. It has been in use in several

universities (mostly in the US) as a freely available draft . I only

published it after a lot of people asked me to.

We have excellent textbooks out there, particularly Baayen (2008)

(focused on language research); Gelman and Hill (2007) (focused on

linear mixed models and bayesian data analysis). These are in wide use

in linguistics, but grad students are unable to get into them as their

first textbook. Our book provides them with a way to start with these

books (they need a lot of help with Gelman and Hill, but our grad

students can make some headway with this book after reading ours).

Given all this, perhaps the reviewer will dislike it a bit less once

we fix the errors in the book?

## 0 Responses to “The foundations of Statistics: a simulation-based approach”