Suspicious pattern of too-strong replications of medical research

(by Howard Wainer)

This article from 2005 (by Zhenglun Pan, Thomas Trikalinos, Fotini Kavvoura, Joseph Lau, and John Ioannidis) is brilliant.

It is well established that the size of many experimental effects diminish over time. So, we often see an initial investigation of some new treatment has a large effect, but subsequent replications show a much smaller effect. The ‘blame’ for this is often laid at the door of publication bias — that the sampling distribution of the effect might be Gaussian with a mean just slightly above zero, and so many studies of the treatment can’t get published because they have small or null results. Suddenly a study gets results from the high tail of the dist’n and is published in an A-list journal with fanfare. Now that it is in the literature subsequent attempts at replication can also get published — so out of file drawers come the other studies, often done before the alpha study — but in B-list journals.

Enter the attached paper. The Chinese scientific literature is rarely read or cited outside of China. But the authors of this work are usually knowledgeable of the non-Chinese literature — at least the A-list journals. And so they too try to replicate the alpha finding. But do they? One would think that they would find the same diminished effect size, but they don’t! Instead they replicate the original result, even larger. Here’s one of the graphs:

How did this happen? Is it because the Chinese researchers ‘know’ what answer they should get, and so arrange their results to get them? Or is there some sort of publication bias in which studies with the ‘wrong’ (too small) result are not accepted? I don’t know, but my conclusion is we need to view medical research from China (and maybe elsewhere) carefully until we understand better the mechanisms driving it.

About these ads

5 Responses to “Suspicious pattern of too-strong replications of medical research”


  1. 1 Wayne May 9, 2011 at 5:43 pm

    When doing searches in computer science literature, I have stopped reading papers with all Chinese authors. Too many seem like they’ve been put out by someone with a quota to meet and a random concept combiner: take two or three well-known techniques and an application area and mix it all together without any real motivation, then write it up.

    Wasn’t there an article recently that Gelman linked to about Chinese doctoral students creating theses by quoting authority figures?

  2. 2 Michael J. McFadden May 11, 2011 at 1:27 am

    It might be interesting to view some of the studies on secondary smoke exposure and either LC or CHD with the same jaundiced eye. Since the 1980s and particularly since the 1990s researchers producing such studies have known quite well which side their bread is more likely to be buttered on, both in terms of initial and repeat funding and in terms of publication likelihood.

    When you throw idealism into the mix (the feeling likely shared by *many* researchers in the field that results condemning secondary smoke exposure will be contributing to the “greater good” of smoking bans that will reduce primary smoking activity) you have a pudding that is far more ripe for corruption than that which exists in almost *any* other sphere of scientific research.

    While the ETS/LC/CHD long-term epidemiological research is difficult for a non-professionally-trained researcher to evaluate cand critique properly (particularly without access to primary data), other antismoking research is more accessible. And the findings from evaluating such research is *far* from reassuring.

    See for example, my critique of the Klein study on economic impacts of smoking bans in the comments area at:

    http://www.jacobgrier.com/blog/archives/2210.html

    or my critique of the warnings about “thirdhand smoke” in the comments at:

    http://globalhealthlaw.wordpress.com/2009/01/11/third-hand-smoke/#comment-52

    or my several criticisms of the Helena heart attack research in the BMJ’s Rapid Response area at:

    http://www.bmj.com/content/328/7446/977/reply

    There are many other examples I could provide, but those should be enough for starters in order to make the point. I have seen analyses *very* similar to those in the graphs at the top of this story used for the long term ETS effect studies and, given the chicanery I have seen in so many studies that are more friendly to independent analysis, I believe they may be even more deeply flawed than the Chinese example given above.

    Michael J. McFadden
    Author of “Dissecting Antismokers’ Brains”


  1. 1 Suspicious pattern of too-strong replications of medical research « Epanechnikov's Blog Trackback on May 5, 2011 at 4:24 pm
  2. 2 Too good to be true? Exercise skepticism. | Process Revolution Trackback on May 7, 2011 at 1:38 pm
  3. 3 How often should statistical significance occur? « IREvalEtAl Trackback on May 8, 2011 at 11:22 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




About

The Statistics Forum, brought to you by the American Statistical Association and CHANCE magazine, provides everyone the opportunity to participate in discussions about probability and statistics and their role in important and interesting topics.

The views expressed here are those of the individual authors and not necessarily those of the ASA, its officers, or its staff. The Statistics Forum is edited by Andrew Gelman.

A Magazine for People Interested in the Analysis of Data

Follow

Get every new post delivered to your Inbox.

Join 56 other followers

%d bloggers like this: