We’re looking for stories of everyday statistical life. What’s it like to do statistics?
See here for further details.
We’re looking for stories of everyday statistical life. What’s it like to do statistics?
See here for further details.
(by Andrew Gelman)
Following our discussion of Avastin, a drug that some have argued doesn’t work for one of its Medicare-approved uses, an anonymous source who has worked with a foreign regulatory review agency, speculated that at least there, in similar situations, the lack of observed OS advantage would not have been a large concern (for the statistical reasons that Don Berry raised in his thoughtful post). However, there is uncertainty about a PS advantage always leading to OS survival. This can be fairly well contained from the context in many cancers, as Don Berry also indicates, but it remains extra uncertainty above what is usual in the evaluation of RCT based evidence.
The larger concerns more likely would have been:
1. Is the benefit of the uncertain increase in PFS worth the risk of uncertain harms observed in the trials?
2. How does one properly discount what was “reported” about the trials/research program by the sponsor, given as Sander Greenland nicely put it is his short post “the influences of vested interests” that almost surely exaggerated those reports?
For 1, it needs to be kept in mind that the consideration of costs and whether to provide the approved products to which patients (i.e. consider cost-effectiveness) might be best left for others to decide later. That is, an initial focus to strictly decide if benefits exceed (justify) the harms might be better. In this initial focus, note that the increases in PFS are often small, harms often include fatal ones, and the observed harms in the trials usually considerably under estimate the harms in the less well controlled practice settings.
For 2, one should not blame the drug companies or expect otherwise from their employees/consultants. It is in partly human nature and also what business is about (when within reason such as when the “answer” is not obvious to most and there is enough uncertainty for judgments to vary widely.) See this paper by Sander Greenland.
(by Andrew Gelman)
Last week I posted a link to New York Times columnist Joe Nocera’s claim that Medicare is a corrupt system that is paying for a drug, Avastin, that doesn’t work for breast cancer patients. I sent the pointer to several colleagues, along with the following note:
I have three goals here:
1. To get to the bottom of the Avastin story.
2. If Nocera is correct, to add the voice of the statistics profession to his push to have more rational rules for drug approval.
3. To promote the Statistics Forum, as part of a larger goal of engaging the statistics community in frequent public discussions.
In the old days, professional discussions were centered on journal articles. We still do this, but blogs and other online forums are becoming increasingly important. By contributing a couple paragraphs here (including links to further reading), you will be helping in all three goals above. Just send to me and I will post at the Forum (under your name).
(I myself am doing this in the spirit of service to the profession in my duties as editor of the Statistics Forum.)
I received several responses. The quick summary is:
1. Nocera claimed that Avastin doesn’t work for breast cancer. Actually, though, the data seem to show that it does work.
2. It’s not clear whether Medicare should be paying for the drug. In any case, the drug reimbursement system is a mess.
Now here are the comments from the experts.
From biostatistician John Carlin:
I asked a colleague of mine about it – he sits on Australia’s Pharmaceutical Benefits Advisory Committee, which rather uniquely in the world (I believe) is charged with examining cost-effectiveness of new drugs in order to advise government as to whether they should be subsidised under the national Pharmaceutical Benefits Scheme (which pays the bulk of drug costs in this country). As I understand it, Avastin (bevacizumab) is still approved here as a treatment for certain indications but it is not approved for listing on the PBS, and without receiving a subsidy via that mechanism its cost is generally prohibitive.
See also this paper by John Ioannidis, which highlights issues about selective early evidence and multiplicity of comparisons for this drug: http://www.bmj.com/content/341/bmj.c4875.
From statistician Howard Wainer:
Data and evidence seems to have only a minor role in today’s political climate — witness the support of policies for cutting taxes to reduce the deficit. In healthcare note the reaction to the evidence-based decision to cut back massively on such diagnostic tests as PSAs and mammograms in which their huge number of false positives makes the probability o a true positive tiny and clinical results that confirm that early detection of breast cancer seems to have no effect on survival (with modern treatment) and the lack of a clear advantage for treatment over not for prostate cancer.
None-the-less the size of the diagnostic industry presses against rationality.
From a health-care economist:
From what I can understand, there are scientific, technical, and political forces at work here. The scientific issue is that Avastin is believed to work for some women, but not all. No one knows which women will benefit, and so the FDA ordered Roche/Genentech to figure this out. The technical issue is that by law, CMS has to cover any approved cancer drug for an off-label indication if the indication is included in one of several named drug compendia. Because CMS does not like making these decisions, this generally suits CMS fine. In this case, the National Comprehensive Cancer Network (NCCN), “a non-profit group of oncologists whose guidance is closely followed by leading treatment centers, has voted overwhelmingly in favor of maintaining its recommendation that Avastin should be used to treat breast cancer.” The NCCN vote was 24-0, with one abstention. That brings up the politics here. The interesting dimension of politics is not the death panel charge — though that’s obviously there — but how organizations like the NCCN make their decisions. There is an undercurrent of sleeziness that one picks up. E.g., a bunch of members of the NCCN have ties to Roche; they make their money off this stuff; etc. All of these play together to create outcomes like this.
From epidemiologist Sander Greenland:
I laud Andrew’s call to think about what might be done in the Avastin case in particular and the general basis for Medicare reimbursement. From the reported information the latter seems not to have much in the way of conflict-of-interest safeguards (unlike FDA panels). That problem, however, is not one of technical statistics or evidence evaluation – it’s a far more fundamental and touchy issue of protecting decision-making from the influences of vested interests.
From psychologist Jon Baron:
I have long advocated the use of cost-effectiveness analysis in health care. (See, for example, the chapter on utility measurement in my textbook Thinking and Deciding. The section on the Oregon Health Plan has been there since the 3d edition in 2000.) Of course, the method has problems. Measurement of utility is not rocket science. (I compare it to the measurement of time in ancient Egypt.) Nor is measurement of cost, for that matter. But the issue is “compared to what?” All other methods of allocating health resources seem more likely to waste resources that could be put to better use, saving lives and increasing quality of life, elsewhere.
A move in this direction in Obama’s health-care proposal was derided as “death panels”. My understanding is that it is still there and could grow in the future.
I have not read the literature on Avastin. But, from Nocera’s article and other things I’ve read about it, it seems to me that it would not pass a cost-effectiveness test with some reasonable limit on dollars per Quality-adjusted Life Year. Whether it has a statistically significant benefit is beside the point. The issue is, I think, our best estimate of its effectiveness given the data available. If the FDA thinks it does no good at all, then it probably does not do very much good, and it is very expensive. Some newer treatments for prostate cancer seem to fall in the same category, and it is possible that the same is true of the HPV vaccine (in the U.S., where Pap smears are routine).
Of course, any use of cost-effectiveness analysis needs to be careful about research, including “Phase 3″ research on extensions of drugs to off-label uses and other issues not addressed before approval.
But the main problem is public opposition. People have trouble with trade-offs and tend to think in terms of an opposition between “cover everything that might help” and “put limits on the amount of money that is spent and let someone else figure out how to allocate resources within those limits”. The latter approach might work, but only if the “someone else” were willing to use cost-effectiveness and if the limits were sufficiently generous.
On the latter point, it seems to me that health care is very important and that its cost, relative to other ways of spending money, is not that high for what it does. The idea that the total budget for health care should not increase seems wrong. Health care is getting better, and thus it is reasonable for people to want more of it. The idea of keeping the cost in line with inflation is like saying that, between 1945 and 1960, the amount of money spent on home entertainment increased too much and must be limited.
I saved the best for last. Here’s statistician Don Berry, who is an expert on medical decision making in general and cancer treatment in particular:
The Avastin story is a very long one, with many turns. The FDA approval question for metastatic breast cancer, which is the issue here, has divided oncologists. There are rational arguments on both sides.
There is no question that Avastin “works” in the sense that it has an anti-tumor effect. Everyone agrees with that, well, with the possible exception of Joe Nocera. In particular, it clearly delays progression of metastatic breast cancer, which is the reason it was approved for treating that disease in the first place. The FDA reversed itself (for this disease but not for other cancers such as lung and colon that will remain on Avastin’s label) because Avastin has not been shown to statistically significantly prolong overall survival (OS). Some oncologists–actually, most oncologists–argue that progression-free survival (PFS) is clinically meaningful and should be a registration end point. The FDA’s position–and that of the Oncology Drug Advisory Committee (ODAC)–is that improved PFS is not usually enough to approve drugs without empirical evidence of improved OS to go along with it.
Genentech/Roche appealed to the FDA Commissioner after the FDA’s first decision (based on ODAC’s recommendation) to remove metastatic breast cancer from Avastin’s label. This kind of appeal is legal but is almost never used. Genentech mustered many arguments. Several were statistical. For example, they argued that none of the clinical trials in question were powered to show a benefit in OS. The following article addresses this question:
Broglio KR, Berry DA (2009). Detecting an overall survival benefit that is derived from progression-free survival. Journal of the National Cancer Institute 101:1642-1649.
This article demonstrates that it’s very difficult to power a study to show an OS benefit when survival post-progression (SPP=OS-PFS) is long, which it is in metastatic breast cancer, about 2 years in some of the Avastin trials. The article argues that even if an advantage in PFS translates perfectly into the same advantage in OS, the variability after progression so dilutes the OS effect that it’s likely to be lost.
The assumptions in the article about SPP are realistic. I know many example clinical trials in many types of cancer (and I know no counterexamples) where SPP is essentially the same in both treatment groups, even when the experimental drug showed better PFS than control. This is despite crossovers (to the drug when the control patient progresses) and potentially greater efforts by the clinicians in one treatment group to keep their patients alive. (The main reason that SPP is similar in the two treatment groups is that metastatic cancer is almost uniformly fatal and it’s hard to slow the disease after it’s set up housekeeping throughout the body.) It makes sense that a drug that was effective in delaying progression is not effective after progression because the drug is almost always stopped when the patient progresses, and the patient usually goes onto another drug.
For the full Avastin story check out The Cancer Letter http://www.cancerletter.com/downloads/20111118/download and links provided therein.
(by Andrew Gelman)
New York Times columnist Joe Nocera tells the story of Avastin, a cancer drug produced by Genentech that Medicare pays for in breast cancer treatment even though the Food and Drug Administration says it doesn’t work. Nocera writes:
For breast cancer patients, Avastin neither suppresses tumor growth to any significant degree nor extends life. Although a 2007 study showed Avastin adding 5.5 months to progression-free survival, subsequent studies have failed to replicate that result.
As a result of that first, optimistic study, the Food and Drug Administration gave the drug “accelerated approval,” meaning it could be marketed as a breast cancer therapy while further studies were conducted. Those follow-up studies are what caused a panel of F.D.A. experts to then withdraw that approval . . . After weighing the evidence, the F.D.A. panel voted 6 to 0 against Avastin.
After Genentech appealed, Dr. Margaret Hamburg, the F.D.A. commissioner, affirmed the decision on Friday in a ruling that would seem, on its face, unassailable. She essentially said that F.D.A. decisions had to be driven by science, and the science wasn’t there to support Genentech’s desire to market Avastin as a breast cancer drug.
And here’s the punch line. After describing the political pressure coming from cancer support groups and political hacks such as Sarah Palin and the Wall Street Journal editorial board, Nocera continues:
The strangest reaction, though, has come from the nation’s health insurers and the administrators of Medicare. Despite the clear evidence of Avastin’s lack of efficacy in treating breast cancer, they have mostly agreed to continue paying whenever doctors prescribe it “off label” for breast cancer patients. Avastin, by the way, costs nearly $90,000 a year. . . .
Medicare . . . is, by statute, guided in such decisions not by the F.D.A. but by various compendia of drug use put together by such groups as the National Comprehensive Cancer Network. The network’s 32-member breast cancer panel is made up almost entirely of breast cancer specialists, nine of whom have financial ties to Genentech. The last time the panel voted on Avastin, it voted unanimously in favor of continuing to recommend it as a breast cancer therapy.
Here is an enormously expensive drug that largely doesn’t work, has serious side effects and can no longer be marketed as a breast cancer therapy. Yet insurers, including Medicare, will continue to cover it.
If we’re not willing to say no to a drug like Avastin, then what drug will we say no to?
Based on Nocera’s description, this does seem pretty scandalous. Perhaps not quite on the scale of a financial and public health disaster such as the pouring of antibiotics into animal feed in our subsidized farms, but still a bit disturbing.
And I know people who work at Genentech, which makes it seem that much worse.
On the other hand, I don’t know anything about this case. I’m curious what experts on medical decision making would say. Is Nocera right on this one? Should we as statisticians be raising our voices and making a fuss about Medicare’s apparent disregard of the principles of evidence-based medicine?
(Contributed by Bill Bolstad and Christian Robert)
Bill Bolstad wrote a reply to my review of his book Understanding computational Bayesian statistics last week and here it is, unedited except for the first paragraph where Bill thanks me for the opportunity to respond, “so readers will see that the book has some good features beyond having a “nice cover”.” (!) I simply processed his Word document into an html output and put a Read More bar in the middle as it is fairly detailed. (As indicated at the beginning of my review, I am obviously biased on the topic: thus, I will not comment on the reply, lest we get into an infinite regress!) The following is from Bill:
The target audience for this book are upper division undergraduate students and first year graduate students in statistics whose prior statistical education has been mostly frequentist based. Many will have knowledge of Bayesian statistics at an introductory level similar to that in my first book, but some will have no previous Bayesian statistics course. Being self-contained, it will also be suitable for statistical practitioners without a background in Bayesian statistics.
The book aims to show that:
I am satisfied that the book has achieved the goals that I set out above. The title “Understanding Computational Bayesian Statistics” explains what this book is about. I want the reader (who has background in frequentist statistics) to understand how computational Bayesian statistics can be applied to models he/she is familiar with. I keep an up-to-date errata on the book website..The website also contains the computer software used in the book. This includes Minitab macros and R-functions. These were used because because they had good data analysis capabilities that could be used in conjunction with the simulations. The website also contains Fortran executables that are much faster for models containing more parameters, and WinBUGS code for the examples in the book. Continue reading ‘Understanding computational Bayesian statistics: a reply from Bill Bolstad’
(Post contributed by Christian Robert)
“Statistical significance is not a scientific test. It is a philosophical, qualitative test. It asks “whether”. Existence, the question of whether, is interesting. But it is not scientific.” S. Ziliak and D. McCloskey, p.5
The book, written by economists Stephen Ziliak and Deirdre McCloskey, has a theme bound to attract Bayesians and all those puzzled by the absolute and automatised faith in significance tests. The main argument of the authors is indeed that an overwhelming majority of papers stop at rejecting variables (“coefficients”) on the sole and unsupported basis of non-significance at the 5% level. Hence the subtitle “How the standard error costs us jobs, justice, and lives“… This is an argument I completely agree with, however, the aggressive style of the book truly put me off! As with Error and Inference, which also addresses a non-Bayesian issue, I could have let the matter go, however I feel the book may in the end be counter-productive and thus endeavour to explain why through this review. (I wrote the following review in batches, before and during my trip to Dublin, so the going is rather broken, I am afraid…) Continue reading ‘The Cult of Statistical Significance [book review]’
(This post is contributed by Christian Robert.)
“Bayes Theorem is a simple consequence of the axioms of probability, and is therefore accepted by all as valid. However, some who challenge the use of personal probability reject certain applications of Bayes Theorem.” J. Kadane, p.44
Principles of uncertainty by Joseph (“Jay”) Kadane (Carnegie Mellon University, Pittsburgh) is a profound and mesmerising book on the foundations and principles of subjectivist or behaviouristic Bayesian analysis. Jay Kadane wrote Principles of uncertainty over a period of several years and, more or less in his own words, it represents the legacy he wants to leave for the future. The book starts with a large section on Jay’s definition of a probability model, with rigorous mathematical derivations all the way to Lebesgue measure (or more exactly the McShane-Stieltjes measure). This section contains many side derivations that pertain to mathematical analysis, in order to explain the subtleties of infinite countable and uncountable sets, and the distinction between finitely additive and countably additive (probability) measures. Unsurprisingly, the role of utility is emphasized in this book that keeps stressing the personalistic entry to Bayesian statistics. Principles of uncertainty also contains a formal development on the validity of Markov chain Monte Carlo methods that is superb and missing in most equivalent textbooks. Overall, the book is a pleasure to read. And highly recommended for teaching as it can be used at many different levels. Continue reading ‘Principles of Uncertainty’