(by Andrew Gelman)
My political science colleague Macartan Humphreys writes:
You might be interested in this since I think you have recommended to people to do this but people generally don’t. We just wrote an entire report with fake data to get our ducks lined up before working with the real data. In fact we had some of the real data but we scrambled the treatment variable so that we could make use of information on distribution of the outcome variables and so on without using the actual treatment data.
Interesting to do but it was hard to keep up the “as you see from this table…” texts.
Oooh . . . I love love love love love this! For any project costing $X, I think it would be a good idea to start off spending $0.01X or more to produce a report based on simulated data. This is what I call a serious power analysis.
My team has had an effort to do this in a clinical trial for some time, and we are finally getting to a place where we can make it happen. As a result, we can review the data collection with a more specific view of the study than just what has gone wrong in previous cases (which is also helpful, but yields different insights).
The downside is that it requires a lot of work up front, and trying to transition to this while working on other projects is a challenge.