The Over-Estimation of Sampling Errors

Fairly obvious, but something I haven’t previously given much consideration to:

Sampling errors mean that initial figures are equally as likely to be under-estimates as over-estimates but [in media stories where figures for a disease or condition are quoted] we only ever seem to be told that the condition is under-detected.

That’s from a short post from Mind Hacks looking at the proliferation of the phrase, “the true number may be higher”.

For any individual study you can validly say that you think the estimate is too low, or indeed, too high, and give reasons for that. For instance, you might say that your sample was mainly young people who tend to be healthier than the general public, or maybe that the diagnostic tools are known to miss some true cases.

But when we look at reporting as a whole, it almost always says the condition is likely to be much more common than the estimate.

For example, have a look at the results of this Google search:

“the true number may be higher” 20,300 hits
“the true number may be lower” 3 hits



One response to “The Over-Estimation of Sampling Errors”

  1. John

    I think google uses sampling and estimation to produce its hit count. 🙂