Statistical Significance Explained

If you didn’t read the House of Commons Library’s statistical literacy guides recently (or you need a refresher on what, exactly, statistical significance means), then you can do much worse than student Warren Davies’ short rundown on the meaning of statistical significance:

In science we’re always testing hypotheses. We never conduct a study to ‘see what happens’, because there’s always at least one way to make any useless set of data look important. We take a risk; we put our idea on the line and expose it to potential refutation. Therefore, all statistical tests in psychology test the probability of obtaining your given set of results (and all those that are even more extreme) if the hypothesis were incorrect – i.e. the null hypothesis were true. […]

This is what statistical significance testing tells you – the probability that the result (and all those that are even more extreme) would have come about if the null hypothesis were true. […] It’s given as a value between 0 and 1, and labelled p. So p = .01 means a 1% chance of getting the results if the null hypothesis were true; p = .5 means 50% chance, p = .99 means 99%, and so on.

In psychology we usually look for p values lower than .05, or 5%. That’s what you should look out for when reading journal papers. If there’s less than a 5% chance of getting the result if the null hypothesis were true, a psychologist will be happy with that, and the result is more likely to get published.

Significance testing is not perfect, though. Remember this: ‘Statistical significance is not psychological significance.’ You must look at other things too; the effect size, the power, the theoretical underpinnings. Combined, they tell a story about how important the results are, and with time you’ll get better and better at interpreting this story.

To get a real feel for this, Davies provides a simple-to-follow example (a loaded die) in the post.

via @sandygautam