# Statistical Significance and Effect Size

## -the overused and the under-used

(last updated on 2011-02-19)

Statistical significance is one of the most widely used statistical term, and likely, the most overused or abused statistical term. No matter how small α is for a treatment, it means nothing more than that the treatment probably makes difference. Statistical significance does not tell how big the difference is. In some sense, "significance" is an unfortunate choice of words for this statistical term because it converys different meaning than its use in daily speech. For example, suppose a cholesterol lowering drug reduces cholesterol level by 1 mg/dL, or less than 1%. Assuming the standard deviations of the treatment and control arms are 10 mg/dL, it can be shown with Biyee's sampling and t-test simulator that a significance level of α < 0.05 can be achieved with only 500 samples in each arm. Apparently this drug would be essentially useless despite the statistical significance it may show in lowering cholesterol. If it had any side effects, it might do more harm than good.

The effect size is usually our ultimate concern. For the above example, the effect size is 1 mg/dL (or 1 mg/dL devided by the standard deviation). Of course, an effect size is meaningful only if it is backed by an acceptable signficance level.

The scenario where the statistical significance is more likely used in lieu of the effect size is when both the treatment of the effect are binary. Vaccines are good examples in this category. The treatment variable of a vaccine is whether a subject has received the vaccine; the effect variable is whether the subject has contracted the corresponding disease after the vaccination. In this case, the best effect size is odds ratio which shows how much difference the vaccine makes in terms of the likelyhood of contracting the disease.