Today I want to talk about effect sizes such as Cohen’s d, Hedges’s g, Glass’s Δ, η2, and ω2. Effects sizes concern rescaling parameter estimates to make them easier to interpret, especially in terms of practical significance.
Many researchers in psychology and education advocate reporting of effect sizes, professional organizations such as the American Psychological Association (APA) and the American Educational Research Association (AERA) strongly recommend their reporting, and professional journals such as the Journal of Experimental Psychology: Applied and Educational and Psychological Measurement require that they be reported.
Anyway, today I want to show you
- What effect sizes are.
- How to calculate effect sizes and their confidence intervals in Stata.
- How to calculate bootstrap confidence intervals for those effect sizes.
- How to use Stata’s effect-size calculator.
1. What are effect sizes?
The importance of research results is often assessed by statistical significance, usually that the p-value is less than 0.05. P-values and statistical significance, however, don’t tell us anything about practical significance.
What if I told you that I had developed a new weight-loss pill and that the difference between the average weight loss for people who took the pill and the those who took a placebo was statistically significant? Would you buy my new pill? If you were overweight, you might reply, “Of course! I’ll take two bottles and a large order of french fries to go!”. Now let me add that the average difference in weight loss was only one pound over the year. Still interested? My results may be statistically significant but they are not practically significant.
Or what if I told you that the difference in weight loss was not statistically significant — the p-value was “only” 0.06 — but the average difference over the year was 20 pounds? You might very well be interested in that pill.
The size of the effect tells us about the practical significance. P-values do not assess practical significance.
All of which is to say, one should report parameter estimates along with statistical significance.
In my examples above, you knew that 1 pound over the year is small and 20 pounds is large because you are familiar with human weights.
In another context, 1 pound might be large, and in yet another, 20 pounds small.
Formal measures of effects sizes are thus usually presented in unit-free but easy-to-interpret form, such as standardized differences and proportions of variability explained.
The “d” family
Effect sizes that measure the scaled difference between means belong to the “d” family. The generic formula is
The estimators differ in terms of how sigma is calculated.
Cohen’s d, for instance, uses the pooled sample standard deviation.
Hedges’s g incorporates an adjustment which removes the bias of Cohen’s d.
Glass’s Δ was originally developed in the context of Read more…