Effect Size: What It Is and Why It Matters

Imagine you run an A/B test on your website and get a “statistically significant” result (p < 0.05). You celebrate, roll out the new design, and... nothing noticeably changes. What went wrong? You found a real difference, but it was so tiny that it didn't matter in practice. This is where effect size comes in.

Effect size measures how big a difference or relationship actually is, not just whether it exists. While p-values tell you “is there an effect?”, effect size tells you “how large is the effect?”. In this tutorial, we'll explore what effect size means, the most common measures, and why every researcher should report it.

What Is Effect Size?

Effect size is a quantitative measure of the magnitude of a phenomenon. Unlike a p-value, which simply tells you whether an observed result is unlikely under the null hypothesis, effect size tells you how much two groups differ, or how strongly two variables are related.

Think of it this way: if a doctor tells you a new medication “works” (p < 0.05), your next question should be “how well does it work?”. Does it reduce pain by 1% or by 50%? That's the question effect size answers.

Statistical Significance vs. Practical Significance

Statistical significance (p-value): Is the effect real, or could it be due to random chance?
Practical significance (effect size): Is the effect large enough to matter in the real world?

A result can be statistically significant but practically meaningless (especially with large samples), or practically important but not statistically significant (especially with small samples).

Common Effect Size Measures

Different research designs call for different effect size measures. Here are the most widely used ones:

Cohen's $d$

Cohen's d measures the difference between two group means in terms of standard deviations. It's the most common effect size for comparing two groups (e.g., treatment vs. control).

d = \frac{\bar{X}_1 - \bar{X}_2}{s_p}

Where $\bar{X}_1$ and $\bar{X}_2$ are the group means, and $s_p$ is the pooled standard deviation.

A Cohen's d of 0.5 means the two group means are half a standard deviation apart. The larger the d, the more the groups differ.

Pearson's $r$ (Correlation Coefficient)

Pearson's r measures the strength and direction of the linear relationship between two continuous variables. It ranges from -1 to +1.

r = \frac{\sum (X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum (X_i - \bar{X})^2 \sum (Y_i - \bar{Y})^2}}

Values close to +1 or -1 indicate a strong relationship; values close to 0 indicate a weak relationship.

Eta-Squared ( $\eta^2$ )

Eta-squared measures the proportion of total variance in the outcome that is explained by the independent variable. It's commonly used with ANOVA and related tests.

\eta^2 = \frac{SS_{\text{between}}}{SS_{\text{total}}}

An $\eta^2$ of 0.06 means 6% of the variance in the outcome is accounted for by the grouping variable.

Odds Ratio (OR)

The odds ratio compares the odds of an event occurring in one group to the odds in another. It's widely used in medical research and logistic regression.

OR = \frac{\text{odds in treatment group}}{\text{odds in control group}} = \frac{a/b}{c/d}

An OR of 1 means no difference; OR > 1 means higher odds in the treatment group; OR < 1 means lower odds.

Interpreting Effect Size

Jacob Cohen proposed the following benchmarks as general guidelines for interpreting common effect size measures. While these are widely used, remember that what counts as a “small” or “large” effect depends on the context of your research.

Interpretation	Cohen's d	Pearson's r	$\eta^2$
Small	0.2	0.10	0.01
Medium	0.5	0.30	0.06
Large	0.8	0.50	0.14

Context Matters

Cohen himself cautioned that these are rough guidelines. In some fields, a “small” effect can be hugely important. For example, a medication that reduces heart attack risk by just 1% (small effect) could save thousands of lives when applied across millions of people. Always interpret effect sizes within the context of your specific domain.

Visualizing Effect Size

The best way to understand effect size is to see it. In this visualization, two bell curves represent a control group and a treatment group. As you increase Cohen's d, the treatment group's distribution shifts further to the right, meaning the groups become more distinct. Notice how the overlap between the two distributions decreases as the effect gets larger.

Interactive: How Cohen's d Affects Distribution Overlap

Cohen's d (Effect Size)

Cohen's d = 0.5

(Medium effect)

Control Group

Treatment Group

Cohen's d

0.5

Interpretation

Medium

Distribution Overlap

80.3%

The overlap percentage shows how much the two distributions share in common. A smaller overlap means the effect is easier to distinguish from no effect.

Try experimenting with the slider and notice:

At d = 0.2(small effect): the curves almost completely overlap — it's very hard to tell the groups apart
At d = 0.5(medium effect): there's a noticeable shift, but still substantial overlap
At d = 0.8 (large effect): the groups are clearly distinguishable, though some overlap remains
At d = 2.0+: the groups are almost completely separated

Effect Size vs. P-value

One of the most common mistakes in statistics is equating a small p-value with a large effect. In reality, the p-value is influenced by both the effect size and the sample size:

\text{Test Statistic} \approx \frac{\text{Effect Size} \times \sqrt{n}}{\text{Variability}}

This means that with a large enough sample, even a trivially small effect will produce a tiny p-value. Consider these two scenarios:

Scenario A: Small Sample

n = 20 per group
Cohen's d = 0.8 (large effect)
p = 0.06 (not significant at 0.05)

A meaningful effect that the test doesn't have enough power to detect.

Scenario B: Large Sample

n = 10,000 per group
Cohen's d = 0.05 (negligible effect)
p < 0.001 (highly significant)

A trivial effect that looks impressive only because of the massive sample.

This is why the American Psychological Association (APA) and many journals now require reporting effect sizes alongside p-values. Together, they give the full picture: the p-value tells you whether an effect is real, and the effect size tells you whether it matters.

Best Practice

Always report effect sizes in your research. A complete finding sounds like: “The treatment group scored significantly higher than the control group, t(98) = 2.45, p = .016, d= 0.49 (medium effect).”

Test Your Understanding

Let's test your understanding of effect size with some practice problems. These will help you interpret effect sizes in real-world contexts and understand the relationship between effect size, p-values, and sample size.

Effect Size: What It Is and Why It Matters

What Is Effect Size?

Common Effect Size Measures

Cohen's $d$

Pearson's $r$ (Correlation Coefficient)

Eta-Squared ( $\eta^2$ )

Odds Ratio (OR)

Interpreting Effect Size

Visualizing Effect Size

Interactive: How Cohen's d Affects Distribution Overlap

Effect Size vs. P-value

Scenario A: Small Sample

Scenario B: Large Sample

Test Your Understanding

Think About It

Effect Size: What It Is and Why It Matters

What Is Effect Size?

Common Effect Size Measures

Cohen's $d$

Pearson's $r$ (Correlation Coefficient)

Eta-Squared ( $\eta^2$ )

Odds Ratio (OR)

Interpreting Effect Size

Visualizing Effect Size

Interactive: How Cohen's d Affects Distribution Overlap

Effect Size vs. P-value

Scenario A: Small Sample

Scenario B: Large Sample

Test Your Understanding

Think About It

Effect Size: What It Is and Why It Matters

What Is Effect Size?

Common Effect Size Measures

Cohen's ddd

Pearson's rrr (Correlation Coefficient)

Eta-Squared (η2\eta^2η2)

Odds Ratio (OR)

Interpreting Effect Size

Visualizing Effect Size

Interactive: How Cohen's d Affects Distribution Overlap

Effect Size vs. P-value

Scenario A: Small Sample

Scenario B: Large Sample

Test Your Understanding

View Step-by-Step Solution

Think About It

I've thought about it - Show me the solution

Effect Size: What It Is and Why It Matters

What Is Effect Size?

Common Effect Size Measures

Cohen's ddd

Pearson's rrr (Correlation Coefficient)

Eta-Squared (η2\eta^2η2)

Odds Ratio (OR)

Interpreting Effect Size

Visualizing Effect Size

Interactive: How Cohen's d Affects Distribution Overlap

Effect Size vs. P-value

Scenario A: Small Sample

Scenario B: Large Sample

Test Your Understanding

View Step-by-Step Solution

Think About It

I've thought about it - Show me the solution

Cohen's $d$

Pearson's $r$ (Correlation Coefficient)

Eta-Squared ( $\eta^2$ )

Cohen's $d$

Pearson's $r$ (Correlation Coefficient)

Eta-Squared ( $\eta^2$ )