StatsCalculators.com

Scheffe's Test

Created:September 15, 2024
Last Updated:September 15, 2025

Scheffe's test is a post-hoc multiple comparison test used after a significant One-Way ANOVA result. It helps you determine which specific group means differ from each other while maintaining strong control over the family-wise error rate. Scheffe's test is the most conservative post-hoc test, making it ideal when you want to be extra cautious about Type I errors.

What You'll Get:

  • All Pairwise Comparisons: Test all possible group pairs
  • Mean Differences: Exact differences between each group pair
  • Confidence Intervals: Simultaneous confidence intervals for all comparisons
  • Significance Results: Clear indication of which pairs differ significantly
  • Visual Comparison: Charts showing group differences
  • APA-Ready Report: Publication-quality results

Pro Tip: Only use Scheffe's test after obtaining a significant ANOVA result. If your ANOVA is not significant, there's no need for post-hoc testing. Run ourOne-Way ANOVA Calculatorfirst if you haven't already.

Ready to find which groups differ? to see how it works, or upload your own data to discover the specific differences between your groups.

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Learn More

Scheffe's Test

Definition

Scheffe's test is a post-hoc multiple comparison procedure used to identify which specific group means differ after obtaining a significant ANOVA result. Named after Henry Scheffe, it is the most conservative of all post-hoc tests, providing the strongest control over Type I error rates across all possible comparisons, including complex contrasts.

When to Use Scheffe's Test

Scheffe's test is particularly useful in the following situations:

  • After significant ANOVA: You've obtained a significant F-statistic and want to explore which groups differ
  • Unequal sample sizes: Your groups have different numbers of observations
  • Complex comparisons: You want to test not just pairwise comparisons but also complex contrasts
  • Maximum conservatism: You want the strongest protection against Type I errors
  • Data snooping: You want to look at all possible comparisons without inflating error rates

How Does Scheffe's Test Work?

Scheffe's test uses the F-distribution to create simultaneous confidence intervals for all possible contrasts. The test statistic for comparing two groups i and j is:

S=(k1)Fα,k1,NkMSE(1ni+1nj)S = \sqrt{(k-1)F_{\alpha, k-1, N-k}} \cdot \sqrt{MSE \left(\frac{1}{n_i} + \frac{1}{n_j}\right)}

Where:

  • kk = number of groups
  • Fα,k1,NkF_{\alpha, k-1, N-k} = critical value from F-distribution
  • MSEMSE = mean square error from ANOVA
  • ni,njn_i, n_j = sample sizes of groups i and j

Two groups are significantly different if their mean difference exceeds this critical value.

Test Statistic Formula

For comparing groups i and j:

Fscheffe=(xˉixˉj)2MSE(1ni+1nj)(k1)F_{scheffe} = \frac{(\bar{x}_i - \bar{x}_j)^2}{MSE(\frac{1}{n_i} + \frac{1}{n_j})(k-1)}

If Fscheffe>Fα,k1,NkF_{scheffe} > F_{\alpha, k-1, N-k}, the difference is significant.

Confidence Interval Formula

Simultaneous confidence interval for the difference between groups i and j:

xˉixˉj±(k1)Fα,k1,NkMSE(1ni+1nj)\bar{x}_i - \bar{x}_j \pm \sqrt{(k-1)F_{\alpha, k-1, N-k}} \cdot \sqrt{MSE\left(\frac{1}{n_i} + \frac{1}{n_j}\right)}

Practical Example

Scenario

Three teaching methods (A, B, C) were compared on student test scores. ANOVA showed a significant difference (F = 51.56, p < 0.001). Now we use Scheffe's test to determine which methods differ.

Group Means
  • Group A: 76.2
  • Group B: 85.0
  • Group C: 90.0
ANOVA Parameters (from previous analysis)
  • MSE = 4.73
  • k = 3 groups
  • N = 15 total observations
  • F0.05,2,12F_{0.05, 2, 12} = 3.89
Scheffe's Critical Value
S=(31)×3.89×4.73×25=3.84S = \sqrt{(3-1) \times 3.89} \times \sqrt{4.73 \times \frac{2}{5}} = 3.84
Pairwise Comparisons
ComparisonMean DiffCritical ValueSignificant?
B vs A8.83.84Yes
C vs A13.83.84Yes
C vs B5.03.84Yes
Conclusion

All three teaching methods produce significantly different results. Method C produces the highest scores, followed by Method B, then Method A.

Scheffe vs Other Post-Hoc Tests

TestConservatismBest For
ScheffeMost conservativeComplex contrasts, data snooping
Tukey HSDModeratePairwise comparisons, equal n
BonferroniVery conservativeFew planned comparisons

Note: Scheffe's test has lower power than Tukey's HSD for pairwise comparisons but is more versatile for complex contrasts.

Key Assumptions

Significant ANOVA: Should only be used after obtaining a significant F-test
Same assumptions as ANOVA: Independence, normality, and homogeneity of variance
Random sampling: Data should be randomly sampled from populations

Code Examples

R
library(tidyverse)
library(DescTools)

# Sample data
data <- tibble(
  Group = c("A", "A", "A", "A", "A",
            "B", "B", "B", "B", "B",
            "C", "C", "C", "C", "C"),
  Score = c(75, 72, 80, 78, 76,
            85, 86, 83, 87, 84,
            90, 92, 88, 91, 89)
)

# Perform one-way ANOVA first
anova_result <- aov(Score ~ Group, data = data)

# Perform Scheffe's test
scheffe_result <- ScheffeTest(anova_result)
print(scheffe_result)
Python
import numpy as np
from scipy import stats
import itertools

# Sample data
group_A = [75, 72, 80, 78, 76]
group_B = [85, 86, 83, 87, 84]
group_C = [90, 92, 88, 91, 89]

groups = [group_A, group_B, group_C]
group_names = ['A', 'B', 'C']

# Perform one-way ANOVA first
f_stat, p_value = stats.f_oneway(*groups)

# Calculate pooled variance (MSE)
all_data = np.concatenate(groups)
group_means = [np.mean(g) for g in groups]
grand_mean = np.mean(all_data)

# Within-group sum of squares
ss_within = sum(sum((x - np.mean(g))**2 for x in g) for g in groups)
df_within = len(all_data) - len(groups)
mse = ss_within / df_within

# Perform Scheffe's test for all pairs
k = len(groups)
n = [len(g) for g in groups]

for i, j in itertools.combinations(range(k), 2):
    mean_diff = abs(group_means[i] - group_means[j])
    se = np.sqrt(mse * (1/n[i] + 1/n[j]))
    scheffe_stat = mean_diff / se
    critical_value = np.sqrt((k - 1) * stats.f.ppf(0.95, k - 1, df_within))

    print(f"{group_names[i]} vs {group_names[j]}:")
    print(f"  Mean difference: {mean_diff:.4f}")
    print(f"  Scheffe statistic: {scheffe_stat:.4f}")
    print(f"  Critical value: {critical_value:.4f}")
    print(f"  Significant: {scheffe_stat > critical_value}")
    print()

Advantages and Limitations

Advantages
  • Strongest control over family-wise error rate
  • Works with unequal sample sizes
  • Can test complex contrasts
  • Allows data snooping without penalty
Limitations
  • Lower statistical power than Tukey's HSD
  • More conservative, harder to detect real differences
  • Only appropriate after significant ANOVA result

Verification