This Permutation Test Calculator helps you determine if there's a significant difference between groups without assuming a specific data distribution. It's ideal for small sample sizes or when parametric assumptions are violated. The test works by randomly shuffling (permuting) the data between groups many times to create a null distribution, allowing you to assess how likely your observed result would be by chance alone. To learn about the data format required and test this calculator, click here to populate the sample data.
Calculator
1. Load Your Data
2. Select Columns & Options
Setting a seed ensures reproducible results across multiple runs
Related Calculators
Learn More
Permutation Test
Definition
Permutation Test is a non-parametric statistical method used to determine if there's a significant difference between groups by randomly shuffling (permuting) the data between groups many times to create a null distribution, then comparing the observed difference to this distribution.
When to Use Permutation Tests
Permutation tests are particularly useful in these situations:
- When your sample size is small
- When your data doesn't meet parametric test assumptions (like normality)
- When you want to make minimal assumptions about underlying distributions
- For complex test statistics without known sampling distributions
- When you need a test that maintains good statistical power with non-normal data
- When testing for independence between variables
How Permutation Tests Work (Step by Step)
- Calculate the observed test statistic:
This could be a difference in means, medians, or any other statistic of interest.
- Combine all data from all groups:
- Repeat many times (e.g., 10,000 iterations):
- Randomly shuffle the combined data
- Reassign data points to groups with original group sizes
- Calculate the test statistic for this permutation
- Store this permuted test statistic
- Calculate the p-value:
For a two-sided test, we count how many permuted statistics are as or more extreme than the observed statistic.
Key Advantages
Practical Example
Step 1: State the Data
Group A | Group B |
---|---|
75 | 85 |
72 | 86 |
80 | 83 |
78 | 87 |
76 | 84 |
Step 2: Calculate Observed Difference
- Mean of Group A: (75 + 72 + 80 + 78 + 76) / 5 = 76.2
- Mean of Group B: (85 + 86 + 83 + 87 + 84) / 5 = 85.0
- Observed difference: 85.0 - 76.2 = 8.8
Step 3: Perform Permutation Test
Combine all data: 75, 72, 80, 78, 76, 85, 86, 83, 87, 84
Randomly shuffle and split into groups many times (10,000 permutations)
Calculate the difference for each permutation
Step 4: Calculate p-value
In our example, if we assumed that we found 19 out of 10,000 permutations where the absolute difference was greater than or equal to 8.8, then the p-value would be:
p-value = 19/10,000 = 0.0019
Step 5: Draw Conclusion
Since the p-value (0.0019) is less than our significance level (0.05), we reject the null hypothesis. There is statistically significant evidence to conclude that the groups differ.
Effect Size
Cohen's d can be used to measure effect size:
where
Guidelines:
- Small effect:
- Medium effect:
- Large effect:
For our example:which indicates a very large effect.
Code Examples
library(tidyverse)
set.seed(42)
group1 <- c(75, 72, 80, 78, 76)
group2 <- c(85, 86, 83, 87, 84)
observed_diff <- mean(group2) - mean(group1)
print(str_glue("Observed difference: {observed_diff}"))
combined <- c(group1, group2)
n1 <- length(group1)
n <- length(combined)
n_perm <- 10000
perm_diffs <- numeric(n_perm)
for (i in 1:n_perm) {
perm <- sample(combined, n, replace = FALSE)
perm_group1 <- perm[1:n1]
perm_group2 <- perm[(n1+1):n]
perm_diffs[i] <- mean(perm_group2) - mean(perm_group1)
}
p_value <- mean(abs(perm_diffs) >= abs(observed_diff))
print(str_glue("Permutation test p-value: {p_value}"))
# plot permuted differences with observed difference
ggplot(data.frame(perm_diffs), aes(x = perm_diffs)) +
geom_histogram(aes(y = after_stat(density)), bins = 30, fill = "lightblue", color = "black") +
geom_vline(aes(xintercept = observed_diff), color = "red", linetype = "dashed", linewidth = 1) +
geom_vline(aes(xintercept = -observed_diff), color = "red", linetype = "dashed", linewidth = 1) +
labs(title = "Permutation Test: Distribution of Permuted Differences",
x = "Difference in Means",
y = "Density") +
theme_minimal()
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
np.random.seed(42)
group1 = np.array([75, 72, 80, 78, 76])
group2 = np.array([85, 86, 83, 87, 84])
# observed difference
observed_diff = np.mean(group2) - np.mean(group1)
print(f"Observed difference: {observed_diff}")
combined = np.concatenate([group1, group2])
n1 = len(group1)
n = len(combined)
n_perm = 10000
perm_diffs = np.zeros(n_perm)
for i in range(n_perm):
# Randomly permute the combined data
perm = np.random.permutation(combined)
# Split into two groups of original sizes
perm_group1 = perm[:n1]
perm_group2 = perm[n1:]
# Calculate and store the difference in means
perm_diffs[i] = np.mean(perm_group2) - np.mean(perm_group1)
# p-value
p_value = np.mean(np.abs(perm_diffs) >= np.abs(observed_diff))
print(f"Permutation test p-value: {p_value}")
# plot the permuted differences with observed difference
plt.figure(figsize=(10, 6))
sns.histplot(perm_diffs, kde=True, color='lightblue', edgecolor='black')
plt.axvline(x=observed_diff, color='red', linestyle='dashed', linewidth=2, label='Observed difference')
plt.axvline(x=-observed_diff, color='red', linestyle='dashed', linewidth=2)
plt.title('Permutation Test: Distribution of Permuted Differences')
plt.xlabel('Difference in Means')
plt.ylabel('Density')
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()
Comparison to Other Tests
How permutation tests compare to other common statistical methods:
- t-test: Permutation tests are more flexible and don't require normality assumptions, but t-tests are simpler to compute and have closed-form solutions.
- Mann-Whitney U Test: Both are non-parametric, but permutation tests can use any test statistic, while Mann-Whitney focuses on ranks.
- Bootstrap Tests: Bootstrap tests resample with replacement, while permutation tests shuffle existing data. Permutation tests are better for testing differences between groups.