StatsCalculators.com

Permutation Test

Created:April 16, 2025

This Permutation Test Calculator helps you determine if there's a significant difference between groups without assuming a specific data distribution. It's ideal for small sample sizes or when parametric assumptions are violated. The test works by randomly shuffling (permuting) the data between groups many times to create a null distribution, allowing you to assess how likely your observed result would be by chance alone. To learn about the data format required and test this calculator, click here to populate the sample data.

Calculator

1. Load Your Data

Note: Column names will be converted to snake_case (e.g., "Product ID" → "product_id") for processing.

2. Select Columns & Options

Setting a seed ensures reproducible results across multiple runs

Related Calculators

Learn More

Permutation Test

Definition

Permutation Test is a non-parametric statistical method used to determine if there's a significant difference between groups by randomly shuffling (permuting) the data between groups many times to create a null distribution, then comparing the observed difference to this distribution.

When to Use Permutation Tests

Permutation tests are particularly useful in these situations:

  • When your sample size is small
  • When your data doesn't meet parametric test assumptions (like normality)
  • When you want to make minimal assumptions about underlying distributions
  • For complex test statistics without known sampling distributions
  • When you need a test that maintains good statistical power with non-normal data
  • When testing for independence between variables

How Permutation Tests Work (Step by Step)

  1. Calculate the observed test statistic:
    Tobs=statistic(group1,group2,...)T_{obs} = \text{statistic}(\text{group}_1, \text{group}_2, ...)

    This could be a difference in means, medians, or any other statistic of interest.

  2. Combine all data from all groups:
    combined=group1group2...\text{combined} = \text{group}_1 \cup \text{group}_2 \cup ...
  3. Repeat many times (e.g., 10,000 iterations):
    1. Randomly shuffle the combined data
    2. Reassign data points to groups with original group sizes
    3. Calculate the test statistic for this permutation
    4. Store this permuted test statistic
  4. Calculate the p-value:
    p=Number of permuted statistics TobsNumber of permutationsp = \frac{\text{Number of permuted statistics } \geq |T_{obs}|}{\text{Number of permutations}}

    For a two-sided test, we count how many permuted statistics are as or more extreme than the observed statistic.

Key Advantages

Distribution-Free: No assumptions about underlying data distributions
Flexible: Can be applied to many different test statistics
Small Sample Size: Works well when sample sizes are too small for parametric tests
Exact p-values: Provides exact p-values (limited only by number of permutations)

Practical Example

Step 1: State the Data
Group AGroup B
7585
7286
8083
7887
7684
Step 2: Calculate Observed Difference
  • Mean of Group A: (75 + 72 + 80 + 78 + 76) / 5 = 76.2
  • Mean of Group B: (85 + 86 + 83 + 87 + 84) / 5 = 85.0
  • Observed difference: 85.0 - 76.2 = 8.8
Step 3: Perform Permutation Test

Combine all data: 75, 72, 80, 78, 76, 85, 86, 83, 87, 84

Randomly shuffle and split into groups many times (10,000 permutations)

Calculate the difference for each permutation

Step 4: Calculate p-value

In our example, if we assumed that we found 19 out of 10,000 permutations where the absolute difference was greater than or equal to 8.8, then the p-value would be:

p-value = 19/10,000 = 0.0019

Step 5: Draw Conclusion

Since the p-value (0.0019) is less than our significance level (0.05), we reject the null hypothesis. There is statistically significant evidence to conclude that the groups differ.

Effect Size

Cohen's d can be used to measure effect size:

d=xˉ1xˉ2spooledd = \frac{|\bar{x}_1 - \bar{x}_2|}{s_{pooled}}

where spooled=(n11)s12+(n21)s22n1+n22s_{pooled} = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}}

Guidelines:

  • Small effect: d0.2d \approx 0.2
  • Medium effect: d0.5d \approx 0.5
  • Large effect: d0.8d \approx 0.8

For our example:d=85.076.22.42=3.638d = \frac{|85.0 - 76.2|}{2.42} = 3.638which indicates a very large effect.

Code Examples

R
library(tidyverse)

set.seed(42)
group1 <- c(75, 72, 80, 78, 76)
group2 <- c(85, 86, 83, 87, 84)

observed_diff <- mean(group2) - mean(group1)
print(str_glue("Observed difference: {observed_diff}"))

combined <- c(group1, group2)
n1 <- length(group1)
n <- length(combined)

n_perm <- 10000
perm_diffs <- numeric(n_perm)

for (i in 1:n_perm) {
  perm <- sample(combined, n, replace = FALSE)
  perm_group1 <- perm[1:n1]
  perm_group2 <- perm[(n1+1):n]
  perm_diffs[i] <- mean(perm_group2) - mean(perm_group1)
}

p_value <- mean(abs(perm_diffs) >= abs(observed_diff))
print(str_glue("Permutation test p-value: {p_value}"))


# plot permuted differences with observed difference

ggplot(data.frame(perm_diffs), aes(x = perm_diffs)) +
  geom_histogram(aes(y = after_stat(density)), bins = 30, fill = "lightblue", color = "black") +
  geom_vline(aes(xintercept = observed_diff), color = "red", linetype = "dashed", linewidth = 1) +
  geom_vline(aes(xintercept = -observed_diff), color = "red", linetype = "dashed", linewidth = 1) +
  labs(title = "Permutation Test: Distribution of Permuted Differences",
       x = "Difference in Means",
       y = "Density") +
  theme_minimal()
Python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

np.random.seed(42)

group1 = np.array([75, 72, 80, 78, 76])
group2 = np.array([85, 86, 83, 87, 84])

# observed difference
observed_diff = np.mean(group2) - np.mean(group1)
print(f"Observed difference: {observed_diff}")

combined = np.concatenate([group1, group2])
n1 = len(group1)
n = len(combined)

n_perm = 10000
perm_diffs = np.zeros(n_perm)

for i in range(n_perm):
    # Randomly permute the combined data
    perm = np.random.permutation(combined)
    # Split into two groups of original sizes
    perm_group1 = perm[:n1]
    perm_group2 = perm[n1:]
    # Calculate and store the difference in means
    perm_diffs[i] = np.mean(perm_group2) - np.mean(perm_group1)

# p-value
p_value = np.mean(np.abs(perm_diffs) >= np.abs(observed_diff))
print(f"Permutation test p-value: {p_value}")

# plot the permuted differences with observed difference
plt.figure(figsize=(10, 6))
sns.histplot(perm_diffs, kde=True, color='lightblue', edgecolor='black')
plt.axvline(x=observed_diff, color='red', linestyle='dashed', linewidth=2, label='Observed difference')
plt.axvline(x=-observed_diff, color='red', linestyle='dashed', linewidth=2)
plt.title('Permutation Test: Distribution of Permuted Differences')
plt.xlabel('Difference in Means')
plt.ylabel('Density')
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Comparison to Other Tests

How permutation tests compare to other common statistical methods:

  • t-test: Permutation tests are more flexible and don't require normality assumptions, but t-tests are simpler to compute and have closed-form solutions.
  • Mann-Whitney U Test: Both are non-parametric, but permutation tests can use any test statistic, while Mann-Whitney focuses on ranks.
  • Bootstrap Tests: Bootstrap tests resample with replacement, while permutation tests shuffle existing data. Permutation tests are better for testing differences between groups.

Verification