Dunn's Test

Created:December 15, 2024

Last Updated:March 30, 2025

Dunn's test is a post-hoc test used after a Kruskal-Wallis test to perform multiple pairwise comparisons between groups. It helps identify which specific groups differ significantly from each other when comparing three or more independent groups. This calculator performs Dunn's test on your data, providing p-values for each pairwise comparison and indicating which differences are statistically significant based on your chosen significance level (alpha). This test requires at least 3 groups with 2+ observations in each group. for a quick example.

Note about R Implementation

There are multiple R packages available for performing Dunn's test (e.g., dunn.test, rstatix, FSA). These packages may produce slightly different test statistics and p-values due to variations in their implementations. So is the case for this calculator, which uses the multipletests from statsmodels package for p-value adjustment. However, the final conclusions about which groups differ significantly typically remain consistent across packages. You can see examples using different packages in the verification section below.

Here are some useful links:

Calculator

1. Load Your Data

2. Select Columns & Options

Select column for groups:

Select column for values:

Significance Level (α):

P-value Adjustment Method:

Bonferroni is recommended for strict control of Type I errors. FDR methods (fdr_bh, fdr_by) are better for exploratory analyses. Learn More

Related Calculators

Kruskal-Wallis Test Calculator

Mann-Whitney U Test Calculator

Friedman Test Calculator

One-Way ANOVA Calculator

Learn More

Dunn's Test

Definition

Dunn's Test is a non-parametric post-hoc test used after a significant Kruskal-Wallis test to identify which specific groups differ from each other. It compares the difference in the sum of ranks between two groups with the expected average difference. This test is often used when the assumptions of ANOVA are violated, such as normality or homogeneity of variances.

Test Statistics

For comparing groups $i$ and $j$ :

z_{ij} = \frac{\bar{R}_i - \bar{R}_j}{\sqrt{\frac{N(N+1)}{12}(\frac{1}{n_i} + \frac{1}{n_j})}}

Where:

$\bar{R}_i, \bar{R}_j$ = mean ranks for groups $i$ and $j$
$n_i, n_j$ = sample sizes for groups $i$ and $j$
$N$ = total sample size

Key Features

Non-parametric: No assumption of normality

Multiple Comparisons: Accounts for family-wise error rate

Unequal Sample Sizes: Can handle different group sizes

Practical Example

Step 1: State the Data

Test scores for three different teaching methods:

Method A	Method B	Method C
23, 25, 21, 22, 20	18, 19, 17, 20, 21	15, 14, 16, 18, 19

Step 2: Rank All Data

Combined ranking of all values (lowest to highest):

Value	Rank	Group
14	1	C
15	2	C
16	3	C
17	4	B
18	5.5	B,C
19	7.5	B,C
20	9.5	A,B
21	11.5	A,B
22	13	A
23	14	A
25	15	A

Step 3: Calculate Mean Ranks

Method A: $\bar{R}_A = 12.6$
Method B: $\bar{R}_B = 7.6$
Method C: $\bar{R}_C = 4.3$

Step 4: Calculate Test Statistics

For each pair (using the formula):

A vs B: $z_{AB} = \frac{12.6-7.6}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 1.774$
A vs C: $z_{AC} = \frac{12.6-4.3}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 3.122$
B vs C: $z_{BC} = \frac{7.6-4.3}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 1.348$

Step 5: Apply Bonferroni Correction

For 3 groups, we have 3 comparisons:

Adjusted $\alpha = 0.05/3 = 0.0167$
Critical value $z = \pm2.39$

Step 6: Draw Conclusions

A vs B: Significant ( $1.77 \lt 2.39$ )
A vs C: Significant ( $3.12 \gt 2.39$ )
B vs C: Not significant ( $1.34 \lt 2.39$ )

Method A scores significantly differ from C. However, there is no evidence to suggest a significant difference between the scores of Method A and Method B or Method B and Method C.

How to perform Dunn's Test in R and Python

library(tidyverse)
library(FSA)

# Data preparation
data_long <- tibble(
  Method = rep(c("Method A", "Method B", "Method C"), each = 5),
  Value = c(23, 25, 21, 22, 20, 18, 19, 17, 20, 21, 15, 14, 16, 18, 19)
)

# Perform Kruskal-Wallis test
kruskal_test <- kruskal.test(Value ~ Method, data = data_long)
print(kruskal_test)

# Perform Dunn's test with Bonferroni correction
dunn_test <- dunnTest(Value ~ Method, data = data_long, method = "bonferroni")
print(dunn_test)

Python

import pandas as pd
import numpy as np
from scipy import stats
from itertools import combinations
from statsmodels.stats.multitest import multipletests

def dunn_test(groups, values, alpha, adjustment_method):
    # Perform Kruskal-Wallis H-test
    h_statistic, overall_p_value = stats.kruskal(
        *[values[groups == g] for g in np.unique(groups)]
    )

    # Calculate ranks
    ranks = stats.rankdata(values)

    # Get unique groups and their sizes
    unique_groups = np.unique(groups)
    group_sizes = [np.sum(groups == g) for g in unique_groups]

    # Calculate pairwise comparisons
    comparisons = []
    for (i, group1), (j, group2) in combinations(enumerate(unique_groups), 2):
        # Calculate z-statistic
        z = (
            np.mean(ranks[groups == group1]) - np.mean(ranks[groups == group2])
        ) / np.sqrt(
            (len(values) * (len(values) + 1) / 12)
            * (1 / group_sizes[i] + 1 / group_sizes[j])
        )

        # Calculate unadjusted p-value
        p_unadjusted = 2 * (1 - stats.norm.cdf(abs(z)))

        comparisons.append(
            {
                "group1": group1,
                "group2": group2,
                "z_statistic": z,
                "unadjusted_p_value": p_unadjusted,
            }
        )

    # Adjust p-values
    p_values = [comp["unadjusted_p_value"] for comp in comparisons]
    if adjustment_method in ["bonferroni", "sidak", "holm", "fdr_bh", "fdr_by"]:
        rejected, adjusted_p_values, _, _ = multipletests(
            p_values, alpha=alpha, method=adjustment_method
        )
    else:
        raise ValueError(f"Unsupported adjustment method: {adjustment_method}")

    # Add adjusted p-values and significance to comparisons
    for comp, adj_p, rej in zip(comparisons, adjusted_p_values, rejected):
        comp["adjusted_p_value"] = adj_p
        comp["significant"] = bool(rej)

    return h_statistic, overall_p_value, len(unique_groups) - 1, comparisons

# Example data
group1 = [23, 25, 21, 22, 20]
group2 = [18, 19, 17, 20, 21]
group3 = [15, 14, 16, 18, 19]
values = np.array(group1 + group2 + group3)
groups = np.array(['A'] * 5 + ['B'] * 5 + ['C'] * 5)

# Perform Dunn's test
dunn = dunn_test(groups, values, 0.05, 'bonferroni')
print(dunn)

Alternative Tests

Consider these alternatives:

Tukey's HSD: When data is normally distributed
Games-Howell: When variances are unequal
Nemenyi Test: Another non-parametric alternative

Verification

Dunn's Test

Created:December 15, 2024

Last Updated:March 30, 2025

Note about R Implementation

Here are some useful links:

Calculator

1. Load Your Data

2. Select Columns & Options

Select column for groups:

Select column for values:

Significance Level (α):

P-value Adjustment Method:

Bonferroni is recommended for strict control of Type I errors. FDR methods (fdr_bh, fdr_by) are better for exploratory analyses. Learn More

Related Calculators

Kruskal-Wallis Test Calculator

Mann-Whitney U Test Calculator

Friedman Test Calculator

One-Way ANOVA Calculator

Learn More

Dunn's Test

Definition

Test Statistics

For comparing groups $i$ and $j$ :

z_{ij} = \frac{\bar{R}_i - \bar{R}_j}{\sqrt{\frac{N(N+1)}{12}(\frac{1}{n_i} + \frac{1}{n_j})}}

Where:

$\bar{R}_i, \bar{R}_j$ = mean ranks for groups $i$ and $j$
$n_i, n_j$ = sample sizes for groups $i$ and $j$
$N$ = total sample size

Key Features

Non-parametric: No assumption of normality

Multiple Comparisons: Accounts for family-wise error rate

Unequal Sample Sizes: Can handle different group sizes

Practical Example

Step 1: State the Data

Test scores for three different teaching methods:

Method A	Method B	Method C
23, 25, 21, 22, 20	18, 19, 17, 20, 21	15, 14, 16, 18, 19

Step 2: Rank All Data

Combined ranking of all values (lowest to highest):

Value	Rank	Group
14	1	C
15	2	C
16	3	C
17	4	B
18	5.5	B,C
19	7.5	B,C
20	9.5	A,B
21	11.5	A,B
22	13	A
23	14	A
25	15	A

Step 3: Calculate Mean Ranks

Method A: $\bar{R}_A = 12.6$
Method B: $\bar{R}_B = 7.6$
Method C: $\bar{R}_C = 4.3$

Step 4: Calculate Test Statistics

For each pair (using the formula):

A vs B: $z_{AB} = \frac{12.6-7.6}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 1.774$
A vs C: $z_{AC} = \frac{12.6-4.3}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 3.122$
B vs C: $z_{BC} = \frac{7.6-4.3}{\sqrt{\frac{15\times(15+1)}{12}(\frac{1}{5} + \frac{1}{5})}} = 1.348$

Step 5: Apply Bonferroni Correction

For 3 groups, we have 3 comparisons:

Adjusted $\alpha = 0.05/3 = 0.0167$
Critical value $z = \pm2.39$

Step 6: Draw Conclusions

A vs B: Significant ( $1.77 \lt 2.39$ )
A vs C: Significant ( $3.12 \gt 2.39$ )
B vs C: Not significant ( $1.34 \lt 2.39$ )

Method A scores significantly differ from C. However, there is no evidence to suggest a significant difference between the scores of Method A and Method B or Method B and Method C.

How to perform Dunn's Test in R and Python

library(tidyverse)
library(FSA)

# Data preparation
data_long <- tibble(
  Method = rep(c("Method A", "Method B", "Method C"), each = 5),
  Value = c(23, 25, 21, 22, 20, 18, 19, 17, 20, 21, 15, 14, 16, 18, 19)
)

# Perform Kruskal-Wallis test
kruskal_test <- kruskal.test(Value ~ Method, data = data_long)
print(kruskal_test)

# Perform Dunn's test with Bonferroni correction
dunn_test <- dunnTest(Value ~ Method, data = data_long, method = "bonferroni")
print(dunn_test)

Python

import pandas as pd
import numpy as np
from scipy import stats
from itertools import combinations
from statsmodels.stats.multitest import multipletests

def dunn_test(groups, values, alpha, adjustment_method):
    # Perform Kruskal-Wallis H-test
    h_statistic, overall_p_value = stats.kruskal(
        *[values[groups == g] for g in np.unique(groups)]
    )

    # Calculate ranks
    ranks = stats.rankdata(values)

    # Get unique groups and their sizes
    unique_groups = np.unique(groups)
    group_sizes = [np.sum(groups == g) for g in unique_groups]

    # Calculate pairwise comparisons
    comparisons = []
    for (i, group1), (j, group2) in combinations(enumerate(unique_groups), 2):
        # Calculate z-statistic
        z = (
            np.mean(ranks[groups == group1]) - np.mean(ranks[groups == group2])
        ) / np.sqrt(
            (len(values) * (len(values) + 1) / 12)
            * (1 / group_sizes[i] + 1 / group_sizes[j])
        )

        # Calculate unadjusted p-value
        p_unadjusted = 2 * (1 - stats.norm.cdf(abs(z)))

        comparisons.append(
            {
                "group1": group1,
                "group2": group2,
                "z_statistic": z,
                "unadjusted_p_value": p_unadjusted,
            }
        )

    # Adjust p-values
    p_values = [comp["unadjusted_p_value"] for comp in comparisons]
    if adjustment_method in ["bonferroni", "sidak", "holm", "fdr_bh", "fdr_by"]:
        rejected, adjusted_p_values, _, _ = multipletests(
            p_values, alpha=alpha, method=adjustment_method
        )
    else:
        raise ValueError(f"Unsupported adjustment method: {adjustment_method}")

    # Add adjusted p-values and significance to comparisons
    for comp, adj_p, rej in zip(comparisons, adjusted_p_values, rejected):
        comp["adjusted_p_value"] = adj_p
        comp["significant"] = bool(rej)

    return h_statistic, overall_p_value, len(unique_groups) - 1, comparisons

# Example data
group1 = [23, 25, 21, 22, 20]
group2 = [18, 19, 17, 20, 21]
group3 = [15, 14, 16, 18, 19]
values = np.array(group1 + group2 + group3)
groups = np.array(['A'] * 5 + ['B'] * 5 + ['C'] * 5)

# Perform Dunn's test
dunn = dunn_test(groups, values, 0.05, 'bonferroni')
print(dunn)

Alternative Tests

Consider these alternatives:

Tukey's HSD: When data is normally distributed
Games-Howell: When variances are unequal
Nemenyi Test: Another non-parametric alternative

Verification

Method A

Method B

Method C

23, 25, 21, 22, 20

18, 19, 17, 20, 21

15, 14, 16, 18, 19

Value

Rank

Group

5.5

B,C

7.5

B,C

9.5

A,B

11.5

A,B

library(tidyverse) library(FSA) # Data preparation data_long <- tibble( Method = rep(c("Method A", "Method B", "Method C"), each = 5), Value = c(23, 25, 21, 22, 20, 18, 19, 17, 20, 21, 15, 14, 16, 18, 19) ) # Perform Kruskal-Wallis test kruskal_test <- kruskal.test(Value ~ Method, data = data_long) print(kruskal_test) # Perform Dunn's test with Bonferroni correction dunn_test <- dunnTest(Value ~ Method, data = data_long, method = "bonferroni") print(dunn_test)

import pandas as pd import numpy as np from scipy import stats from itertools import combinations from statsmodels.stats.multitest import multipletests def dunn_test(groups, values, alpha, adjustment_method): # Perform Kruskal-Wallis H-test h_statistic, overall_p_value = stats.kruskal( *[values[groups == g] for g in np.unique(groups)] ) # Calculate ranks ranks = stats.rankdata(values) # Get unique groups and their sizes unique_groups = np.unique(groups) group_sizes = [np.sum(groups == g) for g in unique_groups] # Calculate pairwise comparisons comparisons = [] for (i, group1), (j, group2) in combinations(enumerate(unique_groups), 2): # Calculate z-statistic z = ( np.mean(ranks[groups == group1]) - np.mean(ranks[groups == group2]) ) / np.sqrt( (len(values) * (len(values) + 1) / 12) * (1 / group_sizes[i] + 1 / group_sizes[j]) ) # Calculate unadjusted p-value p_unadjusted = 2 * (1 - stats.norm.cdf(abs(z))) comparisons.append( { "group1": group1, "group2": group2, "z_statistic": z, "unadjusted_p_value": p_unadjusted, } ) # Adjust p-values p_values = [comp["unadjusted_p_value"] for comp in comparisons] if adjustment_method in ["bonferroni", "sidak", "holm", "fdr_bh", "fdr_by"]: rejected, adjusted_p_values, _, _ = multipletests( p_values, alpha=alpha, method=adjustment_method ) else: raise ValueError(f"Unsupported adjustment method: {adjustment_method}") # Add adjusted p-values and significance to comparisons for comp, adj_p, rej in zip(comparisons, adjusted_p_values, rejected): comp["adjusted_p_value"] = adj_p comp["significant"] = bool(rej) return h_statistic, overall_p_value, len(unique_groups) - 1, comparisons # Example data group1 = [23, 25, 21, 22, 20] group2 = [18, 19, 17, 20, 21] group3 = [15, 14, 16, 18, 19] values = np.array(group1 + group2 + group3) groups = np.array(['A'] * 5 + ['B'] * 5 + ['C'] * 5) # Perform Dunn's test dunn = dunn_test(groups, values, 0.05, 'bonferroni') print(dunn)

Method A

Method B

Method C

23, 25, 21, 22, 20

18, 19, 17, 20, 21

15, 14, 16, 18, 19

Value

Rank

Group

5.5

B,C

7.5

B,C

9.5

A,B

11.5

A,B

Dunn's Test

Note about R Implementation

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Kruskal-Wallis Test Calculator

Mann-Whitney U Test Calculator

Friedman Test Calculator

One-Way ANOVA Calculator

Learn More

Dunn's Test

Definition

Test Statistics

Key Features

Practical Example

Step 1: State the Data

Step 2: Rank All Data

Step 3: Calculate Mean Ranks

Step 4: Calculate Test Statistics

Step 5: Apply Bonferroni Correction

Step 6: Draw Conclusions

How to perform Dunn's Test in R and Python

Alternative Tests

Verification

View Verification Details

Dunn's Test

Note about R Implementation

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Kruskal-Wallis Test Calculator

Mann-Whitney U Test Calculator

Friedman Test Calculator

One-Way ANOVA Calculator

Learn More

Dunn's Test

Definition

Test Statistics

Key Features

Practical Example

Step 1: State the Data

Step 2: Rank All Data

Step 3: Calculate Mean Ranks

Step 4: Calculate Test Statistics

Step 5: Apply Bonferroni Correction

Step 6: Draw Conclusions

How to perform Dunn's Test in R and Python

Alternative Tests

Verification

View Verification Details

Dunn's Test

Note about R Implementation

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Kruskal-Wallis Test Calculator

Mann-Whitney U Test Calculator

Friedman Test Calculator

One-Way ANOVA Calculator

Learn More

Dunn's Test

Definition

Test Statistics

Key Features

Practical Example

Step 1: State the Data

Step 2: Rank All Data

Step 3: Calculate Mean Ranks

Step 4: Calculate Test Statistics

Step 5: Apply Bonferroni Correction

Step 6: Draw Conclusions

How to perform Dunn's Test in R and Python

Alternative Tests

Verification

View Verification Details

Dunn's Test

Note about R Implementation