StatsCalculators.com

Sample Size & Power Analysis

Created:October 19, 2024
Last Updated:December 26, 2025

This sample size calculator helps you determine the optimal sample size needed for your statistical tests. It provides comprehensive power analysis with visual charts showing the relationship between sample size, effect size, and statistical power. Whether you're comparing means (paired/unpaired), proportions, multiple groups (ANOVA), categorical associations (chi-square), or building predictive models (multiple regression), this calculator will help ensure your research has adequate statistical power to detect meaningful effects.

If you are looking to calculate the sample size based on a desired margin of error for confidence intervals, try our Margin of Error Sample Size Calculator.

Calculator

Parameters

Range: 0.1 to 2 (standard deviations)

Notes:

  • Effect size interpretations vary by field and context
  • For mean difference tests, effect size is in standard deviation units
  • Power of 80% (0.8) is typically considered adequate
  • Significance level of 5% (0.05) is conventional in many fields
  • The power curve shows how the statistical power changes with different effect sizes
  • Larger sample sizes can detect smaller effect sizes with the same power

Sample Size Calculation Results

Power vs Sample Size

Power vs Effect Size

Need to cite or share this calculator?

Get properly formatted citations for academic work •20k+ calculations in the past 30 days

APA 7th Edition • Most Common
StatsCalculators Team. (2026). Sample Size & Power Analysis Calculator. StatsCalculators. Retrieved January 2, 2026 from https://statscalculators.com/calculators/hypothesis-testing/sample-size-and-power-analysis-calculator

Learn More: Sample Size Formulas, Examples, and R Code

Why Sample Size Matters in Research?

What is Statistical Power?

Statistical power is the probability that your study will correctly detect an effect when there is one. Failing to do so results in a Type II error.

A power of 0.8 (or 80%) is typically considered adequate, indicating there is a 20% chance of overlooking a real effect.

The Importance of Sample Size

Sample size calculation is a crucial step in research design and hypothesis testing. It helps you:

  • Ensure your study has adequate statistical power to detect meaningful effects
  • Avoid wasting resources on studies that are too large
  • Maintain ethical standards by not using too few or too many participants
  • Make informed decisions about resource allocation

Warning: Conducting a study with inadequate sample size can lead to:

  • False negatives (Type II errors) - failing to detect real effects
  • Unreliable results and wasted resources
  • Inability to draw meaningful conclusions

A/B Testing Example

Scenario: Website Conversion Rate

You're testing a new button design and want to detect a 2% increase in conversion rate (from 10% to 12%).

Without proper sample size calculation:

Too Small (100 visitors/group)
  • Control: 10 conversions (10%)
  • Test: 12 conversions (12%)
  • Result: Not statistically significant despite real effect
Proper Size (2000 visitors/group)
  • Control: 200 conversions (10%)
  • Test: 240 conversions (12%)
  • Result: Can detect the real difference

Required Calculations

For this example, we need:

  • Significance level: α = 0.05
  • Power: 1-β = 0.80
  • Baseline rate: p₁ = 0.10
  • Expected rate: p₂ = 0.12
  • Effect size: |p₂ - p₁| = 0.02

Input these values into the calculator and it will give you 3841 samples per group.

Common Mistakes to Avoid

Underpowered Studies

  • Unable to detect meaningful effects
  • Waste of time and resources
  • Inconclusive results
  • Potential ethical issues

Overpowered Studies

  • Excessive resource usage
  • Detection of trivial effects
  • Unnecessary participant burden
  • Inflated costs

Best Practices

  • Always calculate sample size before starting data collection
  • Consider practical significance, not just statistical significance
  • Account for potential dropout or missing data
  • Document your sample size calculations and assumptions
  • Consider conducting a pilot study if parameters are unknown

Sequential Testing and Early Stopping

While traditional sample size calculation is crucial, modern A/B testing platforms often use sequential testing approaches:

Sequential Analysis

  • Continuously monitor results
  • Stop early if effect is clear
  • Adjust for multiple looks
  • More efficient use of resources

Required Adjustments

  • Use adjusted significance levels
  • Account for peeking
  • Consider false discovery rate
  • Monitor effect size stability

Key Takeaway

Whether using traditional fixed-sample approaches or modern sequential methods, proper planning of sample size and monitoring procedures is essential for valid and reliable results.

Power Analysis Calculations: Sample Size, Power, and Effect Size

1. Two-Sample Mean Difference (t-test)

For comparing two independent means, the sample size per group is:

n=2(zα/2+zβ)2d2n = \frac{2(z_{\alpha/2} + z_{\beta})^2}{d^2}

where:

  • zα/2z_{\alpha/2}: Critical value for Type I error rate (1.96 for α = 0.05)
  • zβz_{\beta}: Critical value for Type II error rate (0.84 for power = 0.80)
  • dd: Cohen's d (standardized effect size) = (μ₁ - μ₂)/σ
Note: For unequal group sizes with allocation ratio r = n₂/n₁:
n1=2(zα/2+zβ)2(1+1/r)d2n_1 = \frac{2(z_{\alpha/2} + z_{\beta})^2(1 + 1/r)}{d^2}n2=r×n1n_2 = r \times n_1

Sample Size Calculation (Given Power & Effect Size)

Let's calculate the sample size needed to detect a medium effect size (d = 0.5) with 80% power at α = 0.05:

Step-by-step calculation:

  1. α = 0.05, so zα/2 = 1.96
  2. Power = 0.80, so zβ = 0.84
  3. Effect size d = 0.5
  4. Apply the formula:
    n=2(1.96+0.84)2(0.5)2=2(2.8)20.25=2×7.840.25=15.680.25=63.76n = \frac{2(1.96 + 0.84)^2}{(0.5)^2} = \frac{2(2.8)^2}{0.25} = \frac{2 \times 7.84}{0.25} = \frac{15.68}{0.25} = 63.76
  5. Round up to n = 64 subjects per group

Python implementation

Python
from statsmodels.stats.power import TTestIndPower
import numpy as np

# Parameters
effect_size = 0.5    # Cohen's d (medium effect)
alpha = 0.05         # Significance level
power = 0.8          # Desired power

# Create power analysis object
analysis = TTestIndPower()

# Calculate sample size (per group)
sample_size = analysis.solve_power(
    effect_size=effect_size,
    alpha=alpha,
    power=power,
    ratio=1.0,  # Equal group sizes
    alternative='two-sided'
)

print(f"Sample size per group: {np.ceil(sample_size):.0f}")

# For unequal group sizes with ratio 2:1
ratio = 2.0
sample_size_unequal = analysis.solve_power(
    effect_size=effect_size,
    alpha=alpha,
    power=power,
    ratio=ratio,
    alternative='two-sided'
)

n1 = np.ceil(sample_size_unequal)
n2 = np.ceil(sample_size_unequal * ratio)
print(f"Unequal groups (ratio 1:{ratio:.0f}):")
print(f"- Group 1 sample size: {n1:.0f}")
print(f"- Group 2 sample size: {n2:.0f}")
print(f"- Total sample size: {n1 + n2:.0f}")

Output:

Sample size per group: 64

Unequal groups (ratio 1:2):
- Group 1 sample size: 48
- Group 2 sample size: 96
- Total sample size: 144

R implementation

R
library(tidyverse)
library(pwr)

# Parameters
effect_size <- 0.5   # Cohen's d (medium effect)
sig_level <- 0.05    # Significance level (alpha)
power <- 0.8         # Desired power
type <- "two.sample" # Two-sample t-test

# Calculate sample size (equal group sizes)
result <- pwr.t.test(d = effect_size,
                     sig.level = sig_level,
                     power = power,
                     type = type,
                     alternative = "two.sided")

# Print results
print(str_glue("Sample size per group: {ceiling(result$n)}"))

# For unequal group sizes with allocation ratio r = 2
r <- 2
n1 <- pwr.t.test(d = effect_size,
                 sig.level = sig_level,
                 power = power,
                 type = type)$n * (1 + r) / (2 * r)
n2 <- r * n1
print(str_glue("Unequal groups (ratio 1:{r}):"))
print(str_glue("- Group 1 sample size: {ceiling(n1)}"))
print(str_glue("- Group 2 sample size: {ceiling(n2)}"))
print(str_glue("- Total sample size: {ceiling(n1) + ceiling(n2)}"))

Output:

     Two-sample t test power calculation 

              n = 63.76561
              d = 0.5
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

Unequal groups (ratio 1:2):
- Group 1 sample size: 48
- Group 2 sample size: 96
- Total sample size: 144

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power you'll achieve with a given sample size and effect size:

Python implementation
Python
from statsmodels.stats.power import TTestIndPower

# Given parameters
effect_size = 0.5    # Cohen's d
alpha = 0.05         # Significance level
sample_size = 50     # Per group

analysis = TTestIndPower()
power = analysis.power(
    effect_size=effect_size,
    nobs1=sample_size,
    alpha=alpha,
    ratio=1.0,
    alternative='two-sided'
)

print(f"With n={sample_size} per group and d={effect_size}:")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=50 per group and d=0.5:
Statistical power: 0.6969 (69.69%)
R implementation
R
library(tidyverse)
library(pwr)

# Given parameters
effect_size <- 0.5    # Cohen's d
sig_level <- 0.05     # Significance level
sample_size <- 50     # Per group

# Calculate power
result <- pwr.t.test(
    d = effect_size,
    n = sample_size,
    sig.level = sig_level,
    type = "two.sample",
    alternative = "two.sided"
)

print(str_glue("With n={sample_size} per group and d={effect_size}:"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=50 per group and d=0.5:
Statistical power: 0.6969 (69.69%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the smallest effect size you can detect with a given sample size and desired power:

Python implementation
Python
from statsmodels.stats.power import TTestIndPower

# Given parameters
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 64     # Per group

analysis = TTestIndPower()
effect_size = analysis.solve_power(
    effect_size=None,
    nobs1=sample_size,
    alpha=alpha,
    power=power,
    ratio=1.0,
    alternative='two-sided'
)

print(f"With n={sample_size} per group and power={power}:")
print(f"Minimum detectable effect size: {effect_size:.4f}")

Output:

With n=64 per group and power=0.8:
Minimum detectable effect size: 0.4991
R implementation
R
library(tidyverse)
library(pwr)

# Given parameters
sig_level <- 0.05     # Significance level
power <- 0.8          # Desired power
sample_size <- 64     # Per group

# Calculate minimum detectable effect size
result <- pwr.t.test(
    n = sample_size,
    sig.level = sig_level,
    power = power,
    type = "two.sample",
    alternative = "two.sided"
)

print(str_glue("With n={sample_size} per group and power={power}:"))
print(str_glue("Minimum detectable effect size: {round(result$d, 4)}"))

Output:

With n=64 per group and power=0.8:
Minimum detectable effect size: 0.4991

2. Paired Difference Test

For paired samples, the required number of pairs is:

n=2(zα/2+zβ)2(1ρ)d2n = \frac{2(z_{\alpha/2} + z_{\beta})^2(1-\rho)}{d^2}

where:

  • ρ\rho: Correlation between paired measurements
  • dd: Effect size = (μ₁ - μ₂)/σ

Note: Higher correlation between pairs reduces the required sample size, making paired designs more efficient when correlation is strong.

Sample Size Calculation (Given Power & Effect Size)

For a paired t-test with expected effect size d = 0.5, correlation ρ = 0.6, significance level α = 0.05, and power = 0.8:

n=2(1.96+0.84)2(10.6)0.52=2(2.8)2(0.4)0.25=6.2720.25=25.0926 pairsn = \frac{2(1.96 + 0.84)^2(1-0.6)}{0.5^2} = \frac{2(2.8)^2(0.4)}{0.25} = \frac{6.272}{0.25} = 25.09 \approx 26 \text{ pairs}

Therefore, we need 26 pairs of observations.

Python Implementation
Python
from statsmodels.stats.power import TTestPower
import numpy as np

# Parameters
d = 0.5            # Effect size (Cohen's d)
alpha = 0.05       # Significance level
power = 0.8        # Desired power
rho = 0.6          # Correlation between pairs

# Adjusted effect size for paired design
# For paired t-test, effective effect size is larger due to correlation
d_adj = d / np.sqrt(2 * (1 - rho))

# Create power analysis object
analysis = TTestPower()

# Calculate sample size (number of pairs)
sample_size = analysis.solve_power(
    effect_size=d_adj,
    alpha=alpha,
    power=power,
    alternative='two-sided'
)

print(f"Sample size (number of pairs): {np.ceil(sample_size):.0f}")
print(f"Correlation reduces required sample size from")
independent_n = analysis.solve_power(effect_size=d, nobs=None, alpha=alpha, power=power, alternative='two-sided')
print(f"{np.ceil(independent_n):.0f} (independent) to {np.ceil(sample_size):.0f} (paired)")

Output:

Sample size (number of pairs): 28

Correlation reduces required sample size from
64 (independent) to 28 (paired)
R Implementation
R
library(tidyverse)
library(pwr)

# Parameters
d <- 0.5           # Effect size (Cohen's d)
sig.level <- 0.05  # Significance level
power <- 0.8       # Desired power
rho <- 0.6         # Correlation between pairs

# Adjusted effect size for paired design
d_adj <- d / sqrt(2 * (1 - rho))

# Calculate sample size
result <- pwr.t.test(
  d = d_adj,
  sig.level = sig.level,
  power = power,
  type = "paired"
)
print(result)
print(str_glue("Sample size per group: {ceiling(result$n)}"))

Output:

     Paired t test power calculation 

              n = 27.0998
              d = 0.559017
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number of *pairs*

Sample size per group: 28

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power for a paired t-test:

Python implementation
Python
from statsmodels.stats.power import TTestPower
import numpy as np

# Given parameters
d = 0.5              # Effect size (Cohen's d)
alpha = 0.05         # Significance level
sample_size = 30     # Number of pairs
rho = 0.6            # Correlation between pairs

# Adjusted effect size
d_adj = d / np.sqrt(2 * (1 - rho))

analysis = TTestPower()
power = analysis.power(
    effect_size=d_adj,
    nobs=sample_size,
    alpha=alpha,
    alternative='two-sided'
)

print(f"With n={sample_size} pairs, d={d}, and ρ={rho}:")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=30 pairs, d=0.5, and ρ=0.6:
Statistical power: 0.8411 (84.11%)
R implementation
R
library(pwr)

# Given parameters
d <- 0.5              # Effect size (Cohen's d)
sig_level <- 0.05     # Significance level
sample_size <- 30     # Number of pairs
rho <- 0.6            # Correlation between pairs

# Adjusted effect size
d_adj <- d / sqrt(2 * (1 - rho))

# Calculate power
result <- pwr.t.test(
    d = d_adj,
    n = sample_size,
    sig.level = sig_level,
    type = "paired",
    alternative = "two.sided"
)

print(str_glue("With n={sample_size} pairs, d={d}, and ρ={rho}:"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=30 pairs, d=0.5, and ρ=0.6:
Statistical power: 0.8411 (84.11%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the minimum detectable effect size for a paired t-test:

Python implementation
Python
from statsmodels.stats.power import TTestPower
import numpy as np

# Given parameters
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 28     # Number of pairs
rho = 0.6            # Correlation between pairs

analysis = TTestPower()
# Solve for adjusted effect size
d_adj = analysis.solve_power(
    effect_size=None,
    nobs=sample_size,
    alpha=alpha,
    power=power,
    alternative='two-sided'
)

# Convert back to original effect size
d = d_adj * np.sqrt(2 * (1 - rho))

print(f"With n={sample_size} pairs, power={power}, and ρ={rho}:")
print(f"Minimum detectable effect size: {d:.4f}")

Output:

With n=28 pairs, power=0.8, and ρ=0.6:
Minimum detectable effect size: 0.4913
R implementation
R
library(pwr)

# Given parameters
sig_level <- 0.05     # Significance level
power <- 0.8          # Desired power
sample_size <- 28     # Number of pairs
rho <- 0.6            # Correlation between pairs

# Calculate minimum detectable adjusted effect size
result <- pwr.t.test(
    n = sample_size,
    sig.level = sig_level,
    power = power,
    type = "paired",
    alternative = "two.sided"
)

# Convert back to original effect size
d <- result$d * sqrt(2 * (1 - rho))

print(str_glue("With n={sample_size} pairs, power={power}, and ρ={rho}:"))
print(str_glue("Minimum detectable effect size: {round(d, 4)}"))

Output:

With n=28 pairs, power=0.8, and ρ=0.6:
Minimum detectable effect size: 0.4912

3. Proportion Test

For comparing two proportions, the required sample size per group is:

n=2(zα/2+zβ)2h2n = \frac{2(z_{\alpha/2} + z_{\beta})^2}{h^2}

where:

  • hh: Cohen's h = 2arcsin(p1)2arcsin(p2)2\arcsin(\sqrt{p_1}) - 2\arcsin(\sqrt{p_2})
  • p1,p2p_1, p_2: Expected proportions in each group

Cohen's h Effect Size Guidelines:

  • Small: h = 0.2
  • Medium: h = 0.5
  • Large: h = 0.8

Sample Size Calculation (Given Power & Effect Size)

Let's calculate the sample size needed to detect a difference between proportions p₁ = 0.6 and p₂ = 0.4 with 80% power at α = 0.05:

h=2arcsin(0.6)2arcsin(0.4)=2(0.8861)2(0.6847)=1.77221.3694=0.4027h = 2\arcsin(\sqrt{0.6}) - 2\arcsin(\sqrt{0.4}) = 2(0.8861) - 2(0.6847) = 1.7722 - 1.3694 = 0.4027n=2(1.96+0.84)2(0.4027)2=2(2.8)20.1622=15.680.1622=96.6997n = \frac{2(1.96 + 0.84)^2}{(0.4027)^2} = \frac{2(2.8)^2}{0.1622} = \frac{15.68}{0.1622} = 96.69 \approx 97

Python implementation

Python
from statsmodels.stats.power import zt_ind_solve_power
from statsmodels.stats.proportion import proportion_effectsize
import numpy as np

# Parameters
p1 = 0.6           # Proportion in group 1
p2 = 0.4           # Proportion in group 2
alpha = 0.05       # Significance level
power = 0.8        # Desired power

# Calculate effect size (Cohen's h)
# This uses the arcsine transformation
h = proportion_effectsize(p1, p2)
print(f"Cohen's h = {h:.4f}")

# Calculate sample size per group
sample_size = zt_ind_solve_power(
    effect_size=h,
    alpha=alpha,
    power=power,
    ratio=1.0,  # Equal group sizes
    alternative='two-sided'
)

print(f"Sample size per group: {np.ceil(sample_size):.0f}")

# Alternative: Using NormalIndPower (another approach)
from statsmodels.stats.power import NormalIndPower
analysis = NormalIndPower()
sample_size_alt = analysis.solve_power(
    effect_size=h,
    alpha=alpha,
    power=power,
    ratio=1.0,
    alternative='two-sided'
)
print(f"Sample size (alternative method): {np.ceil(sample_size_alt):.0f}")

Output:

Cohen's h = 0.4027
Sample size per group: 97
Sample size (alternative method): 97

R implementation

R
library(tidyverse)
library(pwr)

# Parameters
p1 <- 0.6          # Proportion in group 1
p2 <- 0.4          # Proportion in group 2
sig_level <- 0.05  # Significance level (alpha)
power <- 0.8       # Desired power

# Calculate effect size (Cohen's h)
h <- 2 * asin(sqrt(p1)) - 2 * asin(sqrt(p2))
print(str_glue("Cohen's h = {round(h, 4)}"))

# Calculate sample size
result <- pwr.2p.test(h = h,
                     sig.level = sig_level,
                     power = power)

# Print results
print(result)
print(str_glue("Sample size per group: {ceiling(result$n)}"))

Output:

Cohen's h = 0.027

     Difference of proportion power calculation for binomial distribution (arcsine transformation) 

              h = 0.4027158
              n = 96.79194
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: same sample sizes

Sample size per group: 97

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power for a proportion test:

Python implementation
Python
from statsmodels.stats.power import NormalIndPower
from statsmodels.stats.proportion import proportion_effectsize

# Given parameters
p1 = 0.3             # Baseline proportion
p2 = 0.4             # Alternative proportion
alpha = 0.05         # Significance level
sample_size = 100    # Per group

# Calculate Cohen's h
h = proportion_effectsize(p1, p2)

analysis = NormalIndPower()
power = analysis.power(
    effect_size=h,
    nobs1=sample_size,
    alpha=alpha,
    alternative='two-sided'
)

print(f"With n={sample_size} per group, p1={p1}, and p2={p2}:")
print(f"Cohen's h: {h:.4f}")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=100 per group, p1=0.3, and p2=0.4:
Cohen's h: -0.2102
Statistical power: 0.3181 (31.81%)
R implementation
R
library(pwr)

# Given parameters
p1 <- 0.3             # Baseline proportion
p2 <- 0.4             # Alternative proportion
sig_level <- 0.05     # Significance level
sample_size <- 100    # Per group

# Calculate Cohen's h
h <- ES.h(p1, p2)

# Calculate power
result <- pwr.2p.test(
    h = h,
    n = sample_size,
    sig.level = sig_level,
    alternative = "two.sided"
)

print(str_glue("With n={sample_size} per group, p1={p1}, and p2={p2}:"))
print(str_glue("Cohen's h: {round(h, 4)}"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=100 per group, p1=0.3, and p2=0.4:
Cohen's h: -0.2102
Statistical power: 0.3181 (31.81%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the minimum detectable proportion difference:

Python implementation
Python
from statsmodels.stats.power import NormalIndPower
from statsmodels.stats.proportion import proportion_effectsize
from scipy.optimize import fsolve
import numpy as np

# Given parameters
p1 = 0.3             # Baseline proportion
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 194    # Per group

# Solve for Cohen's h
analysis = NormalIndPower()
h = analysis.solve_power(
    effect_size=None,
    nobs1=sample_size,
    alpha=alpha,
    power=power,
    alternative='two-sided'
)

# Convert h back to proportions (approximate)
p2 = np.sin((h + 2*np.arcsin(np.sqrt(p1)))/2)**2
effect_size = p2 - p1

print(f"With n={sample_size} per group, p1={p1}, and power={power}:")
print(f"Minimum detectable proportion difference: {effect_size:.4f}")
print(f"This means p2 = {p2:.4f}")

Output:

With n=194 per group, p1=0.3, and power=0.8:
Minimum detectable proportion difference: 0.1366
This means p2 = 0.4366
R implementation
R
library(pwr)

# Given parameters
p1 <- 0.3             # Baseline proportion
sig_level <- 0.05     # Significance level
power <- 0.8          # Desired power
sample_size <- 194    # Per group

# Calculate minimum detectable Cohen's h
result <- pwr.2p.test(
    n = sample_size,
    sig.level = sig_level,
    power = power,
    alternative = "two.sided"
)

# Convert h back to proportions (approximate)
# h = 2*arcsin(sqrt(p2)) - 2*arcsin(sqrt(p1))
# Solving for p2
h <- result$h
p2 <- (sin(asin(sqrt(p1)) + h/2))^2
effect_size <- p2 - p1

print(str_glue("With n={sample_size} per group, p1={p1}, and power={power}:"))
print(str_glue("Minimum detectable proportion difference: {round(effect_size, 4)}"))
print(str_glue("This means p2 = {round(p2, 4)}"))

Output:

With n=194 per group, p1=0.3, and power=0.8:
Minimum detectable proportion difference: 0.1366
This means p2 = 0.4366

4. One-Way ANOVA

For one-way ANOVA with k groups, the sample size per group is:

n = \ rac{(z_{\alpha/2} + z_{\eta})^2 \cdot 2}{f^2 \cdot k}

where:

  • ff: Cohen's f effect size = η2/(1η2)\sqrt{\eta^2/(1-\eta^2)}
  • kk: Number of groups
  • η2\eta^2: Proportion of variance explained
  • zα/2z_{\alpha/2}: Critical value for significance level α\alpha (two-tailed)
  • z_{\eta}: Critical value for power 1 - \eta

Note:

This formula provides an approximation of the sample size for a one-way ANOVA. For more accurate results, especially when dealing with small effect sizes or complex designs, it is recommended to use specialized software (e.g., G*Power, R, Python, or our calculator above).

Cohen's f Effect Size Guidelines:

  • Small: f = 0.10
  • Medium: f = 0.25
  • Large: f = 0.40

Sample Size Calculation (Given Power & Effect Size)

Let's calculate the sample size needed for a one-way ANOVA with 3 groups, a medium effect size (f = 0.25), 80% power, and α = 0.05:

n=(1.96+0.84)22(0.25)23=8.820920.06253=83.626784n = \frac{(1.96 + 0.84)^2 \cdot 2}{(0.25)^2 \cdot 3} = \frac{8.8209 \cdot 2}{0.0625 \cdot 3} = 83.6267 \approx 84

Therefore, we need approximately 84 subjects per group, for a total of 252 subjects.

Python implementation

Python
from statsmodels.stats.power import FTestAnovaPower
import numpy as np

# Parameters
k = 3              # Number of groups
f = 0.25           # Cohen's f (medium effect size)
alpha = 0.05       # Significance level
power = 0.8        # Desired power

# Create power analysis object
analysis = FTestAnovaPower()

# Calculate TOTAL sample size first (statsmodels returns total, not per-group)
total_sample_size = analysis.solve_power(
    effect_size=f,
    nobs=None,
    alpha=alpha,
    power=power,
    k_groups=k
)

# Handle array conversion (statsmodels may return array)
if isinstance(total_sample_size, np.ndarray):
    total_sample_size = float(total_sample_size.item())
else:
    total_sample_size = float(total_sample_size)

# Divide by k to get per-group size
per_group = total_sample_size / k

print(f"Sample size per group: {np.ceil(per_group):.0f}")
print(f"Total sample size: {np.ceil(total_sample_size):.0f}")

# Also calculate what power we'd have with different sample sizes
for n_per_group in [30, 50, 70, 100]:
    # Note: nobs must be TOTAL sample size (per-group × k)
    pwr = analysis.power(
        effect_size=f,
        nobs=n_per_group * k,
        alpha=alpha,
        k_groups=k
    )
    print(f"Power with n={n_per_group} per group: {pwr:.3f} ({pwr*100:.1f}%)")

Output:

Sample size per group: 53
Total sample size: 158

Power with n=30 per group: 0.540 (54.0%)
Power with n=50 per group: 0.780 (78.0%)
Power with n=70 per group: 0.907 (90.7%)
Power with n=100 per group: 0.978 (97.8%)

R implementation

R
library(pwr)

# Parameters
groups <- 3         # Number of groups
f <- 0.25           # Cohen's f (medium effect size)
sig_level <- 0.05   # Significance level (alpha)
power <- 0.8        # Desired power

# Calculate sample size
result <- pwr.anova.test(k = groups,
                         f = f,
                         sig.level = sig_level,
                         power = power)

# Print results
print(result)
print(paste("Total sample size:", ceiling(result$n) * groups))

Output:

     Balanced one-way analysis of variance power calculation 

              k = 3
              n = 52.3966
              f = 0.25
      sig.level = 0.05
          power = 0.8

NOTE: n is number in each group

Total sample size: 159

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power for one-way ANOVA:

Python implementation
Python
from statsmodels.stats.power import FTestAnovaPower

# Given parameters
k = 3                # Number of groups
f = 0.25             # Cohen's f effect size
alpha = 0.05         # Significance level
sample_size = 60     # Per group

analysis = FTestAnovaPower()
# Note: nobs must be TOTAL sample size
power = analysis.power(
    effect_size=f,
    nobs=sample_size * k,
    alpha=alpha,
    k_groups=k
)

print(f"With n={sample_size} per group (total={sample_size*k}), k={k}, and f={f}:")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=60 per group (total=180), k=3, and f=0.25:
Statistical power: 0.8546 (85.46%)
R implementation
R
library(pwr)

# Given parameters
k <- 3                # Number of groups
f <- 0.25             # Cohen's f effect size
sig_level <- 0.05     # Significance level
sample_size <- 60     # Per group

# Calculate power
result <- pwr.anova.test(
    k = k,
    n = sample_size,
    f = f,
    sig.level = sig_level
)

print(str_glue("With n={sample_size} per group (total={sample_size*k}), k={k}, and f={f}:"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=60 per group (total=180), k=3, and f=0.25:
Statistical power: 0.8546 (85.46%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the minimum detectable Cohen's f for ANOVA:

Python implementation
Python
from statsmodels.stats.power import FTestAnovaPower

# Given parameters
k = 3                # Number of groups
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 53     # Per group

analysis = FTestAnovaPower()
# Solve for effect size (note: nobs is TOTAL sample size)
f = analysis.solve_power(
    effect_size=None,
    nobs=sample_size * k,
    alpha=alpha,
    power=power,
    k_groups=k
)

print(f"With n={sample_size} per group (total={sample_size*k}), k={k}, and power={power}:")
print(f"Minimum detectable effect size (Cohen's f): {f:.4f}")

Output:

With n=53 per group (total=159), k=3, and power=0.8:
Minimum detectable effect size (Cohen's f): 0.2485
R implementation
R
library(pwr)

# Given parameters
k <- 3                # Number of groups
sig_level <- 0.05     # Significance level
power <- 0.8          # Desired power
sample_size <- 53     # Per group

# Calculate minimum detectable effect size
result <- pwr.anova.test(
    k = k,
    n = sample_size,
    sig.level = sig_level,
    power = power
)

print(str_glue("With n={sample_size} per group (total={sample_size*k}), k={k}, and power={power}:"))
print(str_glue("Minimum detectable effect size (Cohen's f): {round(result$f, 4)}"))

Output:

With n=53 per group (total=159), k=3, and power=0.8:
Minimum detectable effect size (Cohen's f): 0.2485

5. Chi-Square Test

The chi-square test is used for categorical data to test independence in contingency tables or goodness-of-fit. Sample size calculations use Cohen's w effect size.

For a chi-square test, the sample size is calculated using:

n = \ rac{(z_{\alpha/2} + z_{\eta})^2}{w^2}

where:

  • ww: Cohen's w effect size
  • zα/2z_{\alpha/2}: Critical value for significance level α\alpha (two-tailed)
  • z_{\eta}: Critical value for power 1 - \eta

Cohen's w Effect Size Guidelines:

  • Small effect: w = 0.1
  • Medium effect: w = 0.3
  • Large effect: w = 0.5

Sample Size Calculation (Given Power & Effect Size)

Calculate the required sample size for a chi-square test with 1 degree of freedom (2x2 contingency table), medium effect size (w = 0.3), 80% power, and α = 0.05:

Python implementation
Python
from statsmodels.stats.power import GofChisquarePower
import numpy as np

# Parameters
df = 1               # Degrees of freedom
w = 0.3              # Cohen's w (medium effect size)
alpha = 0.05         # Significance level
power = 0.8          # Desired power

# Create power analysis object
analysis = GofChisquarePower()

# Calculate sample size
n = analysis.solve_power(
    effect_size=w,
    nobs=None,
    alpha=alpha,
    power=power,
    n_bins=df + 1    # number of categories = df + 1
)

print(f"Sample size needed: {np.ceil(n):.0f}")

# Verify the power with this sample size
power_check = analysis.power(
    effect_size=w,
    nobs=np.ceil(n),
    alpha=alpha,
    n_bins=df + 1
)
print(f"Achieved power: {power_check:.4f} ({power_check*100:.2f}%)")

Output:

Sample size needed: 88
Achieved power: 0.8013 (80.13%)
R implementation
R
library(pwr)
library(stringr)

# Parameters
df <- 1              # Degrees of freedom
w <- 0.3             # Cohen's w (medium effect size)
sig_level <- 0.05    # Significance level
power <- 0.8         # Desired power

# Calculate sample size
result <- pwr.chisq.test(
    w = w,
    df = df,
    sig.level = sig_level,
    power = power
)

print(result)
print(str_glue("Sample size needed: {ceiling(result$N)}"))

Output:

     Chi square power calculation

              w = 0.3
              N = 87.20854
             df = 1
      sig.level = 0.05
          power = 0.8

NOTE: N is the number of observations

Sample size needed: 88

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power with n = 100, df = 1, and w = 0.3:

Python implementation
Python
from statsmodels.stats.power import GofChisquarePower

# Given parameters
df = 1               # Degrees of freedom
w = 0.3              # Cohen's w effect size
alpha = 0.05         # Significance level
sample_size = 100    # Total sample size

analysis = GofChisquarePower()
power = analysis.power(
    effect_size=w,
    nobs=sample_size,
    alpha=alpha,
    n_bins=df + 1
)

print(f"With n={sample_size}, df={df}, and w={w}:")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=100, df=1, and w=0.3:
Statistical power: 0.8508 (85.08%)
R implementation
R
library(pwr)
library(stringr)

# Given parameters
df <- 1              # Degrees of freedom
w <- 0.3             # Cohen's w effect size
sig_level <- 0.05    # Significance level
sample_size <- 100   # Total sample size

# Calculate power
result <- pwr.chisq.test(
    w = w,
    N = sample_size,
    df = df,
    sig.level = sig_level
)

print(str_glue("With n={sample_size}, df={df}, and w={w}:"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=100, df=1, and w=0.3:
Statistical power: 0.8508 (85.08%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the minimum detectable Cohen's w with n = 88, df = 1, and 80% power:

Python implementation
Python
from statsmodels.stats.power import GofChisquarePower

# Given parameters
df = 1               # Degrees of freedom
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 88     # Total sample size

analysis = GofChisquarePower()
w = analysis.solve_power(
    effect_size=None,
    nobs=sample_size,
    alpha=alpha,
    power=power,
    n_bins=df + 1
)

print(f"With n={sample_size}, df={df}, and power={power}:")
print(f"Minimum detectable effect size (Cohen's w): {w:.4f}")

Output:

With n=88, df=1, and power=0.8:
Minimum detectable effect size (Cohen's w): 0.2994
R implementation
R
library(pwr)
library(stringr)

# Given parameters
df <- 1              # Degrees of freedom
sig_level <- 0.05    # Significance level
power <- 0.8         # Desired power
sample_size <- 88    # Total sample size

# Calculate minimum detectable effect size
result <- pwr.chisq.test(
    N = sample_size,
    df = df,
    sig.level = sig_level,
    power = power
)

print(str_glue("With n={sample_size}, df={df}, and power={power}:"))
print(str_glue("Minimum detectable effect size (Cohen's w): {round(result$w, 4)}"))

Output:

With n=88, df=1, and power=0.8:
Minimum detectable effect size (Cohen's w): 0.2987

Note: The slight difference between Python (0.2994) and R (0.2987) results is normal and expected. Different packages use different numerical algorithms and convergence criteria. Both values are practically equivalent.

6. Multiple Regression

Multiple regression models the relationship between a dependent variable and multiple independent variables. Sample size calculations use Cohen's f² effect size (proportion of variance explained).

For multiple regression, the required sample size is:

n = \ rac{(z_{\alpha/2} + z_{\eta})^2}{f^2} + k + 1

where:

  • f2f^2: Cohen's f² effect size (or use Cohen's f = f2\sqrt{f^2})
  • kk: Number of predictors
  • zα/2z_{\alpha/2}: Critical value for significance level α\alpha
  • z_{\eta}: Critical value for power 1 - \eta

Cohen's f² Effect Size Guidelines:

  • Small effect: f² = 0.02 (2% variance explained)
  • Medium effect: f² = 0.15 (15% variance explained)
  • Large effect: f² = 0.35 (35% variance explained)

Note: Relationship to R²: f^2 = \ rac{R^2}{1-R^2}

Sample Size Calculation (Given Power & Effect Size)

Calculate the required sample size for a multiple regression with 5 predictors, medium effect size (f² = 0.15), 80% power, and α = 0.05:

Python implementation
Python
from statsmodels.stats.power import FTestAnovaPower
import numpy as np

# Parameters
num_predictors = 5   # Number of independent variables
f_squared = 0.15     # Cohen's f² (medium effect size)
alpha = 0.05         # Significance level
power = 0.8          # Desired power

# Convert f² to f
cohens_f = np.sqrt(f_squared)

# Create power analysis object
analysis = FTestAnovaPower()

# For multiple regression: k_groups = num_predictors + 1
k_groups = num_predictors + 1

# Solve for required sample size
nobs_result = analysis.solve_power(
    effect_size=cohens_f,
    nobs=None,  # This is what we want to solve for
    alpha=alpha,
    power=power,
    k_groups=k_groups
)

# Round up to get total sample size
total_n = int(np.ceil(nobs_result))

print(f"Cohen's f: {cohens_f:.4f}")
print(f"Sample size needed: {total_n}")
print(f"(Minimum 10-15 observations per predictor recommended)")

# Verify the power with this sample size
power_check = analysis.power(
    effect_size=cohens_f,
    nobs=total_n,
    alpha=alpha,
    k_groups=k_groups
)
print(f"Achieved power: {power_check:.4f} ({power_check*100:.2f}%)")

Output:

Cohen's f: 0.3873
Sample size needed: 92
(Minimum 10-15 observations per predictor recommended)
Achieved power: 0.8042 (80.42%)
R implementation
R
library(pwr)
library(stringr)

# Parameters
num_predictors <- 5  # Number of independent variables
f_squared <- 0.15    # Cohen's f² (medium effect size)
sig_level <- 0.05    # Significance level
power <- 0.8         # Desired power

# Convert f² to f
cohens_f <- sqrt(f_squared)

# Calculate sample size using pwr.f2.test
# u = numerator df = number of predictors
# v = denominator df = n - u - 1
result <- pwr.f2.test(
    u = num_predictors,
    f2 = f_squared,
    sig.level = sig_level,
    power = power
)

# Total sample size = v + u + 1
total_n <- ceiling(result$v + num_predictors + 1)

print(result)
print(str_glue("Cohen's f: {round(cohens_f, 4)}"))
print(str_glue("Sample size needed: {total_n}"))

Output:

     Multiple regression power calculation 

              u = 5
              v = 85.21369
             f2 = 0.15
      sig.level = 0.05
          power = 0.8
          
Cohen's f: 0.3873
Sample size needed: 92

Power Calculation (Given Sample Size & Effect Size)

Calculate the statistical power with n = 100, 5 predictors, and f² = 0.15:

Python implementation
Python
from statsmodels.stats.power import FTestAnovaPower
import numpy as np

# Given parameters
num_predictors = 5   # Number of predictors
f_squared = 0.15     # Cohen's f² effect size
alpha = 0.05         # Significance level
sample_size = 100    # Total sample size

# Convert f² to f
cohens_f = np.sqrt(f_squared)

# Create power analysis object
analysis = FTestAnovaPower()

# For multiple regression: k_groups = num_predictors + 1
k_groups = num_predictors + 1

# Calculate power
power = analysis.power(
    effect_size=cohens_f,
    nobs=sample_size,
    alpha=alpha,
    k_groups=k_groups
)

print(f"With n={sample_size}, k={num_predictors}, and f²={f_squared}:")
print(f"Statistical power: {power:.4f} ({power*100:.2f}%)")

Output:

With n=100, k=5, and f²=0.15:
Statistical power: 0.8430 (84.30%)
R implementation
R
library(pwr)
library(stringr)

# Given parameters
num_predictors <- 5  # Number of predictors
f_squared <- 0.15    # Cohen's f² effect size
sig_level <- 0.05    # Significance level
sample_size <- 100   # Total sample size

# Calculate v (denominator df)
v <- sample_size - num_predictors - 1

# Calculate power
result <- pwr.f2.test(
    u = num_predictors,
    v = v,
    f2 = f_squared,
    sig.level = sig_level
)

print(str_glue("With n={sample_size}, k={num_predictors}, and f²={f_squared}:"))
print(str_glue("Statistical power: {round(result$power, 4)} ({round(result$power*100, 2)}%)"))

Output:

With n=100, k=5, and f²=0.15:
Statistical power: 0.8489 (84.89%)

Minimum Detectable Effect Size (Given Sample Size & Power)

Calculate the minimum detectable Cohen's f² with n = 92, 5 predictors, and 80% power:

Python implementation
Python
from statsmodels.stats.power import FTestAnovaPower
import numpy as np

# Given parameters
num_predictors = 5   # Number of predictors
alpha = 0.05         # Significance level
power = 0.8          # Desired power
sample_size = 92     # Total sample size

# Create power analysis object
analysis = FTestAnovaPower()

# For multiple regression: k_groups = num_predictors + 1
k_groups = num_predictors + 1

# Solve for Cohen's f (effect size)
cohens_f = analysis.solve_power(
    effect_size=None,  # This is what we want to solve for
    nobs=sample_size,
    alpha=alpha,
    power=power,
    k_groups=k_groups
)

# Handle array conversion (statsmodels may return array)
if isinstance(cohens_f, np.ndarray):
    cohens_f = float(cohens_f.item())
else:
    cohens_f = float(cohens_f)

# Convert to f²
f_squared = cohens_f ** 2

print(f"With n={sample_size}, k={num_predictors}, and power={power}:")
print(f"Minimum detectable effect size (Cohen's f²): {f_squared:.4f}")
print(f"Minimum detectable effect size (Cohen's f): {cohens_f:.4f}")

Output:

With n=92, k=5, and power=0.8:
Minimum detectable effect size (Cohen's f²): 0.1486
Minimum detectable effect size (Cohen's f): 0.3855
R implementation
R
library(pwr)
library(stringr)

# Given parameters
num_predictors <- 5  # Number of predictors
sig_level <- 0.05    # Significance level
power <- 0.8         # Desired power
sample_size <- 92    # Total sample size

# Calculate v (denominator df)
v <- sample_size - num_predictors - 1

# Calculate minimum detectable effect size
result <- pwr.f2.test(
    u = num_predictors,
    v = v,
    sig.level = sig_level,
    power = power
)

# Convert f to f²
cohens_f <- sqrt(result$f2)

print(str_glue("With n={sample_size}, k={num_predictors}, and power={power}:"))
print(str_glue("Minimum detectable effect size (Cohen's f²): {round(result$f2, 4)}"))
print(str_glue("Minimum detectable effect size (Cohen's f): {round(cohens_f, 4)}"))

Output:

With n=92, k=5, and power=0.8:
Minimum detectable effect size (Cohen's f²): 0.1486
Minimum detectable effect size (Cohen's f): 0.3855