StatsCalculators.com

Partial Correlation

Created:December 28, 2025
Last Updated:December 28, 2025

This calculator helps you measure the relationship between two variables while controlling for the effect of one or more additional variables. Partial correlation is essential when you want to isolate the direct relationship between two variables by removing the influence of confounding factors. Perfect for research studies, statistical analysis, and understanding complex multivariate relationships.

What You'll Get:

  • Comprehensive Step-by-Step Calculations: Understand how partial correlation removes confounding effects with detailed formulas
  • Multiple Control Variables: Control for one or more variables to isolate the true relationship
  • Comparison with Simple Correlation: See how controlling for variables changes the correlation coefficient
  • Statistical Significance Testing: P-values and confidence assessments for partial correlation
  • Visualization: Scatter plots showing the relationship before and after controlling for confounders
  • APA-Style Report: Professional, publication-ready results you can copy directly into papers or reports

For simple correlation between two variables without controls, use our Correlation Coefficient Calculator. Not sure when to use partial correlation? to see how controlling for prior GPA reveals the true relationship between study hours and exam scores (it demonstrates how high-GPA students can score well with less studying, while the raw correlation might suggest studying alone predicts scores).

Calculator

1. Load Your Data

2. Select Variables

Please select X and Y variables first to see available control variables.

Related Calculators

Learn More

Understanding Partial Correlation

Definition

Partial Correlation measures the relationship between two variables while controlling for the effect of one or more additional variables. It helps isolate the direct relationship between variables by removing the influence of confounding factors. The partial correlation coefficient ranges from -1 to +1, just like regular correlation.

Formula

Partial Correlation (controlling for one variable Z):

rXYZ=rXYrXZrYZ(1rXZ2)(1rYZ2)r_{XY \cdot Z} = \frac{r_{XY} - r_{XZ} \cdot r_{YZ}}{\sqrt{(1 - r_{XZ}^2)(1 - r_{YZ}^2)}}

Where rXY is the correlation between X and Y, rXZ is the correlation between X and Z, and rYZ is the correlation between Y and Z

When to Use Partial Correlation

  • When you want to isolate the direct relationship between two variables
  • To control for confounding variables that might influence both variables
  • In research studies to understand pure associations
  • When analyzing complex multivariate relationships
  • To improve upon simple correlation analysis

Interpretation

A partial correlation coefficient represents the strength and direction of the relationship between two variables after removing the linear effect of the control variable(s):

  • • If partial correlation ≈ simple correlation: control variables have little effect
  • • If partial correlation < simple correlation: control variables explain some of the relationship
  • • If partial correlation ≈ 0 but simple correlation ≠ 0: the relationship is spurious (due to confounders)

Important Considerations

  • Partial correlation still does not imply causation
  • Only controls for linear effects of control variables
  • Requires sufficient sample size (n > number of variables + 20 recommended)
  • Assumes linear relationships between all variables

Practical Example

Let's say we want to examine the relationship between ice cream sales (X) and drowning deaths (Y), controlling for temperature (Z):

Simple Correlation:

rXY = 0.85 (strong positive correlation)

This suggests ice cream sales and drowning deaths are strongly related!

Partial Correlation (controlling for temperature):

rXY·Z = 0.05 (very weak correlation)

After controlling for temperature, the relationship nearly disappears!

Interpretation: The apparent relationship between ice cream sales and drowning deaths is spurious - it's actually driven by temperature. Hot weather increases both ice cream sales (people want cold treats) and drowning deaths (more people swimming). Once we control for temperature, there's no direct relationship between the two variables.

How to Calculate Partial Correlation in R

Use the pcor() function from the ppcor package:

R
# Install and load the ppcor package
install.packages("ppcor")
library(ppcor)

# Load example data (same as sample dataset above)
data <- data.frame(
  study_hours = c(5, 8, 3, 10, 6, 4, 9, 5, 7, 11, 6, 9, 4, 12, 7, 5, 10, 6, 8, 11),
  exam_score = c(72, 85, 68, 92, 75, 70, 88, 73, 80, 95, 78, 90, 71, 96, 82, 74, 91, 77, 86, 94),
  prior_gpa = c(2.8, 3.4, 2.5, 3.8, 3.0, 2.7, 3.5, 2.9, 3.2, 3.9, 3.1, 3.6, 2.6, 3.9, 3.3, 2.8, 3.7, 3.0, 3.4, 3.8),
  sleep_hours = c(6.5, 7.0, 6.0, 7.5, 6.5, 6.0, 7.5, 6.5, 7.0, 8.0, 7.0, 7.5, 6.0, 8.0, 7.0, 6.5, 7.5, 6.5, 7.0, 8.0)
)

# Calculate partial correlation between study_hours and exam_score,
# controlling for prior_gpa
result <- pcor.test(data$study_hours, data$exam_score, data$prior_gpa)

print(result)
# Shows: estimate (partial correlation), p.value, statistic, n

# For multiple control variables:
# pcor(data[, c("study_hours", "exam_score", "prior_gpa", "sleep_hours")])$estimate[1,2]

How to Calculate Partial Correlation in Python

Use pingouin.partial_corr() or calculate manually using correlation matrices:

Python
import pandas as pd
import numpy as np
import pingouin as pg
from scipy import stats

# Load example data (same as sample dataset above)
data = pd.DataFrame({
    'study_hours': [5, 8, 3, 10, 6, 4, 9, 5, 7, 11, 6, 9, 4, 12, 7, 5, 10, 6, 8, 11],
    'exam_score': [72, 85, 68, 92, 75, 70, 88, 73, 80, 95, 78, 90, 71, 96, 82, 74, 91, 77, 86, 94],
    'prior_gpa': [2.8, 3.4, 2.5, 3.8, 3.0, 2.7, 3.5, 2.9, 3.2, 3.9, 3.1, 3.6, 2.6, 3.9, 3.3, 2.8, 3.7, 3.0, 3.4, 3.8],
    'sleep_hours': [6.5, 7.0, 6.0, 7.5, 6.5, 6.0, 7.5, 6.5, 7.0, 8.0, 7.0, 7.5, 6.0, 8.0, 7.0, 6.5, 7.5, 6.5, 7.0, 8.0]
})

# Method 1: Using pingouin
partial_corr = pg.partial_corr(
    data=data,
    x='study_hours',
    y='exam_score',
    covar='prior_gpa'  # or covar=['prior_gpa', 'sleep_hours'] for multiple
)
print(partial_corr)

# Method 2: Manual calculation for one control variable
def partial_correlation(df, x, y, z):
    """Calculate partial correlation between x and y controlling for z"""
    # Calculate simple correlations
    r_xy = df[[x, y]].corr().iloc[0, 1]
    r_xz = df[[x, z]].corr().iloc[0, 1]
    r_yz = df[[y, z]].corr().iloc[0, 1]

    # Calculate partial correlation
    numerator = r_xy - (r_xz * r_yz)
    denominator = np.sqrt((1 - r_xz**2) * (1 - r_yz**2))
    partial_r = numerator / denominator

    # Calculate significance
    n = len(df)
    df_resid = n - 3  # degrees of freedom
    t_stat = partial_r * np.sqrt(df_resid) / np.sqrt(1 - partial_r**2)
    p_value = 2 * (1 - stats.t.cdf(abs(t_stat), df_resid))

    return partial_r, p_value

r, p = partial_correlation(data, 'study_hours', 'exam_score', 'prior_gpa')
print(f"Partial correlation: {r:.4f}, p-value: {p:.4f}")

How to Calculate Partial Correlation in SPSS

Use the PARTIAL CORR command:

* GUI Method:
1. Click Analyze > Correlate > Partial...
2. Move variables X and Y to "Variables:" box
3. Move control variable(s) to "Controlling for:" box
4. Click Options to select significance tests and display options
5. Click OK

* Syntax Method:
PARTIAL CORR
  /VARIABLES=ice_cream_sales drowning_deaths BY temperature
  /SIGNIFICANCE=TWOTAIL
  /MISSING=LISTWISE.

Verification