Exploratory Factor Analysis (EFA)

Created:December 4, 2025

Last Updated:December 4, 2025

This calculator performs comprehensive Exploratory Factor Analysis (EFA), a statistical method used to identify underlying latent factors that explain patterns of correlations among observed variables. EFA is widely used in psychology, social sciences, and market research to understand construct validity and reduce data complexity.

What You'll Get:

KMO Test: Kaiser-Meyer-Olkin measure of sampling adequacy to assess data suitability
Bartlett's Test: Test of sphericity to check if variables are correlated
Factor Loadings: Correlations between variables and extracted factors with rotation
Communalities: Proportion of variance explained for each variable
Variance Explained: How much variance each factor accounts for
Scree Plot: Visual aid for determining optimal number of factors
Factor Correlations: Relationships between factors (for oblique rotations)
APA-Formatted Report: Professional statistical reporting ready for publication

💡 Pro Tip: Use KMO (> 0.5) and Bartlett's test (p < .05) to verify your data is suitable for factor analysis. Choose varimax rotation for independent factors or promax for correlated factors. For dimensionality reduction, consider Principal Component Analysis.

Software Implementation Differences

While factor analysis follows established statistical theory, different software packages may produce slightly different results due to implementation details:

R (psych::fa()) and Python (factor_analyzer) typically differ by less than 0.01 in factor loadings
Differences arise from varimax rotation algorithms, convergence criteria, and numerical precision
SPSS, SAS, Stata, Mplus: May show similar small variations
Important: These differences are statistically negligible and do not affect interpretation or conclusions
Communalities, eigenvalues, and overall variance explained should match very closely across software

Ready to explore latent factors in your data? Load our sample dataset (psychological test scores) to see EFA in action, or upload your own data to discover the underlying structure in your variables.

Calculator

1. Load Your Data

2. Select Variables & Options

Select Variables for Factor Analysis:

Selected: 0 of 0 variables

Number of Factors (optional):

Leave empty to use eigenvalue > 1 criterion

Rotation Method:

Varimax for independent factors, Promax for correlated

Extraction Method:

ML is recommended for most analyses

Standardize Data (Recommended)

Related Calculators

Principal Component Analysis Calculator

Correlation Coefficient Calculator

Cronbach's Alpha Calculator

Multiple Linear Regression Calculator

Learn More

Definition

Exploratory Factor Analysis (EFA) is a multivariate statistical technique used to identify underlying latent factors that explain the pattern of correlations among a set of observed variables. Unlike confirmatory factor analysis, EFA does not require a priori hypotheses about the factor structure and is used for theory development and scale construction.

When to Use Factor Analysis

Use Factor Analysis when you want to:

Identify latent constructs: Discover underlying dimensions in questionnaire or test data
Scale development: Determine which items measure the same underlying construct
Data reduction: Reduce many correlated variables to fewer factors while preserving information
Validate theoretical structures: Test whether observed variables reflect hypothesized constructs
Remove redundancy: Identify and eliminate variables that measure the same thing

EFA vs CFA vs PCA: Key Differences

Aspect	EFA	CFA	PCA
Purpose	Seeks to explain correlations among variables using underlying latent factors. Separates shared variance from unique variance.	Tests a pre-specified factor structure based on theory. Confirms hypotheses about relationships between observed variables and latent factors.	Focuses on explaining total variance and creating orthogonal components. Does not distinguish shared from unique variance.
Approach	Exploratory - discovers underlying structure without prior hypotheses	Confirmatory - tests specific hypothesized factor structures	Descriptive - reduces dimensionality for data simplification
When to Use	Theory building, scale development, understanding construct validity	Theory testing, validating measurement models, assessing model fit to data	Data reduction, feature extraction, eliminating multicollinearity

Interpreting Factor Loadings

Factor loadings represent the correlation between each variable and the factor. General guidelines for interpretation:

>= 0.7: Excellent - Variable is a strong indicator of the factor
0.5 - 0.7: Good - Variable moderately relates to the factor
0.3 - 0.5: Fair - Variable weakly relates to the factor
< 0.3: Poor - Consider removing the variable
Cross-loadings: Variables loading highly on multiple factors may need revision

Rotation Methods

Orthogonal Rotations (factors uncorrelated):

Varimax: Most common; maximizes variance of squared loadings, produces clearer factor structure
Quartimax: Simplifies variables rather than factors

Oblique Rotations (factors allowed to correlate):

Promax: Faster than other oblique methods; good when factors are expected to correlate

Choose oblique rotation when you expect factors to be related (common in social sciences). Choose orthogonal when theoretical independence is important.

Example Code

R Code

library(psych)
library(tidyverse)

# Psychological test scores data
data <- tibble(
  verbal = c(65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73),
  numerical = c(62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70),
  logical = c(68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77),
  spatial = c(58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69),
  memory = c(70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81)
)

# Check sampling adequacy
KMO(data)

# Bartlett's test
cortest.bartlett(cor(data), n = nrow(data))

# Perform EFA with varimax rotation
fa_result <- fa(data, nfactors = 2, rotate = "varimax", fm = "ml")

# View results
print(fa_result, digits = 3)

# Loadings
print(fa_result$loadings, cutoff = 0.3)

# Communalities
fa_result$communality

# Scree plot
scree(data, main = "Scree Plot")

# Factor scores
head(fa_result$scores)

Python Code

Python

import numpy as np
import pandas as pd
from factor_analyzer import FactorAnalyzer, calculate_kmo, calculate_bartlett_sphericity
import matplotlib.pyplot as plt
import seaborn as sns

# Psychological test scores data
data = pd.DataFrame({
    'verbal': [65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73],
    'numerical': [62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70],
    'logical': [68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77],
    'spatial': [58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69],
    'memory': [70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81]
})

# KMO test
kmo_all, kmo_model = calculate_kmo(data)
print(f"KMO: {kmo_model:.3f}")

# Bartlett's test
chi_square_value, p_value = calculate_bartlett_sphericity(data)
print(f"Bartlett's test: χ² = {chi_square_value:.2f}, p = {p_value:.4f}")

# Determine number of factors (scree plot)
fa = FactorAnalyzer(n_factors=len(data.columns), rotation=None)
fa.fit(data)

eigenvalues, _ = fa.get_eigenvalues()
plt.figure(figsize=(8, 5))
plt.plot(range(1, len(eigenvalues) + 1), eigenvalues, 'bo-')
plt.axhline(y=1, color='r', linestyle='--', label='Kaiser criterion')
plt.xlabel('Factor')
plt.ylabel('Eigenvalue')
plt.title('Scree Plot')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Perform factor analysis with varimax rotation
fa = FactorAnalyzer(n_factors=2, rotation='varimax', method='ml')
fa.fit(data)

# Loadings
loadings = pd.DataFrame(
    fa.loadings_,
    index=data.columns,
    columns=['Factor1', 'Factor2']
)
print("\nFactor Loadings:")
print(loadings)

# Communalities
communalities = fa.get_communalities()
print("\nCommunalities:")
print(pd.Series(communalities, index=data.columns))

# Variance explained
variance = fa.get_factor_variance()
print("\nVariance Explained:")
print(pd.DataFrame(variance,
    index=['SS Loadings', 'Proportion Var', 'Cumulative Var'],
    columns=['Factor1', 'Factor2']))