This calculator performs comprehensive Exploratory Factor Analysis (EFA), a statistical method used to identify underlying latent factors that explain patterns of correlations among observed variables. EFA is widely used in psychology, social sciences, and market research to understand construct validity and reduce data complexity.
💡 Pro Tip: Use KMO (> 0.5) and Bartlett's test (p < .05) to verify your data is suitable for factor analysis. Choose varimax rotation for independent factors or promax for correlated factors. For dimensionality reduction, consider Principal Component Analysis.
While factor analysis follows established statistical theory, different software packages may produce slightly different results due to implementation details:
psych::fa()) and Python (factor_analyzer) typically differ by less than 0.01 in factor loadingsReady to explore latent factors in your data? (psychological test scores) to see EFA in action, or upload your own data to discover the underlying structure in your variables.
Selected: 0 of 0 variables
Leave empty to use eigenvalue > 1 criterion
Varimax for independent factors, Promax for correlated
ML is recommended for most analyses
Exploratory Factor Analysis (EFA) is a multivariate statistical technique used to identify underlying latent factors that explain the pattern of correlations among a set of observed variables. Unlike confirmatory factor analysis, EFA does not require a priori hypotheses about the factor structure and is used for theory development and scale construction.
Use Factor Analysis when you want to:
| Aspect | EFA | CFA | PCA |
|---|---|---|---|
| Purpose | Seeks to explain correlations among variables using underlying latent factors. Separates shared variance from unique variance. | Tests a pre-specified factor structure based on theory. Confirms hypotheses about relationships between observed variables and latent factors. | Focuses on explaining total variance and creating orthogonal components. Does not distinguish shared from unique variance. |
| Approach | Exploratory - discovers underlying structure without prior hypotheses | Confirmatory - tests specific hypothesized factor structures | Descriptive - reduces dimensionality for data simplification |
| When to Use | Theory building, scale development, understanding construct validity | Theory testing, validating measurement models, assessing model fit to data | Data reduction, feature extraction, eliminating multicollinearity |
Factor loadings represent the correlation between each variable and the factor. General guidelines for interpretation:
Choose oblique rotation when you expect factors to be related (common in social sciences). Choose orthogonal when theoretical independence is important.
library(psych)
library(tidyverse)
# Psychological test scores data
data <- tibble(
verbal = c(65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73),
numerical = c(62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70),
logical = c(68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77),
spatial = c(58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69),
memory = c(70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81)
)
# Check sampling adequacy
KMO(data)
# Bartlett's test
cortest.bartlett(cor(data), n = nrow(data))
# Perform EFA with varimax rotation
fa_result <- fa(data, nfactors = 2, rotate = "varimax", fm = "ml")
# View results
print(fa_result, digits = 3)
# Loadings
print(fa_result$loadings, cutoff = 0.3)
# Communalities
fa_result$communality
# Scree plot
scree(data, main = "Scree Plot")
# Factor scores
head(fa_result$scores)import numpy as np
import pandas as pd
from factor_analyzer import FactorAnalyzer, calculate_kmo, calculate_bartlett_sphericity
import matplotlib.pyplot as plt
import seaborn as sns
# Psychological test scores data
data = pd.DataFrame({
'verbal': [65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73],
'numerical': [62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70],
'logical': [68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77],
'spatial': [58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69],
'memory': [70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81]
})
# KMO test
kmo_all, kmo_model = calculate_kmo(data)
print(f"KMO: {kmo_model:.3f}")
# Bartlett's test
chi_square_value, p_value = calculate_bartlett_sphericity(data)
print(f"Bartlett's test: χ² = {chi_square_value:.2f}, p = {p_value:.4f}")
# Determine number of factors (scree plot)
fa = FactorAnalyzer(n_factors=len(data.columns), rotation=None)
fa.fit(data)
eigenvalues, _ = fa.get_eigenvalues()
plt.figure(figsize=(8, 5))
plt.plot(range(1, len(eigenvalues) + 1), eigenvalues, 'bo-')
plt.axhline(y=1, color='r', linestyle='--', label='Kaiser criterion')
plt.xlabel('Factor')
plt.ylabel('Eigenvalue')
plt.title('Scree Plot')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Perform factor analysis with varimax rotation
fa = FactorAnalyzer(n_factors=2, rotation='varimax', method='ml')
fa.fit(data)
# Loadings
loadings = pd.DataFrame(
fa.loadings_,
index=data.columns,
columns=['Factor1', 'Factor2']
)
print("\nFactor Loadings:")
print(loadings)
# Communalities
communalities = fa.get_communalities()
print("\nCommunalities:")
print(pd.Series(communalities, index=data.columns))
# Variance explained
variance = fa.get_factor_variance()
print("\nVariance Explained:")
print(pd.DataFrame(variance,
index=['SS Loadings', 'Proportion Var', 'Cumulative Var'],
columns=['Factor1', 'Factor2']))This calculator performs comprehensive Exploratory Factor Analysis (EFA), a statistical method used to identify underlying latent factors that explain patterns of correlations among observed variables. EFA is widely used in psychology, social sciences, and market research to understand construct validity and reduce data complexity.
💡 Pro Tip: Use KMO (> 0.5) and Bartlett's test (p < .05) to verify your data is suitable for factor analysis. Choose varimax rotation for independent factors or promax for correlated factors. For dimensionality reduction, consider Principal Component Analysis.
While factor analysis follows established statistical theory, different software packages may produce slightly different results due to implementation details:
psych::fa()) and Python (factor_analyzer) typically differ by less than 0.01 in factor loadingsReady to explore latent factors in your data? (psychological test scores) to see EFA in action, or upload your own data to discover the underlying structure in your variables.
Selected: 0 of 0 variables
Leave empty to use eigenvalue > 1 criterion
Varimax for independent factors, Promax for correlated
ML is recommended for most analyses
Exploratory Factor Analysis (EFA) is a multivariate statistical technique used to identify underlying latent factors that explain the pattern of correlations among a set of observed variables. Unlike confirmatory factor analysis, EFA does not require a priori hypotheses about the factor structure and is used for theory development and scale construction.
Use Factor Analysis when you want to:
| Aspect | EFA | CFA | PCA |
|---|---|---|---|
| Purpose | Seeks to explain correlations among variables using underlying latent factors. Separates shared variance from unique variance. | Tests a pre-specified factor structure based on theory. Confirms hypotheses about relationships between observed variables and latent factors. | Focuses on explaining total variance and creating orthogonal components. Does not distinguish shared from unique variance. |
| Approach | Exploratory - discovers underlying structure without prior hypotheses | Confirmatory - tests specific hypothesized factor structures | Descriptive - reduces dimensionality for data simplification |
| When to Use | Theory building, scale development, understanding construct validity | Theory testing, validating measurement models, assessing model fit to data | Data reduction, feature extraction, eliminating multicollinearity |
Factor loadings represent the correlation between each variable and the factor. General guidelines for interpretation:
Choose oblique rotation when you expect factors to be related (common in social sciences). Choose orthogonal when theoretical independence is important.
library(psych)
library(tidyverse)
# Psychological test scores data
data <- tibble(
verbal = c(65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73),
numerical = c(62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70),
logical = c(68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77),
spatial = c(58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69),
memory = c(70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81)
)
# Check sampling adequacy
KMO(data)
# Bartlett's test
cortest.bartlett(cor(data), n = nrow(data))
# Perform EFA with varimax rotation
fa_result <- fa(data, nfactors = 2, rotate = "varimax", fm = "ml")
# View results
print(fa_result, digits = 3)
# Loadings
print(fa_result$loadings, cutoff = 0.3)
# Communalities
fa_result$communality
# Scree plot
scree(data, main = "Scree Plot")
# Factor scores
head(fa_result$scores)import numpy as np
import pandas as pd
from factor_analyzer import FactorAnalyzer, calculate_kmo, calculate_bartlett_sphericity
import matplotlib.pyplot as plt
import seaborn as sns
# Psychological test scores data
data = pd.DataFrame({
'verbal': [65, 72, 58, 68, 75, 62, 70, 66, 71, 69, 63, 74, 60, 67, 73],
'numerical': [62, 68, 55, 65, 71, 60, 68, 63, 70, 66, 61, 72, 58, 64, 70],
'logical': [68, 74, 60, 70, 78, 65, 73, 69, 75, 71, 66, 76, 62, 69, 77],
'spatial': [58, 65, 52, 62, 68, 56, 64, 60, 66, 63, 58, 67, 54, 61, 69],
'memory': [70, 76, 63, 72, 80, 68, 75, 71, 77, 73, 69, 79, 65, 72, 81]
})
# KMO test
kmo_all, kmo_model = calculate_kmo(data)
print(f"KMO: {kmo_model:.3f}")
# Bartlett's test
chi_square_value, p_value = calculate_bartlett_sphericity(data)
print(f"Bartlett's test: χ² = {chi_square_value:.2f}, p = {p_value:.4f}")
# Determine number of factors (scree plot)
fa = FactorAnalyzer(n_factors=len(data.columns), rotation=None)
fa.fit(data)
eigenvalues, _ = fa.get_eigenvalues()
plt.figure(figsize=(8, 5))
plt.plot(range(1, len(eigenvalues) + 1), eigenvalues, 'bo-')
plt.axhline(y=1, color='r', linestyle='--', label='Kaiser criterion')
plt.xlabel('Factor')
plt.ylabel('Eigenvalue')
plt.title('Scree Plot')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
# Perform factor analysis with varimax rotation
fa = FactorAnalyzer(n_factors=2, rotation='varimax', method='ml')
fa.fit(data)
# Loadings
loadings = pd.DataFrame(
fa.loadings_,
index=data.columns,
columns=['Factor1', 'Factor2']
)
print("\nFactor Loadings:")
print(loadings)
# Communalities
communalities = fa.get_communalities()
print("\nCommunalities:")
print(pd.Series(communalities, index=data.columns))
# Variance explained
variance = fa.get_factor_variance()
print("\nVariance Explained:")
print(pd.DataFrame(variance,
index=['SS Loadings', 'Proportion Var', 'Cumulative Var'],
columns=['Factor1', 'Factor2']))