Truncated Regression

Created:March 8, 2026

Last Updated:March 8, 2026

This calculator performs truncated regression analysis using Maximum Likelihood Estimation (MLE). Truncated regression is used when the sample is drawn from a restricted part of the population — for example, when you only observe wages for employed individuals (left truncation) or test scores below a ceiling (right truncation).

What You'll Get:

MLE Coefficient Estimates: Corrected for truncation bias with standard errors and z-tests
OLS Comparison: See how ignoring truncation biases your estimates
Model Fit Statistics: Log-likelihood, AIC, and BIC for model evaluation
Diagnostic Plots: Residuals, Q-Q plot, and coefficient visualization
Publication-Ready Output: APA-formatted results for academic reporting

Truncation vs. Censoring: Truncated data means observations outside the truncation point are completely excluded from the sample. If observations are recorded but capped at a limit value, that is censoring — use a Censored Regression (Tobit) Calculator instead.

Ready to analyze truncated data? (wages truncated from below) to see the analysis in action, or upload your own data.

Calculator

1. Load Your Data

2. Select Variables & Options

Dependent Variable (Y):

Independent Variables (X):

No variables available. Please enter data in the table above.

Truncation Type:

Truncation Point:

Confidence Level:

Related Calculators

Simple Linear Regression Calculator

Multiple Linear Regression Calculator

Logistic Regression Calculator

Learn More

Definition

Truncated Regression is a statistical model for data where the dependent variable is only observed within a certain range. Unlike censored data (where values are capped), truncated data means observations outside the range are entirely missing from the sample. Standard OLS regression on truncated data produces biased and inconsistent estimates; truncated regression corrects this using Maximum Likelihood Estimation.

Key Formulas

The log-likelihood for left-truncated data (observed when Y > a):

\ell = \sum_{i=1}^{n} \left[ \log \phi\left(\frac{y_i - X_i\beta}{\sigma}\right) - \log \sigma - \log\left(1 - \Phi\left(\frac{a - X_i\beta}{\sigma}\right)\right) \right]

Where:

$\phi(\cdot)$ is the standard normal PDF
$\Phi(\cdot)$ is the standard normal CDF
$a$ is the truncation point
$\beta$ are the regression coefficients
$\sigma$ is the standard deviation

When to Use Truncated Regression

Wages for employed only: If your sample only includes people who are employed, wages are left-truncated at some minimum value.

Academic admissions: Studying GPA of admitted students (only those above a cutoff are observed).

Insurance claims: Only claims above a deductible are recorded.

Survival data: Duration data where short durations are not observed (e.g., only firms surviving past a threshold).

How to Perform Truncated Regression with R

# Truncated Regression in R
library(truncreg)
library(tidyverse)

# Example sample data (same structure as this calculator)
data <- tibble(
  wage = c(12.5, 15.3, 18.7, 22.1, 25.4, 28.9, 31.2, 35.6, 38.4, 42.1,
           45.7, 48.3, 52.8, 55.1, 58.9, 62.3, 65.7, 70.2, 74.8, 80.1),
  education = c(10, 12, 12, 14, 14, 16, 16, 16, 18, 18,
                18, 20, 20, 20, 20, 22, 22, 22, 22, 24),
  experience = c(2, 3, 5, 4, 8, 6, 10, 12, 8, 15,
                 18, 10, 14, 20, 22, 12, 16, 25, 28, 15)
)

# Left-truncated at 10
model <- truncreg(wage ~ education + experience,
                  data = data,
                  point = 10,
                  direction = "left")

summary(model)

How to Perform Truncated Regression with Python

Python

# Truncated Regression in Python
import numpy as np
import statsmodels.api as sm
from scipy import stats
from scipy.optimize import minimize

# Example sample data (same structure as this calculator)
wage = np.array([
  12.5, 15.3, 18.7, 22.1, 25.4, 28.9, 31.2, 35.6, 38.4, 42.1,
  45.7, 48.3, 52.8, 55.1, 58.9, 62.3, 65.7, 70.2, 74.8, 80.1
])
education = np.array([
  10, 12, 12, 14, 14, 16, 16, 16, 18, 18,
  18, 20, 20, 20, 20, 22, 22, 22, 22, 24
])
experience = np.array([
  2, 3, 5, 4, 8, 6, 10, 12, 8, 15,
  18, 10, 14, 20, 22, 12, 16, 25, 28, 15
])

# Left truncation point
a = 10

X = sm.add_constant(np.column_stack([education, experience]))
y = wage

def neg_log_likelihood(params, X, y, a):
  beta = params[:-1]
  sigma = np.exp(params[-1])  # ensures sigma > 0

  mu = X @ beta
  z = (y - mu) / sigma
  alpha = (a - mu) / sigma

  eps = 1e-12
  log_pdf = stats.norm.logpdf(z) - np.log(sigma)
  log_survival = np.log(np.maximum(1 - stats.norm.cdf(alpha), eps))

  ll = np.sum(log_pdf - log_survival)
  return -ll

# OLS starting values
ols = sm.OLS(y, X).fit()
beta0 = ols.params
sigma0 = np.std(ols.resid, ddof=X.shape[1])
init_params = np.concatenate([beta0, [np.log(max(sigma0, 1e-6))]])

result = minimize(
  neg_log_likelihood,
  init_params,
  args=(X, y, a),
  method='BFGS'
)

beta_hat = result.x[:-1]
sigma_hat = np.exp(result.x[-1])

print('Converged:', result.success)
print('Log-likelihood:', -result.fun)
print('Intercept:', beta_hat[0])
print('Education coef:', beta_hat[1])
print('Experience coef:', beta_hat[2])
print('Sigma:', sigma_hat)

Truncation vs. Censoring

These are commonly confused but are fundamentally different:

Truncation: Observations outside the range are not in the sample at all. You don't know how many were excluded. Use truncated regression.
Censoring: Observations outside the range are in the sample but their exact value is replaced with the limit value. Use Tobit / censored regression.

Verification

# Truncated Regression in R library(truncreg) library(tidyverse) # Example sample data (same structure as this calculator) data <- tibble( wage = c(12.5, 15.3, 18.7, 22.1, 25.4, 28.9, 31.2, 35.6, 38.4, 42.1, 45.7, 48.3, 52.8, 55.1, 58.9, 62.3, 65.7, 70.2, 74.8, 80.1), education = c(10, 12, 12, 14, 14, 16, 16, 16, 18, 18, 18, 20, 20, 20, 20, 22, 22, 22, 22, 24), experience = c(2, 3, 5, 4, 8, 6, 10, 12, 8, 15, 18, 10, 14, 20, 22, 12, 16, 25, 28, 15) ) # Left-truncated at 10 model <- truncreg(wage ~ education + experience, data = data, point = 10, direction = "left") summary(model)

# Truncated Regression in Python import numpy as np import statsmodels.api as sm from scipy import stats from scipy.optimize import minimize # Example sample data (same structure as this calculator) wage = np.array([ 12.5, 15.3, 18.7, 22.1, 25.4, 28.9, 31.2, 35.6, 38.4, 42.1, 45.7, 48.3, 52.8, 55.1, 58.9, 62.3, 65.7, 70.2, 74.8, 80.1 ]) education = np.array([ 10, 12, 12, 14, 14, 16, 16, 16, 18, 18, 18, 20, 20, 20, 20, 22, 22, 22, 22, 24 ]) experience = np.array([ 2, 3, 5, 4, 8, 6, 10, 12, 8, 15, 18, 10, 14, 20, 22, 12, 16, 25, 28, 15 ]) # Left truncation point a = 10 X = sm.add_constant(np.column_stack([education, experience])) y = wage def neg_log_likelihood(params, X, y, a): beta = params[:-1] sigma = np.exp(params[-1]) # ensures sigma > 0 mu = X @ beta z = (y - mu) / sigma alpha = (a - mu) / sigma eps = 1e-12 log_pdf = stats.norm.logpdf(z) - np.log(sigma) log_survival = np.log(np.maximum(1 - stats.norm.cdf(alpha), eps)) ll = np.sum(log_pdf - log_survival) return -ll # OLS starting values ols = sm.OLS(y, X).fit() beta0 = ols.params sigma0 = np.std(ols.resid, ddof=X.shape[1]) init_params = np.concatenate([beta0, [np.log(max(sigma0, 1e-6))]]) result = minimize( neg_log_likelihood, init_params, args=(X, y, a), method='BFGS' ) beta_hat = result.x[:-1] sigma_hat = np.exp(result.x[-1]) print('Converged:', result.success) print('Log-likelihood:', -result.fun) print('Intercept:', beta_hat[0]) print('Education coef:', beta_hat[1]) print('Experience coef:', beta_hat[2]) print('Sigma:', sigma_hat)

Truncated Regression

What You'll Get:

Calculator

1. Load Your Data

2. Select Variables & Options

Related Calculators

Simple Linear Regression Calculator

Multiple Linear Regression Calculator

Logistic Regression Calculator

Learn More

Definition

Key Formulas

When to Use Truncated Regression

How to Perform Truncated Regression with R

How to Perform Truncated Regression with Python

Truncation vs. Censoring

Verification

View Verification Details

Truncated Regression

What You'll Get:

Calculator

1. Load Your Data

2. Select Variables & Options

Related Calculators

Simple Linear Regression Calculator

Multiple Linear Regression Calculator

Logistic Regression Calculator

Learn More

Definition

Key Formulas

When to Use Truncated Regression

How to Perform Truncated Regression with R

How to Perform Truncated Regression with Python

Truncation vs. Censoring

Verification

View Verification Details