Logistic Regression

Created:April 17, 2025

Last Updated:September 19, 2025

This Logistic Regression Calculator helps you analyze binary outcome data and make classifications or predictions. It fits data to the model $P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}$ , providing comprehensive analysis including model coefficients, odds ratios, and performance metrics. Logistic regression is widely used in various fields including medicine (disease diagnosis), marketing (customer conversion), and finance (credit scoring). You can analyze both simple and multiple logistic regression models with one or more predictor variables. To learn about the data format required and test this calculator, .

Calculator

1. Load Your Data

2. Select Columns & Options

Dependent Variable (Y - Binary, 0/1):

The dependent variable should contain only 0 and 1 values

Independent Variables (X):

No variables available. Please enter data in the table above.

Regularization:

Probability Cutoff:

Standardize Variables

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Definition

Logistic Regression is a statistical method used to model the probability of a binary outcome based on one or more predictor variables. Unlike linear regression, logistic regression models the log-odds of an event as a linear combination of predictors, which constrains the predicted probabilities between 0 and 1.

Key Formulas for One Predictor

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x

Odds Ratio:

\text{OR} = e^{\beta_1}

Decision Boundary (for classification):

x_{\text{cutoff}} = \frac{-\beta_0 - \log\left(\frac{1-c}{c}\right)}{\beta_1}

where c is the probability cutoff (typically 0.5)

Key Formulas for Multiple Predictors

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p

Odds Ratio:

\text{OR}_i = e^{\beta_i}

For the i-th predictor, representing the change in odds when $x_i$ increases by one unit, holding other predictors constant

Decision Boundary (for classification):

\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p + \log\left(\frac{1-c}{c}\right) = 0

where c is the probability cutoff (typically 0.5)

\beta_0 + \sum_{i=1}^{p} \beta_i x_i = 0

(simplified for c = 0.5)

Key Assumptions

Binary outcome: The dependent variable is binary (0/1, success/failure, yes/no)

Independence: Observations are independent

No multicollinearity: Predictor variables are not highly correlated (for multiple logistic regression)

Linearity in the logit: The log-odds has a linear relationship with the predictor variables

Large sample size: Sufficient data to provide reliable estimates

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Consider a dataset of student exam scores and admission outcomes (1 = admitted, 0 = rejected):

Exam Score (X)	Admitted (Y)
35	0
42	0
57	0
⋮	⋮
78	1
93	1

Step 2: Fit Logistic Regression Model

After fitting a logistic regression model, we get:

\text{log} \left(\frac{P(\text{admitted})}{1 - P(\text{admitted})}\right) = -10.68 + 0.15 \times \text{exam\_score}

Step 3: Interpret the Coefficients

The coefficient β₁ = 0.15 means that for each one-point increase in exam score, the log-odds of admission increase by 0.15.

Converting to odds ratio: OR = e^0.15 = 1.16

This means that for each one-point increase in exam score, the odds of admission increase by 16%.

Step 4: Calculate Probability for a New Student

For a student with an exam score of 70:

P(\text{admitted}) = \frac{1}{1 + e^{-(-10.68 + 0.15 \times 70)}} = \frac{1}{1 + e^{-(-10.68 + 10.5)}} = \frac{1}{1 + e^{0.18}} = 0.45

This student has a 45% probability of being admitted.

Step 5: Find the Decision Boundary

At what exam score is the probability of admission exactly 0.5?

\text{exam\_score} = \frac{-(-10.68)}{0.15} = 71.2

Students scoring above 71.2 are more likely to be admitted than rejected.

Performance Metrics

Confusion Matrix

A table comparing actual vs. predicted classifications:

	Actual Positive	Actual Negative
Predicted Positive	True Positive (TP)	False Positive (FP)
Predicted Negative	False Negative (FN)	True Negative (TN)

Accuracy

Proportion of correct predictions: (TP + TN) / (TP + FP + FN + TN)

Sensitivity (Recall)

Proportion of actual positives correctly identified: TP / (TP + FN)

Specificity

Proportion of actual negatives correctly identified: TN / (TN + FP)

AUC (Area Under ROC Curve)

Measures the model's ability to distinguish between classes; ranges from 0.5 (no discrimination) to 1 (perfect discrimination)

How to Perform Logistic Regression with R

# Load required libraries
library(tidyverse)
library(pROC)

# Sample data - Predicting admission based on multiple factors
data <- tibble(
  exam_score = c(42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79,
                 81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88),
  gpa = c(2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
          3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9),
  study_hours = c(5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30,
                  32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34),
  admitted = c(0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1,
               1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1)
)

# Fit logistic regression model
model <- glm(admitted ~ exam_score + gpa + study_hours,
             data = data, family = binomial)

# Display model summary
summary(model)

# Calculate and display odds ratios
odds_ratios <- exp(coef(model))
print("Odds Ratios:")
print(round(odds_ratios, 3))

# Make predictions
data$predicted_prob <- predict(model, type = "response")
data$predicted_class <- ifelse(data$predicted_prob > 0.5, 1, 0)

# Calculate ROC and AUC
roc_obj <- roc(data$admitted, data$predicted_prob)
print(paste("AUC:", round(auc(roc_obj), 3)))

# Confusion matrix
conf_matrix <- table(Predicted = data$predicted_class, Actual = data$admitted)
print("Confusion Matrix:")
print(conf_matrix)

# Accuracy
accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix)
print(paste("Accuracy:", round(accuracy, 3)))

# Create visualization
ggplot(data, aes(x = exam_score, y = admitted)) +
  geom_point(aes(color = factor(admitted)), size = 3) +
  stat_smooth(method = "glm", method.args = list(family = "binomial"),
              se = TRUE, color = "blue") +
  labs(title = "Logistic Regression: Exam Score vs Admission",
       x = "Exam Score",
       y = "Probability of Admission",
       color = "Admitted") +
  scale_color_manual(values = c("0" = "red", "1" = "green")) +
  theme_minimal()

How to Perform Logistic Regression with Python

Python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.discrete.discrete_model import Logit
from sklearn.metrics import roc_curve, roc_auc_score, confusion_matrix, accuracy_score
import seaborn as sns

# Sample data - Predicting admission based on multiple factors
data = pd.DataFrame({
    'exam_score': [42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79,
                   81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88],
    'gpa': [2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
            3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9],
    'study_hours': [5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30,
                    32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34],
    'admitted': [0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1,
                 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1]
})

# Prepare data for model
X = sm.add_constant(data[['exam_score', 'gpa', 'study_hours']])
y = data['admitted']

# Fit logistic regression model
model = Logit(y, X)
results = model.fit()

# Display model summary
print(results.summary())

# Calculate odds ratios
odds_ratios = np.exp(results.params)
print("Odds Ratios:")
print(odds_ratios.round(3))

# Make predictions
data['predicted_prob'] = results.predict(X)
data['predicted_class'] = (data['predicted_prob'] > 0.5).astype(int)

# Calculate ROC and AUC
fpr, tpr, _ = roc_curve(y, data['predicted_prob'])
auc_score = roc_auc_score(y, data['predicted_prob'])
print(f"AUC: {auc_score:.3f}")

# Confusion matrix
conf_matrix = confusion_matrix(y, data['predicted_class'])
print("Confusion Matrix:")
print(conf_matrix)

# Accuracy
accuracy = accuracy_score(y, data['predicted_class'])
print(f"Accuracy: {accuracy:.3f}")

# Create visualization
plt.figure(figsize=(8, 6))
colors = ['red' if x == 0 else 'green' for x in data['admitted']]
plt.scatter(data['exam_score'], data['admitted'], c=colors, s=50, alpha=0.7)
sns.regplot(x='exam_score', y='admitted', data=data, logistic=True,
            scatter=False, color='blue', ci=95)
plt.title('Logistic Regression: Exam Score vs Admission')
plt.xlabel('Exam Score')
plt.ylabel('Probability of Admission')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Verification

Logistic Regression

Created:April 17, 2025

Last Updated:September 19, 2025

Calculator

1. Load Your Data

2. Select Columns & Options

Dependent Variable (Y - Binary, 0/1):

The dependent variable should contain only 0 and 1 values

Independent Variables (X):

No variables available. Please enter data in the table above.

Regularization:

Probability Cutoff:

Standardize Variables

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Definition

Key Formulas for One Predictor

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x

Odds Ratio:

\text{OR} = e^{\beta_1}

Decision Boundary (for classification):

x_{\text{cutoff}} = \frac{-\beta_0 - \log\left(\frac{1-c}{c}\right)}{\beta_1}

where c is the probability cutoff (typically 0.5)

Key Formulas for Multiple Predictors

Logistic Model (Probability):

P(y=1) = \frac{1}{1 + e^{-(\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p)}}

Logit Transformation (Log-odds):

\log\left(\frac{P(y=1)}{1 - P(y=1)}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p

Odds Ratio:

\text{OR}_i = e^{\beta_i}

For the i-th predictor, representing the change in odds when $x_i$ increases by one unit, holding other predictors constant

Decision Boundary (for classification):

\beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_p x_p + \log\left(\frac{1-c}{c}\right) = 0

where c is the probability cutoff (typically 0.5)

\beta_0 + \sum_{i=1}^{p} \beta_i x_i = 0

(simplified for c = 0.5)

Key Assumptions

Binary outcome: The dependent variable is binary (0/1, success/failure, yes/no)

Independence: Observations are independent

No multicollinearity: Predictor variables are not highly correlated (for multiple logistic regression)

Linearity in the logit: The log-odds has a linear relationship with the predictor variables

Large sample size: Sufficient data to provide reliable estimates

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Consider a dataset of student exam scores and admission outcomes (1 = admitted, 0 = rejected):

Exam Score (X)	Admitted (Y)
35	0
42	0
57	0
⋮	⋮
78	1
93	1

Step 2: Fit Logistic Regression Model

After fitting a logistic regression model, we get:

\text{log} \left(\frac{P(\text{admitted})}{1 - P(\text{admitted})}\right) = -10.68 + 0.15 \times \text{exam\_score}

Step 3: Interpret the Coefficients

The coefficient β₁ = 0.15 means that for each one-point increase in exam score, the log-odds of admission increase by 0.15.

Converting to odds ratio: OR = e^0.15 = 1.16

This means that for each one-point increase in exam score, the odds of admission increase by 16%.

Step 4: Calculate Probability for a New Student

For a student with an exam score of 70:

P(\text{admitted}) = \frac{1}{1 + e^{-(-10.68 + 0.15 \times 70)}} = \frac{1}{1 + e^{-(-10.68 + 10.5)}} = \frac{1}{1 + e^{0.18}} = 0.45

This student has a 45% probability of being admitted.

Step 5: Find the Decision Boundary

At what exam score is the probability of admission exactly 0.5?

\text{exam\_score} = \frac{-(-10.68)}{0.15} = 71.2

Students scoring above 71.2 are more likely to be admitted than rejected.

Performance Metrics

Confusion Matrix

A table comparing actual vs. predicted classifications:

	Actual Positive	Actual Negative
Predicted Positive	True Positive (TP)	False Positive (FP)
Predicted Negative	False Negative (FN)	True Negative (TN)

Accuracy

Proportion of correct predictions: (TP + TN) / (TP + FP + FN + TN)

Sensitivity (Recall)

Proportion of actual positives correctly identified: TP / (TP + FN)

Specificity

Proportion of actual negatives correctly identified: TN / (TN + FP)

AUC (Area Under ROC Curve)

Measures the model's ability to distinguish between classes; ranges from 0.5 (no discrimination) to 1 (perfect discrimination)

How to Perform Logistic Regression with R

# Load required libraries
library(tidyverse)
library(pROC)

# Sample data - Predicting admission based on multiple factors
data <- tibble(
  exam_score = c(42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79,
                 81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88),
  gpa = c(2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
          3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9),
  study_hours = c(5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30,
                  32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34),
  admitted = c(0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1,
               1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1)
)

# Fit logistic regression model
model <- glm(admitted ~ exam_score + gpa + study_hours,
             data = data, family = binomial)

# Display model summary
summary(model)

# Calculate and display odds ratios
odds_ratios <- exp(coef(model))
print("Odds Ratios:")
print(round(odds_ratios, 3))

# Make predictions
data$predicted_prob <- predict(model, type = "response")
data$predicted_class <- ifelse(data$predicted_prob > 0.5, 1, 0)

# Calculate ROC and AUC
roc_obj <- roc(data$admitted, data$predicted_prob)
print(paste("AUC:", round(auc(roc_obj), 3)))

# Confusion matrix
conf_matrix <- table(Predicted = data$predicted_class, Actual = data$admitted)
print("Confusion Matrix:")
print(conf_matrix)

# Accuracy
accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix)
print(paste("Accuracy:", round(accuracy, 3)))

# Create visualization
ggplot(data, aes(x = exam_score, y = admitted)) +
  geom_point(aes(color = factor(admitted)), size = 3) +
  stat_smooth(method = "glm", method.args = list(family = "binomial"),
              se = TRUE, color = "blue") +
  labs(title = "Logistic Regression: Exam Score vs Admission",
       x = "Exam Score",
       y = "Probability of Admission",
       color = "Admitted") +
  scale_color_manual(values = c("0" = "red", "1" = "green")) +
  theme_minimal()

How to Perform Logistic Regression with Python

Python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.discrete.discrete_model import Logit
from sklearn.metrics import roc_curve, roc_auc_score, confusion_matrix, accuracy_score
import seaborn as sns

# Sample data - Predicting admission based on multiple factors
data = pd.DataFrame({
    'exam_score': [42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79,
                   81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88],
    'gpa': [2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8,
            3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9],
    'study_hours': [5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30,
                    32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34],
    'admitted': [0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1,
                 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1]
})

# Prepare data for model
X = sm.add_constant(data[['exam_score', 'gpa', 'study_hours']])
y = data['admitted']

# Fit logistic regression model
model = Logit(y, X)
results = model.fit()

# Display model summary
print(results.summary())

# Calculate odds ratios
odds_ratios = np.exp(results.params)
print("Odds Ratios:")
print(odds_ratios.round(3))

# Make predictions
data['predicted_prob'] = results.predict(X)
data['predicted_class'] = (data['predicted_prob'] > 0.5).astype(int)

# Calculate ROC and AUC
fpr, tpr, _ = roc_curve(y, data['predicted_prob'])
auc_score = roc_auc_score(y, data['predicted_prob'])
print(f"AUC: {auc_score:.3f}")

# Confusion matrix
conf_matrix = confusion_matrix(y, data['predicted_class'])
print("Confusion Matrix:")
print(conf_matrix)

# Accuracy
accuracy = accuracy_score(y, data['predicted_class'])
print(f"Accuracy: {accuracy:.3f}")

# Create visualization
plt.figure(figsize=(8, 6))
colors = ['red' if x == 0 else 'green' for x in data['admitted']]
plt.scatter(data['exam_score'], data['admitted'], c=colors, s=50, alpha=0.7)
sns.regplot(x='exam_score', y='admitted', data=data, logistic=True,
            scatter=False, color='blue', ci=95)
plt.title('Logistic Regression: Exam Score vs Admission')
plt.xlabel('Exam Score')
plt.ylabel('Probability of Admission')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Verification

Exam Score (X)

Admitted (Y)

⋮

Actual Positive

Actual Negative

Predicted Positive

True Positive (TP)

False Positive (FP)

Predicted Negative

False Negative (FN)

True Negative (TN)

# Load required libraries library(tidyverse) library(pROC) # Sample data - Predicting admission based on multiple factors data <- tibble( exam_score = c(42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88), gpa = c(2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9), study_hours = c(5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30, 32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34), admitted = c(0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1) ) # Fit logistic regression model model <- glm(admitted ~ exam_score + gpa + study_hours, data = data, family = binomial) # Display model summary summary(model) # Calculate and display odds ratios odds_ratios <- exp(coef(model)) print("Odds Ratios:") print(round(odds_ratios, 3)) # Make predictions data$predicted_prob <- predict(model, type = "response") data$predicted_class <- ifelse(data$predicted_prob > 0.5, 1, 0) # Calculate ROC and AUC roc_obj <- roc(data$admitted, data$predicted_prob) print(paste("AUC:", round(auc(roc_obj), 3))) # Confusion matrix conf_matrix <- table(Predicted = data$predicted_class, Actual = data$admitted) print("Confusion Matrix:") print(conf_matrix) # Accuracy accuracy <- sum(diag(conf_matrix)) / sum(conf_matrix) print(paste("Accuracy:", round(accuracy, 3))) # Create visualization ggplot(data, aes(x = exam_score, y = admitted)) + geom_point(aes(color = factor(admitted)), size = 3) + stat_smooth(method = "glm", method.args = list(family = "binomial"), se = TRUE, color = "blue") + labs(title = "Logistic Regression: Exam Score vs Admission", x = "Exam Score", y = "Probability of Admission", color = "Admitted") + scale_color_manual(values = c("0" = "red", "1" = "green")) + theme_minimal()

import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.api as sm from statsmodels.discrete.discrete_model import Logit from sklearn.metrics import roc_curve, roc_auc_score, confusion_matrix, accuracy_score import seaborn as sns # Sample data - Predicting admission based on multiple factors data = pd.DataFrame({ 'exam_score': [42, 48, 51, 55, 58, 60, 62, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 38, 45, 52, 64, 70, 76, 82, 88], 'gpa': [2.3, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 3.7, 3.8, 3.9, 4.0, 3.9, 4.0, 2.1, 2.4, 2.8, 3.2, 3.5, 3.6, 3.8, 3.9], 'study_hours': [5, 8, 10, 12, 14, 15, 16, 18, 20, 22, 24, 25, 26, 28, 30, 32, 28, 30, 32, 35, 33, 36, 3, 6, 11, 17, 23, 27, 31, 34], 'admitted': [0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1] }) # Prepare data for model X = sm.add_constant(data[['exam_score', 'gpa', 'study_hours']]) y = data['admitted'] # Fit logistic regression model model = Logit(y, X) results = model.fit() # Display model summary print(results.summary()) # Calculate odds ratios odds_ratios = np.exp(results.params) print("Odds Ratios:") print(odds_ratios.round(3)) # Make predictions data['predicted_prob'] = results.predict(X) data['predicted_class'] = (data['predicted_prob'] > 0.5).astype(int) # Calculate ROC and AUC fpr, tpr, _ = roc_curve(y, data['predicted_prob']) auc_score = roc_auc_score(y, data['predicted_prob']) print(f"AUC: {auc_score:.3f}") # Confusion matrix conf_matrix = confusion_matrix(y, data['predicted_class']) print("Confusion Matrix:") print(conf_matrix) # Accuracy accuracy = accuracy_score(y, data['predicted_class']) print(f"Accuracy: {accuracy:.3f}") # Create visualization plt.figure(figsize=(8, 6)) colors = ['red' if x == 0 else 'green' for x in data['admitted']] plt.scatter(data['exam_score'], data['admitted'], c=colors, s=50, alpha=0.7) sns.regplot(x='exam_score', y='admitted', data=data, logistic=True, scatter=False, color='blue', ci=95) plt.title('Logistic Regression: Exam Score vs Admission') plt.xlabel('Exam Score') plt.ylabel('Probability of Admission') plt.grid(True, alpha=0.3) plt.tight_layout() plt.show()

Exam Score (X)

Admitted (Y)

⋮

Actual Positive

Actual Negative

Predicted Positive

True Positive (TP)

False Positive (FP)

Predicted Negative

False Negative (FN)

True Negative (TN)

Logistic Regression

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Definition

Key Formulas for One Predictor

Key Formulas for Multiple Predictors

Key Assumptions

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Step 2: Fit Logistic Regression Model

Step 3: Interpret the Coefficients

Step 4: Calculate Probability for a New Student

Step 5: Find the Decision Boundary

Performance Metrics

How to Perform Logistic Regression with R

How to Perform Logistic Regression with Python

Verification

View Verification Details

Logistic Regression

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Definition

Key Formulas for One Predictor

Key Formulas for Multiple Predictors

Key Assumptions

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Step 2: Fit Logistic Regression Model

Step 3: Interpret the Coefficients

Step 4: Calculate Probability for a New Student

Step 5: Find the Decision Boundary

Performance Metrics

How to Perform Logistic Regression with R

How to Perform Logistic Regression with Python

Verification

View Verification Details

Logistic Regression

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Simple Linear Regression Calculator

Exponential Regression Calculator

Multiple Linear Regression Calculator

Quadratic Regression Calculator

Learn More

Definition

Key Formulas for One Predictor

Key Formulas for Multiple Predictors

Key Assumptions

Practical Example of Logistic Regression with One Predictor

Step 1: Data

Step 2: Fit Logistic Regression Model

Step 3: Interpret the Coefficients

Step 4: Calculate Probability for a New Student

Step 5: Find the Decision Boundary

Performance Metrics

How to Perform Logistic Regression with R

How to Perform Logistic Regression with Python

Verification

View Verification Details

Logistic Regression

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators