ROC Curve Generator

Created:February 15, 2026

Last Updated:April 6, 2026

Generate ROC (Receiver Operating Characteristic) curves to evaluate binary classification model performance. Calculate AUC, find optimal thresholds using Youden's index, and view confusion matrix metrics including sensitivity, specificity, PPV, and NPV.

Data Format Tips

Your data should have two columns:

Actual label column: Binary outcomes (0/1, Yes/No, Disease/Healthy, etc.)
Predicted probability column: Numeric values between 0 and 1 from your classification model

The tool automatically detects binary classes. You can optionally specify which label is the positive class.

Not sure how to format your data? to see how it works, or upload your own data to get started!

Calculator

1. Load Your Data

2. Select Columns & Options

Select Actual Label Column *

Select Predicted Probability Column *

Positive Label (optional)

Related Calculators

Logistic Regression

Control Chart (SPC)

Forest Plot

Box Plot Maker

Learn More

What is a ROC Curve?

A Receiver Operating Characteristic (ROC) curve is a graphical tool used to evaluate the performance of a binary classification model. It plots the True Positive Rate (Sensitivity) against the False Positive Rate (1 - Specificity) at various classification thresholds.

The curve shows the trade-off between correctly identifying positive cases and incorrectly classifying negative cases as positive. A perfect classifier would have a curve that passes through the top-left corner (100% sensitivity, 0% false positive rate).

Understanding AUC

The Area Under the ROC Curve (AUC) summarizes the overall diagnostic accuracy of a classification model in a single number between 0 and 1:

AUC Range	Interpretation
0.90 – 1.00	Excellent discrimination
0.80 – 0.90	Good discrimination
0.70 – 0.80	Fair discrimination
0.60 – 0.70	Poor discrimination
0.50	No discrimination (random chance)

AUC represents the probability that a randomly chosen positive case will have a higher predicted probability than a randomly chosen negative case.

Optimal Threshold Selection

This tool uses Youden's J statistic (J = Sensitivity + Specificity - 1) to find the optimal classification threshold. This point maximizes the vertical distance between the ROC curve and the diagonal reference line, balancing sensitivity and specificity equally.

In practice, the optimal threshold depends on the relative costs of false positives vs. false negatives for your specific application. For example:

High-stakes screening (e.g., cancer detection)

Prioritize sensitivity — use a lower threshold to minimize false negatives (missed cases), even if it increases false positives.

Cost-sensitive decisions (e.g., fraud detection)

Balance specificity and sensitivity — use a higher threshold to reduce false positives (costly investigations), but accept some missed fraud.

Key Metrics Explained

Sensitivity (TPR / Recall): Proportion of actual positives correctly identified. High sensitivity means few missed positive cases.
Specificity (TNR): Proportion of actual negatives correctly identified. High specificity means few false alarms.
PPV (Precision): Proportion of predicted positives that are actually positive. Important when the cost of false positives is high.
NPV: Proportion of predicted negatives that are actually negative. Important when the cost of false negatives is high.
Accuracy: Overall proportion of correct predictions. Can be misleading with imbalanced classes.

How to Create a ROC Curve in R

Using pROC and plotly to create an interactive ROC curve.

library(tidyverse)

# Sample data (same as this page's sampleData)
df <- tibble(
  actual = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
  predicted = c(0.95, 0.92, 0.88, 0.85, 0.82, 0.79, 0.76, 0.73, 0.71, 0.68, 0.65, 0.62, 0.58, 0.55, 0.52, 0.48, 0.45, 0.42, 0.38, 0.35, 0.32, 0.28, 0.25, 0.22, 0.18, 0.12, 0.15, 0.18, 0.22, 0.25, 0.28, 0.32, 0.35, 0.38, 0.08, 0.05, 0.42, 0.45, 0.1, 0.02, 0.48, 0.15, 0.2, 0.06, 0.3, 0.12, 0.08, 0.25, 0.04, 0.01)
)

# Calculate ROC curve exactly like backend
thresholds <- sort(unique(df$predicted), decreasing = TRUE)

tpr <- c(0)
fpr <- c(0)
threshold_values <- c(if (length(thresholds) > 0) thresholds[1] + 0.01 else 1.01)

total_positive <- sum(df$actual == 1)
total_negative <- sum(df$actual == 0)

for (thresh in thresholds) {
  pred_positive <- df$predicted >= thresh
  tp <- sum(pred_positive & (df$actual == 1))
  fp <- sum(pred_positive & (df$actual == 0))

  tpr <- c(tpr, tp / total_positive)
  fpr <- c(fpr, fp / total_negative)
  threshold_values <- c(threshold_values, thresh)
}

if (tail(tpr, 1) != 1 || tail(fpr, 1) != 1) {
  tpr <- c(tpr, 1)
  fpr <- c(fpr, 1)
  threshold_values <- c(threshold_values, 0)
}

# AUC (trapezoidal rule)
roc_auc <- sum(diff(fpr) * (head(tpr, -1) + tail(tpr, -1)) / 2)
cat("AUC:", roc_auc, "\n")

# Optimal threshold (Youden's index) - same rule as backend
youden <- tpr - fpr
best_idx <- which.max(youden)

best_threshold <- threshold_values[best_idx]
best_sensitivity <- tpr[best_idx]
best_specificity <- 1 - fpr[best_idx]

cat("Optimal Threshold:", best_threshold, "\n")
cat("Sensitivity:", best_sensitivity, "\n")
cat("Specificity:", best_specificity, "\n")

best_point <- tibble(
  fpr = 1 - best_specificity,
  tpr = best_sensitivity,
  threshold = best_threshold
)

# Plot with ggplot2
roc_df <- tibble(fpr = fpr, tpr = tpr)

ggplot(roc_df, aes(x = fpr, y = tpr)) +
  geom_line(color = "#1565C0", linewidth = 1.2) +
  geom_area(fill = "#1565C0", alpha = 0.1) +
  geom_abline(intercept = 0, slope = 1, linetype = "dashed", color = "gray50") +
  geom_point(
    data = best_point,
    aes(x = fpr, y = tpr),
    color = "red",
    size = 3,
    shape = 8,
    inherit.aes = FALSE
  ) +
  labs(
    title = "ROC Curve",
    subtitle = paste0("AUC = ", round(roc_auc, 4),
                      " | Optimal threshold = ", round(best_point$threshold, 3)),
    x = "False Positive Rate (1 - Specificity)",
    y = "True Positive Rate (Sensitivity)"
  ) +
  coord_equal(xlim = c(0, 1), ylim = c(0, 1), expand = FALSE) +
  theme_minimal(base_size = 12)

How to Create a ROC Curve in Python

Using scikit-learn and Plotly to create an interactive ROC curve with optimal threshold.

Python

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Sample data (same as this page's sampleData)
y_true = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
y_scores = np.array([0.95, 0.92, 0.88, 0.85, 0.82, 0.79, 0.76, 0.73, 0.71, 0.68, 0.65, 0.62, 0.58, 0.55, 0.52, 0.48, 0.45, 0.42, 0.38, 0.35, 0.32, 0.28, 0.25, 0.22, 0.18, 0.12, 0.15, 0.18, 0.22, 0.25, 0.28, 0.32, 0.35, 0.38, 0.08, 0.05, 0.42, 0.45, 0.1, 0.02, 0.48, 0.15, 0.2, 0.06, 0.3, 0.12, 0.08, 0.25, 0.04, 0.01])

# Calculate ROC curve exactly like backend
thresholds = np.sort(np.unique(y_scores))[::-1]

tpr_list = [0.0]
fpr_list = [0.0]
threshold_list = [thresholds[0] + 0.01 if len(thresholds) > 0 else 1.01]

total_positive = np.sum(y_true == 1)
total_negative = np.sum(y_true == 0)

for thresh in thresholds:
  pred_positive = y_scores >= thresh
  tp = np.sum((pred_positive) & (y_true == 1))
  fp = np.sum((pred_positive) & (y_true == 0))

  tpr_list.append(tp / total_positive)
  fpr_list.append(fp / total_negative)
  threshold_list.append(thresh)

if tpr_list[-1] != 1.0 or fpr_list[-1] != 1.0:
  tpr_list.append(1.0)
  fpr_list.append(1.0)
  threshold_list.append(0.0)

fpr = np.array(fpr_list)
tpr = np.array(tpr_list)
thresholds = np.array(threshold_list)

roc_auc = np.trapezoid(tpr, fpr)

# Optimal threshold (Youden's index) - same rule as backend
youden = tpr - fpr
best_idx = np.argmax(youden)
optimal_threshold = thresholds[best_idx]

# Plot with seaborn/matplotlib
sns.set_theme(style='whitegrid')
fig, ax = plt.subplots(figsize=(8, 6))

ax.plot(
  fpr,
  tpr,
  color='#1565C0',
  linewidth=2.5,
  label=f'ROC Curve (AUC = {roc_auc:.4f})'
)
ax.fill_between(fpr, tpr, alpha=0.1, color='#1565C0')
ax.plot([0, 1], [0, 1], linestyle='--', color='gray', linewidth=1, label='Random Classifier')
ax.scatter(
  fpr[best_idx],
  tpr[best_idx],
  color='red',
  s=120,
  marker='*',
  label=f'Optimal (threshold={optimal_threshold:.3f})',
  zorder=5
)

ax.set_title('ROC Curve')
ax.set_xlabel('False Positive Rate (1 - Specificity)')
ax.set_ylabel('True Positive Rate (Sensitivity)')
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.legend(loc='lower right')

plt.tight_layout()
plt.show()

print(f"AUC: {roc_auc:.4f}")
print(f"Optimal Threshold: {optimal_threshold:.4f}")
print(f"Sensitivity at optimal: {tpr[best_idx]:.4f}")
print(f"Specificity at optimal: {1 - fpr[best_idx]:.4f}")

Frequently Asked Questions

What data format do I need for a ROC curve?

You need two columns: (1) actual binary labels (0/1 or two categories like Yes/No) and (2) predicted probabilities (numeric values between 0 and 1). The predicted probabilities typically come from a logistic regression or other classification model.

How is AUC interpreted?

AUC represents the probability that a randomly chosen positive case will have a higher predicted probability than a randomly chosen negative case. An AUC of 0.5 means the model is no better than random guessing, while an AUC of 1.0 indicates perfect discrimination.

When should I use ROC curves?

ROC curves are ideal for evaluating binary classifiers, comparing multiple models, selecting classification thresholds, and assessing diagnostic test accuracy. They are particularly useful when class distributions are relatively balanced. For imbalanced datasets, consider also using Precision-Recall curves.

What is the difference between ROC and Precision-Recall curves?

ROC curves plot TPR vs FPR and are robust to class imbalance in the evaluation metric. Precision-Recall curves plot precision vs recall and are more informative when the positive class is rare. Both are valuable diagnostic tools for classification models.

Can I compare multiple models on the same ROC plot?

Yes, overlaying ROC curves for different models on the same plot is a standard way to compare classifier performance. The model with the highest AUC (curve closest to the top-left corner) generally has the best discriminative ability.

What if my AUC is below 0.5?

An AUC below 0.5 suggests the model is performing worse than random chance, which typically means the positive and negative labels are swapped. Try inverting the positive label or checking your data for labeling errors.