Create publication-ready forest plots for meta-analysis and systematic reviews. Visualize effect sizes with confidence intervals, pooled estimates, and heterogeneity statistics across multiple studies.
Not sure how to format your data? (10 studies with odds ratios) to see how it works!
1 for ratio measures (OR/RR/HR), 0 for differences (MD/SMD)
A forest plot is a graphical display of estimated results from multiple studies in a meta-analysis. Each study is represented by a square (effect size) with a horizontal line extending on both sides (confidence interval). A diamond at the bottom typically shows the pooled (combined) effect across all studies.
The name "forest plot" comes from the visual appearance of the plot, which resembles a forest of lines. It was first used in medical research to synthesize findings from randomized controlled trials but is now widely used across social sciences, education, ecology, and business research.
Heterogeneity measures how much variation exists between study results beyond what would be expected by chance.
Tests whether observed differences between studies are due to chance. A significant p-value (typically < 0.10) suggests true heterogeneity exists.
Consider: subgroup analysis, meta-regression, using a random-effects model, or examining potential sources of variation (study design, population, intervention).
Ratio of odds of outcome in treatment vs. control group. Used in case-control studies. Null value = 1.
Ratio of risk of outcome in treatment vs. control group. Used in cohort studies and RCTs. Null value = 1.
Ratio of hazard rates between groups from survival analysis. Used in time-to-event studies. Null value = 1.
Difference in means between groups on the original scale. Used when studies use the same outcome measure. Null value = 0.
Difference in means divided by pooled standard deviation. Used when studies use different scales to measure the same outcome. Also known as Cohen's d or Hedges' g. Null value = 0.
Using the metafor package to create a forest plot.
library(metafor)
# Sample meta-analysis data
dat <- data.frame(
study = c("Smith 2018", "Johnson 2019", "Williams 2019",
"Brown 2020", "Davis 2020", "Miller 2021",
"Wilson 2021", "Moore 2022", "Taylor 2022", "Anderson 2023"),
yi = log(c(1.52, 0.85, 1.23, 1.78, 1.05, 1.45, 0.92, 1.67, 1.31, 1.12)), # log OR
lower = log(c(1.12, 0.62, 0.89, 1.21, 0.76, 1.05, 0.68, 1.15, 0.95, 0.82)),
upper = log(c(2.06, 1.16, 1.70, 2.62, 1.45, 2.00, 1.25, 2.42, 1.81, 1.53))
)
# Calculate SE from CI
dat$sei <- (dat$upper - dat$lower) / (2 * 1.96)
# Fit random-effects model
res <- rma(yi = yi, sei = sei, data = dat, method = "REML",
slab = study)
# Create forest plot
forest(res,
atransf = exp, # Back-transform to OR scale
xlab = "Odds Ratio",
refline = 0, # null line at log(1) = 0
header = c("Study", "OR [95% CI]"),
cex = 0.9)
# Add heterogeneity info
text(-4, -1, pos = 4, cex = 0.8,
bquote(paste("I"^2, " = ", .(formatC(res$I2, digits = 1, format = "f")), "%",
", p = ", .(formatC(res$QEp, digits = 3, format = "f")))))Using Matplotlib and Seaborn to create a forest plot.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Sample meta-analysis data
studies = [
"Smith et al. (2018)", "Johnson et al. (2019)", "Williams et al. (2019)",
"Brown et al. (2020)", "Davis et al. (2020)", "Miller et al. (2021)",
"Wilson et al. (2021)", "Moore et al. (2022)", "Taylor et al. (2022)",
"Anderson et al. (2023)"
]
odds_ratios = [1.52, 0.85, 1.23, 1.78, 1.05, 1.45, 0.92, 1.67, 1.31, 1.12]
lower_ci = [1.12, 0.62, 0.89, 1.21, 0.76, 1.05, 0.68, 1.15, 0.95, 0.82]
upper_ci = [2.06, 1.16, 1.70, 2.62, 1.45, 2.00, 1.25, 2.42, 1.81, 1.53]
n = len(studies)
y_pos = np.arange(n)
sns.set_theme(style="whitegrid")
fig, ax = plt.subplots(figsize=(10, max(5, 0.6 * n)))
# CI lines
for i in range(n):
ax.plot([lower_ci[i], upper_ci[i]], [y_pos[i], y_pos[i]],
color="#333", linewidth=1.5)
# Effect size points
ax.scatter(odds_ratios, y_pos, s=80, color="#1565C0", marker="s",
label="Study OR", zorder=3)
# Null line (OR = 1)
ax.axvline(x=1, linestyle="--", color="gray")
ax.set_xscale("log")
ax.set_xlabel("Odds Ratio")
ax.set_title("Forest Plot - Odds Ratios")
ax.set_yticks(y_pos)
ax.set_yticklabels(studies)
ax.invert_yaxis()
ax.grid(axis="y", visible=False)
plt.tight_layout()
plt.show()At minimum, you need the study name, effect size (OR, RR, HR, MD, or SMD), and 95% confidence interval bounds for each study. Optionally, you can provide study weights.
Ratio measures (OR, RR, HR) are asymmetric around 1. For example, OR=2 and OR=0.5 are equally "far" from the null in opposite directions, but on a linear scale they appear different distances from 1. A log scale makes them symmetric.
This tool uses a fixed-effect (inverse-variance weighted) model. If heterogeneity is high (I2 > 50%), a random-effects model may be more appropriate. For formal meta-analysis, consider using R's metafor or Python's statsmodels packages.
The diamond shows the pooled (combined) effect estimate across all studies. The center of the diamond is the pooled point estimate, and its horizontal extent shows the 95% confidence interval for the pooled effect.
If no weight column is provided, weights are calculated using the inverse-variance method: w = 1/SE2, where SE is derived from the confidence interval. Studies with narrower CIs (more precision) receive higher weights.
Square sizes are proportional to study weights. Larger squares represent studies that contribute more to the pooled estimate, typically because they have larger sample sizes or more precise estimates.
Create publication-ready forest plots for meta-analysis and systematic reviews. Visualize effect sizes with confidence intervals, pooled estimates, and heterogeneity statistics across multiple studies.
Not sure how to format your data? (10 studies with odds ratios) to see how it works!
1 for ratio measures (OR/RR/HR), 0 for differences (MD/SMD)
A forest plot is a graphical display of estimated results from multiple studies in a meta-analysis. Each study is represented by a square (effect size) with a horizontal line extending on both sides (confidence interval). A diamond at the bottom typically shows the pooled (combined) effect across all studies.
The name "forest plot" comes from the visual appearance of the plot, which resembles a forest of lines. It was first used in medical research to synthesize findings from randomized controlled trials but is now widely used across social sciences, education, ecology, and business research.
Heterogeneity measures how much variation exists between study results beyond what would be expected by chance.
Tests whether observed differences between studies are due to chance. A significant p-value (typically < 0.10) suggests true heterogeneity exists.
Consider: subgroup analysis, meta-regression, using a random-effects model, or examining potential sources of variation (study design, population, intervention).
Ratio of odds of outcome in treatment vs. control group. Used in case-control studies. Null value = 1.
Ratio of risk of outcome in treatment vs. control group. Used in cohort studies and RCTs. Null value = 1.
Ratio of hazard rates between groups from survival analysis. Used in time-to-event studies. Null value = 1.
Difference in means between groups on the original scale. Used when studies use the same outcome measure. Null value = 0.
Difference in means divided by pooled standard deviation. Used when studies use different scales to measure the same outcome. Also known as Cohen's d or Hedges' g. Null value = 0.
Using the metafor package to create a forest plot.
library(metafor)
# Sample meta-analysis data
dat <- data.frame(
study = c("Smith 2018", "Johnson 2019", "Williams 2019",
"Brown 2020", "Davis 2020", "Miller 2021",
"Wilson 2021", "Moore 2022", "Taylor 2022", "Anderson 2023"),
yi = log(c(1.52, 0.85, 1.23, 1.78, 1.05, 1.45, 0.92, 1.67, 1.31, 1.12)), # log OR
lower = log(c(1.12, 0.62, 0.89, 1.21, 0.76, 1.05, 0.68, 1.15, 0.95, 0.82)),
upper = log(c(2.06, 1.16, 1.70, 2.62, 1.45, 2.00, 1.25, 2.42, 1.81, 1.53))
)
# Calculate SE from CI
dat$sei <- (dat$upper - dat$lower) / (2 * 1.96)
# Fit random-effects model
res <- rma(yi = yi, sei = sei, data = dat, method = "REML",
slab = study)
# Create forest plot
forest(res,
atransf = exp, # Back-transform to OR scale
xlab = "Odds Ratio",
refline = 0, # null line at log(1) = 0
header = c("Study", "OR [95% CI]"),
cex = 0.9)
# Add heterogeneity info
text(-4, -1, pos = 4, cex = 0.8,
bquote(paste("I"^2, " = ", .(formatC(res$I2, digits = 1, format = "f")), "%",
", p = ", .(formatC(res$QEp, digits = 3, format = "f")))))Using Matplotlib and Seaborn to create a forest plot.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Sample meta-analysis data
studies = [
"Smith et al. (2018)", "Johnson et al. (2019)", "Williams et al. (2019)",
"Brown et al. (2020)", "Davis et al. (2020)", "Miller et al. (2021)",
"Wilson et al. (2021)", "Moore et al. (2022)", "Taylor et al. (2022)",
"Anderson et al. (2023)"
]
odds_ratios = [1.52, 0.85, 1.23, 1.78, 1.05, 1.45, 0.92, 1.67, 1.31, 1.12]
lower_ci = [1.12, 0.62, 0.89, 1.21, 0.76, 1.05, 0.68, 1.15, 0.95, 0.82]
upper_ci = [2.06, 1.16, 1.70, 2.62, 1.45, 2.00, 1.25, 2.42, 1.81, 1.53]
n = len(studies)
y_pos = np.arange(n)
sns.set_theme(style="whitegrid")
fig, ax = plt.subplots(figsize=(10, max(5, 0.6 * n)))
# CI lines
for i in range(n):
ax.plot([lower_ci[i], upper_ci[i]], [y_pos[i], y_pos[i]],
color="#333", linewidth=1.5)
# Effect size points
ax.scatter(odds_ratios, y_pos, s=80, color="#1565C0", marker="s",
label="Study OR", zorder=3)
# Null line (OR = 1)
ax.axvline(x=1, linestyle="--", color="gray")
ax.set_xscale("log")
ax.set_xlabel("Odds Ratio")
ax.set_title("Forest Plot - Odds Ratios")
ax.set_yticks(y_pos)
ax.set_yticklabels(studies)
ax.invert_yaxis()
ax.grid(axis="y", visible=False)
plt.tight_layout()
plt.show()At minimum, you need the study name, effect size (OR, RR, HR, MD, or SMD), and 95% confidence interval bounds for each study. Optionally, you can provide study weights.
Ratio measures (OR, RR, HR) are asymmetric around 1. For example, OR=2 and OR=0.5 are equally "far" from the null in opposite directions, but on a linear scale they appear different distances from 1. A log scale makes them symmetric.
This tool uses a fixed-effect (inverse-variance weighted) model. If heterogeneity is high (I2 > 50%), a random-effects model may be more appropriate. For formal meta-analysis, consider using R's metafor or Python's statsmodels packages.
The diamond shows the pooled (combined) effect estimate across all studies. The center of the diamond is the pooled point estimate, and its horizontal extent shows the 95% confidence interval for the pooled effect.
If no weight column is provided, weights are calculated using the inverse-variance method: w = 1/SE2, where SE is derived from the confidence interval. Studies with narrower CIs (more precision) receive higher weights.
Square sizes are proportional to study weights. Larger squares represent studies that contribute more to the pooled estimate, typically because they have larger sample sizes or more precise estimates.