StatsCalculators.com

Scatter Plot Maker

Created:October 10, 2024
Last Updated:September 6, 2025

A scatter plot is a powerful visualization tool that displays the relationship between two numerical variables. Each point on the plot represents an observation in your dataset, with its position determined by the values of the selected X and Y columns. Scatter plots are ideal for spotting trends, patterns, clusters, and outliers, and are commonly used to assess correlation and linearity between variables.

To get started, upload your data or use our sample datasets, then select the columns you want to visualize. You can color points by categories, add trend lines, or create small multiples (facets) for deeper insights. For visualizations that require a third numerical dimension displayed as size, use our Bubble Chart Maker. Not sure how to begin? See our step-by-step tutorial below.

Calculator

1. Load Your Data

2. Select Columns & Options

Related Calculators

Learn More About Scatter Plots

What is a Scatter Plot?

A scatter plot (also called a scattergram) is a graph that shows the relationship between two continuous variables. Each point represents an individual data point, with its position determined by its x and y values. Scatter plots are ideal for visualizing correlation between variables and identifying patterns in your data.

Scatter plot anatomy showing axes, points, trend line, and correlation

The diagram above illustrates the components of a scatter plot. Each point represents a data pair, while the trend line shows the general relationship direction. The spread of points indicates the strength of correlation.

How to Read a Scatter Plot

Understanding scatter plots becomes intuitive once you know what to look for:

Key Components

  • X-axis: The horizontal axis representing one variable
  • Y-axis: The vertical axis representing another variable
  • Data Points: Each point represents a pair of values
  • Trend Line: Optional line showing the general relationship

What to Look For

  • Direction: Is the relationship positive, negative, or neutral?
  • Strength: How closely do the points follow a pattern?
  • Outliers: Are there any points far from the general pattern?
  • Clusters: Do points group together in certain areas?

How to Make a Scatter Plot with Our Calculator

  1. Click Sample Data and select Restaurant Tips
  2. For X-Axis column, select total_bill
  3. For Y-Axis column, select tip
  4. For Color By column, select day or leave it as None
  5. For Facet Column, select time to create small multiples showing lunch vs dinner patterns (or leave as None)
  6. Toggle Show Trend Line to add a regression line
  7. Click Generate Scatter Plot to visualize the data

Tip: If you need to visualize a third numerical dimension as point size (like party size affecting tips), use our Bubble Chart Maker instead.

Correlation in Scatter Plots

One of the main purposes of scatter plots is to visualize correlation between variables. Here's how to interpret different correlation patterns:

Positive Correlation

As X increases, Y tends to increase

Negative Correlation

As X increases, Y tends to decrease

No Correlation

No clear relationship between X and Y

Correlation Coefficient (r)

The correlation coefficient quantifies the strength and direction of a linear relationship:

  • r = 1: Perfect positive correlation
  • r = 0: No correlation
  • r = -1: Perfect negative correlation
  • 0.7 ≤ |r| ≤ 1.0: Strong correlation
  • 0.3 ≤ |r| < 0.7: Moderate correlation
  • 0.0 ≤ |r| < 0.3: Weak correlation

Advanced Scatter Plot Techniques

Multiple Groups

Use different colors to represent different groups or categories within your data, making it easier to spot group-specific patterns.

Small Multiples (Facets)

Create a series of scatter plots side-by-side, each showing data for a different category. This helps compare patterns across groups.

Trend Lines

Add regression lines or curves (linear, logarithmic, exponential) to visualize the general relationship between variables and make predictions.

Matrix Scatter Plots

Create a grid of scatter plots showing relationships between multiple variables simultaneously for comprehensive multivariate analysis.

Scatter Plot Quick Reference

Key Formulas

Correlation (r) = Σ[(xi - x̄)(yi - ȳ)] / √[Σ(xi - x̄)² × Σ(yi - ȳ)²]

Linear Regression: y = mx + b

where m = Σ[(xi - x̄)(yi - ȳ)] / Σ(xi - x̄)²

Coefficient of Determination (R²) = r²

Standard Error = √[Σ(yi - ŷi)² / (n-2)]

Frequently Asked Questions

When should I use a scatter plot vs a bubble chart?

Use a scatter plot when you want to examine the relationship between two continuous variables. Use a bubble chart when you have a third numerical variable that you want to represent as the size of the points. Scatter plots are simpler and clearer for basic correlation analysis, while bubble charts add an extra dimension of information.

What's the difference between correlation and causation?

Correlation shows that two variables change together, but doesn't prove that one causes the other. Causation means one variable directly affects the other. A scatter plot can reveal correlation, but establishing causation requires controlled experiments and additional analysis.

How do I interpret outliers in a scatter plot?

Outliers are points that deviate significantly from the overall pattern. They may represent data errors, special cases, or important insights. Investigate outliers to determine if they should be removed (if they're errors) or highlighted (if they reveal something important about your data).

Can scatter plots show non-linear relationships?

Yes. While a simple linear trend line won't capture non-linear patterns, the scatter of points themselves will reveal curved or complex relationships. You can add non-linear trend lines (polynomial, logarithmic, exponential) to better fit such data patterns.

Creating Scatter Plots in Excel

Microsoft Excel is one of the most popular tools for creating scatter plots. Here's how to make a scatter plot in Excel:

  1. Select your data (two columns: X-values and Y-values)
  2. Go to the Insert tab
  3. In the Charts group, click Scatter
  4. Select your preferred scatter plot style
  5. To add a trend line, right-click on any data point and select Add Trendline
  6. For Excel 365 or Excel 2019, you can click on the chart, then use the Chart Design and Format tabs to customize further

Excel Tips:

  • Use CORREL(array1,array2) to calculate correlation coefficient
  • Use LINEST(y_values,x_values) for detailed regression statistics
  • To handle dates on axes, ensure they're formatted as dates in your data
  • Add a secondary axis with right-click → Format Axis → Axis Options

How to create scatter plot in Python

Use libraries like matplotlib and seaborn to create scatter plots in Python. Here's an example using the popular tips dataset:

Python
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

# Load the data
tips = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

# Set style for better-looking plots
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")


# Calculate correlation
correlation = np.corrcoef(tips['total_bill'], tips['tip'])[0, 1]
print(f"Correlation coefficient: {correlation:.4f}")

# Scatter plot with regression line using seaborn
plt.figure(figsize=(10, 6))
sns.regplot(x='total_bill', y='tip', data=tips, scatter_kws={'alpha':0.5}, line_kws={'color':'red'})
plt.title(f'Scatter Plot with Regression Line (r = {correlation:.4f})')
plt.xlabel('Total Bill ($)')
plt.ylabel('Tip Amount ($)')
plt.show()
Scatter Plot in Python
Python
# Create a scatter plot with time and day
plt.figure(figsize=(12, 8))
sns.scatterplot(data=tips, x='total_bill', y='tip', 
                hue='time', style='day',
                palette='viridis', alpha=0.7)
plt.title('Restaurant Tips by Total Bill, Time, and Day')
plt.xlabel('Total Bill ($)')
plt.ylabel('Tip Amount ($)')
plt.legend(title='Legend')
plt.show()
Scatter Plot in Python

How to create scatter plots in R

Use ggplot2 to create scatter plots in R. Here's how to do it:

R
library(tidyverse)

# Load tips dataset
tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

# basic scatter plot with a regression line
ggplot(tips, aes(x = total_bill, y = tip)) +
  geom_point(size = 3, alpha = 0.7) +
  geom_smooth(method = "lm", color = "blue", fill = "lightblue") +
  labs(title = "Tips vs Total Bill with Linear Regression",
       x = "Total Bill ($)",
       y = "Tip Amount ($)") +
  theme_minimal()
Scatter Plot in R
R
# scatter plot colored by day with facets for time of day
ggplot(tips, aes(x = total_bill, y = tip, color = day)) +
  geom_point(size = 2, alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 0.5) +
  facet_wrap(~time, labeller = labeller(time = c("Lunch" = "Lunch", "Dinner" = "Dinner"))) +
  scale_color_brewer(palette = "Set1", name = "Day") +
  labs(title = "Restaurant Tips Analysis by Time of Day",
       subtitle = "Colored by day of week",
       x = "Total Bill ($)",
       y = "Tip Amount ($)") +
  theme_minimal() +
  theme(legend.position = "bottom")
Scatter Plot in R