StatsCalculators.com

Mean, Median, Mode

Created:August 24, 2024

This calculator helps you analyze the central tendency of your data distribution. It calculates the arithmetic mean (average), median (middle value), and mode (most frequent value), helping you understand the typical values and patterns in your dataset. It also provides a histogram with the mean, median, and mode marked for easy visualization if a numeric column is selected. A bar chart is also provided to visualize the frequency distribution of a categorical column.

Quick Calculator

Need a quick calculation? Enter your numbers below:

Calculator

1. Load Your Data

Note: Column names will be converted to snake_case (e.g., "Product ID" → "product_id") for processing.

2. Select Columns & Options

Related Calculators

Learn More

Mean (Arithmetic Mean)

Definition

The arithmetic average of all values, representing the central tendency of the data.

Formula

Mean=1ni=1nxi=x1+x2+...+xnn\text{Mean} = \frac{1}{n} \sum_{i = 1}^{n} x_i = \frac{x_1 + x_2 + ... + x_n}{n}

Example

For the numbers: 2,4,6,82, 4, 6, 8

Mean=2+4+6+84=204=5\text{Mean} = \frac{2 + 4 + 6 + 8}{4} = \frac{20}{4} = 5

Key Points

  • Sensitive to extreme values (outliers)
  • Best used when data is symmetrically distributed

Comparison of Different Means

TypeFormulaExample
Arithmetic Mean1ni=1nxi=x1+x2+...+xnn\frac{1}{n}\sum_{i=1}^n x_i = \frac{x_1 + x_2 + ... + x_n}{n}2+4+6+84=5\frac{2 + 4 + 6 + 8}{4} = 5
Geometric Meani=1nxin=x1×x2×...×xnn\sqrt[n]{\prod_{i=1}^n x_i} = \sqrt[n]{x_1 \times x_2 \times ... \times x_n}2×4×6×844.4\sqrt[4]{2 \times 4 \times 6 \times 8} \approx 4.4
Harmonic Meanni=1n1xi=n1x1+1x2+...+1xn\frac{n}{\sum_{i=1}^n \frac{1}{x_i}} = \frac{n}{\frac{1}{x_1} + \frac{1}{x_2} + ... + \frac{1}{x_n}}412+14+16+183.8\frac{4}{\frac{1}{2} + \frac{1}{4} + \frac{1}{6} + \frac{1}{8}} \approx 3.8

Arithmetic Mean(Current)

Best For:
  • Simple averages of quantities
  • Calculating central tendency
  • Equal weight to all values
  • Linear relationships
Common Uses:
  • Test scores
  • Heights/weights
  • Daily temperatures
  • Income levels
Limitations:
  • Sensitive to outliers
  • Not ideal for ratios or rates

Geometric Mean

Best For:
  • Growth rates
  • Percentage changes
  • Ratios and proportions
  • Exponential relationships
Common Uses:
  • Investment returns
  • Population growth
  • Interest rates
  • Price indices
Limitations:
  • Only works with positive numbers
  • More complex calculation

Harmonic Mean

Best For:
  • Rates and speeds
  • Per-unit quantities
  • Reciprocal relationships
  • Density measurements
Common Uses:
  • Average speed over different distances
  • Price per unit calculations
  • Density measurements
  • Circuit calculations (parallel resistors)
Limitations:
  • Only works with positive numbers
  • Gives more weight to smaller values

When to Use Each Mean

Use Arithmetic Mean when you need a simple average and all values should have equal weight

Use Geometric Mean when dealing with growth rates, returns, or multiplicative changes

Use Harmonic Mean when working with rates, speeds, or other measures where using reciprocals makes sense

Median

Definition

The middle value in a sorted set of numbers. For an even number of values, it's the average of the two middle numbers.

Formula

Median={x(n+1)/2if n is oddxn/2+xn/2+12if n is even\text{Median} = \begin{cases} x_{(n+1)/2} & \text{if n is odd} \\ \frac{x_{n/2} + x_{n/2 + 1}}{2} & \text{if n is even} \end{cases}

Example

For odd number of values: 1,3,5,7,91, 3, 5, 7, 9

Median=x(n+1)/2=x(5+1)/2=x3=5\text{Median} = x_{(n+1)/2} = x_{(5+1)/2} = x_3 = 5

For even number of values: 1,3,5,71, 3, 5, 7

Median=xn/2+xn/2+12=x4/2+x4/2+12=x2+x32=3+52=4\text{Median} = \frac{x_{n/2} + x_{n/2 + 1}}{2} = \frac{x_{4/2} + x_{4/2 + 1}}{2} = \frac{x_2 + x_3}{2} = \frac{3 + 5}{2} = 4

Key Points

  • Not affected by extreme values (outliers)
  • Best used when data is skewed or contains outliers

Mode

Definition

The most frequently occurring value(s) in a set of numbers. A dataset can have no mode, one mode, or multiple modes.

Formula

Mode=value(s) with the highest frequency\text{Mode} = \text{value(s) with the highest frequency}

Example

For the numbers with multiple modes: 1,2,2,3,4,4,51, 2, 2, 3, 4, 4, 5

Mode=2,4\text{Mode} = 2, 4

Because 2 and 4 each appear twice, more than any other number

For numbers 1,2,3,41, 2, 3, 4, there is no mode because all numbers appear exactly once.

Key Points

  • Can have multiple modes (bimodal, trimodal, etc.) or no mode
  • Particularly useful for categorical data and discrete numerical data

Mean vs Median: When to Use Which?

Mean (Average)

Best for:Symmetric, normally distributed data
Sensitive to:Outliers and extreme values
Uses:All data points in calculation

Median (Middle Value)

Best for:Skewed data or data with outliers
Resistant to:Outliers and extreme values
Uses:Only middle value(s) position

Example: Impact of Outliers

Normal data: Annual salaries in thousands:40,42,45,48,5040, 42, 45, 48, 50

Mean:\ rac{40+42+45+48+50}{5} = 45
Median:4545 (middle value)

With outlier: Adding CEO salary:40,42,45,48,50040, 42, 45, 48, 500

Mean:\ rac{40+42+45+48+500}{5} = 135↑ Dramatically increased!
Median:4545 (still middle value)→ Unchanged

Decision Guide

Use Mean when:
  • • Data is approximately normally distributed
  • • No significant outliers present
  • • You want to include all data points in analysis
Use Median when:
  • • Data is skewed (left or right)
  • • Outliers are present
  • • You want a robust measure of central tendency

Quick Reference

Symmetric Data
Mean ≈ Median
Right Skewed
Mean > Median
Left Skewed
Mean < Median

Calculating Mean, Median, and Mode from a Bar Chart

Mean Calculation

1. Multiply each value by its frequency:

2 × 1 = 2
4 × 2 = 8
6 × 3 = 18
8 × 2 = 16
10 × 1 = 10

2. Sum all products: 2 + 8 + 18 + 16 + 10 = 54

3. Sum all frequencies: 1 + 2 + 3 + 2 + 1 = 9

4. Divide: 54 ÷ 9 = 6

Mean = 6

Median Calculation

1. List all values (including duplicates):

2, 4, 4, 6, 6, 6, 8, 8, 10

2. Find middle position: (n+1) ÷ 2 = (9+1) ÷ 2 = 5th

3. Find 5th value in ordered list

Median = 6

Mode Calculation

1. Count frequency of each value:

2: occurs 1 time
4: occurs 2 times
6: occurs 3 times
8: occurs 2 times
10: occurs 1 time

2. Find highest frequency (3)

3. Identify value(s) with that frequency

Mode = 6

Calculating Mean, Median, and Mode in R

Here is an example of how to calculate the mean, median, and mode of a dataset in R using the tidyverse package.

R
library(tidyverse)

tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

central_measures <- tips |>
  summarise(
    mean_tip = mean(tip), # 2.9983
    median_tip = median(tip) # 2.9
  )

day_counts <- table(tips$day)
mode_days <- names(day_counts[day_counts == max(day_counts)]) # Sat

# histogram of tip with mean, median vertical lines
ggplot(tips, aes(x = tip)) +
  geom_histogram(binwidth = 0.5, fill = "darkblue", alpha = 0.7) +
  geom_vline(aes(xintercept = central_measures$mean_tip, color = "Mean"), 
             linetype = "dashed", linewidth = 1) +
  geom_vline(aes(xintercept = central_measures$median_tip, color = "Median"), 
             linetype = "dotted", linewidth = 1) +
  scale_color_manual(name = "Statistics", 
                     values = c("Mean" = "red", "Median" = "green")) +
  labs(title = "Distribution of Tip Amounts",
       x = "Tip Amount ($)",
       y = "Frequency") +
  theme_minimal() +
  theme(legend.position = "bottom")

# bar chart of day frequencies
day_freq <- tips |>
  count(day) |>
  arrange(desc(n))

# bar chart with calculated frequencies
ggplot(day_freq, aes(x = reorder(day, -n), y = n)) +
  geom_bar(stat = "identity", fill = "steelblue", width = 0.6) +
  geom_text(aes(label = n), vjust = -0.5, size = 3.5) +
  labs(title = "Frequency of Restaurant Visits by Day",
       x = "Day of Week",
       y = "Number of Visits") +
  theme_minimal()

Calculating Mean, Median, and Mode in Python

Here is an example of how to calculate the mean, median, and mode of a dataset in Python using the pandas, numpy, and scipy libraries.

Python
import pandas as pd
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

# Load the tips dataset
tips = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

# Calculate central measures for tip amounts
central_measures = {
    'mean_tip': tips['tip'].mean(),      # 2.9983
    'median_tip': tips['tip'].median(),  # 2.9
}

# Calculate mode for categorical data (day)
mode_day = tips['day'].mode()[0]  # Sat

print(f"Mean tip: {central_measures['mean_tip']:.2f}")
print(f"Median tip: {central_measures['median_tip']:.2f}")
print(f"Most common day: {mode_day}")

# Create histogram with mean and median lines
plt.figure(figsize=(10, 6))
plt.hist(tips['tip'], bins=20, alpha=0.7, color='darkblue', edgecolor='black')
plt.axvline(central_measures['mean_tip'], color='red', linestyle='--', 
           linewidth=2, label=f"Mean: {central_measures['mean_tip']:.2f}")
plt.axvline(central_measures['median_tip'], color='green', linestyle=':', 
           linewidth=2, label=f"Median: {central_measures['median_tip']:.2f}")
plt.xlabel('Tip Amount ($)')
plt.ylabel('Frequency')
plt.title('Distribution of Tip Amounts')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# Bar chart of day frequencies
day_counts = tips['day'].value_counts().sort_values(ascending=False)

plt.figure(figsize=(8, 6))
bars = plt.bar(day_counts.index, day_counts.values, color='steelblue', width=0.6)
plt.xlabel('Day of Week')
plt.ylabel('Number of Visits')
plt.title('Frequency of Restaurant Visits by Day')

# Add value labels on bars
for bar, value in zip(bars, day_counts.values):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.5, 
             str(value), ha='center', va='bottom')

plt.show()

# Summary statistics
print("\nSummary Statistics:")
print(tips['tip'].describe())

Calculating Mean, Median, and Mode in SQL

Here is an example of how to calculate the mean, median, and mode of a dataset in SQL using aggregate functions and window functions.

Csv
employee_id,department,salary
1,Sales,45000
2,Sales,52000
3,Engineering,75000
4,Marketing,48000
5,Engineering,72000
6,Sales,45000
7,Marketing,51000
8,Engineering,78000
9,Sales,49000
10,Marketing,48000
SQL
-- Create and populate the employee table
CREATE TABLE employees (
    employee_id INT,
    department VARCHAR(50),
    salary INT
);

INSERT INTO employees VALUES
(1, 'Sales', 45000),
(2, 'Sales', 52000),
(3, 'Engineering', 75000),
(4, 'Marketing', 48000),
(5, 'Engineering', 72000),
(6, 'Sales', 45000),
(7, 'Marketing', 51000),
(8, 'Engineering', 78000),
(9, 'Sales', 49000),
(10, 'Marketing', 48000);

-- Calculate mean, median, and mode for salaries
WITH salary_statistics AS (
  SELECT 
    -- Mean calculation
    AVG(salary) as mean_salary,
    
    -- Median calculation (PostgreSQL/SQL Server)
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) as median_salary,
    
    -- Count total records
    COUNT(*) as total_employees
  FROM employees
),

-- Mode calculation for categorical data (most frequent department)
dept_mode AS (
  SELECT 
    department as mode_department,
    COUNT(*) as dept_frequency
  FROM employees
  GROUP BY department
  ORDER BY COUNT(*) DESC
  LIMIT 1
),

-- Mode calculation for numerical data (most frequent salary)
salary_mode AS (
  SELECT 
    salary as mode_salary,
    COUNT(*) as salary_frequency
  FROM employees
  GROUP BY salary
  ORDER BY COUNT(*) DESC
  LIMIT 1
)

-- Combine all statistics
SELECT 
  ROUND(ss.mean_salary, 2) as mean_salary,      -- 54300.00
  ss.median_salary,                             -- 50500.00
  sm.mode_salary,                               -- 45000 (appears twice)
  dm.mode_department,                           -- Sales (appears 4 times)
  ss.total_employees                            -- 10
FROM salary_statistics ss
CROSS JOIN dept_mode dm
CROSS JOIN salary_mode sm;

-- Alternative median calculation (works in most SQL databases)
WITH ordered_salaries AS (
  SELECT 
    salary,
    ROW_NUMBER() OVER (ORDER BY salary) as row_num,
    COUNT(*) OVER () as total_count
  FROM employees
)
SELECT AVG(salary) as median_salary
FROM ordered_salaries
WHERE row_num IN (
  FLOOR((total_count + 1) / 2.0),
  CEIL((total_count + 1) / 2.0)
);

-- Department analysis
SELECT 
  department,
  COUNT(*) as employee_count,
  ROUND(AVG(salary), 0) as avg_salary,
  MIN(salary) as min_salary,
  MAX(salary) as max_salary
FROM employees
GROUP BY department
ORDER BY avg_salary DESC;

-- Results:
-- Engineering: 3 employees, avg $75,000, min $72,000, max $78,000
-- Sales: 4 employees, avg $47,750, min $45,000, max $52,000  
-- Marketing: 3 employees, avg $49,000, min $48,000, max $51,000

Calculating Mean, Median, and Mode in Excel

Here is an example of how to calculate the mean, median, and mode of a dataset in Excel using built-in functions with product sales data.

Csv
Product,Category,Units_Sold
Widget A,Electronics,15
Widget B,Electronics,22
Gadget X,Home,18
Tool Pro,Tools,15
Smart TV,Electronics,12
Lamp Z,Home,25
Hammer,Tools,20
Phone Y,Electronics,18
Chair,Home,15
Drill,Tools,22
Excel
// Excel worksheet layout:
//    A         B           C
// 1  Product   Category    Units_Sold
// 2  Widget A  Electronics 15
// 3  Widget B  Electronics 22
// 4  Gadget X  Home        18
// 5  Tool Pro  Tools       15
// 6  Smart TV  Electronics 12
// 7  Lamp Z    Home        25
// 8  Hammer    Tools       20
// 9  Phone Y   Electronics 18
// 10 Chair     Home        15
// 11 Drill     Tools       22

// Basic statistics for Units_Sold column (column C)
=AVERAGE(C2:C11)    // Mean: 18.2
=MEDIAN(C2:C11)     // Median: 18
=MODE.SNGL(C2:C11)  // Mode: 15 (appears 3 times)

// Category frequency analysis in columns E and F:
E1: "Category"      F1: "Count"
E2: "Electronics"   F2: =COUNTIF(B2:B11,"Electronics")  // 4
E3: "Home"          F3: =COUNTIF(B2:B11,"Home")         // 3  
E4: "Tools"         F4: =COUNTIF(B2:B11,"Tools")        // 3

// Find mode category:
=INDEX(E2:E4, MATCH(MAX(F2:F4), F2:F4, 0))  // Electronics

// Summary statistics table in columns H and I:
H1: "Statistic"           I1: "Value"
H2: "Mean Units"          I2: =ROUND(AVERAGE(C2:C11), 1)      // 18.2
H3: "Median Units"        I3: =MEDIAN(C2:C11)                 // 18
H4: "Mode Units"          I4: =MODE.SNGL(C2:C11)              // 15
H5: "Mode Category"       I5: =INDEX(E2:E4,MATCH(MAX(F2:F4),F2:F4,0))  // Electronics
H6: "Total Products"      I6: =COUNT(C2:C11)                  // 10
H7: "Min Units"           I7: =MIN(C2:C11)                    // 12
H8: "Max Units"           I8: =MAX(C2:C11)                    // 25
H9: "Standard Deviation"  I9: =ROUND(STDEV.S(C2:C11), 1)     // 4.3

// Category analysis in columns K, L, M:
K1: "Category"      L1: "Avg Units"    M1: "Total Units"
K2: "Electronics"   L2: =AVERAGEIF(B2:B11,"Electronics",C2:C11)  // 16.8
K3: "Home"          L3: =AVERAGEIF(B2:B11,"Home",C2:C11)         // 19.3
K4: "Tools"         L4: =AVERAGEIF(B2:B11,"Tools",C2:C11)        // 19.0

M2: =SUMIF(B2:B11,"Electronics",C2:C11)  // 67
M3: =SUMIF(B2:B11,"Home",C2:C11)         // 58  
M4: =SUMIF(B2:B11,"Tools",C2:C11)        // 57

// Performance classification:
// In column D, add performance labels
D1: "Performance"
D2: =IF(C2>=20,"High",IF(C2>=15,"Medium","Low"))
// Copy formula down to D11

// Count performance levels:
O1: "Performance"   P1: "Count"
O2: "High"          P2: =COUNTIF(D2:D11,"High")    // 4 products
O3: "Medium"        P3: =COUNTIF(D2:D11,"Medium")  // 3 products  
O4: "Low"           P4: =COUNTIF(D2:D11,"Low")     // 3 products

// Conditional formatting for units sold:
// Select C2:C11, then Home > Conditional Formatting > Color Scales
// Or create custom rules:
// Green: =C2>=20    (High performers)
// Yellow: =AND(C2>=15,C2<20)  (Medium performers)  
// Red: =C2<15       (Low performers)

// Chart creation:
// 1. Select category data (E2:F4) for category distribution
// 2. Insert > Pie Chart or Column Chart
// 3. Title: "Product Distribution by Category"

// For units sold distribution:
// 1. Select performance data (O2:P4)  
// 2. Insert > Column Chart
// 3. Title: "Products by Performance Level"

// Advanced formulas:
// Multiple modes (if they exist):
=MODE.MULT(C2:C11)

// Products above average:
=COUNTIF(C2:C11,">"&AVERAGE(C2:C11))  // 4 products

// Quartile analysis:
=QUARTILE(C2:C11,1)  // Q1: 15
=QUARTILE(C2:C11,3)  // Q3: 22

// Identify outliers (values beyond 1.5 * IQR):
// IQR = Q3 - Q1 = 22 - 15 = 7
// Lower bound: Q1 - 1.5*IQR = 15 - 10.5 = 4.5
// Upper bound: Q3 + 1.5*IQR = 22 + 10.5 = 32.5
// No outliers in this dataset