This calculator helps you analyze the relationship between two categorical variables by creating a cross-tabulation (contingency table) of their frequencies. It reveals patterns and associations by organizing your data into a matrix that shows how different categories of one variable relate to categories of another variable.
What You'll Get:
- Complete Contingency Table: Cross-tabulated frequencies with row and column totals
- Chi-Square Test: Statistical significance testing with p-values and degrees of freedom
- Association Measures: Cramer's V for strength of association, plus Phi coefficient and Odds Ratios for 2×2 tables
- Rich Visualizations: Interactive heatmap, grouped bar charts, and proportional stacked bars
💡 Pro Tip: Each variable must have fewer than 10 unique categories for optimal analysis. For continuous variables, consider creating meaningful groups (like age ranges) with our Histogram Calculator before analysis.
Perfect for analyzing relationships like gender × voting preferences, education level × employment status, or treatment × outcome. Try our sample dataset to see how age groups relate to communication preferences, or upload your own data to uncover hidden associations.
Quick Calculator
Need a quick calculation? Enter your numbers below:
Calculator
1. Load Your Data
2. Select Two Columns
Related Calculators
Learn More
Understanding Contingency Tables
What is a Contingency Table?
A contingency table, also known as a cross tabulation or crosstab, is a type of table in a matrix format that displays the frequency distribution of variables. It's used to record and analyze the relationship between two or more categorical variables.
Count Analysis
Shows the raw frequency of each combination of categories. This is the most basic form of contingency table analysis.
Percentage Analysis
Converts frequencies to percentages, showing proportional relationships:
- Row percentages (within each row)
- Column percentages (within each column)
- Total percentages (of grand total)
Tips for Interpretation
Look for:
- Patterns in the data
- Unexpected high/low values
- Row/column trends
- Overall distributions
Consider:
- Sample size adequacy
- Missing data patterns
- Practical significance
- Context of the data
Market Research Example: Smartphone Brand Preference
A smartphone company surveyed 1000 participants about their age group and preferred brand.
Frequency Distribution
Age Group | BrandA | BrandB | BrandC | Other |
---|---|---|---|---|
18-24 | 80 | 100 | 60 | 10 |
25-34 | 120 | 90 | 70 | 20 |
35-44 | 100 | 70 | 80 | 50 |
45+ | 60 | 40 | 30 | 20 |
Brand Preferences by Age Group (Absolute Numbers)
Brand Preferences by Age Group (Percentage)
Key Insights:
- BrandA is most popular with the 25-34 age group.
- BrandB is favored by the 18-24 age group.
- Older age groups (35+) show more diverse preferences.
This contingency table helps the company understand how brand preferences vary across age groups, informing targeted marketing strategies and product development.
How to Calculate Exponential Distribution in R
library(tidyverse)
tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
# built-in function
table(tips$sex, tips$smoker)
# dplyr
tips |>
group_by(sex, smoker) |>
summarise(
count = n(),
.groups = "drop"
) |>
mutate(percentage = str_glue("{round(count / sum(count), 4) * 100}%"))
How to Calculate Exponential Distribution in Python
import pandas as pd
import numpy as np
tips = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
crosstab = pd.crosstab(tips['sex'], tips['smoker'])
print(crosstab)
# group by
result = (tips
.groupby(['sex', 'smoker'])
.size()
.reset_index(name='count')
)
# percentages
total = result['count'].sum()
result['percentage'] = result['count'].apply(lambda x: f"{round(x / total * 100, 2)}%")
print(result)