This calculator helps you analyze the relationship between two categorical variables by creating a cross-tabulation (contingency table) of their frequencies. It reveals patterns and associations by organizing your data into a matrix that shows how different categories of one variable relate to categories of another variable.
💡 Pro Tip: Each variable must have fewer than 10 unique categories for optimal analysis. For continuous variables, consider creating meaningful groups (like age ranges) with our Histogram Calculator before analysis.
Perfect for analyzing relationships like gender × voting preferences, education level × employment status, or treatment × outcome. to see how age groups relate to communication preferences, or upload your own data to uncover hidden associations.
Need a quick calculation? Enter your numbers below:
A contingency table, also known as a cross tabulation or crosstab, is a type of table in a matrix format that displays the frequency distribution of variables. It's used to record and analyze the relationship between two or more categorical variables.
Shows the raw frequency of each combination of categories. This is the most basic form of contingency table analysis.
Converts frequencies to percentages, showing proportional relationships:
Look for:
Consider:
A smartphone company surveyed 1000 participants about their age group and preferred brand.
| Age Group | BrandA | BrandB | BrandC | Other |
|---|---|---|---|---|
| 18-24 | 80 | 100 | 60 | 10 |
| 25-34 | 120 | 90 | 70 | 20 |
| 35-44 | 100 | 70 | 80 | 50 |
| 45+ | 60 | 40 | 30 | 20 |
This contingency table helps the company understand how brand preferences vary across age groups, informing targeted marketing strategies and product development.
library(tidyverse)
tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
# built-in function
table(tips$sex, tips$smoker)
# dplyr
tips |>
group_by(sex, smoker) |>
summarise(
count = n(),
.groups = "drop"
) |>
mutate(percentage = str_glue("{round(count / sum(count), 4) * 100}%"))import pandas as pd
import numpy as np
tips = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
crosstab = pd.crosstab(tips['sex'], tips['smoker'])
print(crosstab)
# group by
result = (tips
.groupby(['sex', 'smoker'])
.size()
.reset_index(name='count')
)
# percentages
total = result['count'].sum()
result['percentage'] = result['count'].apply(lambda x: f"{round(x / total * 100, 2)}%")
print(result)This calculator helps you analyze the relationship between two categorical variables by creating a cross-tabulation (contingency table) of their frequencies. It reveals patterns and associations by organizing your data into a matrix that shows how different categories of one variable relate to categories of another variable.
💡 Pro Tip: Each variable must have fewer than 10 unique categories for optimal analysis. For continuous variables, consider creating meaningful groups (like age ranges) with our Histogram Calculator before analysis.
Perfect for analyzing relationships like gender × voting preferences, education level × employment status, or treatment × outcome. to see how age groups relate to communication preferences, or upload your own data to uncover hidden associations.
Need a quick calculation? Enter your numbers below:
A contingency table, also known as a cross tabulation or crosstab, is a type of table in a matrix format that displays the frequency distribution of variables. It's used to record and analyze the relationship between two or more categorical variables.
Shows the raw frequency of each combination of categories. This is the most basic form of contingency table analysis.
Converts frequencies to percentages, showing proportional relationships:
Look for:
Consider:
A smartphone company surveyed 1000 participants about their age group and preferred brand.
| Age Group | BrandA | BrandB | BrandC | Other |
|---|---|---|---|---|
| 18-24 | 80 | 100 | 60 | 10 |
| 25-34 | 120 | 90 | 70 | 20 |
| 35-44 | 100 | 70 | 80 | 50 |
| 45+ | 60 | 40 | 30 | 20 |
This contingency table helps the company understand how brand preferences vary across age groups, informing targeted marketing strategies and product development.
library(tidyverse)
tips <- read.csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
# built-in function
table(tips$sex, tips$smoker)
# dplyr
tips |>
group_by(sex, smoker) |>
summarise(
count = n(),
.groups = "drop"
) |>
mutate(percentage = str_glue("{round(count / sum(count), 4) * 100}%"))import pandas as pd
import numpy as np
tips = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
crosstab = pd.crosstab(tips['sex'], tips['smoker'])
print(crosstab)
# group by
result = (tips
.groupby(['sex', 'smoker'])
.size()
.reset_index(name='count')
)
# percentages
total = result['count'].sum()
result['percentage'] = result['count'].apply(lambda x: f"{round(x / total * 100, 2)}%")
print(result)