Mean, Median & Mode
What Are Measures of Central Tendency?
Imagine you have a dataset: exam scores of 100 students, daily temperatures for the past month, or salaries of every employee at a company. These raw numbers could be dozens or millions of values — you need a way to summarize where the "center" of the dataset lies with a single representative value. That is the core goal of measures of central tendency.
Why do we need three different measures? Because no single number can perfectly capture every aspect of a data distribution. Each measure defines "center" from a different mathematical perspective:
- Mean — minimizes the sum of squared distances from all data points to the center. It uses every value in the dataset but is sensitive to extreme values.
- Median — minimizes the sum of absolute distances from all data points to the center. It is robust to outliers and more representative in skewed distributions.
- Mode — the most frequently occurring value. It is the only measure of central tendency applicable to categorical data (e.g., colors, brands).
Understanding the differences between these three — and when to use each — is one of the foundational skills in data analysis, statistics, and machine learning.
The Mean (Arithmetic Average)
The mean is the most commonly used measure of central tendency. Its calculation is straightforward: add all values together, then divide by the count.
ๅๅฒๆธๆบ
ๆฑๅนณๅๅผ็ๆฆๅฟตๅฏไปฅ่ฟฝๆบฏๅฐๅคๅทดๆฏไผฆๆถไปฃ๏ผ็บฆๅ ฌๅ ๅ 300 ๅนด๏ผ๏ผๅคฉๆๅญฆๅฎถไปฌไฝฟ็จๅคๆฌก่งๆต็ๅนณๅๅผๆฅๆ้ซๅคฉไฝไฝ็ฝฎ้ขๆต็็ฒพๅบฆใๅจ่ฟไปฃ๏ผๅกๅฐยทๅผ้ๅพท้ๅธยท้ซๆฏ๏ผCarl Friedrich Gauss๏ผๅ้ฟๅพท้ๅฎ-้ฉฌ้ยทๅ่ฎฉๅพท๏ผAdrien-Marie Legendre๏ผๅจ 19 ไธ็บชๅๅฐๅ ถๆญฃๅผๅไธบๆๅฐไบไนๆณ็ๆ ธๅฟๆฆๅฟต๏ผไธบ็ฐไปฃ็ป่ฎกๅญฆๅฅ ๅฎไบๅบ็กใ
ไธบไปไนๅๅผๆๆ๏ผๆฐๅญฆ่ฏๆ
ๅๅผๅนถ้ไปปๆๅฎไนโโๅฎๆฏไฝฟๆๆๆฐๆฎ็นๅฐ"ไธญๅฟ"็ๅนณๆนๅๅทฎไนๅๆๅฐๅ็้ฃไธชๅฏไธๅผใ่ฟๅฏไปฅ้่ฟๅพฎ็งฏๅ่ฏๆ๏ผ
่ฟๅฐฑๆฏไธบไปไนๅๅผๅจๅๅฝๅๆๅๆๅฐไบไนๆๅไธญๅฆๆญค้่ฆโโๅฎๅคฉ็ถๅฐไธ"ๆๅฐๅๅนณๆน่ฏฏๅทฎ"็็ฎๆ ็ปๅฎๅจไธ่ตทใ
ๅๅผ็ๅไฝ
Weighted Mean
Geometric Mean
Harmonic Mean
Limitation of the Mean
The mean is highly sensitive to outliers. The reason is straightforward: the mean uses the exact numerical value of every data point — a single extreme value directly shifts the sum and pulls the mean away from where most data cluster.
Practical examples: GPA and temperature
Suppose 5 students score: 82, 85, 88, 90, 91. Mean = 87.2 — a good representation of the center.
But add one student who scored 20: the set becomes 20, 82, 85, 88, 90, 91. The mean drops to 76.0 — a value that does not represent most students' performance. Here the median of 86.5 is far more representative.
The Median
The median is the value that splits a sorted dataset into two equal halves. If the number of data points is odd, the median is the middle value; if even, it is the average of the two middle values.
Even n: Median = (xn/2 + xn/2+1) / 2
Historical origin
Francis Galton introduced and popularized the median in statistics in 1881. While the intuitive concept of a "middle value" is much older, Galton formally demonstrated its theoretical value as a measure of central tendency, especially its superiority when dealing with skewed data.
Why the median exists: mathematical intuition
Just as the mean minimizes the sum of squared deviations, the median minimizes the sum of absolute deviations:
This is the fundamental reason the median is insensitive to outliers: the absolute value function (unlike squaring) does not amplify the influence of extreme values.
Why income and house prices use the median
Income and house price distributions are typically right-skewed: most values cluster at the low-to-middle range while a few extremely high values pull the mean far upward. This is why economists and government statistics agencies report median income rather than mean income — it more accurately reflects the economic situation of the "typical" citizen.
Bill Gates walks into a bar
A bar has 10 people, each earning about $50,000/year. Mean = Median ≈ $50,000.
Now Bill Gates walks in (net worth ~$100 billion). The mean salary jumps to roughly $9.1 billion — about 180,000x the original. But the median remains around $50,000, barely changed.
This classic example perfectly illustrates why the median should be used in the presence of extreme outliers.
The Mode
The mode is the value that occurs most frequently in a dataset. Unlike the mean and median, the mode does not depend on numerical magnitude — only on frequency of occurrence.
Historical origin
Karl Pearson coined the term "mode" in 1895 (from the French la mode, meaning "fashion" or "trend" — i.e., "the most fashionable value"). Pearson was one of the founders of modern statistics, also responsible for the chi-squared test, correlation coefficient, and many other core concepts.
Why the mode exists
The mode is the only measure of central tendency applicable to categorical (nominal) data. You cannot calculate the "mean" or "median" of colors, but you can say "the most common color is blue" — that is the mode.
Unimodal, bimodal, and multimodal distributions
- Unimodal: one mode — data comes from a single population. Example: heights of adult males.
- Bimodal: two modes — usually indicates the data is a mixture of two distinct populations. Example: height data for all adults (males and females each form a peak).
- Multimodal: more than two modes — may indicate multiple subgroups or discrete preference categories.
Discovering that data is multimodal is often more informative than the mode value itself — it signals that distinct subpopulations may exist and should be analyzed separately.
When to Use Which Measure?
Choosing the right measure of central tendency depends on the type and distribution shape of your data. The following decision table can guide your choice:
| Situation | Best Measure | Why |
|---|---|---|
| Symmetric numerical data | Mean | Mean = Median = Mode here; the mean uses the most information |
| Skewed data (income, house prices) | Median | Robust to outliers, reflects the "typical" value |
| Categorical data (colors, brands) | Mode | The only option for non-numerical data |
| Growth rates, investment returns | Geometric mean | Correctly handles compounding; does not overestimate annual returns |
| Rates and ratios (speed, P/E ratio) | Harmonic mean | Correctly averages "per-unit" quantities |
| Has outliers but you do not want to discard them entirely | Trimmed mean | Remove top and bottom percentages, then average โ balances robustness and information |
| Need to identify subgroups in data | Mode | Multimodal distributions reveal mixture populations |
Quick decision rule
Step 1: Is the data numerical or categorical? If categorical → use the mode.
Step 2: Is the distribution symmetric? If yes → use the mean.
Step 3: Is there obvious skewness or outliers? → use the median.
Relationship Between Mean, Median, and Mode
The relative positions of the three measures depend on the skewness of the distribution. Understanding this relationship lets you quickly infer the shape of a distribution just by comparing the three values.
Symmetric distribution
The normal distribution is the classic example. All three measures coincide at the center.
Right-skewed (positive skew)
The long right tail pulls the mean to the right. Typical examples: income distribution, house prices.
Left-skewed (negative skew)
The long left tail pulls the mean to the left. Typical examples: retirement age, exam scores on a hard test.
Pearson's empirical rule
Karl Pearson proposed an approximate relationship linking all three:
Equivalently: Mode ≈ 3 × Median − 2 × Mean
This is an approximation that holds for moderately skewed unimodal distributions. It may be inaccurate for heavily skewed or multimodal distributions, but it is remarkably useful as a quick estimation tool — if you know the mean and median, you can roughly estimate where the mode lies.
Related Statistics Tools
Frequently Asked Questions
In everyday language, "average" and "mean" usually refer to the same thing — the arithmetic mean. However, in strict statistical terminology, "average" is a broader concept that can include the arithmetic mean, geometric mean, harmonic mean, median, or even the mode. "Mean" typically refers specifically to the arithmetic mean x̄ = ∑xi / n.
If every value in the dataset occurs the same number of times (e.g., 1, 2, 3, 4, 5 each appearing once), then there is no mode (the distribution is called "amodal"). Some textbooks say "all values are modes," but this is not practically meaningful. This calculator lists all values that share the highest frequency.
The mean and median are equal when the distribution is perfectly symmetric. The normal (Gaussian) distribution is the most common example. The uniform distribution also satisfies this. If you find a large gap between mean and median, it typically indicates skewness or outliers — in such cases, the median is the better choice for describing a "typical" value.
It can serve as a rough preliminary check, but not as a formal test. If mean ≈ median ≈ mode, the data may be symmetric (but not necessarily normal — a uniform distribution also satisfies this). Formal normality tests require Shapiro-Wilk, Kolmogorov-Smirnov, or Q-Q plots. The differences among the three primarily indicate the direction and degree of skewness.
Measures of central tendency (mean, median, mode) describe the center of the data, while variance and standard deviation describe how spread out the data is around that center. The two are complementary: knowing only that the mean is 50 does not tell you whether the data cluster between 48 and 52 or are spread across 0 to 100. Standard deviation is the square root of variance and has the same units as the original data, making it more practical. This calculator computes both sets of measures.