Standard Deviation
What is Standard Deviation?
Standard Deviation (SD) is the most important concept in psychometrics — the science of measuring mental capacities. In straightforward terms, it is a mathematical measure of how “spread out” a set of numbers is around their average. A small standard deviation means scores cluster tightly together; a large one means they are spread widely.
In IQ testing, the standard deviation transforms a raw score into a meaningful ranking: it tells you not just how a person performed, but how their performance compares to the entire population. Without understanding standard deviation, an IQ score is just an arbitrary number. Because different tests use different scales, standard deviation is the universal conversion key that allows meaningful comparison across instruments, eras, and populations.
The Formula Behind the Concept
Standard deviation is calculated by:
- Computing the mean (average) of all scores
- Calculating how far each score deviates from the mean
- Squaring those deviations (to eliminate negative signs)
- Averaging the squared deviations — this is the variance
- Taking the square root of the variance — this is the standard deviation
In formal notation: SD = √[ Σ(x − μ)² / N ]
For IQ testing, this process is applied to a large, representative standardization sample — typically thousands of people stratified by age, sex, education level, and geographic region — to ensure the resulting distribution accurately reflects the true population.
The Three Standard Scales
The same underlying intelligence can be reported at very different numbers depending on which test was administered — a source of enormous confusion in popular discussions of IQ:
| Test | Mean | Standard Deviation | IQ 130 equivalent |
|---|---|---|---|
| WAIS-IV, WISC-V, SB5 | 100 | 15 | 130 |
| Cattell Culture Fair (CFIT) | 100 | 24 | 148 |
| Stanford-Binet (old editions) | 100 | 16 | 132 |
The critical insight: An IQ of 130 (SD 15), 148 (SD 24), and 132 (SD 16) are identical in terms of population rarity — all three represent the 98th percentile, the threshold for Mensa. The different numbers reflect different scale conventions, not different levels of intelligence.
This is precisely why Mensa and other high-IQ societies publish qualifying scores by test rather than by a single cutoff number — without knowing which test was used, the number alone is meaningless.
Reading the Bell Curve: Rarity by Standard Deviation
Because IQ scores follow a normal distribution (bell curve), standard deviations provide exact mathematical predictions of population rarity:
±1 SD: IQ 85–115 (SD 15 scale)
- 68.26% of the population
- The “average” range: roughly 2 in 3 people
- Can complete secondary education; functions effectively across most everyday demands
±2 SD: IQ 70–130
- 95.44% of the population
- Below 70 (−2 SD): Intellectual disability range (~2.2% of population); difficulty with many functional daily tasks without support
- Above 130 (+2 SD): Gifted/Superior range (~2.2%); Mensa qualification threshold; common cutoff for gifted education programs
±3 SD: IQ 55–145
- 99.72% of the population
- Above 145 (+3 SD): Highly gifted range — approximately 1 in 740 people; meets cutoffs for advanced high-IQ societies such as Intertel (top 1%) at the lower range and approaches Triple Nine Society (top 0.1%) thresholds
±4 SD: IQ 40–160
- 99.994% of the population
- Above 160 (+4 SD): Approximately 1 in 31,560 people — the range associated historically with the most cognitively exceptional documented individuals
±5 SD: IQ 25–175
- 99.9999% of the population
- Above 175: Approximately 1 in 3.5 million — statistically, fewer than 2,000 such individuals would exist in the global population if IQ continued to follow a perfect normal distribution
The Z-Score: The Universal Converter
The standard deviation gives rise to the Z-score — a universal measure that expresses any score as a signed number of standard deviations from the mean:
Z = (Score − Mean) / Standard Deviation
Z-scores allow direct comparison across tests with different scales. Example conversions:
- WAIS IQ 145: Z = (145 − 100) / 15 = +3.0
- Cattell IQ 172: Z = (172 − 100) / 24 = +3.0
- Old Stanford-Binet 148: Z = (148 − 100) / 16 = +3.0
All three describe the same population rarity: approximately 1 in 740, or the 99.87th percentile.
Converting Z-score to percentile requires the normal distribution’s cumulative density function — typically looked up in statistical tables or calculated by software. Key Z-score benchmarks:
| Z-Score | Percentile | 1 in X people |
|---|---|---|
| +1.0 | 84.1st | 1 in 6 |
| +1.28 | 90th | 1 in 10 |
| +2.0 | 97.7th | 1 in 44 |
| +2.05 | 98th | 1 in 50 (Mensa) |
| +2.33 | 99th | 1 in 100 |
| +3.0 | 99.87th | 1 in 741 |
| +3.72 | 99.99th | 1 in 10,000 |
| +4.0 | 99.9968th | 1 in 31,560 |
Standard Error of Measurement: The Uncertainty Band
A concept closely related to standard deviation is the Standard Error of Measurement (SEM) — a measure not of population spread but of test-score uncertainty. No test is perfectly reliable; random factors (fatigue, test anxiety, question sampling) introduce noise into any individual’s score.
The SEM quantifies this noise: on the WAIS-IV, the SEM for Full Scale IQ is approximately 2.16 points. This means a reported FSIQ of 130 corresponds to a 95% confidence interval of approximately 126–134 — the true score almost certainly lies in that range but could be at either end.
SEM has important clinical implications:
- A score of 130 (Mensa threshold) with SEM of ±3 points means the “true” score could plausibly be 127 — or 133
- Diagnostic decisions near threshold cutoffs (intellectual disability at IQ 70, giftedness at 130) must account for SEM
- Re-testing effects (practice effects, regression to the mean) further complicate point-estimate interpretation
Why High IQ Claims Often Fall Apart
Understanding standard deviation enables quick, rigorous evaluation of extraordinary IQ claims — which are frequent in popular media and self-reporting:
- IQ 200 claims: On SD 15 scale, this is Z = +6.67 — a probability of approximately 1 in 12 billion. No standard test has sufficient ceiling to measure this reliably. The entire human population would yield essentially zero individuals at this level on a properly normed instrument.
- IQ 160+ from online tests: Internet IQ tests are unnormed and typically inflate scores by 15–25 points. A “score” of 160 on such a test corresponds to perhaps IQ 130–135 on a validated instrument.
- IQ 145+ from expired norms: The Flynn Effect means that tests normed in earlier decades systematically overstate scores. An IQ of 145 on a 1970s-normed test may correspond to approximately 130–135 on modern norms.
The standard deviation framework provides the tools to evaluate these claims precisely: if a score implies a frequency rarer than 1 in several hundred thousand, ordinary IQ tests simply cannot reliably measure it.
Standard Deviation Across Subtests: The Profile
Modern IQ tests report standard deviation not just for the Full Scale IQ but for each index score. This creates a cognitive profile — a fingerprint of relative strengths and weaknesses within an individual:
- High VCI + low PSI: Strong crystallized verbal abilities with slow processing — common in dyslexia, giftedness with processing differences
- High FRI + low WMI: Strong abstract reasoning with working memory limitations — associated with ADHD profiles
- Flat profile (all indices within 10 points of each other): Full Scale IQ is an accurate summary statistic; no significant intraindividual discrepancy
The concept of scatter (the spread of subtest scores around an individual’s own mean) uses the same logic as population-level standard deviation — measuring how much a person’s profile varies internally. High scatter often indicates learning disabilities, neurological differences, or twice-exceptionality even when the Full Scale IQ appears average.
Conclusion: The Yardstick of the Mind
Standard deviation is not just a technical concept — it is the foundational tool that makes IQ scores meaningful. Without it, a score of 130 is opaque; with it, 130 becomes “smarter than 97.7% of the tested population, with a confidence interval of approximately 126–134 on the WAIS-IV.” That precision — grounded in statistical theory, calibrated to a representative population, and interpretable against known error ranges — is what distinguishes scientific psychometrics from intuitive guesswork. Understanding standard deviation means understanding why IQ scores mean what they mean, and why extraordinary claims require extraordinary evidence.