Regression toward the mean

Galton's experimental setup (Fig.8)

In statistics, regression toward the mean (also called reversion to the mean, and reversion to mediocrity) is the phenomenon where if one sample of a random variable is extreme, the next sampling of the same random variable is likely to be closer to its mean.[1][2][3] Furthermore, when many random variables are sampled and the most extreme results are intentionally picked out, it refers to the fact that (in many cases) a second sampling of these picked-out variables will result in "less extreme" results, closer to the initial mean of all of the variables.

Mathematically, the strength of this "regression" effect is dependent on whether or not all of the random variables are drawn from the same distribution, or if there are genuine differences in the underlying distributions for each random variable. In the first case, the "regression" effect is statistically likely to occur, but in the second case, it may occur less strongly or not at all.

Regression toward the mean is thus a useful concept to consider when designing any scientific experiment, data analysis, or test, which intentionally selects the most extreme events - it indicates that follow-up checks may be useful in order to avoid jumping to false conclusions about these events; they may be genuine extreme events, a completely meaningless selection due to statistical noise, or a mix of the two cases.[4]

  1. ^ Everitt, B. S. (August 12, 2002). The Cambridge Dictionary of Statistics (2 ed.). Cambridge University Press. ISBN 978-0521810999.
  2. ^ Upton, Graham; Cook, Ian (21 August 2008). Oxford Dictionary of Statistics. Oxford University Press. ISBN 978-0-19-954145-4.
  3. ^ Stigler, Stephen M (1997). "Regression toward the mean, historically considered". Statistical Methods in Medical Research. 6 (2): 103–114. doi:10.1191/096228097676361431. PMID 9261910.
  4. ^ Chiolero, A; Paradis, G; Rich, B; Hanley, JA (2013). "Assessing the Relationship between the Baseline Value of a Continuous Variable and Subsequent Change Over Time". Frontiers in Public Health. 1: 29. doi:10.3389/fpubh.2013.00029. PMC 3854983. PMID 24350198.

© MMXXIII Rich X Search. We shall prevail. All rights reserved. Rich X Search