Portal:Mathematics/Selected picture/22
Appearance
Credit: Schutz
Simpson's paradox (also known as the Yule–Simpson effect) states that an observed association between two variables can reverse when considered at separate levels of a third variable (or, conversely, that the association can reverse when separate groups are combined). Shown here is an illustration of the paradox for quantitative data. In the graph the overall association between X and Y is negative (as X increases, Y tends to decrease when all of the data is considered, as indicated by the negative slope of the dashed line); but when the blue and red points are considered separately (two levels of a third variable, color), the association between X and Y appears to be positive in each subgroup (positive slopes on the blue and red lines — note that the effect in real-world data is rarely this extreme). Named after British statistician Edward H. Simpson, who first described the paradox in 1951 (in the context of qualitative data), similar effects had been mentioned by Karl Pearson (and coauthors) in 1899, and by Udny Yule in 1903. One famous real-life instance of Simpson's paradox occurred in the UC Berkeley gender-bias case of the 1970s, in which the university was sued for gender discrimination because it had a higher admission rate for male applicants to its graduate schools than for female applicants (and the effect was statistically significant). The effect was reversed, however, when the data was split by department: most departments showed a small but significant bias in favor of women. The explanation was that women tended to apply to competitive departments with low rates of admission even among qualified applicants, whereas men tended to apply to less-competitive departments with high rates of admission among qualified applicants. (Note that splitting by department was a more appropriate way of looking at the data since it is individual departments, not the university as a whole, that admit graduate students.)