Two statisticians at Autodesk were inspired by this man's work to create the “Datasaurus Dozen,” which includes graphs shaped like an ellipse, a star, and a dinosaur. For 10 points each:
[10h] Name this statistician, who created four data sets with nearly identical mean, variance, correlation, and linear regression line, but very different underlying distributions.
ANSWER: Francis Anscombe
[10m] The batting averages of Derek Jeter and David Justice are often used to show this phenomenon, in which a trend that appears in several groups of data is reversed when the groups are combined.
ANSWER: Simpson's paradox [or the Yule-Simpson effect or amalgamation paradox or reversal paradox; prompt on the ecological fallacy]
[10e] This mathematician expressed a similar idea to Simpson’s paradox in 1899, 52 years before Simpson. The most common way of quantifying the strength of a linear correlation is with his namesake “r.”
ANSWER: Karl Pearson [accept Pearson’s r]
<BB>