Devlin and Roeder’s 1999 paper “Genomic control for association studies” introduces a method to account for this feature of SNP (“snip”) data, which is a common source of spurious associations in case–control studies. For 10 points each:
[10h] Name this confounding feature of SNP data that is present when the inflation factor lambda is greater than 1.02. This feature is estimated by a namesake algorithm that Pritchard, Stephens, and Donnelly published in 2000.
ANSWER: population structure [or population substructure, genetic structure, population stratification, or population heterogeneity; prompt on STRUCTURE by asking “of what?”] (The threshold of 1.02 is given in Hoon Sul et al.’s review “Population structure in genetic studies: Confounding factors and mixed models.”)
[10m] In 2000, Bacanu, Devlin, and Roeder compared genomic control to a method based on TDT, which tests for this contributor to population structure. This phenomenon is the primary driver of the clustering of trait-associated SNPs.
ANSWER: linkage disequilibrium [or LD or genetic linkage; prompt on transmission disequilibrium test]
[10e] Genomic control reverses inflation of the Cochran–Armitage test statistic, which has this distribution under the null hypothesis. The presence of Hardy–Weinberg equilibrium is often tested with a “goodness of fit” test named for Pearson and this distribution.
ANSWER: chi-squared distribution
<David Bass, Biology>