Interpreting principal component analyses of spatial population genetic variation
- PMID: 18425127
- PMCID: PMC3989108
- DOI: 10.1038/ng.139
Interpreting principal component analyses of spatial population genetic variation
Abstract
Nearly 30 years ago, Cavalli-Sforza et al. pioneered the use of principal component analysis (PCA) in population genetics and used PCA to produce maps summarizing human genetic variation across continental regions. They interpreted gradient and wave patterns in these maps as signatures of specific migration events. These interpretations have been controversial, but influential, and the use of PCA has become widespread in analysis of population genetics data. However, the behavior of PCA for genetic data showing continuous spatial variation, such as might exist within human continental groups, has been less well characterized. Here, we find that gradients and waves observed in Cavalli-Sforza et al.'s maps resemble sinusoidal mathematical artifacts that arise generally when PCA is applied to spatial data, implying that the patterns do not necessarily reflect specific migration events. Our findings aid interpretation of PCA results and suggest how PCA can help correct for continuous population structure in association studies.
Figures
Comment in
-
Principal component analysis of genetic data.Nat Genet. 2008 May;40(5):491-2. doi: 10.1038/ng0508-491. Nat Genet. 2008. PMID: 18443580 No abstract available.
Similar articles
-
Comparing spatial maps of human population-genetic variation using Procrustes analysis.Stat Appl Genet Mol Biol. 2010;9(1):Article 13. doi: 10.2202/1544-6115.1493. Epub 2010 Jan 27. Stat Appl Genet Mol Biol. 2010. PMID: 20196748 Free PMC article.
-
Influence of admixture and paleolithic range contractions on current European diversity gradients.Mol Biol Evol. 2013 Jan;30(1):57-61. doi: 10.1093/molbev/mss203. Epub 2012 Aug 25. Mol Biol Evol. 2013. PMID: 22923464
-
Correcting principal component maps for effects of spatial autocorrelation in population genetic data.Front Genet. 2012 Nov 20;3:254. doi: 10.3389/fgene.2012.00254. eCollection 2012. Front Genet. 2012. PMID: 23181073 Free PMC article.
-
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification.In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. In: Kobeissy FH, editor. Brain Neurotrauma: Molecular, Neuropsychological, and Rehabilitation Aspects. Boca Raton (FL): CRC Press/Taylor & Francis; 2015. Chapter 25. PMID: 26269925 Free Books & Documents. Review.
-
Extracting functional networks with spatial independent component analysis: the role of dimensionality, reliability and aggregation scheme.Curr Opin Neurol. 2011 Aug;24(4):378-85. doi: 10.1097/WCO.0b013e32834897a5. Curr Opin Neurol. 2011. PMID: 21734575 Review.
Cited by
-
VCF2PCACluster: a simple, fast and memory-efficient tool for principal component analysis of tens of millions of SNPs.BMC Bioinformatics. 2024 May 1;25(1):173. doi: 10.1186/s12859-024-05770-1. BMC Bioinformatics. 2024. PMID: 38693489 Free PMC article.
-
Cross-ancestry genetic architecture and prediction for cholesterol traits.Hum Genet. 2024 May;143(5):635-648. doi: 10.1007/s00439-024-02660-7. Epub 2024 Mar 27. Hum Genet. 2024. PMID: 38536467
-
The power of representation: Statistical analysis of diversity in US Alzheimer's disease genetics data.Alzheimers Dement (N Y). 2024 Mar 18;10(1):e12462. doi: 10.1002/trc2.12462. eCollection 2024 Jan-Mar. Alzheimers Dement (N Y). 2024. PMID: 38500778 Free PMC article.
-
Highly parameterized polygenic scores tend to overfit to population stratification via random effects.bioRxiv [Preprint]. 2024 Jan 29:2024.01.27.577589. doi: 10.1101/2024.01.27.577589. bioRxiv. 2024. PMID: 38352303 Free PMC article. Preprint.
-
Phantom oscillations in principal component analysis.Proc Natl Acad Sci U S A. 2023 Nov 28;120(48):e2311420120. doi: 10.1073/pnas.2311420120. Epub 2023 Nov 21. Proc Natl Acad Sci U S A. 2023. PMID: 37988465 Free PMC article.
References
-
- Menozzi P, Piazza A, Cavalli-Sforza L. Science. 1978;201:786–792. - PubMed
-
- Cavalli-Sforza L, Menozzi P, Piazza A. Science. 1993;259(5095):639–646. - PubMed
-
- Cavalli-Sforza LL, Menozzi P, Piazza A. The History and Geography of Human Genes. Princeton University Press; 1994.
-
- Jobling M, Hurles M, Tyler-Smith C. Human evolutionary genetics. Garland Science; 2004.
-
- Rendine S, Piazza A, Cavalli-Sforza LL. The American Naturalist. 1986;128:681–706.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources