Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;8(12):e1002806.
doi: 10.1371/journal.pcbi.1002806. Epub 2012 Dec 6.

SnIPRE: selection inference using a Poisson random effects model

Affiliations

SnIPRE: selection inference using a Poisson random effects model

Kirsten E Eilertson et al. PLoS Comput Biol. 2012.

Abstract

We present an approach for identifying genes under natural selection using polymorphism and divergence data from synonymous and non-synonymous sites within genes. A generalized linear mixed model is used to model the genome-wide variability among categories of mutations and estimate its functional consequence. We demonstrate how the model's estimated fixed and random effects can be used to identify genes under selection. The parameter estimates from our generalized linear model can be transformed to yield population genetic parameter estimates for quantities including the average selection coefficient for new mutations at a locus, the synonymous and non-synynomous mutation rates, and species divergence times. Furthermore, our approach incorporates stochastic variation due to the evolutionary process and can be fit using standard statistical software. The model is fit in both the empirical Bayes and Bayesian settings using the lme4 package in R, and Markov chain Monte Carlo methods in WinBUGS. Using simulated data we compare our method to existing approaches for detecting genes under selection: the McDonald-Kreitman test, and two versions of the Poisson random field based method MKprf. Overall, we find our method universally outperforms existing methods for detecting genes subject to selection using polymorphism and divergence data.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Example joint distribution of the estimated selection effect and the constraint effect for a particular gene.
Data simulated using PRFREQ. The blue asterisk denotes the true location of parameters.
Figure 2
Figure 2. Classification of constraint.
Top: Distribution 1, 2, and 3 of formula image used in the coalescent simulations for Table 7. Bottom: Proportion of constraint effects classified as significant by SnIPRE; x-axis is true proportion of non-lethal mutations, formula image.
Figure 3
Figure 3. Comparison of estimates of constraint when (no constraint).
A: The distribution of constraint estimates. B: Constraint estimates versus the selection strength.
Figure 4
Figure 4. Classification of selection effect for Drosophila-like simulations.
Shaded regions of histogram represent the proportion of genes under selection classified as under selection; x-axis is true selection coefficient; formula image.
Figure 5
Figure 5. Classification of selection effect for human-like simulations.
Shaded regions of histogram represent the proportion of genes under selection classified as under selection; x-axis is true selection coefficient; formula image.
Figure 6
Figure 6. True positive rate versus false discover rate.
Results for data set of 2,000 genes, 550 of the genes are under selection with formula image or formula image.
Figure 7
Figure 7. Distribution of residuals for selection coefficient estimates by method.
The top row displays the distribution of constraint, the middle row displays residuals for simulations using formula image; the bottom row displays residuals for simulations using formula image. Residuals grouped by true selection strength.
Figure 8
Figure 8. D. simulans estimated selection effects and non-synonymous effects for 8,887 genes.
Plots A and B shows the estimated selection effects using SnIPRE and B SnIPRE respectively.
Figure 9
Figure 9. Human estimated selection effects and non-synonymous effects for 11,624 genes.
Plots A and B shows the estimated selection effects using SnIPRE and B SnIPRE respectively. B SnIPRE classifies far more genes as having a negative average selection effect, and this difference can be explained in part by the construction of 95% confidence interval versus the credible interval.

Similar articles

Cited by

References

    1. Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123: 585. - PMC - PubMed
    1. Nielsen R (2005) Molecular signatures of natural selection. Genetics 39: 197. - PubMed
    1. Hudson R, Kreitman M, Aguade M (1987) A test of neutral molecular evolution based on nucleotide data. Genetics 116: 153. - PMC - PubMed
    1. Nielsen R (2001) Statistical tests of selective neutrality in the age of genomics. Heredity 86: 641–647. - PubMed
    1. McDonald J, Kreitman M (1991) Adaptive protein evolution at the Adh locus in Drosophila. Nature 351: 652–654. - PubMed

Publication types

Grants and funding

This work has been funded by the National Science Foundation under grant NSF0516310. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.