Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2008 May 30;4(5):e1000083.
doi: 10.1371/journal.pgen.1000083.

Assessing the evolutionary impact of amino acid mutations in the human genome

Affiliations
Comparative Study

Assessing the evolutionary impact of amino acid mutations in the human genome

Adam R Boyko et al. PLoS Genet. .

Abstract

Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27-29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30-42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10-20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Simulation of demographic and selective parameter estimates with and without linkage.
Simulation results for ML estimate of demographic and selective parameters assuming African American demography (τ = 0.1328, ω = 0.3034) and gamma distribution of fitness effects (α = 0.184, β = 8200). Sample sizes and mutation rates are the same as those in the African American data projected down to N = 24 chromosomes. Each panel represents 100 replicates; actual values shown with black dashed lines. (A) Simulations without linkage; each entry of the site-frequency spectrum is a Poisson variate drawn with the mean being that expected under the demographic model (synonymous sfs) or demography+selection model (nonsynonymous sfs). (B) Simulations with linkage; each entry calculated from a simulation of 11,404 genes, each with 7 linked exons (see Methods). (C) Distribution of inferred values for unlinked (blue) and linked (red) simulations.
Figure 2
Figure 2. Observed and expected nonsynonymous site-frequency spectra after demographic correction.
Expected site-frequency spectra under best-fit selection models after demographic correction. Note the logarithmic scale of the y-axis. (A) African American replacement SNPs versus expectation under neutrality, fixed selective effects, and gamma distribution of fitness effects. (B) European American replacement SNPs versus expectation under neutrality, fixed selective effects, and gamma distribution of fitness effects.
Figure 3
Figure 3. Cummulative proportion of nonsynonymous mutations with a selection coefficient less than s.
Gamma and lognormal curves represent the best-fit gamma and lognormal models to the African American polymorphism data (Table 1). Gamma+pos and wnorm are the best-fit gamma distribution with positive selection at 2Nes = 5 and best-fit weighted normal model to the African American polymorphism+divergence data. All four distributions predict nearly identical site-frequency spectra that closely match the observed data. Left side are deleterious selection coefficients; right side are advantageous selection coefficients.
Figure 4
Figure 4. Inferred fitness effects of new, segregating, and fixed mutations in African-Americans.
Estimated proportion of new nonsynonymous mutations (left column), SNPs (middle columns), and human-chimp fixed differences (right column) which are strongly deleterious (s<−10−2; red), moderately deleterious (−10−2<s<−10−3; orange), weakly deleterious (−10−3<s<−10−4; yellow), nearly neutral (−10−4<s<−10−5; green), neutral (−10−5<s<0; blue), and positively selected (white) in a sample of 100 chromosomes from a population under the best-fit expansion model of African American demography. (A–C) Proportions estimated by assuming all positively selected mutations have an effect of (A) γ+ = 5, (B) γ+ = 25, (C) γ+ = 100 and finding the MLE of the resulting three-parameter selection model (gamma distribution of deleterious fitness effects and a proportion (p+) of sites positively selected) to the African American polymorphism and divergence data. (D) Proportions estimated from the best-fit gamma distribution selection model (Table 1) in African Americans (equivalent to assuming positive selection is strong enough that positively selected mutants are never observed in the site-frequency spectrum). The resulting MLEs are (A) α = 0.228, β = 3100, p+ = 0.0186; (B) α = 0.200, β = 5400, p+ = 0.0023; (C) α = 0.196, β = 5850, p+ = 0.0005; (D) α = 0.184, β = 8200, p+→0. Models (A–C) provided equally good fits to the polymorphism data, but they outperformed the best-fit gamma model and best-fit gamma+neutral model by 4.1, 3.5, 3.1 and 3.4, 2.8, and 2.4 LL units, respectively.

Similar articles

Cited by

References

    1. Eyre-Walker A, Keightley PD. The distribution of fitness effects of new mutations. Nat Rev Genet. 2007;8:610–618. (doi: 10.1038/nrg2146). - PubMed
    1. Barton NH, Charlesworth B. Why sex and recombination? Science. 1998;281:1986–1990. - PubMed
    1. Ohta T. Slightly deleterious mutant substitutions in evolution. Nature. 1973;246:96–98. - PubMed
    1. Kimura M. Model of effectively neutral mutations in which selective constraint is incorporated. Proc Natl Acad Sci U S A. 1979;76:3440–3444. - PMC - PubMed
    1. Di Rienzo A. Population genetics models of common diseases. Curr Opin Genet Dev. 2006;16:630–636. (doi: 10.1016/j.gde.2006.10.002). - PubMed

Publication types