<- chapter_18|Chapter 18^table_of_contents|Table of Contents^chapter_20|Chapter 20 -> %%Chapter 19. Mutation, allele frequency, and selection%% In the [[chapter_18|Chapter 18]] we saw that in a population, allele frequency does not change from generation to generation, unless: * Mating is not random; * There are mutations; * There is selection; * There is genetic drift or a bottleneck effect; * There is migration of individuals in and out of the population In this chapter we will consider how allele frequencies can change under the influence of mutation and selection. ===== Example of the effect of selection on recessive mutations: PKU ===== We first consider the conversion of a wildtype allele $A$ to an altered allele $a$ by mutation:

$$ A \xrightarrow{\mu} a $$ Naturally occurring mutations will covert wildtype allele $A$ to mutant allele $a$ with frequency μ. μ is the mutation rate. Typical mutation rates vary from μ = 10^-4 to 10^-8. Thus, in the absence of any other effects, such as selection, for any given gene the frequency of mutant alleles will increase a little each generation because of new mutations. Consider the disease phenylketonuria (PKU), which is an autosomal recessive defect for the gene coding for the enzyme phenylalanine hydroxylase. The absence of the enzyme prevents phenylalanine from being metabolized causing unusually high levels of phenylalanine in the body leading to severe mental retardation. Let's say that the allele frequency for mutant PKU alleles is $q$ and the allele frequency for wildtype alleles is $p$. Let's also say that the mutation rate for PKU is μ = 10^-4. As $q$ increases, the frequency of individuals with PKU ($q^2$) will also slowly increase with each generation. When $q$ gets high enough, selection against homozygotes will counterbalance the formation of new mutations and $q$ will stay constant. In order to treat selection quantitatively, we need to introduce two additional concepts called selective disadvantage and fitness.

$S$ = selective disadvantage\\ $1-S$ = fitness

If a genotype has S = 0.75, then fitness = 0.25, meaning that individuals with this genotype will reproduce at a rate of only 25% relative to an average individual. Fitness can be thought of as a combination of survival and fertility. Let's put this into the context of the Hardy-Weinberg equilibrium. Recall from [[chapter_18|Chap. 18]] that $f(A/A)=p^2$, $f(A/a)=2pq$, and $f(a/a)=q^2$. ^ genotype ^ frequency ^ after selection ^ change in frequency (Δ) ^ | $A/A$ | $p^2$ | $p^2$ | 0 | | $A/a$ | $2pq$ | $2pq$ | 0 | | $a/a$ | $q^2$ | $q^2(1-S)$ | $q^2(1-S)-q^2 = -Sq^2$ |

We use the symbol $\Delta q_{sel}$ to mean "the change in $q$ due to selection"; from Table {{ref>Tab1}} we see that $\Delta q_{sel}=-Sq^2$. When the change in allele frequency caused by mutation μ is balanced out by the change in allele frequency by selection ($-Sq^2$), we say the allele frequency is in a steady state. Mathematically, we say that $\Delta q_{sel}+\mu=-Sq^2+\mu=0$. From this, we can solve for $q$:

$$q=\sqrt{\frac{\mu}{S}}$$ Allele frequency modified by selective disadvantage. For PKU, $q^2 = 10^{-4}$, so $q=10^{-2}$. Also, since PKU is fairly severe, in the pre-modern medicine age of human evolution $S \approx 1$ (that is, just about everyone who had PKU died before they could reproduce). Therefore, based on Fig. {{ref>Fig2}} the estimated value of μ is about 10^-4. The actual mutation frequency is probably not this high – and the relatively high $q$ for PKU is probably due to a founder effect in the European population or due to a balanced polymorphism (see below). In modern times PKU can be treated by a low-phenylalanine diet; this means that in modern times $S << 1$ (or, you could say that $S \approx 0$). In this case, $\Delta q_{sel} = -Sq^2 \approx 0$ as well, and the main thing that will alter allele frequency would be the mutation rate μ. This suggests that the frequency of PKU mutant alleles should start to rise at a rate of $\mu = 10^{-4}$ per generation. Thus, $q$ will only increase by about a factor of 0.01% per generation. It will take a long time for this change in environment to have a significant effect on disease frequency. ===== Example of the effect of selection on dominant mutations: Huntington's disease ===== Now let’s determine the steady state allele frequency for a dominant disease with allele frequency $q = f(A)$. In contrast to the situation for recessive alleles, selection will operate against heterozygotes for dominant alleles. For rare dominant traits, almost all affected individuals area heterozygotes; that is, $f(A/A)$ is very small. Therefore, while formally $q=f(A/A)+\frac{1}{2}f(A/a)$, we can approximate $q$ by saying that $q \approx \frac{1}{2}f(A/a)$. Let's look at how $S$ and $(1-S)$ affect $q$: ^ genotype ^ frequency ^ after selection ^ change in frequency (Δ) ^ | $A/A$ | - | - | - | | $A/a$ | $2pq \approx 2q$ | $(1-S)2q$ | $(1-S)2q-2q=-2Sq$ | | $a/a$ | $p^2$ | $p^2$ | 0 |

After selection, $2Sq$ heterozygotes are lost each generation but only half of their alleles are the dominant disease allale $A$. Therefore:

$$\begin{aligned}\Delta q_{sel}=\frac{1}{2}\Delta f(A/a)&=\frac{1}{2}(-2Sq)\\&=-Sq\end{aligned}$$ placeholder Similar to what we discussed above for recessive mutations, in the steady state $\Delta q_{sel}+\mu=0$, except that here $\Delta q_{sel}=-Sq$ for dominant mutations. As before, we can solve for $q$:

$$-Sq+\mu=0\\ \mu=Sq\\ q=\frac{\mu}{S}$$ placeholder For $S=1$, $q=\mu$. In other words, for dominant mutations with fitness=0, the only instances of the disease will be due to new mutations. This makes sense, because dominant mutant alleles with 0 fitness (cannot survive or reproduce) cannot be passed from one generation to the next. In this case, the number of affected individuals will be $2pq \approx 2q = 2\mu$. Any dominant mutation that results in embryonic or early postnatal lethality would be an example of this. Another example would be a dominant mutation that results in sterility. When $S<1$, the frequency of mutant alleles $q$ can get quite high (this makes sense mathematically; look at Fig. xx). A good example of this is Huntington's disease, which is caused by dominant mutations in $Htn$, the gene that codes for the Huntingtin protein. This devastating disease causes late onset neuromuscular degeneration starting at around 36 years of age, eventually leading to death. This obviously is terrible for anyone that is unfortunate enough to be a carrier, but since the disease doesn't manifest until later in life it doesn't decrease reproductive fitness much. ===== Example of the effect of selection on sex-linked mutations: hemophilia A and DMD ===== For the final example of a balance between mutation and selection, consider an X-linked recessive disease allele with frequency $q = f(a)$. For rare alleles the vast majority of affected individuals who are operated on by selection are males, and new mutations will increase the allele frequency (i.e., $\Delta q_{mut} \approx \mu$). ^ genotype ^ frequency ^ after selection ^ change in frequency (Δ) ^ | $X^AY$ | $p$ | $p$ | 0 | | $X^aY$ | $q$ | $(1-S)q$ | $(1-S)q-q=-Sq$ |

In a population with an equal number of males and females, $\frac{1}{3}$ of the X chromosomes will be in males. Therefore: $$\begin{aligned}\Delta q_{sel}&=\frac{1}{3}[\Delta f(X^aY)]\\&=\frac{1}{3}(-Sq)\\&=\frac{-sQ}{3}\end{aligned}$$ As before, in the steady state $\Delta q_{sel}+\mu = \frac{-Sq}{3}+\mu=0$. We can therefore solve for $q$:

$$\mu=\frac{Sq}{3}\\ q=\frac{3\mu}{S}$$ placeholder When $S=1$ (i.e., when fitness=0), $q=3\mu$. In other words, exactly one-third of the disease alleles in a population will be new mutations. This relationship has been demonstrated for at least two debilitating X-linked diseases: hemophilia A and Duchenne muscular dystrophy. ===== Balanced polymorphisms ===== Finally, we will consider a situation in which an allele is deleterious in the homozygous state but is beneficial in the heterozygous state. The steady state value of $q$ will be set by a balance between selection for the heterozygote and selection against the homozygote. We will need a new parameter, $h$ (the heterozygote advantage), that represents the increased reproductive fitness of heterozygote over an average individual. ^ genotype ^ frequency ^ after selection ^ change in frequency (Δ) ^ | $A/A$ | $p^2$ | $p^2$ | 0 | | $A/a$ | $2pq \approx 2q$ | $(1+h)2q$ | $(1+h)2q-2q=2hq$ | | $a/a$ | $q^2$ | $(1-S)q^2$ | $(1-S)q^2-q^2=-Sq^2$ |

As before when considering dominant mutations, when we calculate $\Delta q_{sel}$ we have to halve the change in frequency for heterozygotes, since only half the alleles of heterozygotes are mutant. $$\begin{aligned}\Delta q_{sel} &=\Delta f(a/a)+\frac{1}{2}\Delta f(A/a)\\ &=-Sq^2+\frac{1}{2}(2hq)\\ &=-Sq^2+hq \end{aligned}$$ When $S=1$, then $\Delta q=0$ when $q^2=hq$, or in other words, when $h=q$. The possibility of a subtle selection for (or against) the heterozygote for an allele that appears to be recessive means that in practice the estimates of μ from allele frequencies are quite unreliable. For example, let's assume that $q= 10^{-2}$. This could mean $\mu = 10^{-4}$ and $h = 0$, or it could mean $\mu < 10^{-4}$ and $h = 10^{-2}$. Since a 1% increase in heterozygote advantage would be essentially unmeasurable we wouldn't be able to distinguish these possibilities. ==== Sickle-cell anemia ==== The best understood case of balanced polymorphism is sickle-cell anemia. The allele of hemoglobin known as β_S^H is recessive for the disease but is dominant for malarial resistance. β_S^H is most prevalent in a number of different equatorial populations where malaria is common: sub-Saharan Africa, the Mediterranean, and Southeast Asia. In parts of Africa the frequency of the disease can be as high as ~2.6 %, which means that in these populations $q= 0.16$. During human history sickle cell disease would almost certainly be fatal ($S \approx 1$) and therefore $h$ must have been about 0.16. This indicates that during evolution the reproductive advantage for an β_S^H heterozygote was 16%. Many of the most prevalent genetic diseases are suspected to be at a relatively high frequency because of balanced polymorphism. ==== Cystic fibrosis ==== A second example of balanced polymorphism is cystic fibrosis, a disease caused by autosomal recessive mutations in the $CFTR$ gene (__c__ystic __f__ibrosis __t__ransmembrane conductance __r__egulator). Mutations in $CTFR$ disrupt Cl^– transport, leading to disturbed osmotic balance across in epithelial cell layers of the lungs and intestine. The incidence in European populations is approx. 0.0025; therefore, $q=\sqrt{0.0025}=0.05$. This is a pretty high frequency! This is probably not due to either high mutation frequency or founder effect (many different $CTFR$ disease alleles have been found although 70% are the ΔF508 allele). Scientists believe that heterozygotes may be more resistant to bacterial infections that cause diarrhea such as typhoid or cholera and that this selection was imposed in densely populated European cities. ==== Lysosomal storage disorders ==== A third example of balanced polymorphism involve lysosomal storage disorders caused by several different autosomal recessive mutations: ^ disease ^ enzyme affected by mutation ^ allele frequency (maximum) ^ | Gaucher's disease | glucocerebrosidase | 0.03 | | Tay-Sachs disease | hexosaminidase A | 0.017 | | Nieman-Pick disease | sphyngomylinase | 0.01 |

All three enzymes are involved in the breakdown of glycolipids in the lysosome. When these enzymes are defective in individuals homozygous for the disease allele, excessive quantities of glycolipids build up in cells and can have pathological effects. In particular all three diseases are characterized by mental retardation because of excess glycolipids accumulating in neurons. All three diseases are ~100x more common in Ashkenazi Jewish populations than in the general population. This group arrived in central Europe in 9th century AD and is currently distributed among the United States, Israel, and the former Soviet Union. One hypothesis that may partially explain why Tay-Sachs allele frequency is so high is that heterozygotes may have resistance to tuberculosis, and that Jewish people living in European ghettos around the time of World War 2 may have been under selection, although other explanations (such as founder effect) may also be relevant.