User Tools

Site Tools


chapter_18

This is an old revision of the document!


Chapter 18. The Hardy-Weinberg equilibrium

Until now, we have been carrying out genetic analysis of individuals, This approach works well when you are studying a model organism, such as Drosophilaplugin-autotooltip__default plugin-autotooltip_bigDrosophila melanogaster: a fruit fly species used in genetics research. or mice. It doesn't work well when you are interested in the genetics of organisms that cannot be studied in this way. For the next several chapters, we will consider genetics from the point of view of groups of individuals, or populations. We will treat this subject entirely from the perspective of human population studies where population genetics is used to get the type of information that would ordinarily be obtained by breeding experiments in model organisms.

The Hardy Weinberg equilibrium

At the heart of population genetics is the concept of alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency. Consider a human geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) with two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence.:$A$ and $a$. The frequency of $A$ is $f(A)$ ; the frequency of $a$ is $f(a)$. We define the following symbols:

$$p = f(A)\\ q = f(a)$$
Figure 1: Defining the symbols $p$ and $q$ as alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies. This assumes that there are only two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. of a geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in a population. In more genetically diverse populations, there may be more than two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. for any given geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-).

$p$ and $q$ can be thought of as probabilities of selecting the given allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. by random sampling. For example, $p$ for a given population of humans is the probability of finding alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $A$ by selectingplugin-autotooltip__default plugin-autotooltip_bigSelection: There are two distinct but somewhat related definitions for this term:

In model organism research, a selection is a process through which a researcher is attempting to find rare individuals with certain phenotypes and has some way of enriching for the rare individuals by killing off all other individuals that do not match the search criteria. Contrast to a
an individual from that population at random and then selecting one of their two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. at random.

Since $p$ and $q$ are probabilities and in this example there are only two possible allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence., we know from probability theory that the probabilities of all possibilities must add up to one, or:

$$p + q = 1$$
Figure 2: The frequencies (probabilities) of all possibilities must add up to one.

Correspondingly, there are three possible genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally., and therefore three possible genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies:

$$f(\frac{A}{A}) + f(\frac{A}{a}) + f(\frac{a}{a}) = 1$$
Figure 3: Three possible genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. with two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence., and how we represent their genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies. All possible genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies must add up to one. Note that the “fractions” here inside the $f()$ are not math symbols but genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally..

We usually can't measure alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies directly. Rather, we can derive them from the frequencies of the different genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. that are present in a population:

$$p = f(\frac{A}{A})+\frac{1}{2} f(\frac{A}{a})\\ q = f(\frac{a}{a})+\frac{1}{2} f(\frac{A}{a})$$
Figure 4: Derivation of alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies based on genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies.

Here is an example: $M$ and $N$ are allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. of a geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) that specifies different blood antigens. The allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. are codominantplugin-autotooltip__default plugin-autotooltip_bigCodominant: refers to two alleles of the same gene that, when both present, will express the phenotypes of both alleles. An example is the $A$ and $B$ alleles for human bloodtype; some humans have $AB$ bloodtype because the $A$ and $B$ alleles are codominant, whereas the $O$ allele is recessive to both $A$ and $B$., so a simple blood test can distinguish the three possible genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. of $\frac{M}{M}$, $\frac{M}{N}$, and $\frac{N}{N}$. A survey of the population reveals that 83% of the population has only the M antigen, 1% of the population only has the N antigen, and 16% of the population has both the M and N antigens. In other words:

$$f(\frac{M}{M}) = 0.83, f(\frac{M}{N}) = 0.16, f(\frac{N}{N}) = 0.01$$
Figure 5: Observed phenotypeplugin-autotooltip__default plugin-autotooltip_bigPhenotype: an observable feature or property of an organism. frequencies (and therefore genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies) of hypothetical blood antigens $M$ and $N$.

Based on this information, we can calculate $p$ and $q$:

$$p = f(M) = 0.83 + \frac{1}{2}\cdot0.16 = 0.91\\ q = f(N) = 0.01 + \frac{1}{2}\cdot0.16 = 0.09$$
Figure 6: Derivation of alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies for $M$ and $N$ based on the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies given in Fig. 5.

We can get both $p$ and $q$ with just two of the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies because the three genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies must total to a frequency of 1:

$$f(\frac{M}{M}) + f(\frac{M}{N}) + f(\frac{N}{N}) = 1$$
Figure 7: Three possible genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. with two allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in a population $M$ and $N$, with their genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies adding up to one. Note that as before, the “fractions” here inside the $f()$ are not math symbols but genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally.. This figure is essentially the same as Fig. 3 except that we are using the codominantplugin-autotooltip__default plugin-autotooltip_bigCodominant: refers to two alleles of the same gene that, when both present, will express the phenotypes of both alleles. An example is the $A$ and $B$ alleles for human bloodtype; some humans have $AB$ bloodtype because the $A$ and $B$ alleles are codominant, whereas the $O$ allele is recessive to both $A$ and $B$. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $M$ and $N$ in our example here instead of the generic $A$ and $a$ in Fig. 3.

Now let's think about how the inverse calculation would be performed. That is, how would we derive the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies for the “F1” generation from the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies of the “P” generation1)? To do this we must make an assumption about the frequency of mating of parents with different genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally.. If we assume that the gametesplugin-autotooltip__default plugin-autotooltip_bigGamete: a specialized (usually haploid) cell used for sexual reproduction. Eggs (oocytes) and sperm are gametes. mix at random, we can calculate the compound probabilities of obtaining each possible combination of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in the offspring:

egg
sperm $M$ ($p$) $N$ ($q$)
$M$ ($p$) $\frac{M}{M}$ ($p^2$ $\frac{M}{N}$ ($p \cdot q$)
$N$ ($q$) $\frac{M}{N}$ ($p \cdot q$ ) $\frac{M}{M}$ ($q^2$)

Table 1: Calculating genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of offspring based on alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies. Does this table format look familiar to you?

Thus the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies for the next generation are:

$f(\frac{M}{M}) = p^2$, $f(\frac{M}{N}) = 2pq$, and $f(\frac{N}{N}) = q^2$

Figure 8: Calculating genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies based on alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies, continued from Table 1. Note that since $p+q=1$ (Fig. 2), it follows that $(p+q)^2=1$ and therefore $p^2+2pq+q^2=1$, which is equivalent to saying that $f(\frac{M}{M}) + f(\frac{M}{N}) + f(\frac{N}{N}) = 1$. In other words, the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of all possible genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. must add up to one. Does this math look familiar to you?

We can now calculate the new $p_1$ for the new generation using the formula for deriving alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies from genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies seen in Fig. 4:

$$\begin{aligned} p_1 &= f(\frac{M}{M}) + 1/2 f(\frac{M}{N})\\ &= p^2 + pq\\ &=p (p +q)\\ &=p \end{aligned}$$
Figure 9: If random mating occurs, then the frequency of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. does not change from one generation ($p$) to another ($p_1$).

From Fig. 9 we obtain the simple but very important result that when mixing of gametesplugin-autotooltip__default plugin-autotooltip_bigGamete: a specialized (usually haploid) cell used for sexual reproduction. Eggs (oocytes) and sperm are gametes. occurs at random, the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies do not change from one generation to the next (i.e., $p_1=p$). We say any geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) that has this property as being in genetic equilibrium. When applied to all genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) and allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in a population, this condition is known as a Hardy-Weinberg equilibrium, named after British mathematician G.H. Hardy and German physician Wilhelm Weinberg. Sometimes geneticists use “Hardy-Weinberg equilibrium” as a synonym for genetic equilibrium.

Examples of populations in (and not in) Hardy-Weinberg equilibria

If we know the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies and alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies of a geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in a population, then we can ask whether the geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in question in that population is in a Hardy-Weinberg equilibrium for that geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) by determining whether the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies reflect random mixing of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence.. Consider two different populations that have different genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies and different alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies (Table 2):

$\frac{M}{M}$ $\frac{M}{N}$ $\frac{N}{N}$ $p$ $q$
U.S. Caucasians 0.29 0.5 0.21 0.54 0.46
American Inuit 0.84 0.16 0.008 0.92 0.08

Table 2: Hypothetical dataset showing observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of codominantplugin-autotooltip__default plugin-autotooltip_bigCodominant: refers to two alleles of the same gene that, when both present, will express the phenotypes of both alleles. An example is the $A$ and $B$ alleles for human bloodtype; some humans have $AB$ bloodtype because the $A$ and $B$ alleles are codominant, whereas the $O$ allele is recessive to both $A$ and $B$. blood antigens $M$ and $N$, and calculated alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies $p$ and $q$ in U.S. Caucasians vs American Inuit.

Although the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies are quite different, both populations have the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies and alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies that fit the Hardy-Weinberg equilibrium. For instance, in the U.S. Caucasian population, the frequencies of $\frac{M}{M}$, $\frac{M}{N}$, and $\frac{N}{N}$ are obtained by surveying the population (this is your primary data). From these frequencies, we can calculate $p$ and $q$ using the formula in Fig. 4.

Now let's consider two sample populations that have the same alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies but have different genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies.

$\frac{M}{M}$ $\frac{M}{N}$ $\frac{N}{N}$ $p$ $q$
population 1 0.2 0.2 0.6 0.3 0.7
population 2 0.09 0.42 0.49 0.3 0.7

Table 3: Dataset showing observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of codominantplugin-autotooltip__default plugin-autotooltip_bigCodominant: refers to two alleles of the same gene that, when both present, will express the phenotypes of both alleles. An example is the $A$ and $B$ alleles for human bloodtype; some humans have $AB$ bloodtype because the $A$ and $B$ alleles are codominant, whereas the $O$ allele is recessive to both $A$ and $B$. blood antigens $M$ and $N$, and calculated alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies $p$ and $q$ in two different hypothetical populations. Compare to Table 2.

Based on the observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of population 1, we can calculate $p$ and $q$, and we can further derive $p^2=0.09$, $2pq=0.42$, and $q^2=0.49$. This clearly does not match the observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies of $\frac{M}{M}=0.2$, $\frac{M}{N}=0.2$, and $\frac{N}{N}=0.6$, which is what we predict the $\frac{M}{M}$ genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequency would be based on alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency $p$. $p^2$ is greater than the observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequency $f(\frac{M}{M})$ — suggesting something is happening to cause the $M$ alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. to be underrepresented after a generation of breeding. Population 1 is not in a Hardy-Weinberg equilibrium.

We can do the same things for population 2. Even though the observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies are different, the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies $p$ and $q$ wind up being the same as population 1. And when we now use the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies to derive $p^2$, $2pq$, and $q^2$, we find they match the observed genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies. Therefore, population 2 is in a Hardy-Weinberg equilibrium.

Here is a graph showing the relationship between alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. and genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies for genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) that are in a Hardy-Weinberg equilibrium:

Figure 10: Relationship between alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency $p$ (and $q$) and genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequency for genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in a Hardy-Weinberg equilibrium.

Before (Fig. 4) we needed at least two of the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies to calculate alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency but if we know that the population is in a Hardy-Weinberg equilibrium we can get both alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies and all genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies from just one of the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies or one of the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies.

Hardy-Weinberg vs. the real (human) world

Our discussion above of the Hardy-Weinberg equilibrium assumes that there is random mating in human populations. That is to say, in order for a population to be in a Hardy-Weinberg equilibrium, there must be random mating within the population. How good is the random mating assumption in actual human populations? These are some of the conditions that affect random mating assumption and therefore may affect H-W equilibrium:

Genotypic effects on choice of mating partner

Examination of alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies and genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies for most genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in the human populations reveals that they closely fit a Hardy-Weinberg equilibrium. The implication is that in general, humans choose their mates at random with respect to individual genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) and allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence.. This may seem odd given that personal experience says that choosing a mate is anything but random. However the usual criteria for choosing mates such as character, appearance, and social position are largely not determined genetically and, to the extent that they are genetically determined, these are all very complex traits that are influenced by a large number of different genesplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-). The net result is that our decision of with whom we have children does not in general systematically favor some allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. over others.

One of the exceptional conditions that produce a population that is not in a Hardy-Weinberg equilibrium is known as assortative mating, which means preferential mating between similar individuals. For example, individuals with inherited deafness have a relatively high probability of having children together. But even this type of assortative mating will only affect the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequencies related to deafness.

New mutations

Although new mutationsplugin-autotooltip__default plugin-autotooltip_bigMutation: a change in the DNA of a gene that results in a change of phenotype compared to a reference wildtype allele. See also: mutant. continually arise, mutationplugin-autotooltip__default plugin-autotooltip_bigMutation: a change in the DNA of a gene that results in a change of phenotype compared to a reference wildtype allele. See also: mutant. rates are usually sufficiently small that in any single generation their effect on alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies is negligible. As will be discussed in the next chapter, the effect of mutationsplugin-autotooltip__default plugin-autotooltip_bigMutation: a change in the DNA of a gene that results in a change of phenotype compared to a reference wildtype allele. See also: mutant. compounded over many generations can have a significant effect on alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies.

Selection

“Selection” has a slightly different but still related meaning than “selectionplugin-autotooltip__default plugin-autotooltip_bigSelection: There are two distinct but somewhat related definitions for this term:

In model organism research, a selection is a process through which a researcher is attempting to find rare individuals with certain phenotypes and has some way of enriching for the rare individuals by killing off all other individuals that do not match the search criteria. Contrast to a
” when discussing a genetic trick for isolating rare events in Chapter 08. “Selection” in this context refers to differences in survival or reproduction of different genotypesplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally.. Like new mutationsplugin-autotooltip__default plugin-autotooltip_bigMutation: a change in the DNA of a gene that results in a change of phenotype compared to a reference wildtype allele. See also: mutant., the effect of selectionplugin-autotooltip__default plugin-autotooltip_bigSelection: There are two distinct but somewhat related definitions for this term:

In model organism research, a selection is a process through which a researcher is attempting to find rare individuals with certain phenotypes and has some way of enriching for the rare individuals by killing off all other individuals that do not match the search criteria. Contrast to a
is usually small in any single generation and therefore usually does not affect Hardy-Weinberg equilibria. An exception would be a recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. lethal mutationplugin-autotooltip__default plugin-autotooltip_bigMutation: a change in the DNA of a gene that results in a change of phenotype compared to a reference wildtype allele. See also: mutant. that would render the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequency of the homozygoteplugin-autotooltip__default plugin-autotooltip_bigHomozygous: a state for a diploid organism wherein the two alleles for a gene are identical to each other. to be zero regardless of the genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. frequency of the heterozygoteplugin-autotooltip__default plugin-autotooltip_bigHeterozygous: a state for a diploid organism wherein the two alleles for a gene are different from each other.. As will be discussed in the next chapter, the effect of selectionplugin-autotooltip__default plugin-autotooltip_bigSelection: There are two distinct but somewhat related definitions for this term:

In model organism research, a selection is a process through which a researcher is attempting to find rare individuals with certain phenotypes and has some way of enriching for the rare individuals by killing off all other individuals that do not match the search criteria. Contrast to a
can have a significant effect over many generations.

Genetic drift/founder effect

In small populations, only a small number of individuals pass their allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. on to the next generation. Under these circumstances, chance fluctuations in the allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. that are transmitted can cause significant changes in alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency. These effects are usually insignificant for large populations such as in the U.S.

To see how this would happen, consider a geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) in a very large population with a single major dominantplugin-autotooltip__default plugin-autotooltip_bigDominant: used to describe an allele, usually in comparison to wildtype. Dominant alleles will express their phenotype when combined with a wildtype allele. alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $A$ and 10 minor recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a_1$, $a_2$, $a_3$ … $a_{10}$ with alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies $ƒ(a_1) = ƒ(a_2) = ƒ(a_3) ... = 10^{-4}$ and $f(A)=0.999$. Now imagine that a group of 500 individuals from this population move to an island to start a new population.

The aggregate frequency of recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in this population is $a_n = 10\times10^{-4}=10^{-3}$. We can calculate the probability that $n$ number of individuals are carriers (either $\frac{a_n}{a_n}$ or $\frac{A}{a_n}$) as:

$$p(n)=(p^2+2pq)^n \cdot (q^2)^{500-n} \cdot \frac{500!}{n!(500-n)!}$$
Figure 11: Probability that $n$ number of individuals among the 500 moving to the island are carriers of one of the recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a_n$. Based on the information given in the text above, we know that $p=f(a_n)=0.001$ and $q=0.999$.

Plugging in different integer values for $n$ and values for $p$ and $q$ as described in the legend for Fig. 11, we get the following table:

number of carriers $n$ probability that $n$ carriers are
part of the migrating population
1 0.368
2 0.184
3 0.061
4 0.015
5 0.003
6 <0.001

Table 4: Probability table for likelihood of there being carriers of rare recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in migrating population in our example.

The way to interpret Table 4 is to say, “The probability that there is exactly one carrier in the population is 0.368”, “The probability that there are exactly two carriers in the population is 0.184”, etc. The likelihood that there will be at least one carrier is actually pretty good; that is given as $1-(q^2)^{500} = 0.632$. Since there are 10 recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence., the minimum number of carriers you would need to have all 10 allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. move to the new population would be $n=5$, since humans are diploidplugin-autotooltip__default plugin-autotooltip_bigDiploid: a term that describes a cell or organism that has two copies of similar genetic information, usually obtaining one copy from a male parent and the other copy from a female parent.. But the likelihood of there being 5 carriers in the population is 0.003, or less than 1%. Because the allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. are rare, it's much more likely that only one or two carriers will be part of the migration. Also , it's highly unlikely that any of the carriers will be homozygousplugin-autotooltip__default plugin-autotooltip_bigHomozygous: a state for a diploid organism wherein the two alleles for a gene are identical to each other. for any of the recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. (or even carry two different recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence.).

Let's say alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a_1$ is brought over by a single individual with genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. $\frac{A}{a_1}$. This means that allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a_2$ through $a_{10}$ are now lost in the new population. The new alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency for $a_1$ is now 1 in 1000, or $f(a_1)=10^{-3}$; an increase of 10-fold! Thus, in a stochastic fashion most of the rare recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. will be lost, whereas an occasional rare alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. will experience an increase in frequency. The smaller the founding population the more likely that a rare alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. will be lost and the greater the increase in frequency experienced by the allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. that happen to be selected.

Migration of individuals between different populations

When individuals from populations with different alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequencies mix, the combined population will be in a Hardy-Weinberg equilibrium after one generation of random mating. The combined population will be out of equilibrium to the extent that mating is assortatative.

An example of calculating allele frequency in humans: albinism

If we are considering rare allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. we can make the following approximations allowing us to avoid a lot of messy algebra in our calculations.

For $f(A)=p$ and $f(a)=q$,
If $q<<1$, then $p \approx 1$

Therefore, $f(\frac{A}{A})=p^2 \approx 1$, $f(\frac{A}{a})=2pq \approx 2q$, and $f(\frac{a}{a})=q^2$. Since most genetic diseases are rare, these approximations are valid for many of the population genetics calculations that are of medical importance.

Let's look at a real-life example. Albinism occurs in approximately 1 in 20,000 individuals in humans. Let's say that this condition is due to a recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a$ of a single geneplugin-autotooltip__default plugin-autotooltip_bigGene: read Chapters 02, 03, 04, 05, and 06 for a definition of gene :-) that is in a Hardy-Weinberg equilibrium. Based on this information, we can derive the alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. frequency for $a$:

$$f(\frac{a}{a}) = \frac{1}{20000} = 5\times10^{-5}=q^2\\ q=\sqrt{5\times10^{-5}}=7\times10^{-3}$$

And based on this we can also calculate the frequency of heterozygotesplugin-autotooltip__default plugin-autotooltip_bigHeterozygous: a state for a diploid organism wherein the two alleles for a gene are different from each other. in the population:

$$f(\frac{A}{a})=2pq\approx 2q = 1.4\times10^{-2}$$

In other words, approximately 1 in 140 humans are carriers for albinism. We can next calculate the fraction of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. for albinism that are in individuals that are actually albinos (i.e., their genotypeplugin-autotooltip__default plugin-autotooltip_bigGenotype: the combination of alleles within an organism or strain. When used as a verb, it means to determine the genotype experimentally. is $\frac{a}{a}$. If we let $\text{N}$=population size, then the number of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in homozygotesplugin-autotooltip__default plugin-autotooltip_bigHomozygous: a state for a diploid organism wherein the two alleles for a gene are identical to each other. will be $2\times\text{N}\cdot q^2$. The number of allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in heterozygotesplugin-autotooltip__default plugin-autotooltip_bigHeterozygous: a state for a diploid organism wherein the two alleles for a gene are different from each other. (carriers) will be $1\times\text{N} \cdot 2pq \approx \text{N}(2q)$. Therefore, the fraction of $a$ allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. in homozygotesplugin-autotooltip__default plugin-autotooltip_bigHomozygous: a state for a diploid organism wherein the two alleles for a gene are identical to each other. is:

$$ \frac{2\times\text{N}\cdot q^2}{2\times\text{N}\cdot q^2+\text{N}(2q)}\\ =\frac{q}{q+1}$$

Since $q<<1$, we can approximate $\frac{q}{q+1}$ with just $q$. In other words, the fraction of alleleplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. $a$ in homozygotesplugin-autotooltip__default plugin-autotooltip_bigHomozygous: a state for a diploid organism wherein the two alleles for a gene are identical to each other. is 7×10-3. The vast majority of the allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. ($1-(7\times10^{-3})=99.3\%$) are in heterozygotesplugin-autotooltip__default plugin-autotooltip_bigHeterozygous: a state for a diploid organism wherein the two alleles for a gene are different from each other..

A basic ethics lesson we can get from this simple example is that eugenics is an exercise in futility. Recessiveplugin-autotooltip__default plugin-autotooltip_bigRecessive: used to describe an allele, usually in comparison to wildtype. Recessive alleles do not exhibit their phenotype when combined with a wildtype allele. allelesplugin-autotooltip__default plugin-autotooltip_bigAllele: a version of a gene. Alleles of a gene are different if they have differences in their DNA sequence. can easily exist at relatively high frequencies inside human populations.

1)
Note that “P” and “F1” are symbols used in Mendelian genetics for model organisms and don't really apply to population genetics. We're using them here just as an analogy.
chapter_18.1725853785.txt.gz · Last modified: 2024/09/08 20:49 by mike