chapter_21
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
chapter_21 [2024/09/15 20:24] – [Polymorphisms and mapping] mike | chapter_21 [2024/09/18 19:47] (current) – mike | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | <- chapter_20|Chapter 20^table_of_contents|Table of Contents^chapter_22|Chapter 22 -> | ||
+ | |||
<typo fs: | <typo fs: | ||
Line 22: | Line 24: | ||
</ | </ | ||
- | Of course, to many people humans are the organism in which they are most interested, either for their intrinsic interest in themselves or for biomedical applications. Humans are also more difficult to study compared to the other model organisms we have discussed in this book. The human genome is much larger and less gene dense than invertebrates such as Drosophila or //C. elegans//. More importantly, | + | Of course, to many people humans are the organism in which they are most interested, either for their intrinsic interest in themselves or for biomedical applications. Humans are also more difficult to study compared to the other model organisms we have discussed in this book. The human genome is much larger and less gene dense than invertebrates such as Drosophila or //C. elegans//. More importantly, |
===== Polymorphisms and mapping ===== | ===== Polymorphisms and mapping ===== | ||
Line 31: | Line 33: | ||
- You can describe the physical location of a mutation using coordinates on DNA (i.e., " | - You can describe the physical location of a mutation using coordinates on DNA (i.e., " | ||
- | In human genetics, we don't have visible marker mutations such as "white" | + | In human genetics, we don't have visible marker mutations such as $white$ and $yellow$ in Drosophila, and even if we did we could not force humans to do crosses (and even if we could, we would have to wait 20 years to get an F1 generation!). Instead, we reply on DNA polymorphisms as markers. A polymorphism is simply a difference in DNA sequence at a particular location in the genome between individuals in a population. These differences can be inside a gene, or in between genes. DNA polymorphisms include substitutions, |
+ | |||
+ | When human individuals with interesting (usually disease-related) phenotypes are found, we can examine their family tree (i.e., their pedigree), see if there are other affected individuals, | ||
+ | |||
+ | Two types of DNA polymorphisms are of particular importance in human genetics: single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs). | ||
+ | |||
+ | ==== Single nucleotide polymorphisms (SNPs) ==== | ||
+ | |||
+ | A single nucleotide polymorphism, | ||
+ | |||
+ | How frequently are SNPs found? All humans are 99.9% identical at the DNA level. This means that on average, at a randomly selected locus, two randomly selected human alleles will differ at a frequency of 0.001. This implies that your maternal genome (the haploid genome that you inherited from your mother) differs from your paternal genome (inherited from your father) at about 1 bp per 1000. Since the human genome is 3x10< | ||
+ | |||
+ | The vast majority (probably 99%) of SNPs are selectively “neutral” changes of little or no functional consequence. This is mostly because they likely exist outside coding or gene regulatory regions (>97% of human genome). They can also be silent substitutions in coding sequences, or amino acid substitutions that do not affect protein stability or function. A small minority of SNPs are of functional consequence and are selectively advantageous or disadvantageous; | ||
+ | |||
+ | SNPs can be detected in a variety of ways. Conceptually, | ||
+ | |||
+ | ==== Simple sequence repeats (SSRs) ==== | ||
+ | |||
+ | Simple sequence repeats (SSRs) also go by a variety of other names: microsatellites, | ||
+ | |||
+ | SSRs in noncoding regions typically do not affect gene function, and therefore are usually not under any kind of selection. This means that SSR loci can accumulate mutations which leads to different alleles in a population. The different alleles will vary in length, and this difference in length can be detected by PCR ([[chapter_07|Chapter 07]]). To distinguish between different SSR alleles, a researcher would use primers that flank the SSR; the length of the PCR amplification product can then be determined by electrophoresis, | ||
+ | |||
+ | < | ||
+ | {{ : | ||
+ | < | ||
+ | CODIS (placeholder). Source: [[https:// | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | |||
+ | SSRs can be used as markers for any kind of mapping of human genes, but they are commonly used in forensics. The Federal Bureau of Investigation (FBI) maintains a DNA database called the Combined DNA Index System (CODIS) that contains data on a core set of SSR alleles from convicted offenders or arrestees of various crimes. Prior to 2017, 13 STR loci were used in CODIS entries; since 2017, an additional 7 STR loci have been added. These loci are chosen such that they are unlinked from each other. This maximizes their utility in identifying unique individuals. This technique of using STR allele combinations to identify individuals is called DNA fingerprinting. | ||
+ | |||
+ | A consequence of SSR loci being neutral and not under selection is that these loci are usually in Hardy-Weinberg equilibrium ([[chapter_18|Chapter 18]]). This allows forensic scientists to use the principles of population genetics to calculate allele frequencies. When the DNA of a suspect matches forensic evidence at a crime scene, allele frequency (together with information on SSR loci mutation rate) allows forensic scientists to calculate the likelihood that the combination of SSR alleles found in evidence matches that of the suspect is due to random chance. For instance, let's say there 11 different alleles at an SSR locus. Let's say that allele 1 has a frequency of 0.5 and allele 2 has a frequency of 0.2; the remaining alleles are more rare and make up the remaining 0.3. The likelihood of a random individual in the population being heterozygous at this locus for allele 1 and 2 is $0.5 \times 0.2 = 0.1$ or 10. Let's just guesstimate that the likelihood for a random match for most SSR loci is also about 10% (or 0.1). If a forensic investigator compares 13 different loci and gets a perfect match to a suspect, the likelihood that this match is due to random chance is $0.1^{13}=10^{-12}$, | ||
+ | |||
+ | |||
+ | |||
+ | ===== Example of using polymorphisms to map a human mutation: hypolactasia ===== | ||
- | Two types of DNA polymorphisms are of particular importance in human genetics: single nucleotide polymorphisms | + | The digestion |
+ | In 2002, a team of Finnish scientists set out to use human genetics methods to identify mutations that are associated with hypolactasia. They reasoned that mutations that affect individuals are probably not in the protein-coding region of $LCT$, since these individuals could digest lactose as children (they also knew from other studies that there were no mutations in the $LCT$ gene of individuals that had hypolactasia). Instead, they believed that there may be mutations in nearby cis-acting regulatory sequences (see [[chapter_13|Chap. 13]]) that control the expression of $LCT$, such that it is no longer expressed in adults (they also had some other evidence to support this idea). | ||
+ | The scientists examined the pedigrees of nine Finnish families with a history of hypolactasia (Fig xxx: NOTE: WAITING FOR PERMISSION FROM HHMI TO USE THE FIGURES). From the pedigrees, you can see that the inheritance pattern is consistent with hypolactasia being an autosomal recessive mutation. The scientists then collected DNA samples from volunteers in these families and analyzed various polymorphisms. They utilized seven SSRs that flanked the $LCT$ gene on either side. They found strong statistical evidence (see [[chapter_22|Chap. 22]]) for linkage of hypolactasia to an SSR upstream of the $LCT$ gene - consistent with it being a regulatory mutant instead of a coding mutant. Since SNPs are much denser than SSRs, they then used SNPs to further narrow down the genetic interval for the mutations to a 47 kb region of DNA upstream of $LCT$. In this region, they found many DNA polymorphisms among the family members, but only two SNPs that showed complete co-segregation with the hypolactasia trait. Subsequent reverse genetic studies in mice and cultured human cells suggest that these two SNPs may be mutations that affect the ability of a transcription factor called OCT-1 to bind, thereby affecting the expression of $LCT$ in adults. |
chapter_21.1726457079.txt.gz · Last modified: 2024/09/15 20:24 by mike