chapter_22
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
chapter_22 [2024/11/25 17:36] – [LOD score formula for unknown phase] mike | chapter_22 [2024/11/25 18:34] (current) – [LOD score formula for unknown phase] mike | ||
---|---|---|---|
Line 210: | Line 210: | ||
</ | </ | ||
- | We can see that the LOD score is highest around θ = 0.15, at which LOD = 2.35 (you can calculate a more precise maximal value of LOD using some calculus). Experience tells us that LOD > 3.3 is usually the cutoff point at which the linkage is likely to be real (it corresponds to a 0.05 false positive rate); conversely, LOD < -2 usually means we can rule out linkage. Therefore, we can see that the data do not support linkage between $m$ and $s$, but does not rule it out either. | + | <figure Fig3> |
+ | {{ : | ||
+ | < | ||
+ | placeholder. graph plotting LOD as a function of theta. | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | We can see from Table {{ref> | ||
What if our sample size were bigger? Let's say that instead of 20 offspring, we looked at 40 offspring and obtained R=6 and NR=34. At θ=0.15, we can calculate that LOD=5.9. This puts us over the threshold of 3.3. Based on this data, we can conclude that $m$ and $s$ are linked, and the strongest support of the data suggest they are 15 map units apart. | What if our sample size were bigger? Let's say that instead of 20 offspring, we looked at 40 offspring and obtained R=6 and NR=34. At θ=0.15, we can calculate that LOD=5.9. This puts us over the threshold of 3.3. Based on this data, we can conclude that $m$ and $s$ are linked, and the strongest support of the data suggest they are 15 map units apart. | ||
Line 216: | Line 223: | ||
==== LOD score formula for unknown phase ==== | ==== LOD score formula for unknown phase ==== | ||
- | Now let's return to a human example: | + | In mouse experiments, |
- | < | + | < |
{{ : | {{ : | ||
< | < | ||
placeholder | placeholder | ||
</ | </ | ||
+ | </ | ||
- | In this example, a family with 5 children is affected by an autosomal dominant mutation $D$. We are testing linkage with marker $m$, of which there are two alleles, $M$ and $m$. $m$ is known to be on the same chromosome as $D$ but it is unknown if they are linked. The genotype of the mother (individual 2) is known to be $\frac{a \; d}{a \; d}$. The genotype of the father (individual 1) is known but the phase is not known; his genotype is either $\frac{A \; D}{a \; d}$ or $\frac{A \; d}{a \; D}$. Based on this information, | + | In this example, a family with 5 children is affected by an autosomal dominant mutation $D$. We are testing linkage with marker $m$, of which there are two alleles, $M$ and $m$. $m$ is known to be on the same chromosome as $D$ but it is unknown if they are linked. The genotype of the mother (individual 2) is known to be $\frac{a \; d}{a \; d}$. The genotype of the father (individual 1) is known but the phase is not known and therefore there are two possible phases; his genotype is either $\frac{A \; D}{a \; d}$ (phase 1) or $\frac{A \; d}{a \; D}$ (phase 2). Based on this information, |
<table Tab6> | <table Tab6> | ||
Line 230: | Line 238: | ||
^ individual | ^ individual | ||
- | ^ inferred genotype | + | ^ inferred |
- | ^ paternal chromosome | + | ^ paternal |
+ | ^ if phase 1 | NR | R | NR | NR | NR | | ||
+ | ^ if phase 2 | R | NR | R | R | R | | ||
+ | ^ maternal %%chromosome%% | ||
</ | </ | ||
< | < | ||
Line 237: | Line 248: | ||
</ | </ | ||
</ | </ | ||
+ | |||
+ | We first define some terms: | ||
+ | |||
+ | * T = total number of informative chromosomes; | ||
+ | * R1 = number of recombinant chromosomes in phase 1 | ||
+ | * NR1 = number of non-recombinant chromosomes in phase 1 | ||
+ | * R2 = number of recombinant chromosomes in phase 2 | ||
+ | * NR2 = number of non-recombinant chromosomes in phase 2 | ||
+ | * θ = recombination fraction (i.e., map distance) between $m$ and $D$ | ||
+ | |||
+ | As before, we define the LOD score as: | ||
+ | |||
+ | $$ \begin{aligned}\text{LOD score} &= \log{\frac{\text{probability of observed pedigree data given } 0< | ||
+ | &= \log{\frac{\text{probability}(\theta)}{\text{probability}(\frac{1}{2})}}\end{aligned} | ||
+ | $$ | ||
+ | |||
+ | Since we don't know what the phase is, we must calculate $\text{probabilty}(\theta)$ for both phases: | ||
+ | |||
+ | $$\text{probabilty}(\theta)_1=\theta^{R1} \cdot (1-\theta)^{NR1} \\ | ||
+ | \text{probabilty}(\theta)_2=\theta^{R2} \cdot (1-\theta)^{NR2} | ||
+ | |||
+ | And since phase 1 and phase 2 are equally likely, we take the average of both for the LOD score: | ||
+ | |||
+ | $$ \begin{aligned} \text{LOD} & | ||
+ | & | ||
+ | |||
+ | In this example, there are a total of 5 informative chromosomes (R1=1, NR1=4; R2=4, NR2=1). Using the formula, we can calculate that at θ=0.25 we get a max LOD score of 0.25 - this does not cross the threshold of LOD=3.3 and therefore is not evidence of linkage. When the phase is unknown, LOD scores will be substantially lower than if the phase were known. If we knew the phase in our example to be phase 1, we could calculate at θ=0.25, LOD=0.403. That's still not significant but it's higher. | ||
+ | |||
+ | One last important point on LOD scores is that they are additive. This is because probabilities (odds) are multiplicative, | ||
+ | |||
chapter_22.1732584984.txt.gz · Last modified: 2024/11/25 17:36 by mike