6_3_plotFig_MutRatesComp module
Plot Figure 4
from the paper, i.e., a comparison
between the new estimated indel rates in the lineage relationships
between human and 40 other vertebrates. It also includes,
as a reference, Direct estimates and
Indirect estimates from previous studies (listed below).
Direct estimates
Direct estimates refers to methods that count mutations between generations in present-day individuals. The following studies were used as reference:
Authors |
Year |
Indel sizes |
Indel rate estimation (original) |
Indel rate estimation (CI) |
Generation Time |
Generation Time Interval (CI) |
Indel rate PPPY (Per Position Per Year) |
Indel rate PPPY (Per Position Per Year) (CI) |
---|---|---|---|---|---|---|---|---|
Kloosterman et al. |
2015 |
1-20 |
0.68*(10**(-9)) |
29.27 |
(24.385, 34.155) |
2.3231886903593173e-11 |
(1.990918179344908e-11, 2.788584149604477e-11) |
|
Besenbacher et al. |
2016 |
1-35 |
0.929*(10**(-9)) |
30.26 |
3.07*(10**(-11)) |
(2.91*10**(-11), 3.25*(10**(-11))) |
||
Maretty et al. |
2017 |
1-10 |
1.3*(10**(-9)) |
27.7 |
4.70e-11 |
|||
Besenbacher et al. |
2015 |
1-50 |
1.5e-9 |
(1.2e-9, 1.9e-9) |
28.4 |
5.28169014084507e-11 |
(4.225352112676056e-11, 6.690140845070423e-11) |
|
Kondrashov (del.) |
2002 |
1- |
0.526*(10**(-9)) |
(0.216e-9,0.836e-9) |
20 |
2.63e-11 |
(1.58e-11, 4.18e-11) |
|
Kondrashov (ins.) |
2002 |
1- |
0.182*(10**(-9)) |
(0.072e-9,0.292e-9) |
20 |
0.91e-11 |
(0.36e-11, 1.46e-11) |
|
Palamara et al. |
2015 |
1-20 |
1.26*(10**(-9)) |
(1.2e-09, 1.32e-09) |
29 |
4.3448275862068967e-11 |
(4.137931034482759e-11, 4.5517241379310344e-11) |
Note
Note that the indel sizes vary in each study (see values in the ``Indel sizes’’ column). As lower the maximum indel size is, higher the mutation rate estimation.
Indirect estimates
Indirect estimates refers to estimates based on the evolutionary distance separating two species divided by (twice) their divergence time. The following studies were used as reference:
Authors |
Year |
Species |
Indel sizes |
Indel rate estimation (original) |
Indel rate estimation (CI) |
Generation Time |
Generation Time Interval (CI) |
Indel rate PPPY (Per Position Per Year) |
Indel rate PPPY (Per Position Per Year) (CI) |
---|---|---|---|---|---|---|---|---|---|
Nachman and Crowell |
2000 |
Chimp |
1-4 |
2.3*(10**(-9)) |
20 |
4.95049504950495e-11 |
(3.712871287128713e-11, 6.188118811881188e-11) |
||
Lunter |
2007 |
Mouse |
1- |
0.053 |
2*87*(10**6) |
(2*81.3*(10**6), 2*91*(10**6)) |
30.46e-11 |
(29.12087912087912e-11, 32.595325953259533e-11) |
Plots
In the generated plot, the new estimates are shown in orange, with the rectangle borders representing the standard deviation, and the middle point showing the mean indel rate. Indirect and direct estimates from previous studies are indicated in green and blue, respectively. All indel rates were adjusted to “per position per year” (PPPY) in order to make them comparable. The orange dashed line indicates the average indel rate across all species. If evolution were uniform across all lineages, values should be concentrated around this point.
Use:
python3 6_3_plotFig_MutRatesComp.py
Example of Usage:
python3 ~/code/6_3_plotFig_MutRatesComp.py
Input Parameter:
To ensure the graphs match those used in the paper, the parameters are hard-coded in the script and cannot be modified via command line.
Pre-requisites
Before using this script, make sure all the required files were pre-computed:
a) Files with sampled evolutionary times
Make sure to run 5_sampleEvolTimes.py
for α=1.1.
b) Logs from evolutionary time estimates
Make sure to keep the logs from 4_estimateEvolTimes.py
for α=1.1. It contains the information regarding windows without estimates.
Time, Memory & Disk space
Running the script on a single core takes 3 minutes (143.07 seconds) and requires a small amount of memory. In total, the output file requires 2.4 MB of disk space.
Output files:
The file ``mutRates-comp.highEvolTimeQuant0.99.svg’’ contains the plot for Figure 4.
Function details
Only relevant functions have been documented below. For more details on any function, check the comments in the souce code.
- class 6_3_plotFig_MutRatesComp.MutRateStudy(author, publYear, methodType, ucscName, mutRatePPPY, mutRatePPPY_lb, mutRatePPPY_ub)
Bases:
tuple
- author
Alias for field number 0
- methodType
Alias for field number 2
- mutRatePPPY
Alias for field number 4
- mutRatePPPY_lb
Alias for field number 5
- mutRatePPPY_ub
Alias for field number 6
- publYear
Alias for field number 1
- ucscName
Alias for field number 3
- 6_3_plotFig_MutRatesComp.computeDistribQuantile(taudistrib_est_onespecies, quantile)
- 6_3_plotFig_MutRatesComp.computeEvolTimeEmpty(my_dataset, UCSCname, alpha, empty_evoltime_quantile)
- 6_3_plotFig_MutRatesComp.createLines(ax, my_dataset, rows, corr=0)
Creates horizontal lines separating direct and indirect estimates.
- 6_3_plotFig_MutRatesComp.getDirectEstimates()
It returns a list where each entry corresponds to a previous study. All studies returned by this method are Direct Estimates, i.e., they count mutations that occur between generations in present-day individuals. WARNING: The indel sizes vary in each study (see entry ``Indel sizes’’ in each tuple). As lower the maximum indel size is, higher the mutation rate estimation.
- 6_3_plotFig_MutRatesComp.getEmptyWindows(alpha, my_dataset)
- 6_3_plotFig_MutRatesComp.getExtrapolatedEstimates()
It returns a list where each entry corresponds to a previous study. All studies returned by this method are Extrapolated Estimates, i.e., they estimate the indel rate based on the substitution rate. These studies have the generation time unclear and, therefore, were left out of the analysis.
- 6_3_plotFig_MutRatesComp.getIndirectEstimates()
It returns a list where each entry corresponds to a previous study. All studies returned by this method are Indirect Estimates, i.e., they compute their estimate based on the evolutionary distance and the divergence time separating two species.
- 6_3_plotFig_MutRatesComp.getNewEstimates(my_dataset, alpha, empty_evoltime_quantile=-1)
- 6_3_plotFig_MutRatesComp.loadOurData(alpha, my_dataset, empty_windows=None, empty_evoltime_quantile=-1)
- 6_3_plotFig_MutRatesComp.makeFigure4(alpha, my_dataset)
- 6_3_plotFig_MutRatesComp.meanEvolTimes(taudistrib_est_onespecies, empty_win_info=(-1, -1), empty_win_tau=-1, bootstrap=False)
- 6_3_plotFig_MutRatesComp.plotErrorRect(ax, study, ycoord, color)
- 6_3_plotFig_MutRatesComp.plotIcon(ax, icon_width, iconFilename, xval, yval, rect)
- 6_3_plotFig_MutRatesComp.plotMutationRateComparison(my_dataset, new_ests, direct_ests, indirect_ests, empty_evoltime_quantile)
- 6_3_plotFig_MutRatesComp.plotMutationRatePerType(ax, mutRateEstsAll, mutRateEsts, markerStyle, icon_width)
- 6_3_plotFig_MutRatesComp.yLabels(ax, my_dataset, rows, corr=0)
Creates y-labels The y-label consists of the name of the species that was compared with human (or “Human” if it is a direct comparison), and from which study (author + publication year) the estimate comes from.