Activity

  • Kevin Krabbe posted an update 6 years, 7 months ago

    E two classes. For S/HIC, we employed the posterior classification probability in the Extra-Trees classifier obtained employing scikit-learn’s predict_proba system. For SFselect+, we applied the worth of your SVM decision function. For SweepFinder, we utilized the composite likelihood ratio. For Garud et al.’s system, we applied the fraction of accepted simulations (i.e. inside a Euclidean distance of 0.1 from the test instance) that were on the initially class: as an example, for challenging vs. soft, this is the number of accepted simulations that have been tough sweeps divided by the number of accepted simulations that had been either tough sweeps or soft sweeps. For Tajima’s D [36] and Kim and Nielsen’s [10], we just utilized the values of these statistics.Simulating sweeps under non-equilibrium demographic modelsTo examine the energy and sensitivity of S/HIC below non-equilibrium demographic histories, we simulated training and test datasets from several scenarios that may well be relevant to researchers. Firstly we examined the energy of our approach beneath two complex SAR405 site population size histories that are relevant to humans. Secondly we examined the case of uncomplicated population bottlenecks, as may possibly be frequent in populations which have lately colonized new locales, making use of two levels of bottleneck severity. We simulated instruction and test datasets from Tennessen et al.’s [44] European demographic model (S1 Table). This model parameterizes a population contraction linked with migration out of Africa, a second contraction followed by exponential population development, plus a much more recent phase of even more quickly exponential growth. Values of and = 4Nr were drawn fromPLOS Genetics | DOI:ten.1371/journal.pgen.March 15,8 /Robust Identification of Soft and Tough Sweeps Utilizing Machine Learningprior distributions (S1 Table), allowing for variation inside the instruction information, whose implies have been chosen from recent estimates of human mutation [45] and recombination prices [46], respectively. For simulations with selection, we drew values of from U(5.003, 5.005), and drew the fixation time from the sweeping allele type U(0, 51,000) years ago (i.e. the sweep completed soon after the migration out of Africa). We also generated simulations of Tennessen et al.’s African demographic model, which consists of exponential population growth beginning 5,100 years ago (S1 Table). We generated two sets of these simulations: one particular where was drawn from U(five.004, five.005), and 1 with drawn from U(5.004, 5.005). The sample size of those simulated information sets was set to one hundred chromosomes. These two sets have been then combined into a single education set. For these simulations, the sweep was constrained to finish some time in the course of the exponential development phase (no later than 5,100 years ago). Lastly, we examined two models using a population size bottleneck. The initial was taken from Thornton and Andolfatto [47], and models the demographic history of a European population sample of D. melanogaster (S1 Table). This model consists of a population size reduction 0.044N generations ago to two.9 with the ancestral population size, and then 0.0084N generations ago the population recovers to its original size. The second bottleneck model we employed was identical except the population contraction was less severe (reduction to 29 with the ancestral population size).