(v0.7.1.9073) as.mo() self-learning algorithm

2025-07-09 03:22:00 +02:00 · 2019-09-15 22:57:30 +02:00
parent cd178ee569
commit 398c5bdc4f
31 changed files with 1030 additions and 2360 deletions
--- a/vignettes/benchmarks.Rmd
+++ b/vignettes/benchmarks.Rmd
@ -20,7 +20,7 @@ knitr::opts_chunk$set(
  comment = "#",
  fig.width = 7.5,
  fig.height = 4.5,
-  dpi = 150
+  dpi = 75
 )
 ```

@ -110,26 +110,40 @@ That takes `r round(mean(T.islandicus$time, na.rm = TRUE) / mean(S.aureus$time,
 In the figure below, we compare *Escherichia coli* (which is very common) with *Prevotella brevis* (which is moderately common) and with *Thermus islandicus* (which is uncommon):

 ```{r, echo = FALSE}
-ggplot.bm(
-  microbenchmark(as.mo("Escherichia coli"),
-                 as.mo("E. coli"),
-                 times = 10), title = "Very common")
+# ggplot.bm(
+#   microbenchmark(as.mo("Escherichia coli"),
+#                  as.mo("E. coli"),
+#                  times = 10), title = "Very common")
+# 
+# ggplot.bm(
+#   microbenchmark(as.mo("Prevotella brevis"),
+#                  as.mo("P. brevis"),
+#                  times = 10), title = "Moderately common")
+# 
+# ggplot.bm(
+#   microbenchmark(as.mo("Thermus islandicus"),
+#                  as.mo("T. islandicus"),
+#                  times = 10), title = "Uncommon")

-ggplot.bm(
-  microbenchmark(as.mo("Prevotella brevis"),
-                 as.mo("P. brevis"),
-                 times = 10), title = "Moderately common")
-
-ggplot.bm(
-  microbenchmark(as.mo("Thermus islandicus"),
-                 as.mo("T. islandicus"),
-                 times = 10), title = "Uncommon")
+par(mar = c(5, 16, 4, 2))
+boxplot(microbenchmark(
+  'as.mo("Thermus islandicus")' = as.mo("Thermus islandicus"),
+  'as.mo("Prevotella brevis")' = as.mo("Prevotella brevis"),
+  'as.mo("Escherichia coli")' = as.mo("Escherichia coli"),
+  'as.mo("T. islandicus")' = as.mo("T. islandicus"),
+  'as.mo("P. brevis")' = as.mo("P. brevis"),
+  'as.mo("E. coli")' = as.mo("E. coli"),
+  times = 10),
+        horizontal = TRUE, las = 1, unit = "s", log = FALSE,
+        xlab = "", ylab = "Time in seconds", ylim = c(0, 0.5),
+        main = "Benchmarks per prevalence")
 ```

-```{r, echo = FALSE, eval = FALSE}
-# In reality, the `as.mo()` functions **learns from its own output to speed up determinations for next times**. In above figure, this effect was disabled to show the difference with the boxplot below - when you would use `as.mo()` yourself:
+In reality, the `as.mo()` functions **learns from its own output to speed up determinations for next times**. In above figure, this effect was disabled to show the difference with the boxplot below - when you would use `as.mo()` yourself:

-clean_mo_history()
+```{r, echo = FALSE}
+
+clear_mo_history()
 par(mar = c(5, 16, 4, 2))
 boxplot(microbenchmark(
  'as.mo("Thermus islandicus")' = as.mo("Thermus islandicus", force_mo_history = TRUE),
@ -142,10 +156,10 @@ boxplot(microbenchmark(
        horizontal = TRUE, las = 1, unit = "s", log = FALSE,
        xlab = "", ylab = "Time in seconds", ylim = c(0, 0.5),
        main = "Benchmarks per prevalence")
-
-# The highest outliers are the first times. All next determinations were done in only thousands of seconds. For now, learning only works per session. If R is closed or terminated, the algorithms reset. This will probably be resolved in a next version.
 ```

+The highest outliers are the first times. All next determinations were done in only thousands of seconds.
+
 Uncommon microorganisms take a lot more time than common microorganisms. To relieve this pitfall and further improve performance, two important calculations take almost no time at all: **repetitive results** and **already precalculated results**.

 ### Repetitive results