diff --git a/DESCRIPTION b/DESCRIPTION index 5b271565..5a42cd68 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.9.0.9006 -Date: 2019-12-22 +Version: 0.9.0.9007 +Date: 2019-12-27 Title: Antimicrobial Resistance Analysis Authors@R: c( person(role = c("aut", "cre"), diff --git a/NEWS.md b/NEWS.md index e6a6bd6d..0807204b 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,5 @@ -# AMR 0.9.0.9006 -## Last updated: 22-Dec-2019 +# AMR 0.9.0.9007 +## Last updated: 27-Dec-2019 ### Changes * Speed improvement for `as.mo()` (and consequently all `mo_*` functions that use `as.mo()` internally), especially for the *G. species* format (G for genus), like *E. coli* and *K penumoniae* diff --git a/docs/404.html b/docs/404.html index 56acff37..08870334 100644 --- a/docs/404.html +++ b/docs/404.html @@ -84,7 +84,7 @@
diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 5ca7bd6b..e9dd676c 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -84,7 +84,7 @@ diff --git a/docs/articles/benchmarks.html b/docs/articles/benchmarks.html index bf4f0537..dc813503 100644 --- a/docs/articles/benchmarks.html +++ b/docs/articles/benchmarks.html @@ -41,7 +41,7 @@ @@ -187,7 +187,7 @@benchmarks.Rmd
In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 5 milliseconds means it can determine 200 input values per second. It case of 100 milliseconds, this is only 10 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first one is a WHONET code) or common laboratory codes, or common full organism names like the last one. Full organism names are always preferred.
To achieve this speed, the as.mo
function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of Methanosarcina semesiae (B_MTHNSR_SEMS
), a bug probably never found before in humans:
That takes 6.2 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like Methanosarcina semesiae) are always very fast and only take some thousands of seconds to coerce - they are the most probable input from most data sets.
+# 1825.00 10 +# 1690.00 10 +# 39.06 10 +# 35.94 10 +# 8.74 10 +That takes 6.3 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like Methanosarcina semesiae) are always very fast and only take some thousands of seconds to coerce - they are the most probable input from most data sets.
In the figure below, we compare Escherichia coli (which is very common) with Prevotella brevis (which is moderately common) and with Methanosarcina semesiae (which is uncommon):
The highest outliers are the first times. All next determinations were done in only thousands of seconds, because the as.mo()
function learns from its own output to speed up determinations for next times.
So transforming 500,000 values (!!) of 50 unique values only takes 0.59 seconds (593 ms). You only lose time on your unique input values.
+# mo_name(x) 550 600 656 627 702 954 100 +So transforming 500,000 values (!!) of 50 unique values only takes 0.63 seconds (627 ms). You only lose time on your unique input values.
So going from mo_name("Staphylococcus aureus")
to "Staphylococcus aureus"
takes 0.0008 seconds - it doesn’t even start calculating if the result would be the same as the expected resulting value. That goes for all helper functions:
So going from mo_name("Staphylococcus aureus")
to "Staphylococcus aureus"
takes 0.0009 seconds - it doesn’t even start calculating if the result would be the same as the expected resulting value. That goes for all helper functions:
run_it <- microbenchmark(A = mo_species("aureus"),
B = mo_genus("Staphylococcus"),
C = mo_name("Staphylococcus aureus"),
@@ -326,14 +326,14 @@
print(run_it, unit = "ms", signif = 3)
# Unit: milliseconds
# expr min lq mean median uq max neval
-# A 0.445 0.453 0.463 0.464 0.467 0.492 10
-# B 0.484 0.496 0.522 0.502 0.505 0.724 10
-# C 0.667 0.746 0.755 0.758 0.786 0.800 10
-# D 0.488 0.491 0.507 0.505 0.509 0.558 10
-# E 0.454 0.455 0.462 0.461 0.465 0.490 10
-# F 0.432 0.447 0.456 0.458 0.459 0.490 10
-# G 0.438 0.446 0.456 0.454 0.460 0.486 10
-# H 0.439 0.442 0.454 0.450 0.459 0.501 10
Of course, when running mo_phylum("Firmicutes")
the function has zero knowledge about the actual microorganism, namely S. aureus. But since the result would be "Firmicutes"
too, there is no point in calculating the result. And because this package ‘knows’ all phyla of all known bacteria (according to the Catalogue of Life), it can just return the initial value immediately.
Currently supported are German, Dutch, Spanish, Italian, French and Portuguese.
diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png index 8d3785ba..88d14007 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png differ diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png index 25e39db4..5a2e920b 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png index f88a9ad1..af7f9592 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png differ diff --git a/docs/articles/index.html b/docs/articles/index.html index f87f8691..00bc4a00 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -84,7 +84,7 @@ diff --git a/docs/authors.html b/docs/authors.html index ddd67235..04787bb9 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -84,7 +84,7 @@ diff --git a/docs/extra.js b/docs/extra.js index 4d4c2704..fa73cc9f 100644 --- a/docs/extra.js +++ b/docs/extra.js @@ -94,6 +94,7 @@ $( document ).ready(function() { return(x); } $(".template-authors").html(doct_tit($(".template-authors").html())); + $(".template-citation-authors").html(doct_tit($(".template-citation-authors").html())); $(".developers").html(doct_tit($(".developers").html())); // $("footer").html(doct_tit($("footer").html())); diff --git a/docs/index.html b/docs/index.html index 42241438..fc217a54 100644 --- a/docs/index.html +++ b/docs/index.html @@ -45,7 +45,7 @@ diff --git a/docs/news/index.html b/docs/news/index.html index 6c94c005..684c5145 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -84,7 +84,7 @@ @@ -231,13 +231,13 @@ -as.mo(..., allow_uncertain = 3)
Contents