diff --git a/DESCRIPTION b/DESCRIPTION index 5b271565..5a42cd68 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.9.0.9006 -Date: 2019-12-22 +Version: 0.9.0.9007 +Date: 2019-12-27 Title: Antimicrobial Resistance Analysis Authors@R: c( person(role = c("aut", "cre"), diff --git a/NEWS.md b/NEWS.md index e6a6bd6d..0807204b 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,5 @@ -# AMR 0.9.0.9006 -## Last updated: 22-Dec-2019 +# AMR 0.9.0.9007 +## Last updated: 27-Dec-2019 ### Changes * Speed improvement for `as.mo()` (and consequently all `mo_*` functions that use `as.mo()` internally), especially for the *G. species* format (G for genus), like *E. coli* and *K penumoniae* diff --git a/docs/404.html b/docs/404.html index 56acff37..08870334 100644 --- a/docs/404.html +++ b/docs/404.html @@ -84,7 +84,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 5ca7bd6b..e9dd676c 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -84,7 +84,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 diff --git a/docs/articles/benchmarks.html b/docs/articles/benchmarks.html index bf4f0537..dc813503 100644 --- a/docs/articles/benchmarks.html +++ b/docs/articles/benchmarks.html @@ -41,7 +41,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 @@ -187,7 +187,7 @@

Benchmarks

Matthijs S. Berends

-

22 December 2019

+

27 December 2019

@@ -221,21 +221,21 @@ times = 10) print(S.aureus, unit = "ms", signif = 2) # Unit: milliseconds -# expr min lq mean median uq max neval -# as.mo("sau") 9.1 9.3 15 9.8 13.0 34 10 -# as.mo("stau") 33.0 34.0 40 35.0 39.0 60 10 -# as.mo("STAU") 33.0 34.0 51 35.0 57.0 150 10 -# as.mo("staaur") 8.8 9.2 16 10.0 13.0 43 10 -# as.mo("STAAUR") 9.3 9.4 23 9.7 10.0 120 10 -# as.mo("S. aureus") 10.0 10.0 16 11.0 12.0 41 10 -# as.mo("S aureus") 10.0 10.0 28 12.0 35.0 110 10 -# as.mo("Staphylococcus aureus") 4.6 4.8 10 4.9 5.2 56 10 -# as.mo("Staphylococcus aureus (MRSA)") 660.0 670.0 700 680.0 710.0 770 10 -# as.mo("Sthafilokkockus aaureuz") 330.0 350.0 370 370.0 390.0 430 10 -# as.mo("MRSA") 9.2 9.2 14 9.4 9.5 35 10 -# as.mo("VISA") 20.0 20.0 26 21.0 23.0 45 10 -# as.mo("VRSA") 20.0 21.0 30 22.0 44.0 47 10 -# as.mo(22242419) 19.0 20.0 43 20.0 28.0 130 10 +# expr min lq mean median uq max neval +# as.mo("sau") 9.2 9.3 20.0 10.0 34 39.0 10 +# as.mo("stau") 34.0 34.0 37.0 35.0 35 63.0 10 +# as.mo("STAU") 33.0 34.0 39.0 35.0 36 58.0 10 +# as.mo("staaur") 9.3 9.4 9.8 9.8 10 10.0 10 +# as.mo("STAAUR") 9.3 9.4 16.0 9.7 11 42.0 10 +# as.mo("S. aureus") 10.0 10.0 19.0 11.0 33 44.0 10 +# as.mo("S aureus") 10.0 10.0 18.0 11.0 34 36.0 10 +# as.mo("Staphylococcus aureus") 4.7 4.8 4.9 4.9 5 5.1 10 +# as.mo("Staphylococcus aureus (MRSA)") 650.0 700.0 760.0 740.0 820 940.0 10 +# as.mo("Sthafilokkockus aaureuz") 360.0 390.0 410.0 390.0 430 510.0 10 +# as.mo("MRSA") 9.1 9.4 17.0 9.7 34 35.0 10 +# as.mo("VISA") 20.0 20.0 26.0 21.0 26 45.0 10 +# as.mo("VRSA") 20.0 21.0 34.0 33.0 46 53.0 10 +# as.mo(22242419) 19.0 20.0 25.0 21.0 25 43.0 10

In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 5 milliseconds means it can determine 200 input values per second. It case of 100 milliseconds, this is only 10 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first one is a WHONET code) or common laboratory codes, or common full organism names like the last one. Full organism names are always preferred.

To achieve this speed, the as.mo function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of Methanosarcina semesiae (B_MTHNSR_SEMS), a bug probably never found before in humans:

@@ -248,18 +248,18 @@ print(M.semesiae, unit = "ms", signif = 4) # Unit: milliseconds # expr min lq mean median uq -# as.mo("metsem") 1469.000 1482.000 1515.000 1507.000 1545.000 -# as.mo("METSEM") 1435.000 1452.000 1490.000 1479.000 1520.000 -# as.mo("M. semesiae") 10.840 11.090 16.220 11.150 11.600 -# as.mo("M. semesiae") 10.670 10.820 20.140 11.180 38.040 -# as.mo("Methanosarcina semesiae") 5.138 5.185 7.838 5.366 5.493 +# as.mo("metsem") 1437.000 1541.000 1599.000 1598.000 1657.000 +# as.mo("METSEM") 1521.000 1539.000 1594.000 1606.000 1622.000 +# as.mo("M. semesiae") 10.840 11.110 19.690 12.620 35.530 +# as.mo("M. semesiae") 10.780 11.000 13.670 11.250 11.450 +# as.mo("Methanosarcina semesiae") 5.385 5.563 6.217 5.784 6.134 # max neval -# 1574.00 10 -# 1563.00 10 -# 36.46 10 -# 44.17 10 -# 30.28 10 -

That takes 6.2 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like Methanosarcina semesiae) are always very fast and only take some thousands of seconds to coerce - they are the most probable input from most data sets.

+# 1825.00 10 +# 1690.00 10 +# 39.06 10 +# 35.94 10 +# 8.74 10 +

That takes 6.3 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like Methanosarcina semesiae) are always very fast and only take some thousands of seconds to coerce - they are the most probable input from most data sets.

In the figure below, we compare Escherichia coli (which is very common) with Prevotella brevis (which is moderately common) and with Methanosarcina semesiae (which is uncommon):

The highest outliers are the first times. All next determinations were done in only thousands of seconds, because the as.mo() function learns from its own output to speed up determinations for next times.

@@ -296,8 +296,8 @@ print(run_it, unit = "ms", signif = 3) # Unit: milliseconds # expr min lq mean median uq max neval -# mo_name(x) 548 581 605 593 611 735 100 -

So transforming 500,000 values (!!) of 50 unique values only takes 0.59 seconds (593 ms). You only lose time on your unique input values.

+# mo_name(x) 550 600 656 627 702 954 100 +

So transforming 500,000 values (!!) of 50 unique values only takes 0.63 seconds (627 ms). You only lose time on your unique input values.

@@ -309,11 +309,11 @@ times = 10) print(run_it, unit = "ms", signif = 3) # Unit: milliseconds -# expr min lq mean median uq max neval -# A 6.380 6.430 6.800 6.540 6.690 8.94 10 -# B 10.900 10.900 14.200 11.100 11.400 37.20 10 -# C 0.735 0.772 0.832 0.792 0.875 1.01 10

-

So going from mo_name("Staphylococcus aureus") to "Staphylococcus aureus" takes 0.0008 seconds - it doesn’t even start calculating if the result would be the same as the expected resulting value. That goes for all helper functions:

+# expr min lq mean median uq max neval +# A 6.680 7.01 7.730 7.710 8.160 9.36 10 +# B 11.700 12.20 18.000 14.800 15.200 54.30 10 +# C 0.723 0.87 0.924 0.945 0.973 1.06 10 +

So going from mo_name("Staphylococcus aureus") to "Staphylococcus aureus" takes 0.0009 seconds - it doesn’t even start calculating if the result would be the same as the expected resulting value. That goes for all helper functions:

run_it <- microbenchmark(A = mo_species("aureus"),
                          B = mo_genus("Staphylococcus"),
                          C = mo_name("Staphylococcus aureus"),
@@ -326,14 +326,14 @@
 print(run_it, unit = "ms", signif = 3)
 # Unit: milliseconds
 #  expr   min    lq  mean median    uq   max neval
-#     A 0.445 0.453 0.463  0.464 0.467 0.492    10
-#     B 0.484 0.496 0.522  0.502 0.505 0.724    10
-#     C 0.667 0.746 0.755  0.758 0.786 0.800    10
-#     D 0.488 0.491 0.507  0.505 0.509 0.558    10
-#     E 0.454 0.455 0.462  0.461 0.465 0.490    10
-#     F 0.432 0.447 0.456  0.458 0.459 0.490    10
-#     G 0.438 0.446 0.456  0.454 0.460 0.486    10
-#     H 0.439 0.442 0.454  0.450 0.459 0.501    10
+# A 0.528 0.552 0.572 0.564 0.573 0.653 10 +# B 0.583 0.602 0.623 0.618 0.647 0.665 10 +# C 0.965 0.979 1.020 1.000 1.040 1.220 10 +# D 0.582 0.621 0.632 0.627 0.636 0.696 10 +# E 0.548 0.552 0.585 0.582 0.599 0.646 10 +# F 0.538 0.540 0.555 0.548 0.567 0.588 10 +# G 0.539 0.554 0.573 0.568 0.585 0.646 10 +# H 0.512 0.545 0.554 0.551 0.556 0.628 10

Of course, when running mo_phylum("Firmicutes") the function has zero knowledge about the actual microorganism, namely S. aureus. But since the result would be "Firmicutes" too, there is no point in calculating the result. And because this package ‘knows’ all phyla of all known bacteria (according to the Catalogue of Life), it can just return the initial value immediately.

@@ -360,13 +360,13 @@ print(run_it, unit = "ms", signif = 4) # Unit: milliseconds # expr min lq mean median uq max neval -# en 21.02 22.35 26.62 22.93 23.58 55.03 100 -# de 22.22 23.82 29.38 24.32 25.30 60.12 100 -# nl 27.58 28.91 33.76 30.03 30.78 144.40 100 -# es 22.31 23.57 28.23 24.31 25.96 53.68 100 -# it 22.02 23.74 30.82 24.32 26.97 158.30 100 -# fr 22.38 23.39 28.85 24.29 25.71 143.30 100 -# pt 22.14 23.44 27.47 24.17 25.12 56.93 100
+# en 20.88 22.82 27.81 23.63 24.78 78.11 100 +# de 22.56 24.77 31.69 25.54 31.31 150.90 100 +# nl 28.06 30.72 37.50 32.17 41.06 76.13 100 +# es 22.64 24.58 30.66 25.43 32.62 72.82 100 +# it 22.44 24.34 30.07 25.23 32.15 60.99 100 +# fr 22.57 24.85 32.13 25.72 34.49 79.05 100 +# pt 22.54 24.52 30.13 25.29 30.88 80.61 100

Currently supported are German, Dutch, Spanish, Italian, French and Portuguese.

diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png index 8d3785ba..88d14007 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-5-1.png differ diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png index 25e39db4..5a2e920b 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png index f88a9ad1..af7f9592 100644 Binary files a/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png and b/docs/articles/benchmarks_files/figure-html/unnamed-chunk-9-1.png differ diff --git a/docs/articles/index.html b/docs/articles/index.html index f87f8691..00bc4a00 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 diff --git a/docs/authors.html b/docs/authors.html index ddd67235..04787bb9 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -84,7 +84,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 diff --git a/docs/extra.js b/docs/extra.js index 4d4c2704..fa73cc9f 100644 --- a/docs/extra.js +++ b/docs/extra.js @@ -94,6 +94,7 @@ $( document ).ready(function() { return(x); } $(".template-authors").html(doct_tit($(".template-authors").html())); + $(".template-citation-authors").html(doct_tit($(".template-citation-authors").html())); $(".developers").html(doct_tit($(".developers").html())); // $("footer").html(doct_tit($("footer").html())); diff --git a/docs/index.html b/docs/index.html index 42241438..fc217a54 100644 --- a/docs/index.html +++ b/docs/index.html @@ -45,7 +45,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 diff --git a/docs/news/index.html b/docs/news/index.html index 6c94c005..684c5145 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.9.0.9006 + 0.9.0.9007 @@ -231,13 +231,13 @@ -
+

-AMR 0.9.0.9006 Unreleased +AMR 0.9.0.9007 Unreleased

-
+

-Last updated: 22-Dec-2019 +Last updated: 27-Dec-2019

@@ -1416,7 +1416,7 @@ Using as.mo(..., allow_uncertain = 3)

Contents

diff --git a/pkgdown/extra.js b/pkgdown/extra.js index 4d4c2704..fa73cc9f 100644 --- a/pkgdown/extra.js +++ b/pkgdown/extra.js @@ -94,6 +94,7 @@ $( document ).ready(function() { return(x); } $(".template-authors").html(doct_tit($(".template-authors").html())); + $(".template-citation-authors").html(doct_tit($(".template-citation-authors").html())); $(".developers").html(doct_tit($(".developers").html())); // $("footer").html(doct_tit($("footer").html()));