diff --git a/DESCRIPTION b/DESCRIPTION index 84422237..7636222b 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,5 +1,5 @@ Package: AMR -Version: 1.1.0.9013 +Version: 1.1.0.9014 Date: 2020-05-19 Title: Antimicrobial Resistance Analysis Authors@R: c( @@ -38,13 +38,11 @@ Depends: R (>= 3.0.0) Suggests: cleaner, - covr, dplyr, ggplot2, knitr, microbenchmark, rmarkdown, - rvest, testthat, tidyr, utils diff --git a/NEWS.md b/NEWS.md index bf92cc65..3b30ebbd 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,15 +1,15 @@ -# AMR 1.1.0.9013 +# AMR 1.1.0.9014 ## Last updated: 19-May-2020 ### Breaking -* Removed code dependency on **all** R packages that this `AMR` package required: `cleaner`, `crayon`, `data.table`, `dplyr`, `ggplot2`, `knitr`, `microbenchmark`, `pillar`, `R6`, `rlang`, `tidyr` and `vctrs`. This is a major code change, but will probably not be noticeable by most users. +* Removed code dependency on all other R packages: `cleaner`, `crayon`, `data.table`, `dplyr`, `ggplot2`, `knitr`, `microbenchmark`, `pillar`, `R6`, `rlang`, `tidyr` and `vctrs`. This is a major code change, but will probably not be noticeable by most users. Making this package independent on especially the tidyverse tremendously increases sustainability on the long term, since tidyverse functions change quite often. Most of our functions are replaced with versions that only rely on base R, which keeps this package fully functional for many years to come, without requiring a lot of maintenance to keep up with other packages anymore. Another upside it that this package can now be used with all versions of R since R-3.0.0 (April 2013). Our package is being used in settings where the resources are very limited. Fewer dependencies on newer software is helpful for such settings. Negative effects of this change are: * Function `freq()` that was borrowed from the `cleaner` package was removed. Use `cleaner::freq()`, or run `library("cleaner")` before you use `freq()`. * Printing values of class `mo` or `ab` in a tibble will no longer be in colour. - * All functions from the `mo_*` family (like `mo_name()` and `mo_gramstain()`) are noticeably slower when running on tens of thousands of rows. + * All functions from the `mo_*` family (like `mo_name()` and `mo_gramstain()`) are noticeably slower when running on hundreds of thousands of rows. * For developers: classes `mo` and `ab` now both also inherit class `character`, to support any data transformation. This change invalidates code that checks for class length == 1. ### Changed diff --git a/R/like.R b/R/like.R index 42e7a6cb..964fea10 100755 --- a/R/like.R +++ b/R/like.R @@ -60,8 +60,7 @@ #' \dontrun{ #' library(dplyr) #' example_isolates %>% -#' filter(mo_name(mo) %like% "^ent") %>% -#' freq(mo) +#' filter(mo_name(mo) %like% "^ent") #' } like <- function(x, pattern, ignore.case = TRUE) { # set to fixed if no regex found diff --git a/R/mdro.R b/R/mdro.R index adac0965..b515687e 100755 --- a/R/mdro.R +++ b/R/mdro.R @@ -68,6 +68,7 @@ #' @examples #' \dontrun{ #' library(dplyr) +#' library(cleaner) #' #' example_isolates %>% #' mdro() %>% diff --git a/docs/404.html b/docs/404.html index 0e7c7e3f..769e3500 100644 --- a/docs/404.html +++ b/docs/404.html @@ -81,7 +81,7 @@ AMR (for R) - 1.1.0.9013 + 1.1.0.9014 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 6b52a397..11879079 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -81,7 +81,7 @@ AMR (for R) - 1.1.0.9013 + 1.1.0.9014 diff --git a/docs/articles/AMR.html b/docs/articles/AMR.html index bbc4833d..6b1e2c66 100644 --- a/docs/articles/AMR.html +++ b/docs/articles/AMR.html @@ -39,7 +39,7 @@ AMR (for R) - 1.1.0.9009 + 1.1.0.9014 @@ -186,7 +186,7 @@

How to conduct AMR analysis

Matthijs S. Berends

-

18 May 2020

+

19 May 2020

Source: vignettes/AMR.Rmd @@ -195,7 +195,7 @@ -

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 18 May 2020.

+

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 19 May 2020.

Introduction

@@ -226,21 +226,21 @@ -2020-05-18 +2020-05-19 abcd Escherichia coli S S -2020-05-18 +2020-05-19 abcd Escherichia coli S R -2020-05-18 +2020-05-19 efgh Escherichia coli R @@ -251,14 +251,15 @@

Needed R packages

-

As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr and ggplot2 by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.

-

Our AMR package depends on these packages and even extends their use and functions.

+

As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr and ggplot2 by RStudio. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.

+

We will also use the cleaner package, that can be used for cleaning data and creating frequency tables.

library(dplyr)
 library(ggplot2)
 library(AMR)
+library(cleaner)
 
 # (if not yet installed, install with:)
-# install.packages(c("dplyr", "ggplot2", "AMR"))
+# install.packages(c("dplyr", "ggplot2", "AMR", "cleaner"))
@@ -335,68 +336,68 @@ -2013-06-23 -V9 -Hospital C +2017-01-17 +W7 +Hospital B Staphylococcus aureus S -R +S S S F -2016-12-25 -S10 -Hospital D +2017-01-15 +S1 +Hospital A Escherichia coli R S S -R +S F -2011-01-27 -Z10 -Hospital B -Escherichia coli -R +2014-03-03 +F8 +Hospital D +Staphylococcus aureus R S S -F +S +M -2010-09-14 -N1 +2013-10-12 +K3 Hospital B Escherichia coli S S -R -R +S +S M -2017-10-31 -W1 -Hospital D -Klebsiella pneumoniae -R +2017-07-30 +Y4 +Hospital B +Escherichia coli R +S R S F -2014-03-29 -L4 -Hospital B +2011-01-18 +B4 +Hospital C Staphylococcus aureus S S -S +R S M @@ -408,9 +409,9 @@

Cleaning the data

-

We also created a package dedicated to data cleaning and checking, called the cleaner package. It gets automatically installed with the AMR package. For its freq() function to create frequency tables, you don’t even need to load it yourself as it is available through the AMR package as well.

+

We also created a package dedicated to data cleaning and checking, called the cleaner package. It freq() function can be used to create frequency tables.

For example, for the gender variable:

-
data %>% freq(gender) # this would be the same: freq(data$gender)
+
data %>% freq(gender)

Frequency table

Class: character
Length: 20,000
@@ -431,16 +432,16 @@ Longest: 1

1 M -10,293 -51.47% -10,293 -51.47% +10,336 +51.68% +10,336 +51.68% 2 F -9,707 -48.54% +9,664 +48.32% 20,000 100.00% @@ -480,7 +481,7 @@ Longest: 1

# [34mNOTE: Using column `[1mbacteria[22m` as input for `col_mo`.[39m # [34mNOTE: Using column `[1mdate[22m` as input for `col_date`.[39m # [34mNOTE: Using column `[1mpatient_id[22m` as input for `col_patient_id`.[39m
-

So only 28.6% is suitable for resistance analysis! We can now filter on it with the filter() function, also from the dplyr package:

+

So only 28.3% is suitable for resistance analysis! We can now filter on it with the filter() function, also from the dplyr package:

data_1st <- data %>%
   filter(first == TRUE)

For future use, the above two syntaxes can be shortened with the filter_first_isolate() function:

@@ -490,7 +491,7 @@ Longest: 1

First weighted isolates

-

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient N1, sorted on date:

+

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient T4, sorted on date:

@@ -506,21 +507,21 @@ Longest: 1

- - + + + - - + - - + + - + @@ -528,8 +529,8 @@ Longest: 1

- - + + @@ -539,52 +540,30 @@ Longest: 1

- - + + - + - - + + - + - + - - - - - - - - - - - - - - - - - - - - - - - - + + @@ -593,24 +572,46 @@ Longest: 1

- - - + + + + + + + + + + + + + + + + + + + + + + + + + - - + + - + @@ -642,131 +643,131 @@ Longest: 1

- - + + + - - + - - + + - + - + - - + + - + - - + + - + - + - - + + - + - + - - + + - + - - + + + - - - - + + + - - + + + - - - + + - - + + + - - + - + - - + + - + - +
isolate
12010-03-16N12010-02-08T4 B_ESCHR_COLIR SSSR S TRUE
22010-04-19N12010-03-28T4 B_ESCHR_COLISR S S S
32010-06-26N12010-06-15T4 B_ESCHR_COLI S S
42010-07-17N12010-07-21T4 B_ESCHR_COLI S SSR S FALSE
52010-09-14N12011-01-20T4 B_ESCHR_COLISR S RRS FALSE
62010-09-23N1B_ESCHR_COLISSSSFALSE
72010-12-14N1B_ESCHR_COLISSSSFALSE
82011-07-30N12011-04-24T4 B_ESCHR_COLI S S TRUE
92011-08-21N172011-06-20T4 B_ESCHR_COLIISRRFALSE
82011-06-23T4B_ESCHR_COLIR S SRFALSE
92011-09-20T4B_ESCHR_COLIR SR S FALSE
102011-09-09N12011-10-08T4 B_ESCHR_COLI S SSR S FALSE
12010-03-16N12010-02-08T4 B_ESCHR_COLIR SSSR S TRUE TRUE
22010-04-19N12010-03-28T4 B_ESCHR_COLISR S S S FALSEFALSETRUE
32010-06-26N12010-06-15T4 B_ESCHR_COLI S S S S FALSEFALSETRUE
42010-07-17N12010-07-21T4 B_ESCHR_COLI S SSR S FALSEFALSETRUE
52010-09-14N12011-01-20T4 B_ESCHR_COLISR S RRS FALSE TRUE
62010-09-23N12011-04-24T4 B_ESCHR_COLI S S S SFALSETRUE TRUE
72010-12-14N12011-06-20T4 B_ESCHR_COLII SSSSFALSERR FALSETRUE
82011-07-30N12011-06-23T4 B_ESCHR_COLIR S SSSTRUERFALSE TRUE
92011-08-21N12011-09-20T4 B_ESCHR_COLIR SSSR S FALSEFALSETRUE
102011-09-09N12011-10-08T4 B_ESCHR_COLI S SSR S FALSEFALSETRUE
-

Instead of 2, now 4 isolates are flagged. In total, 75.4% of all isolates are marked ‘first weighted’ - 46.8% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

+

Instead of 2, now 10 isolates are flagged. In total, 75.2% of all isolates are marked ‘first weighted’ - 46.9% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

As with filter_first_isolate(), there’s a shortcut for this new algorithm too:

data_1st <- data %>%
   filter_first_weighted_isolate()
-

So we end up with 15,085 isolates for analysis.

+

So we end up with 15,048 isolates for analysis.

We can remove unneeded columns:

data_1st <- data_1st %>%
   select(-c(first, keyab))
@@ -792,9 +793,9 @@ Longest: 1

1 -2013-06-23 -V9 -Hospital C +2017-01-17 +W7 +Hospital B B_STPHY_AURS S S @@ -808,14 +809,14 @@ Longest: 1

2 -2016-12-25 -S10 -Hospital D +2017-01-15 +S1 +Hospital A B_ESCHR_COLI R S S -R +S F Gram-negative Escherichia @@ -824,66 +825,66 @@ Longest: 1

3 -2011-01-27 -Z10 -Hospital B -B_ESCHR_COLI -R +2014-03-03 +F8 +Hospital D +B_STPHY_AURS R S S -F -Gram-negative -Escherichia -coli +S +M +Gram-positive +Staphylococcus +aureus TRUE -4 -2010-09-14 -N1 +5 +2017-07-30 +Y4 Hospital B B_ESCHR_COLI -S +R S R -R -M +S +F Gram-negative Escherichia coli TRUE -5 -2017-10-31 -W1 -Hospital D -B_KLBSL_PNMN -R -R +6 +2011-01-18 +B4 +Hospital C +B_STPHY_AURS +S +S R S -F -Gram-negative -Klebsiella -pneumoniae +M +Gram-positive +Staphylococcus +aureus TRUE 7 -2010-03-02 -W2 +2013-05-01 +G7 Hospital B -B_ESCHR_COLI -S -S -S -S -F -Gram-negative -Escherichia -coli +B_STRPT_PNMN +I +I +R +R +M +Gram-positive +Streptococcus +pneumoniae TRUE @@ -905,8 +906,8 @@ Longest: 1

data_1st %>% freq(genus, species)

Frequency table

Class: character
-Length: 15,085
-Available: 15,085 (100%, NA: 0 = 0%)
+Length: 15,048
+Available: 15,048 (100%, NA: 0 = 0%)
Unique: 4

Shortest: 16
Longest: 24

@@ -923,33 +924,33 @@ Longest: 24

1 Escherichia coli -7,418 -49.17% -7,418 -49.17% +7,376 +49.02% +7,376 +49.02% 2 Staphylococcus aureus -3,716 -24.63% -11,134 -73.81% +3,819 +25.38% +11,195 +74.40% 3 Streptococcus pneumoniae -2,404 -15.94% -13,538 -89.74% +2,330 +15.48% +13,525 +89.88% 4 Klebsiella pneumoniae -1,547 -10.26% -15,085 +1,523 +10.12% +15,048 100.00% @@ -961,7 +962,7 @@ Longest: 24

The functions resistance() and susceptibility() can be used to calculate antimicrobial resistance or susceptibility. For more specific analyses, the functions proportion_S(), proportion_SI(), proportion_I(), proportion_IR() and proportion_R() can be used to determine the proportion of a specific antimicrobial outcome.

As per the EUCAST guideline of 2019, we calculate resistance as the proportion of R (proportion_R(), equal to resistance()) and susceptibility as the proportion of S and I (proportion_SI(), equal to susceptibility()). These functions can be used on their own:

data_1st %>% resistance(AMX)
-# [1] 0.4658933
+# [1] 0.4628522

Or can be used in conjuction with group_by() and summarise(), both from the dplyr package:

data_1st %>%
   group_by(hospital) %>%
@@ -974,19 +975,19 @@ Longest: 24

Hospital A -0.4701310 +0.4566180 Hospital B -0.4702970 +0.4606292 Hospital C -0.4575579 +0.4775330 Hospital D -0.4574359 +0.4650083 @@ -1004,23 +1005,23 @@ Longest: 24

Hospital A -0.4701310 -4503 +0.4566180 +4518 Hospital B -0.4702970 -5454 +0.4606292 +5245 Hospital C -0.4575579 -2203 +0.4775330 +2270 Hospital D -0.4574359 -2925 +0.4650083 +3015 @@ -1040,27 +1041,27 @@ Longest: 24

Escherichia -0.9207334 -0.8959288 -0.9936641 +0.9197397 +0.8889642 +0.9940347 Klebsiella -0.9198449 -0.9004525 -0.9974144 +0.9277741 +0.8929744 +0.9954038 Staphylococcus -0.9203445 -0.9233046 -0.9938105 +0.9316575 +0.9279916 +0.9963341 Streptococcus -0.6110649 +0.6253219 0.0000000 -0.6110649 +0.6253219 diff --git a/docs/articles/AMR_files/figure-html/plot 1-1.png b/docs/articles/AMR_files/figure-html/plot 1-1.png index d405d2ae..8f26ee84 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 1-1.png and b/docs/articles/AMR_files/figure-html/plot 1-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 3-1.png b/docs/articles/AMR_files/figure-html/plot 3-1.png index 0add7c50..1c6981ed 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 3-1.png and b/docs/articles/AMR_files/figure-html/plot 3-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 4-1.png b/docs/articles/AMR_files/figure-html/plot 4-1.png index 12e57e0d..37a7354b 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 4-1.png and b/docs/articles/AMR_files/figure-html/plot 4-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 5-1.png b/docs/articles/AMR_files/figure-html/plot 5-1.png index 6f0e0fe8..04fb1999 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 5-1.png and b/docs/articles/AMR_files/figure-html/plot 5-1.png differ diff --git a/docs/articles/MDR.html b/docs/articles/MDR.html index e176e090..1f82aa67 100644 --- a/docs/articles/MDR.html +++ b/docs/articles/MDR.html @@ -39,7 +39,7 @@ AMR (for R) - 1.1.0 + 1.1.0.9014
@@ -186,7 +186,7 @@

How to determine multi-drug resistance (MDR)

Matthijs S. Berends

-

15 April 2020

+

19 May 2020

Source: vignettes/MDR.Rmd @@ -209,20 +209,16 @@

Create a frequency table of the results:

freq(my_TB_data$mdr)

Frequency table

@@ -346,40 +343,40 @@ Unique: 5

1 Mono-resistant -3310 -66.20% -3310 -66.20% +3294 +65.88% +3294 +65.88% 2 Negative -637 -12.74% -3947 -78.94% +613 +12.26% +3907 +78.14% 3 Multi-drug-resistant -569 -11.38% -4516 -90.32% +572 +11.44% +4479 +89.58% 4 Poly-resistant -283 -5.66% -4799 -95.98% +312 +6.24% +4791 +95.82% 5 Extensively drug-resistant -201 -4.02% +209 +4.18% 5000 100.00% @@ -401,7 +398,7 @@ Unique: 5

-

Site built with pkgdown 1.5.0.

+

Site built with pkgdown 1.5.1.

diff --git a/docs/articles/WHONET.html b/docs/articles/WHONET.html index 3fe01da6..1831bacd 100644 --- a/docs/articles/WHONET.html +++ b/docs/articles/WHONET.html @@ -39,7 +39,7 @@ AMR (for R) - 1.1.0 + 1.1.0.9014 @@ -186,7 +186,7 @@

How to work with WHONET data

Matthijs S. Berends

-

15 April 2020

+

19 May 2020

Source: vignettes/WHONET.Rmd @@ -210,7 +210,8 @@

First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don’t know it yet, I suggest you read about it on their website: https://www.tidyverse.org/.

library(dplyr)   # part of tidyverse
 library(ggplot2) # part of tidyverse
-library(AMR)     # this package
+library(AMR) # this package +library(cleaner) # to create frequency tables

We will have to transform some variables to simplify and automate the analysis: