diff --git a/DESCRIPTION b/DESCRIPTION index 7de38f74..3f55eb76 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.8.0.9017 -Date: 2019-11-06 +Version: 0.8.0.9021 +Date: 2019-11-09 Title: Antimicrobial Resistance Analysis Authors@R: c( person(role = c("aut", "cre"), diff --git a/NEWS.md b/NEWS.md index 8c1bcb95..245594e8 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,5 @@ -# AMR 0.8.0.9017 -Last updated: 06-Nov-2019 +# AMR 0.8.0.9021 +Last updated: 09-Nov-2019 ### New * Support for a new MDRO guideline: Magiorakos AP, Srinivasan A *et al.* "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012). diff --git a/R/ab.R b/R/ab.R index ec6bf1de..d898a378 100755 --- a/R/ab.R +++ b/R/ab.R @@ -75,7 +75,7 @@ as.ab <- function(x, ...) { # remove suffices x_bak_clean <- gsub("_(mic|rsi|dis[ck])$", "", x, ignore.case = TRUE) # remove disk concentrations, like LVX_NM -> LVX - x_bak_clean <- gsub("_[A-Z]{2}[0-9_]{0,3}$", "", x_bak_clean, ignore.case = TRUE) + x_bak_clean <- gsub("_[A-Z]{2}[0-9_.]{0,3}$", "", x_bak_clean, ignore.case = TRUE) # remove part between brackets if that's followed by another string x_bak_clean <- gsub("(.*)+ [(].*[)]", "\\1", x_bak_clean) # keep only max 1 space diff --git a/R/eucast_rules.R b/R/eucast_rules.R index 472209e3..38375369 100755 --- a/R/eucast_rules.R +++ b/R/eucast_rules.R @@ -241,11 +241,11 @@ eucast_rules <- function(x, warned <- FALSE txt_error <- function() { - cat("", bgRed(white(" ERROR ")), "\n\n") + if (info == TRUE) cat("", bgRed(white(" ERROR ")), "\n\n") } txt_warning <- function() { if (warned == FALSE) { - cat("", bgYellow(black(" WARNING "))) + if (info == TRUE) cat("", bgYellow(black(" WARNING "))) } warned <<- TRUE } diff --git a/appveyor.yml b/appveyor.yml index 65d7a8e6..24c366db 100644 --- a/appveyor.yml +++ b/appveyor.yml @@ -42,8 +42,10 @@ environment: USE_RTOOLS: true matrix: + - R_VERSION: oldrel - R_VERSION: release - - R_VERSION: devel + allow_failures: + - R_VERSION: devel build_script: - travis_tool.sh install_deps diff --git a/docs/404.html b/docs/404.html index ed984d46..d7879f87 100644 --- a/docs/404.html +++ b/docs/404.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 783ceac4..1153159c 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/articles/AMR.html b/docs/articles/AMR.html index dce1f69e..a65f8394 100644 --- a/docs/articles/AMR.html +++ b/docs/articles/AMR.html @@ -41,7 +41,7 @@ AMR (for R) - 0.8.0 + 0.8.0.9021 @@ -187,7 +187,7 @@

How to conduct AMR analysis

Matthijs S. Berends

-

16 October 2019

+

09 November 2019

@@ -196,7 +196,7 @@ -

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 16 October 2019.

+

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 09 November 2019.

Introduction

@@ -212,21 +212,21 @@ -2019-10-16 +2019-11-09 abcd Escherichia coli S S -2019-10-16 +2019-11-09 abcd Escherichia coli S R -2019-10-16 +2019-11-09 efgh Escherichia coli R @@ -321,65 +321,65 @@ -2012-05-14 -G6 -Hospital B -Escherichia coli -R -S -R -R -M - - -2012-03-26 -F8 -Hospital B -Staphylococcus aureus -S -S -S -S -M - - -2010-04-06 -F7 -Hospital B -Escherichia coli -I -S -R -R -M - - -2016-05-08 -O1 -Hospital B -Escherichia coli -S -S -S -S -F - - -2017-05-22 -T6 -Hospital A -Staphylococcus aureus -R -S -S -S -F - - -2012-06-06 -X7 +2011-07-25 +L6 Hospital D Staphylococcus aureus +S +S +R +S +M + + +2015-11-29 +J1 +Hospital B +Klebsiella pneumoniae +S +S +R +S +M + + +2012-04-04 +C3 +Hospital B +Escherichia coli +R +S +S +S +M + + +2017-10-06 +W8 +Hospital B +Escherichia coli +S +S +S +S +F + + +2013-07-22 +Q6 +Hospital D +Escherichia coli +R +I +S +S +F + + +2011-06-05 +W7 +Hospital B +Escherichia coli R S S @@ -394,10 +394,9 @@

Cleaning the data

-

We also created a package dedicated to data cleaning and checking, called the clean package. It gets automatically installed with the AMR package, so we only have to load it:

-
library(clean)
-

Use the frequency table function freq() from this clean package to look specifically for unique values in any variable. For example, for the gender variable:

-
data %>% freq(gender) # this would be the same: freq(data$gender)
+

We also created a package dedicated to data cleaning and checking, called the cleaner package. It gets automatically installed with the AMR package. For its freq() function to create frequency tables, you don’t even need to load it yourself as it is available through the AMR package as well.

+

For example, for the gender variable:

+
data %>% freq(gender) # this would be the same: freq(data$gender)
# Frequency table 
 # 
 # Class:   factor (numeric)
@@ -407,82 +406,82 @@
 # 
 #      Item     Count   Percent   Cum. Count   Cum. Percent
 # ---  -----  -------  --------  -----------  -------------
-# 1    M       10,380     51.9%       10,380          51.9%
-# 2    F        9,620     48.1%       20,000         100.0%
+# 1 M 10,398 51.99% 10,398 51.99% +# 2 F 9,602 48.01% 20,000 100.00%

So, we can draw at least two conclusions immediately. From a data scientists perspective, the data looks clean: only values M and F. From a researchers perspective: there are slightly more men. Nothing we didn’t already know.

The data is already quite clean, but we still need to transform some variables. The bacteria column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The mutate() function of the dplyr package makes this really easy:

-
data <- data %>%
-  mutate(bacteria = as.mo(bacteria))
+
data <- data %>%
+  mutate(bacteria = as.mo(bacteria))

We also want to transform the antibiotics, because in real life data we don’t know if they are really clean. The as.rsi() function ensures reliability and reproducibility in these kind of variables. The mutate_at() will run the as.rsi() function on defined variables:

-
data <- data %>%
-  mutate_at(vars(AMX:GEN), as.rsi)
+
data <- data %>%
+  mutate_at(vars(AMX:GEN), as.rsi)

Finally, we will apply EUCAST rules on our antimicrobial results. In Europe, most medical microbiological laboratories already apply these rules. Our package features their latest insights on intrinsic resistance and exceptional phenotypes. Moreover, the eucast_rules() function can also apply additional rules, like forcing ampicillin = R when amoxicillin/clavulanic acid = R.

Because the amoxicillin (column AMX) and amoxicillin/clavulanic acid (column AMC) in our data were generated randomly, some rows will undoubtedly contain AMX = S and AMC = R, which is technically impossible. The eucast_rules() fixes this:

-
data <- eucast_rules(data, col_mo = "bacteria")
-# 
-# Rules by the European Committee on Antimicrobial Susceptibility Testing (EUCAST)
-# http://eucast.org/
-# 
-# EUCAST Clinical Breakpoints (v9.0, 2019)
-# Aerococcus sanguinicola (no changes)
-# Aerococcus urinae (no changes)
-# Anaerobic Gram-negatives (no changes)
-# Anaerobic Gram-positives (no changes)
-# Campylobacter coli (no changes)
-# Campylobacter jejuni (no changes)
-# Enterobacteriales (Order) (no changes)
-# Enterococcus (no changes)
-# Haemophilus influenzae (no changes)
-# Kingella kingae (no changes)
-# Moraxella catarrhalis (no changes)
-# Pasteurella multocida (no changes)
-# Staphylococcus (no changes)
-# Streptococcus groups A, B, C, G (no changes)
-# Streptococcus pneumoniae (1,483 values changed)
-# Viridans group streptococci (no changes)
-# 
-# EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)
-# Table 01: Intrinsic resistance in Enterobacteriaceae (1,268 values changed)
-# Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)
-# Table 03: Intrinsic resistance in other Gram-negative bacteria (no changes)
-# Table 04: Intrinsic resistance in Gram-positive bacteria (2,755 values changed)
-# Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)
-# Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)
-# Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)
-# Table 12: Interpretive rules for aminoglycosides (no changes)
-# Table 13: Interpretive rules for quinolones (no changes)
-# 
-# Other rules
-# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,282 values changed)
-# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (122 values changed)
-# Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)
-# Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)
-# Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)
-# Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)
-# 
-# --------------------------------------------------------------------------
-# EUCAST rules affected 6,530 out of 20,000 rows, making a total of 7,910 edits
-# => added 0 test results
-# 
-# => changed 7,910 test results
-#    - 109 test results changed from S to I
-#    - 4,678 test results changed from S to R
-#    - 1,098 test results changed from I to S
-#    - 322 test results changed from I to R
-#    - 1,676 test results changed from R to S
-#    - 27 test results changed from R to I
-# --------------------------------------------------------------------------
-# 
-# Use eucast_rules(..., verbose = TRUE) (on your original data) to get a data.frame with all specified edits instead.
+
data <- eucast_rules(data, col_mo = "bacteria")
+# 
+# Rules by the European Committee on Antimicrobial Susceptibility Testing (EUCAST)
+# http://eucast.org/
+# 
+# EUCAST Clinical Breakpoints (v9.0, 2019)
+# Aerococcus sanguinicola (no changes)
+# Aerococcus urinae (no changes)
+# Anaerobic Gram-negatives (no changes)
+# Anaerobic Gram-positives (no changes)
+# Campylobacter coli (no changes)
+# Campylobacter jejuni (no changes)
+# Enterobacterales (Order) (no changes)
+# Enterococcus (no changes)
+# Haemophilus influenzae (no changes)
+# Kingella kingae (no changes)
+# Moraxella catarrhalis (no changes)
+# Pasteurella multocida (no changes)
+# Staphylococcus (no changes)
+# Streptococcus groups A, B, C, G (no changes)
+# Streptococcus pneumoniae (1,452 values changed)
+# Viridans group streptococci (no changes)
+# 
+# EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)
+# Table 01: Intrinsic resistance in Enterobacteriaceae (1,294 values changed)
+# Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)
+# Table 03: Intrinsic resistance in other Gram-negative bacteria (no changes)
+# Table 04: Intrinsic resistance in Gram-positive bacteria (2,721 values changed)
+# Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)
+# Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)
+# Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)
+# Table 12: Interpretive rules for aminoglycosides (no changes)
+# Table 13: Interpretive rules for quinolones (no changes)
+# 
+# Other rules
+# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,287 values changed)
+# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (107 values changed)
+# Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)
+# Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)
+# Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)
+# Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)
+# 
+# --------------------------------------------------------------------------
+# EUCAST rules affected 6,525 out of 20,000 rows, making a total of 7,861 edits
+# => added 0 test results
+# 
+# => changed 7,861 test results
+#    - 100 test results changed from S to I
+#    - 4,694 test results changed from S to R
+#    - 1,140 test results changed from I to S
+#    - 302 test results changed from I to R
+#    - 1,594 test results changed from R to S
+#    - 31 test results changed from R to I
+# --------------------------------------------------------------------------
+# 
+# Use eucast_rules(..., verbose = TRUE) (on your original data) to get a data.frame with all specified edits instead.

Adding new variables

Now that we have the microbial ID, we can add some taxonomic properties:

-
data <- data %>% 
-  mutate(gramstain = mo_gramstain(bacteria),
-         genus = mo_genus(bacteria),
-         species = mo_species(bacteria))
+
data <- data %>% 
+  mutate(gramstain = mo_gramstain(bacteria),
+         genus = mo_genus(bacteria),
+         species = mo_species(bacteria))

First isolates

@@ -493,23 +492,23 @@

(…) When preparing a cumulative antibiogram to guide clinical decisions about empirical antimicrobial therapy of initial infections, only the first isolate of a given species per patient, per analysis period (eg, one year) should be included, irrespective of body site, antimicrobial susceptibility profile, or other phenotypical characteristics (eg, biotype). The first isolate is easily identified, and cumulative antimicrobial susceptibility test data prepared using the first isolate are generally comparable to cumulative antimicrobial susceptibility test data calculated by other methods, providing duplicate isolates are excluded.
M39-A4 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 4th Edition. CLSI, 2014. Chapter 6.4

This AMR package includes this methodology with the first_isolate() function. It adopts the episode of a year (can be changed by user) and it starts counting days after every selected isolate. This new variable can easily be added to our data:

- +

So only 28.4% is suitable for resistance analysis! We can now filter on it with the filter() function, also from the dplyr package:

- +

For future use, the above two syntaxes can be shortened with the filter_first_isolate() function:

- +

First weighted isolates

-

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient I4, sorted on date:

+

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient Y8, sorted on date:

@@ -525,8 +524,8 @@ - - + + @@ -536,10 +535,10 @@ - - + + - + @@ -547,8 +546,8 @@ - - + + @@ -558,21 +557,21 @@ - - + + - - + + - - + + - + @@ -580,43 +579,43 @@ - - + + - + + + - - - - + + - + - - + + - - + + - - + + - + @@ -624,12 +623,12 @@ - - + + - + @@ -637,16 +636,16 @@
isolate
12010-02-10I42010-04-05Y8 B_ESCHR_COLI R S
22010-02-19I42010-04-09Y8 B_ESCHR_COLISR S S S
32010-03-09I42010-04-14Y8 B_ESCHR_COLI S S
42010-04-06I42010-05-23Y8 B_ESCHR_COLII SRSS S FALSE
52010-07-14I42010-07-05Y8 B_ESCHR_COLISR S S S
62010-08-03I42010-07-05Y8 B_ESCHR_COLIIRSS SRR FALSE
72010-09-19I42010-07-10Y8 B_ESCHR_COLI S SSR S FALSE
82010-09-19I42010-08-07Y8 B_ESCHR_COLISSRR S S FALSE
92010-10-05I42010-09-24Y8 B_ESCHR_COLISR S S S
102010-12-30I42011-02-13Y8 B_ESCHR_COLI S SSR S FALSE

Only 1 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics() function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.

If a column exists with a name like ‘key(…)ab’ the first_isolate() function will automatically use it and determine the first weighted isolates. Mind the NOTEs in below output:

- + @@ -663,8 +662,8 @@ - - + + @@ -675,124 +674,124 @@ - - + + - + - + - - + + - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + - - + + - + + + - - - + - - + + - + - - + + - - + + - + - - + + - + - + - - + + - + - +
isolate
12010-02-10I42010-04-05Y8 B_ESCHR_COLI R S
22010-02-19I42010-04-09Y8 B_ESCHR_COLISR S S S FALSETRUEFALSE
32010-03-09I42010-04-14Y8 B_ESCHR_COLI S S S S FALSEFALSE
42010-04-06I4B_ESCHR_COLIISRSFALSE TRUE
52010-07-14I4
42010-05-23Y8 B_ESCHR_COLI S S S S FALSEFALSE
52010-07-05Y8B_ESCHR_COLIRSSSFALSE TRUE
62010-08-03I42010-07-05Y8 B_ESCHR_COLIIRSS SRR FALSETRUEFALSE
72010-09-19I42010-07-10Y8 B_ESCHR_COLI S SSR S FALSE TRUE
82010-09-19I42010-08-07Y8 B_ESCHR_COLISSRR S S FALSEFALSETRUE
92010-10-05I42010-09-24Y8 B_ESCHR_COLISR S S S FALSEFALSETRUE
102010-12-30I42011-02-13Y8 B_ESCHR_COLI S SSR S FALSEFALSETRUE
-

Instead of 1, now 6 isolates are flagged. In total, 75.7% of all isolates are marked ‘first weighted’ - 47.3% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

+

Instead of 1, now 7 isolates are flagged. In total, 75.4% of all isolates are marked ‘first weighted’ - 47.0% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

As with filter_first_isolate(), there’s a shortcut for this new algorithm too:

- -

So we end up with 15,143 isolates for analysis.

+ +

So we end up with 15,079 isolates for analysis.

We can remove unneeded columns:

- +

Now our data looks like:

-
head(data_1st)
+
head(data_1st)
@@ -813,30 +812,30 @@ - - - - - + + + + + - + - - - + + + - - + + - + + + - - @@ -844,67 +843,67 @@ - - - - - - + + + + + + - - - + + + - - - + + + - + - + - - - + + + - - - - - - - + + + + + + + - - - + + + - - + + - + - - - + + + @@ -919,14 +918,14 @@

Dispersion of species

-

To just get an idea how the species are distributed, create a frequency table with our freq() function. We created the genus and species column earlier based on the microbial ID. With paste(), we can concatenate them together.

-

The freq() function can be used like the base R language was intended:

-
freq(paste(data_1st$genus, data_1st$species))
+

To just get an idea how the species are distributed, create a frequency table with our freq() function. We created the genus and species column earlier based on the microbial ID. With paste(), we can concatenate them together.

+

The freq() function can be used like the base R language was intended:

+
freq(paste(data_1st$genus, data_1st$species))

Or can be used like the dplyr way, which is easier readable:

-
data_1st %>% freq(genus, species)
+
data_1st %>% freq(genus, species)

Frequency table

Class: character
-Length: 15,143 (of which NA: 0 = 0%)
+Length: 15,079 (of which NA: 0 = 0%)
Unique: 4

Shortest: 16
Longest: 24

@@ -943,33 +942,33 @@ Longest: 24

- - - - + + + + - - - - + + + + - - - - + + + + - - - + + + @@ -979,12 +978,12 @@ Longest: 24

Resistance percentages

The functions portion_S(), portion_SI(), portion_I(), portion_IR() and portion_R() can be used to determine the portion of a specific antimicrobial outcome. As per the EUCAST guideline of 2019, we calculate resistance as the portion of R (portion_R()) and susceptibility as the portion of S and I (portion_SI()). These functions can be used on their own:

- +

Or can be used in conjuction with group_by() and summarise(), both from the dplyr package:

-
data_1st %>% 
-  group_by(hospital) %>% 
-  summarise(amoxicillin = portion_R(AMX))
+
data_1st %>% 
+  group_by(hospital) %>% 
+  summarise(amoxicillin = portion_R(AMX))
12012-05-14G6Hospital BB_ESCHR_COLIR2011-07-25L6Hospital DB_STPHY_AURSS S RRS MGram-negativeEscherichiacoliGram-positiveStaphylococcusaureus TRUE
32010-04-06F72012-04-04C3 Hospital B B_ESCHR_COLIIRSS SRR M Gram-negative EscherichiaTRUE
52017-05-22T6Hospital AB_STPHY_AURSR42017-10-06W8Hospital BB_ESCHR_COLIS S S S FGram-positiveStaphylococcusaureusGram-negativeEscherichiacoli TRUE
62012-06-06X752013-07-22Q6 Hospital DB_STPHY_AURSB_ESCHR_COLI RSI S S FGram-positiveStaphylococcusaureusGram-negativeEscherichiacoli TRUE
72010-02-15O1Hospital AB_STPHY_AURSSS62011-06-05W7Hospital BB_ESCHR_COLI R SSS FGram-positiveStaphylococcusaureusGram-negativeEscherichiacoli TRUE
92016-11-25C62013-04-24D1 Hospital BB_ESCHR_COLIB_STPHY_AURS S S S S MGram-negativeEscherichiacoliGram-positiveStaphylococcusaureus TRUE
1 Escherichia coli7,51249.61%7,51249.61%7,44249.35%7,44249.35%
2 Staphylococcus aureus3,81925.22%11,33174.83%3,73224.75%11,17474.10%
3 Streptococcus pneumoniae2,24314.81%13,57489.64%2,32315.41%13,49789.51%
4 Klebsiella pneumoniae1,56910.36%15,1431,58210.49%15,079 100.00%
@@ -993,27 +992,27 @@ Longest: 24

- + - + - + - +
hospital
Hospital A0.46725470.4627198
Hospital B0.46702250.4745283
Hospital C0.46486250.4721269
Hospital D0.46694890.4549763

Of course it would be very convenient to know the number of isolates responsible for the percentages. For that purpose the n_rsi() can be used, which works exactly like n_distinct() from the dplyr package. It counts all isolates available for every group (i.e. values S, I or R):

-
data_1st %>% 
-  group_by(hospital) %>% 
-  summarise(amoxicillin = portion_R(AMX),
-            available = n_rsi(AMX))
+
data_1st %>% 
+  group_by(hospital) %>% 
+  summarise(amoxicillin = portion_R(AMX),
+            available = n_rsi(AMX))
@@ -1023,32 +1022,32 @@ Longest: 24

- - + + - - + + - - + + - - + +
hospital
Hospital A0.467254745350.46271984493
Hospital B0.467022552460.47452835300
Hospital C0.464862522910.47212692332
Hospital D0.466948930710.45497632954

These functions can also be used to get the portion of multiple antibiotics, to calculate empiric susceptibility of combination therapies very easily:

- + @@ -1059,94 +1058,94 @@ Longest: 24

- - - + + + - - - + + + - - - + + + - + - +
genus
Escherichia0.92478700.88804580.99281150.92502020.89720510.9965063
Klebsiella0.81325690.89866160.98597830.81605560.89823010.9841972
Staphylococcus0.91385180.91856510.99345380.92175780.91613080.9930332
Streptococcus0.62906820.6147223 0.00000000.62906820.6147223

To make a transition to the next part, let’s see how this difference could be plotted:

-
data_1st %>% 
-  group_by(genus) %>% 
-  summarise("1. Amoxi/clav" = portion_SI(AMC),
-            "2. Gentamicin" = portion_SI(GEN),
-            "3. Amoxi/clav + genta" = portion_SI(AMC, GEN)) %>% 
-  tidyr::gather("antibiotic", "S", -genus) %>%
-  ggplot(aes(x = genus,
-             y = S,
-             fill = antibiotic)) +
-  geom_col(position = "dodge2")
+
data_1st %>% 
+  group_by(genus) %>% 
+  summarise("1. Amoxi/clav" = portion_SI(AMC),
+            "2. Gentamicin" = portion_SI(GEN),
+            "3. Amoxi/clav + genta" = portion_SI(AMC, GEN)) %>% 
+  tidyr::gather("antibiotic", "S", -genus) %>%
+  ggplot(aes(x = genus,
+             y = S,
+             fill = antibiotic)) +
+  geom_col(position = "dodge2")

Plots

To show results in plots, most R users would nowadays use the ggplot2 package. This package lets you create plots in layers. You can read more about it on their website. A quick example would look like these syntaxes:

-
ggplot(data = a_data_set,
-       mapping = aes(x = year,
-                     y = value)) +
-  geom_col() +
-  labs(title = "A title",
-       subtitle = "A subtitle",
-       x = "My X axis",
-       y = "My Y axis")
-
-# or as short as:
-ggplot(a_data_set) +
-  geom_bar(aes(year))
+
ggplot(data = a_data_set,
+       mapping = aes(x = year,
+                     y = value)) +
+  geom_col() +
+  labs(title = "A title",
+       subtitle = "A subtitle",
+       x = "My X axis",
+       y = "My Y axis")
+
+# or as short as:
+ggplot(a_data_set) +
+  geom_bar(aes(year))

The AMR package contains functions to extend this ggplot2 package, for example geom_rsi(). It automatically transforms data with count_df() or portion_df() and show results in stacked bars. Its simplest and shortest example:

-
ggplot(data_1st) +
-  geom_rsi(translate_ab = FALSE)
+
ggplot(data_1st) +
+  geom_rsi(translate_ab = FALSE)

Omit the translate_ab = FALSE to have the antibiotic codes (AMX, AMC, CIP, GEN) translated to official WHO names (amoxicillin, amoxicillin/clavulanic acid, ciprofloxacin, gentamicin).

If we group on e.g. the genus column and add some additional functions from our package, we can create this:

- +

To simplify this, we also created the ggplot_rsi() function, which combines almost all above functions:

- +

@@ -1154,33 +1153,33 @@ Longest: 24

Independence test

The next example uses the included example_isolates, which is an anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. This data.frame can be used to practice AMR analysis.

We will compare the resistance to fosfomycin (column FOS) in hospital A and D. The input for the fisher.test() can be retrieved with a transformation like this:

-
check_FOS <- example_isolates %>%
-  filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D
-  select(hospital_id, FOS) %>%             # select the hospitals and fosfomycin
-  group_by(hospital_id) %>%                # group on the hospitals
-  count_df(combine_SI = TRUE) %>%          # count all isolates per group (hospital_id)
-  tidyr::spread(hospital_id, value) %>%    # transform output so A and D are columns
-  select(A, D) %>%                         # and select these only
-  as.matrix()                              # transform to good old matrix for fisher.test()
-
-check_FOS
-#       A  D
-# [1,] 25 77
-# [2,] 24 33
+
check_FOS <- example_isolates %>%
+  filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D
+  select(hospital_id, FOS) %>%             # select the hospitals and fosfomycin
+  group_by(hospital_id) %>%                # group on the hospitals
+  count_df(combine_SI = TRUE) %>%          # count all isolates per group (hospital_id)
+  tidyr::spread(hospital_id, value) %>%    # transform output so A and D are columns
+  select(A, D) %>%                         # and select these only
+  as.matrix()                              # transform to good old matrix for fisher.test()
+
+check_FOS
+#       A  D
+# [1,] 25 77
+# [2,] 24 33

We can apply the test now with:

- +

As can be seen, the p value is 0.031, which means that the fosfomycin resistances found in hospital A and D are really different.

diff --git a/docs/articles/AMR_files/figure-html/plot 1-1.png b/docs/articles/AMR_files/figure-html/plot 1-1.png index a144724e..5b35bfce 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 1-1.png and b/docs/articles/AMR_files/figure-html/plot 1-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 3-1.png b/docs/articles/AMR_files/figure-html/plot 3-1.png index 20422eb8..2356f842 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 3-1.png and b/docs/articles/AMR_files/figure-html/plot 3-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 4-1.png b/docs/articles/AMR_files/figure-html/plot 4-1.png index a48b12f0..a932b081 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 4-1.png and b/docs/articles/AMR_files/figure-html/plot 4-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 5-1.png b/docs/articles/AMR_files/figure-html/plot 5-1.png index a83c0cd7..cebd646f 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 5-1.png and b/docs/articles/AMR_files/figure-html/plot 5-1.png differ diff --git a/docs/articles/EUCAST.html b/docs/articles/EUCAST.html index f526acd5..1b8722e5 100644 --- a/docs/articles/EUCAST.html +++ b/docs/articles/EUCAST.html @@ -41,7 +41,7 @@ AMR (for R) - 0.8.0 + 0.8.0.9021
@@ -187,7 +187,7 @@

How to apply EUCAST rules

Matthijs S. Berends

-

16 October 2019

+

09 November 2019

@@ -204,17 +204,24 @@

EUCAST expert rules are a tabulated collection of expert knowledge on intrinsic resistances, exceptional resistance phenotypes and interpretive rules that may be applied to antimicrobial susceptibility testing in order to reduce errors and make appropriate recommendations for reporting particular resistances.

In Europe, a lot of medical microbiological laboratories already apply these rules (Brown et al., 2015). Our package features their latest insights on intrinsic resistance and exceptional phenotypes (version 9.0, 2019). Moreover, the eucast_rules() function we use for this purpose can also apply additional rules, like forcing ampicillin = R in isolates when amoxicillin/clavulanic acid = R.

-

(more will be available soon)

-
-

-Benefit for empiric therapy success estimation

-

(will be available soon)

-

Examples

-

(will be available soon)

+

These rules can be used to discard impossible bug-drug combinations in your data. For example, Klebsiella produces beta-lactamase that prevents ampicillin (or amoxicillin) from working against it. In other words, every strain of Klebsiella is resistant to ampicillin.

+

Sometimes, laboratory data can still contain such strains with ampicillin being susceptible to ampicillin. This could be because an antibiogram is available before an identification is available, and the antibiogram is then not re-interpreted based on the identification (namely, Klebsiella). EUCAST expert rules solves this:

+
oops <- data.frame(mo = c("Klebsiella", 
+                          "Escherichia"),
+                   ampicillin = "S")
+oops
+#            mo ampicillin
+# 1  Klebsiella          S
+# 2 Escherichia          S
+
+eucast_rules(oops, info = FALSE)
+#            mo ampicillin
+# 1  Klebsiella          R
+# 2 Escherichia          S
diff --git a/docs/articles/MDR.html b/docs/articles/MDR.html index 358937dc..647bb3eb 100644 --- a/docs/articles/MDR.html +++ b/docs/articles/MDR.html @@ -41,7 +41,7 @@ AMR (for R) - 0.8.0 + 0.8.0.9021 @@ -187,7 +187,7 @@

How to determine multi-drug resistance (MDR)

Matthijs S. Berends

-

16 October 2019

+

09 November 2019

@@ -196,67 +196,148 @@ -

With the function mdro(), you can determine multi-drug resistant organisms (MDRO). It currently support these guidelines:

+

With the function mdro(), you can determine multi-drug resistant organisms (MDRO).

+
+

+Type of input

+

The mdro() takes a data set as input, such as a regular data.frame. It automatically determines the right columns for info about your isolates, like the name of the species and all columns with results of antimicrobial agents. See the help page for more info about how to set the right settings for your data with the command ?mdro.

+

For WHONET data (and most other data), all settings are automatically set correctly.

+
+
+

+Guidelines

+

The function support multiple guidelines. You can select a guideline with the guideline parameter. Currently supported guidelines are (case-insensitive):

-

As an example, I will make a data set to determine multi-drug resistant TB:

-
# a helper function to get a random vector with values S, I and R
-# with the probabilities 50% - 10% - 40%
-sample_rsi <- function() {
-  sample(c("S", "I", "R"),
-         size = 5000,
-         prob = c(0.5, 0.1, 0.4),
-         replace = TRUE)
-}
-
-my_TB_data <- data.frame(rifampicin = sample_rsi(),
-                         isoniazid = sample_rsi(),
-                         gatifloxacin = sample_rsi(),
-                         ethambutol = sample_rsi(),
-                         pyrazinamide = sample_rsi(),
-                         moxifloxacin = sample_rsi(),
-                         kanamycin = sample_rsi())
+
+
+

+Examples

+

The mdro() function always returns an ordered factor. For example, the output of the default guideline by Magiorakos et al. returns a factor with levels ‘Negative’, ‘MDR’, ‘XDR’ or ‘PDR’ in that order. If we test that guideline on the included example_isolates data set, we get:

+
library(dplyr) # to support pipes: %>%
+
example_isolates %>% 
+  mdro() %>% 
+  freq() # show frequency table of the result
+# NOTE: Using column `mo` as input for `col_mo`.
+# NOTE: Auto-guessing columns suitable for analysis...OK.
+# NOTE: Reliability will be improved if these antimicrobial results would be available too: SAM (ampicillin/sulbactam), ATM (aztreonam), CTT (cefotetan), CPT (ceftaroline), DAP (daptomycin), DOR (doripenem), ETP (ertapenem), FUS (fusidic acid), GEH (gentamicin-high), LVX (levofloxacin), MNO (minocycline), NET (netilmicin), PLB (polymyxin B), QDA (quinupristin/dalfopristin), STH (streptomycin-high), TLV (telavancin), TCC (ticarcillin/clavulanic acid)
+# Table 1 - S. aureus ... OK
+# Table 2 - Enterococcus spp. ... OK
+# Table 3 - Enterobacteriaceae ... OK
+# Table 4 - Pseudomonas aeruginosa ... OK
+# Table 5 - Acinetobacter spp. ... OK
+# Warning in mdro(.): NA introduced for isolates where the available
+# percentage of antimicrobial classes was below 50% (set with
+# `pct_required_classes`)
+

Frequency table

+

Class: factor > ordered (numeric)
+Length: 2,000 (of which NA: 289 = 14.45%)
+Levels: 4: Negative < Multi-drug-resistant (MDR) < Extensively drug-resistant …
+Unique: 2

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
ItemCountPercentCum. CountCum. Percent
1Negative159693.28%159693.28%
2Multi-drug-resistant (MDR)1156.72%1711100.00%
+

For another example, I will create a data set to determine multi-drug resistant TB:

+
# a helper function to get a random vector with values S, I and R
+# with the probabilities 50% - 10% - 40%
+sample_rsi <- function() {
+  sample(c("S", "I", "R"),
+         size = 5000,
+         prob = c(0.5, 0.1, 0.4),
+         replace = TRUE)
+}
+
+my_TB_data <- data.frame(rifampicin = sample_rsi(),
+                         isoniazid = sample_rsi(),
+                         gatifloxacin = sample_rsi(),
+                         ethambutol = sample_rsi(),
+                         pyrazinamide = sample_rsi(),
+                         moxifloxacin = sample_rsi(),
+                         kanamycin = sample_rsi())

Because all column names are automatically verified for valid drug names or codes, this would have worked exactly the same:

-
my_TB_data <- data.frame(RIF = sample_rsi(),
-                         INH = sample_rsi(),
-                         GAT = sample_rsi(),
-                         ETH = sample_rsi(),
-                         PZA = sample_rsi(),
-                         MFX = sample_rsi(),
-                         KAN = sample_rsi())
-

The data set looks like this now:

-
head(my_TB_data)
-#   rifampicin isoniazid gatifloxacin ethambutol pyrazinamide moxifloxacin
-# 1          S         S            R          S            R            S
-# 2          S         R            R          S            S            R
-# 3          R         R            S          S            R            S
-# 4          R         R            R          S            S            S
-# 5          R         R            R          R            R            R
-# 6          R         R            R          I            S            R
-#   kanamycin
-# 1         I
-# 2         S
-# 3         R
-# 4         S
-# 5         R
-# 6         S
-

We can now add the interpretation of MDR-TB to our data set:

-
my_TB_data$mdr <- mdr_tb(my_TB_data)
-# NOTE: No column found as input for `col_mo`, assuming all records contain Mycobacterium tuberculosis.
-# Determining multidrug-resistant organisms (MDRO), according to:
-# Guideline: Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis
-# Version:   WHO/HTM/TB/2014.11
-# Author:    WHO (World Health Organization)
-# Source:    https://www.who.int/tb/publications/pmdt_companionhandbook/en/
-# NOTE: Auto-guessing columns suitable for analysis...
-# NOTE: Reliability might be improved if these antimicrobial results would be available too: CAP (capreomycin), RIB (rifabutin), RFP (rifapentine)
-

We also created a package dedicated to data cleaning and checking, called the clean package. It gets automatically installed with the AMR package, so we only have to load it:

-
library(clean)
-

It contains the freq() function, to create a frequency table:

-
freq(my_TB_data$mdr)
+
my_TB_data <- data.frame(RIF = sample_rsi(),
+                         INH = sample_rsi(),
+                         GAT = sample_rsi(),
+                         ETH = sample_rsi(),
+                         PZA = sample_rsi(),
+                         MFX = sample_rsi(),
+                         KAN = sample_rsi())
+

The data set now looks like this:

+
head(my_TB_data)
+#   rifampicin isoniazid gatifloxacin ethambutol pyrazinamide moxifloxacin
+# 1          S         S            R          R            S            R
+# 2          S         R            S          S            R            S
+# 3          R         S            R          S            R            R
+# 4          S         S            R          S            S            I
+# 5          R         R            S          R            S            S
+# 6          S         S            R          R            R            S
+#   kanamycin
+# 1         S
+# 2         S
+# 3         R
+# 4         R
+# 5         S
+# 6         R
+

We can now add the interpretation of MDR-TB to our data set. You can use:

+
mdro(my_TB_data, guideline = "TB")
+

or its shortcut mdr_tb():

+
my_TB_data$mdr <- mdr_tb(my_TB_data)
+# NOTE: No column found as input for `col_mo`, assuming all records contain Mycobacterium tuberculosis.
+# NOTE: Auto-guessing columns suitable for analysis...OK.
+# NOTE: Reliability will be improved if these antimicrobial results would be available too: CAP (capreomycin), RIB (rifabutin), RFP (rifapentine)
+# 
+# Only results with 'R' are considered as resistance. Use `combine_SI = FALSE` to also consider 'I' as resistance.
+# 
+# Determining multidrug-resistant organisms (MDRO), according to:
+# Guideline: Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis
+# Version:   WHO/HTM/TB/2014.11
+# Author:    WHO (World Health Organization)
+# Source:    https://www.who.int/tb/publications/pmdt_companionhandbook/en/
+# 
+# => Found 4371 MDROs out of 5000 tested isolates (87.4%)
+

Create a frequency table of the results:

+
freq(my_TB_data$mdr)

Frequency table

Class: factor > ordered (numeric)
Length: 5,000 (of which NA: 0 = 0%)
@@ -275,46 +356,47 @@ Unique: 5

1 Mono-resistant -3276 -65.52% -3276 -65.52% +3301 +66.02% +3301 +66.02% 2 Negative -658 -13.16% -3934 -78.68% +629 +12.58% +3930 +78.60% 3 Multi-drug-resistant -616 -12.32% -4550 -91.00% +596 +11.92% +4526 +90.52% 4 Poly-resistant -256 -5.12% -4806 -96.12% +282 +5.64% +4808 +96.16% 5 -Extensive drug-resistant -194 -3.88% +Extensively drug-resistant +192 +3.84% 5000 100.00%
+ @@ -187,7 +187,7 @@

How to work with WHONET data

Matthijs S. Berends

-

16 October 2019

+

09 November 2019

@@ -196,18 +196,18 @@ -
-

-Import of data

+
+

+Import of data

This tutorial assumes you already imported the WHONET data with e.g. the readxl package. In RStudio, this can be done using the menu button ‘Import Dataset’ in the tab ‘Environment’. Choose the option ‘From Excel’ and select your exported file. Make sure date fields are imported correctly.

An example syntax could look like this:

library(readxl)
 data <- read_excel(path = "path/to/your/file.xlsx")

This package comes with an example data set WHONET. We will use it for this analysis.

-
-

-Preparation

+
+

+Preparation

First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don’t know it yet, I suggest you read about it on their website: https://www.tidyverse.org/.

library(dplyr)   # part of tidyverse
 library(ggplot2) # part of tidyverse
@@ -224,18 +224,16 @@
   # transform everything from "AMP_ND10" to "CIP_EE" to the new `rsi` class
   mutate_at(vars(AMP_ND10:CIP_EE), as.rsi)

No errors or warnings, so all values are transformed succesfully.

-

We created a package dedicated to data cleaning and checking, called the clean package. It gets automatically installed with the AMR package, so we only have to load it:

-
library(clean)
-

It contains the freq() function, to create frequency tables.

+

We also created a package dedicated to data cleaning and checking, called the cleaner package. It gets automatically installed with the AMR package. For its freq() function to create frequency tables, you don’t even need to load it yourself as it is available through the AMR package as well.

So let’s check our data, with a couple of frequency tables:

- +

Frequency table

Class: mo (character)
Length: 500 (of which NA: 0 = 0%)
Unique: 39

-

Gram-negative: 281 (56.20%)
-Gram-positive: 219 (43.80%)
+

Gram-negative: 280 (56.00%)
+Gram-positive: 220 (44.00%)
Nr of genera: 17
Nr of species: 39

@@ -331,10 +329,10 @@ Nr of species: 39

(omitted 29 entries, n = 57 [11.40%])

- +

Frequency table

Class: factor > ordered > rsi (numeric)
Length: 500 (of which NA: 19 = 3.8%)
@@ -378,24 +376,17 @@ Unique: 3

-
-

-Analysis

-

(more will be available soon)

+
+

+A first glimpse at results

+

An easy ggplot will already give a lot of information, using the included ggplot_rsi() function:

+
ggplot_rsi(data, translate_ab = 'ab')
+

diff --git a/docs/articles/WHONET_files/figure-html/unnamed-chunk-5-1.png b/docs/articles/WHONET_files/figure-html/unnamed-chunk-5-1.png new file mode 100644 index 00000000..35d2a3ad Binary files /dev/null and b/docs/articles/WHONET_files/figure-html/unnamed-chunk-5-1.png differ diff --git a/docs/articles/index.html b/docs/articles/index.html index 90367097..c00a4829 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021
diff --git a/docs/authors.html b/docs/authors.html index dd7f8996..9e5763ff 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/index.html b/docs/index.html index 00399fb6..cf842455 100644 --- a/docs/index.html +++ b/docs/index.html @@ -45,7 +45,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 @@ -194,7 +194,7 @@

(TLDR - to find out how to conduct AMR analysis, please continue reading here to get started.

18 October 2019
METHODS PAPER PREPRINTED
-A methods paper about this package has been preprinted at bioRxiv. Please see here for the publishers page or click here for the PDF.

+A methods paper about this package has been preprinted at bioRxiv. It was updated on 8 November 2019. Please click here for the publishers page.


diff --git a/docs/news/index.html b/docs/news/index.html index b35615e7..bdbabb3f 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021
@@ -231,11 +231,11 @@ -
+

-AMR 0.8.0.9017 Unreleased +AMR 0.8.0.9021 Unreleased

-

Last updated: 06-Nov-2019

+

Last updated: 09-Nov-2019

New

@@ -1333,7 +1333,7 @@ Using as.mo(..., allow_uncertain = 3)

Contents

diff --git a/docs/reference/age_groups.html b/docs/reference/age_groups.html index 6a51f1f1..23bd0faa 100644 --- a/docs/reference/age_groups.html +++ b/docs/reference/age_groups.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021
diff --git a/docs/reference/as.ab.html b/docs/reference/as.ab.html index 33af5584..b7ea7b06 100644 --- a/docs/reference/as.ab.html +++ b/docs/reference/as.ab.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021
diff --git a/docs/reference/as.disk.html b/docs/reference/as.disk.html index 3c80a806..0ea7e279 100644 --- a/docs/reference/as.disk.html +++ b/docs/reference/as.disk.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/as.mic.html b/docs/reference/as.mic.html index a6cb3475..69dbea2e 100644 --- a/docs/reference/as.mic.html +++ b/docs/reference/as.mic.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/as.rsi.html b/docs/reference/as.rsi.html index 67d19b4e..414af339 100644 --- a/docs/reference/as.rsi.html +++ b/docs/reference/as.rsi.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/catalogue_of_life_version.html b/docs/reference/catalogue_of_life_version.html index c9855855..72d886df 100644 --- a/docs/reference/catalogue_of_life_version.html +++ b/docs/reference/catalogue_of_life_version.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9008 + 0.8.0.9021 diff --git a/docs/reference/count.html b/docs/reference/count.html index 09129b69..5ef02161 100644 --- a/docs/reference/count.html +++ b/docs/reference/count.html @@ -86,7 +86,7 @@ count_R and count_IR can be used to count resistant isolates, count_S and count_ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/eucast_rules.html b/docs/reference/eucast_rules.html index c722929d..3597c683 100644 --- a/docs/reference/eucast_rules.html +++ b/docs/reference/eucast_rules.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/filter_ab_class.html b/docs/reference/filter_ab_class.html index 070f0b67..f3c0e4e0 100644 --- a/docs/reference/filter_ab_class.html +++ b/docs/reference/filter_ab_class.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/first_isolate.html b/docs/reference/first_isolate.html index fb370aed..407c4177 100644 --- a/docs/reference/first_isolate.html +++ b/docs/reference/first_isolate.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/g.test.html b/docs/reference/g.test.html index b0f06953..837ddafd 100644 --- a/docs/reference/g.test.html +++ b/docs/reference/g.test.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/ggplot_rsi.html b/docs/reference/ggplot_rsi.html index a8a78244..a37117a4 100644 --- a/docs/reference/ggplot_rsi.html +++ b/docs/reference/ggplot_rsi.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9009 + 0.8.0.9021 diff --git a/docs/reference/index.html b/docs/reference/index.html index dfbbd633..c75b37b0 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/like.html b/docs/reference/like.html index 96f8b106..ff0ad7ea 100644 --- a/docs/reference/like.html +++ b/docs/reference/like.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/mdro.html b/docs/reference/mdro.html index 911b415f..55982823 100644 --- a/docs/reference/mdro.html +++ b/docs/reference/mdro.html @@ -85,7 +85,7 @@ AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/portion.html b/docs/reference/portion.html index 4cc9fdb4..e66aec34 100644 --- a/docs/reference/portion.html +++ b/docs/reference/portion.html @@ -86,7 +86,7 @@ portion_R and portion_IR can be used to calculate resistance, portion_S and port AMR (for R) - 0.8.0.9017 + 0.8.0.9021 diff --git a/docs/reference/reexports.html b/docs/reference/reexports.html index 985e135f..c4947971 100644 --- a/docs/reference/reexports.html +++ b/docs/reference/reexports.html @@ -90,7 +90,7 @@ below to see their documentation. AMR (for R) - 0.8.0.9008 + 0.8.0.9021 diff --git a/index.md b/index.md index 26ceeec1..6c5af0ac 100644 --- a/index.md +++ b/index.md @@ -4,7 +4,7 @@ > *18 October 2019* > **METHODS PAPER PREPRINTED** -> A methods paper about this package has been preprinted at bioRxiv. Please see [here for the publishers page](https://doi.org/10.1101/810622) or [click here for the PDF](https://www.biorxiv.org/content/early/2019/10/18/810622.full.pdf). +> A methods paper about this package has been preprinted at bioRxiv. It was updated on 8 November 2019. Please click [here for the publishers page](https://doi.org/10.1101/810622). ---- diff --git a/vignettes/AMR.Rmd b/vignettes/AMR.Rmd index 3dae082c..45553ba8 100755 --- a/vignettes/AMR.Rmd +++ b/vignettes/AMR.Rmd @@ -144,13 +144,9 @@ Now, let's start the cleaning and the analysis! # Cleaning the data -We also created a package dedicated to data cleaning and checking, called the `clean` package. It gets automatically installed with the `AMR` package, so we only have to load it: +We also created a package dedicated to data cleaning and checking, called the `cleaner` package. It gets automatically installed with the `AMR` package. For its `freq()` function to create frequency tables, you don't even need to load it yourself as it is available through the `AMR` package as well. -```{r lib clean, message = FALSE} -library(clean) -``` - -Use the frequency table function `freq()` from this `clean` package to look specifically for unique values in any variable. For example, for the `gender` variable: +For example, for the `gender` variable: ```{r freq gender 1, eval = FALSE} data %>% freq(gender) # this would be the same: freq(data$gender) @@ -210,7 +206,7 @@ data <- data %>% mutate(first = first_isolate(.)) ``` -So only `r AMR:::percentage(sum(data$first) / nrow(data))` is suitable for resistance analysis! We can now filter on it with the `filter()` function, also from the `dplyr` package: +So only `r cleaner::percentage(sum(data$first) / nrow(data))` is suitable for resistance analysis! We can now filter on it with the `filter()` function, also from the `dplyr` package: ```{r 1st isolate filter} data_1st <- data %>% @@ -230,7 +226,7 @@ data_1st <- data %>% weighted_df <- data %>% filter(bacteria == as.mo("E. coli")) %>% # only most prevalent patient - filter(patient_id == top_freq(freq(., patient_id), 1)[1]) %>% + filter(patient_id == cleaner::top_freq(freq(., patient_id), 1)[1]) %>% arrange(date) %>% select(date, patient_id, bacteria, AMX:GEN, first) %>% # maximum of 10 rows @@ -260,7 +256,7 @@ data <- data %>% weighted_df2 <- data %>% filter(bacteria == as.mo("E. coli")) %>% # only most prevalent patient - filter(patient_id == top_freq(freq(., patient_id), 1)[1]) %>% + filter(patient_id == cleaner::top_freq(freq(., patient_id), 1)[1]) %>% arrange(date) %>% select(date, patient_id, bacteria, AMX:GEN, first, first_weighted) %>% # maximum of 10 rows @@ -272,7 +268,7 @@ weighted_df2 %>% knitr::kable(align = "c") ``` -Instead of `r sum(weighted_df$first)`, now `r sum(weighted_df2$first_weighted)` isolates are flagged. In total, `r AMR:::percentage(sum(data$first_weighted) / nrow(data))` of all isolates are marked 'first weighted' - `r AMR:::percentage((sum(data$first_weighted) / nrow(data)) - (sum(data$first) / nrow(data)))` more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline. +Instead of `r sum(weighted_df$first)`, now `r sum(weighted_df2$first_weighted)` isolates are flagged. In total, `r cleaner::percentage(sum(data$first_weighted) / nrow(data))` of all isolates are marked 'first weighted' - `r cleaner::percentage((sum(data$first_weighted) / nrow(data)) - (sum(data$first) / nrow(data)))` more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline. As with `filter_first_isolate()`, there's a shortcut for this new algorithm too: ```{r 1st isolate filter 3, results = 'hide', message = FALSE, warning = FALSE} diff --git a/vignettes/EUCAST.Rmd b/vignettes/EUCAST.Rmd index 4db83445..88208167 100644 --- a/vignettes/EUCAST.Rmd +++ b/vignettes/EUCAST.Rmd @@ -21,6 +21,7 @@ knitr::opts_chunk$set( fig.width = 7.5, fig.height = 4.5 ) +library(AMR) ``` ## Introduction @@ -31,12 +32,17 @@ What are EUCAST rules? The European Committee on Antimicrobial Susceptibility Te In Europe, a lot of medical microbiological laboratories already apply these rules ([Brown *et al.*, 2015](https://www.eurosurveillance.org/content/10.2807/1560-7917.ES2015.20.2.21008)). Our package features their latest insights on intrinsic resistance and exceptional phenotypes (version 9.0, 2019). Moreover, the `eucast_rules()` function we use for this purpose can also apply additional rules, like forcing ampicillin = R in isolates when amoxicillin/clavulanic acid = R. -*(more will be available soon)* - -### Benefit for empiric therapy success estimation - -*(will be available soon)* - ## Examples -*(will be available soon)* +These rules can be used to discard impossible bug-drug combinations in your data. For example, *Klebsiella* produces beta-lactamase that prevents ampicillin (or amoxicillin) from working against it. In other words, every strain of *Klebsiella* is resistant to ampicillin. + +Sometimes, laboratory data can still contain such strains with ampicillin being susceptible to ampicillin. This could be because an antibiogram is available before an identification is available, and the antibiogram is then not re-interpreted based on the identification (namely, *Klebsiella*). EUCAST expert rules solves this: + +```{r, warning = FALSE, message = FALSE} +oops <- data.frame(mo = c("Klebsiella", + "Escherichia"), + ampicillin = "S") +oops + +eucast_rules(oops, info = FALSE) +``` diff --git a/vignettes/MDR.Rmd b/vignettes/MDR.Rmd index e0088482..aa57f0d2 100644 --- a/vignettes/MDR.Rmd +++ b/vignettes/MDR.Rmd @@ -21,13 +21,54 @@ knitr::opts_chunk$set( library(AMR) ``` -With the function `mdro()`, you can determine multi-drug resistant organisms (MDRO). It currently support these guidelines: +With the function `mdro()`, you can determine multi-drug resistant organisms (MDRO). -* "Intrinsic Resistance and Exceptional Phenotypes Tables", by EUCAST (European Committee on Antimicrobial Susceptibility Testing) -* "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis", by WHO (World Health Organization) -* "WIP-Richtlijn Bijzonder Resistente Micro-organismen (BRMO)", by RIVM (Rijksinstituut voor de Volksgezondheid, the Netherlands National Institute for Public Health and the Environment) +#### Type of input -As an example, I will make a data set to determine multi-drug resistant TB: +The `mdro()` takes a data set as input, such as a regular `data.frame`. It automatically determines the right columns for info about your isolates, like the name of the species and all columns with results of antimicrobial agents. See the help page for more info about how to set the right settings for your data with the command `?mdro`. + +For WHONET data (and most other data), all settings are automatically set correctly. + +#### Guidelines + +The function support multiple guidelines. You can select a guideline with the `guideline` parameter. Currently supported guidelines are (case-insensitive): + +* `guideline = "CMI2012"` (default) + + Magiorakos AP, Srinivasan A *et al.* "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012) ([link](https://www.clinicalmicrobiologyandinfection.com/article/S1198-743X(14)61632-3/fulltext)) +* `guideline = "EUCAST"` + + The European international guideline - EUCAST Expert Rules Version 3.1 "Intrinsic Resistance and Exceptional Phenotypes Tables" ([link](http://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf)) +* `guideline = "TB"` + + The international guideline for multi-drug resistant tuberculosis - World Health Organization "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis" ([link](https://www.who.int/tb/publications/pmdt_companionhandbook/en/)) +* `guideline = "MRGN"` + + The German national guideline - Mueller et al. (2015) Antimicrobial Resistance and Infection Control 4:7. ([link](https://doi.org/10.1186/s13756-015-0047-6)) +* `guideline = "BRMO"` + + The Dutch national guideline - Rijksinstituut voor Volksgezondheid en Milieu "WIP-richtlijn BRMO (Bijzonder Resistente Micro-Organismen) [ZKH]" ([link](https://www.rivm.nl/Documenten_en_publicaties/Professioneel_Praktisch/Richtlijnen/Infectieziekten/WIP_Richtlijnen/WIP_Richtlijnen/Ziekenhuizen/WIP_richtlijn_BRMO_Bijzonder_Resistente_Micro_Organismen_ZKH)) + +#### Examples + +The `mdro()` function always returns an ordered `factor`. For example, the output of the default guideline by Magiorakos *et al.* returns a `factor` with levels 'Negative', 'MDR', 'XDR' or 'PDR' in that order. If we test that guideline on the included `example_isolates` data set, we get: + +```{r, message = FALSE} +library(dplyr) # to support pipes: %>% +``` +```{r, results = 'hide'} +example_isolates %>% + mdro() %>% + freq() # show frequency table of the result +``` +```{r, echo = FALSE, results = 'asis', message = FALSE, warning = FALSE} +library(dplyr) # to support pipes: %>% +example_isolates %>% + mdro(info = FALSE) %>% + freq() # show frequency table of the result +``` + +For another example, I will create a data set to determine multi-drug resistant TB: ```{r} # a helper function to get a random vector with values S, I and R @@ -60,25 +101,25 @@ my_TB_data <- data.frame(RIF = sample_rsi(), KAN = sample_rsi()) ``` -The data set looks like this now: +The data set now looks like this: ```{r} head(my_TB_data) ``` -We can now add the interpretation of MDR-TB to our data set: +We can now add the interpretation of MDR-TB to our data set. You can use: + +```r +mdro(my_TB_data, guideline = "TB") +``` + +or its shortcut `mdr_tb()`: ```{r} my_TB_data$mdr <- mdr_tb(my_TB_data) ``` -We also created a package dedicated to data cleaning and checking, called the `clean` package. It gets automatically installed with the `AMR` package, so we only have to load it: - -```{r lib clean, message = FALSE} -library(clean) -``` - -It contains the `freq()` function, to create a frequency table: +Create a frequency table of the results: ```{r, results = 'asis'} freq(my_TB_data$mdr) diff --git a/vignettes/WHONET.Rmd b/vignettes/WHONET.Rmd index 7202547b..478db8a4 100644 --- a/vignettes/WHONET.Rmd +++ b/vignettes/WHONET.Rmd @@ -23,7 +23,7 @@ knitr::opts_chunk$set( ) ``` -# Import of data +### Import of data This tutorial assumes you already imported the WHONET data with e.g. the [`readxl` package](https://readxl.tidyverse.org/). In RStudio, this can be done using the menu button 'Import Dataset' in the tab 'Environment'. Choose the option 'From Excel' and select your exported file. Make sure date fields are imported correctly. @@ -36,7 +36,7 @@ data <- read_excel(path = "path/to/your/file.xlsx") This package comes with an [example data set `WHONET`](https://msberends.gitlab.io/AMR/reference/WHONET.html). We will use it for this analysis. -# Preparation +### Preparation First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don't know it yet, I suggest you read about it on their website: https://www.tidyverse.org/. @@ -62,13 +62,7 @@ data <- WHONET %>% No errors or warnings, so all values are transformed succesfully. -We created a package dedicated to data cleaning and checking, called the `clean` package. It gets automatically installed with the `AMR` package, so we only have to load it: - -```{r lib clean, message = FALSE} -library(clean) -``` - -It contains the `freq()` function, to create frequency tables. +We also created a package dedicated to data cleaning and checking, called the `cleaner` package. It gets automatically installed with the `AMR` package. For its `freq()` function to create frequency tables, you don't even need to load it yourself as it is available through the `AMR` package as well. So let's check our data, with a couple of frequency tables: @@ -81,7 +75,10 @@ data %>% freq(mo, nmax = 10) data %>% freq(AMC_ND2) ``` -# Analysis +### A first glimpse at results -*(more will be available soon)* +An easy ggplot will already give a lot of information, using the included `ggplot_rsi()` function: +```{r} +ggplot_rsi(data, translate_ab = 'ab') +```