diff --git a/DESCRIPTION b/DESCRIPTION index ff87372b..6635ea05 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.8.0.9029 -Date: 2019-11-10 +Version: 0.8.0.9030 +Date: 2019-11-11 Title: Antimicrobial Resistance Analysis Authors@R: c( person(role = c("aut", "cre"), @@ -47,7 +47,7 @@ Imports: microbenchmark, pillar, rlang (>= 0.3.1), - tidyr (>= 0.7.0) + tidyr (>= 1.0.0) Suggests: covr (>= 3.0.1), curl, diff --git a/NAMESPACE b/NAMESPACE index 9e49e241..d6109758 100755 --- a/NAMESPACE +++ b/NAMESPACE @@ -321,8 +321,8 @@ importFrom(stats,glm) importFrom(stats,lm) importFrom(stats,pchisq) importFrom(stats,predict) -importFrom(tidyr,gather) -importFrom(tidyr,spread) +importFrom(tidyr,pivot_longer) +importFrom(tidyr,pivot_wider) importFrom(utils,browseURL) importFrom(utils,menu) importFrom(utils,read.csv) diff --git a/NEWS.md b/NEWS.md index b3afc516..dca3b445 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,8 +1,16 @@ -# AMR 0.8.0.9029 -Last updated: 10-Nov-2019 +# AMR 0.8.0.9030 +Last updated: 11-Nov-2019 ### New -* Functions `susceptibility()` and `resistance()` as aliases of `proportion_SI()` and `proportion_R()`, respectively. These functions were added to make it more clear that I should be considered susceptible and not resistant. +* Functions `susceptibility()` and `resistance()` as aliases of `proportion_SI()` and `proportion_R()`, respectively. These functions were added to make it more clear that "I" should be considered susceptible and not resistant. + ```r + library(dplyr) + example_isolates %>% + group_by(bug = mo_name(mo)) %>% + summarise(amoxicillin = resistance(AMX), + amox_clav = resistance(AMC)) %>% + filter(!is.na(amoxicillin) | !is.na(amox_clav)) + ``` * Support for a new MDRO guideline: Magiorakos AP, Srinivasan A *et al.* "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012). * This is now the new default guideline for the `mdro()` function * The new Verbose mode (`mdro(...., verbose = TRUE)`) returns an informative data set where the reason for MDRO determination is given for every isolate, and an list of the resistant antimicrobial agents diff --git a/R/bug_drug_combinations.R b/R/bug_drug_combinations.R index bcdd3728..41c19b18 100644 --- a/R/bug_drug_combinations.R +++ b/R/bug_drug_combinations.R @@ -32,7 +32,7 @@ #' @inheritParams rsi_df #' @inheritParams base::formatC #' @importFrom dplyr %>% rename group_by select mutate filter summarise ungroup -#' @importFrom tidyr spread +#' @importFrom tidyr pivot_longer #' @details The function \code{format} calculates the resistance per bug-drug combination. Use \code{combine_IR = FALSE} (default) to test R vs. S+I and \code{combine_IR = TRUE} to test R+I vs. S. #' #' The language of the output can be overwritten with \code{options(AMR_locale)}, please see \link{translate}. @@ -80,7 +80,7 @@ bug_drug_combinations <- function(x, FUN(...)) %>% group_by(mo) %>% select_if(is.rsi) %>% - gather("ab", "value", -mo) %>% + pivot_longer(-mo, names_to = "ab") %>% group_by(mo, ab) %>% summarise(S = sum(value == "S", na.rm = TRUE), I = sum(value == "I", na.rm = TRUE), @@ -93,7 +93,7 @@ bug_drug_combinations <- function(x, } #' @importFrom dplyr everything rename %>% ungroup group_by summarise mutate_all arrange everything lag -#' @importFrom tidyr spread +#' @importFrom tidyr pivot_wider #' @importFrom cleaner percentage #' @exportMethod format.bug_drug_combinations #' @export @@ -135,7 +135,7 @@ format.bug_drug_combinations <- function(x, } ab_txt } - + y <- x %>% mutate(ab = as.ab(ab), ab_txt = give_ab_name(ab = ab, format = translate_ab, language = language)) %>% @@ -146,8 +146,9 @@ format.bug_drug_combinations <- function(x, mutate(txt = paste0(percentage(isolates / total, decimal.mark = decimal.mark, big.mark = big.mark), " (", trimws(format(isolates, big.mark = big.mark)), "/", trimws(format(total, big.mark = big.mark)), ")")) %>% - select(ab, ab_txt, mo, txt) %>% - spread(mo, txt) %>% + select(ab, ab_txt, mo, txt) %>% + arrange(mo) %>% + pivot_wider(names_from = mo, values_from = txt) %>% mutate_all(~ifelse(is.na(.), "", .)) %>% mutate(ab_group = ab_group(ab, language = language), ab_txt) %>% diff --git a/R/resistance_predict.R b/R/resistance_predict.R index b8c9faad..a65deb83 100755 --- a/R/resistance_predict.R +++ b/R/resistance_predict.R @@ -29,12 +29,13 @@ #' @param year_every unit of sequence between lowest year found in the data and \code{year_max} #' @param minimum minimal amount of available isolates per year to include. Years containing less observations will be estimated by the model. #' @param model the statistical model of choice. This could be a generalised linear regression model with binomial distribution (i.e. using \code{\link{glm}(..., family = \link{binomial})}), assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance. See Details for all valid options. -#' @param I_as_S a logical to indicate whether values \code{I} should be treated as \code{S} (will otherwise be treated as \code{R}) +#' @param I_as_S a logical to indicate whether values \code{I} should be treated as \code{S} (will otherwise be treated as \code{R}). The default, \code{TRUE}, follows the redefinition by EUCAST about the interpretion of I (increased exposure) in 2019, see section 'Interpretation of S, I and R' below. #' @param preserve_measurements a logical to indicate whether predictions of years that are actually available in the data should be overwritten by the original data. The standard errors of those years will be \code{NA}. #' @param info a logical to indicate whether textual analysis should be printed with the name and \code{\link{summary}} of the statistical model. #' @param main title of the plot #' @param ribbon a logical to indicate whether a ribbon should be shown (default) or error bars #' @param ... parameters passed on to functions +#' @inheritSection as.rsi Interpretation of S, I and R #' @inheritParams first_isolate #' @inheritParams graphics::plot #' @details Valid options for the statistical model are: @@ -59,6 +60,7 @@ #' @export #' @importFrom stats predict glm lm #' @importFrom dplyr %>% pull mutate mutate_at n group_by_at summarise filter filter_at all_vars n_distinct arrange case_when n_groups transmute ungroup +#' @importFrom tidyr pivot_wider #' @inheritSection AMR Read more on our website! #' @examples #' x <- resistance_predict(example_isolates, col_ab = "AMX", year_min = 2010, model = "binomial") @@ -161,6 +163,7 @@ resistance_predict <- function(x, } year <- function(x) { + # don't depend on lubridate or so, would be overkill for only this function if (all(grepl("^[0-9]{4}$", x))) { x } else { @@ -192,9 +195,12 @@ resistance_predict <- function(x, } colnames(df) <- c("year", "antibiotic", "observations") + df <- df %>% filter(!is.na(antibiotic)) %>% - tidyr::spread(antibiotic, observations, fill = 0) %>% + pivot_wider(names_from = antibiotic, + values_from = observations, + values_fill = list(observations = 0)) %>% filter((R + S) >= minimum) df_matrix <- df %>% ungroup() %>% diff --git a/R/rsi_calc.R b/R/rsi_calc.R index 46750806..8623f774 100755 --- a/R/rsi_calc.R +++ b/R/rsi_calc.R @@ -167,8 +167,8 @@ rsi_calc <- function(..., } } -#' @importFrom dplyr %>% summarise_if mutate select everything bind_rows -#' @importFrom tidyr gather +#' @importFrom dplyr %>% summarise_if mutate select everything bind_rows arrange +#' @importFrom tidyr pivot_longer rsi_calc_df <- function(type, # "proportion" or "count" data, translate_ab = "name", @@ -247,12 +247,13 @@ rsi_calc_df <- function(type, # "proportion" or "count" } res <- res %>% - gather(antibiotic, value, -interpretation, -data.groups) %>% - select(antibiotic, everything()) + pivot_longer(-c(interpretation, data.groups), names_to = "antibiotic") %>% + select(antibiotic, everything()) %>% + arrange(antibiotic, interpretation) if (!translate_ab == FALSE) { res <- res %>% mutate(antibiotic = AMR::ab_property(antibiotic, property = translate_ab, language = language)) } - res + as.data.frame(res, stringsAsFactors = FALSE) } diff --git a/docs/404.html b/docs/404.html index dea8d387..5927d9a0 100644 --- a/docs/404.html +++ b/docs/404.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 1e502d69..ba6220da 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 diff --git a/docs/articles/AMR.html b/docs/articles/AMR.html index b9b1a724..c6d9730e 100644 --- a/docs/articles/AMR.html +++ b/docs/articles/AMR.html @@ -41,7 +41,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 @@ -187,7 +187,7 @@

How to conduct AMR analysis

Matthijs S. Berends

-

10 November 2019

+

11 November 2019

@@ -196,7 +196,7 @@ -

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 10 November 2019.

+

Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 11 November 2019.

Introduction

@@ -212,21 +212,21 @@ -2019-11-10 +2019-11-11 abcd Escherichia coli S S -2019-11-10 +2019-11-11 abcd Escherichia coli S R -2019-11-10 +2019-11-11 efgh Escherichia coli R @@ -321,71 +321,71 @@ -2015-05-08 -P3 +2011-09-25 +O7 +Hospital C +Staphylococcus aureus +S +S +S +S +F + + +2012-04-04 +O9 +Hospital A +Escherichia coli +S +S +S +S +F + + +2015-03-11 +S3 Hospital A Escherichia coli R -I -S -S -F - - -2017-11-03 -Y8 -Hospital C -Escherichia coli -R -S -S -S -F - - -2013-09-06 -U9 -Hospital B -Escherichia coli -R S S S F -2015-11-16 -E7 +2014-12-11 +G1 Hospital B Escherichia coli -I +S S S S M -2011-04-18 -F4 -Hospital B -Streptococcus pneumoniae -S -I -S -S -M - - -2010-04-22 -L4 +2013-01-02 +J8 Hospital D Escherichia coli S +S +S R -S -S M + +2014-08-17 +S8 +Hospital C +Escherichia coli +S +S +S +S +F +

Now, let’s start the cleaning and the analysis!

@@ -406,8 +406,8 @@ # # Item Count Percent Cum. Count Cum. Percent # --- ----- ------- -------- ----------- ------------- -# 1 M 10,417 52.09% 10,417 52.09% -# 2 F 9,583 47.92% 20,000 100.00% +# 1 M 10,427 52.14% 10,427 52.14% +# 2 F 9,573 47.87% 20,000 100.00%

So, we can draw at least two conclusions immediately. From a data scientists perspective, the data looks clean: only values M and F. From a researchers perspective: there are slightly more men. Nothing we didn’t already know.

The data is already quite clean, but we still need to transform some variables. The bacteria column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The mutate() function of the dplyr package makes this really easy:

data <- data %>%
@@ -437,14 +437,14 @@
 # Pasteurella multocida (no changes)
 # Staphylococcus (no changes)
 # Streptococcus groups A, B, C, G (no changes)
-# Streptococcus pneumoniae (1,545 values changed)
+# Streptococcus pneumoniae (1,552 values changed)
 # Viridans group streptococci (no changes)
 # 
 # EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)
-# Table 01: Intrinsic resistance in Enterobacteriaceae (1,309 values changed)
+# Table 01: Intrinsic resistance in Enterobacteriaceae (1,279 values changed)
 # Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)
 # Table 03: Intrinsic resistance in other Gram-negative bacteria (no changes)
-# Table 04: Intrinsic resistance in Gram-positive bacteria (2,733 values changed)
+# Table 04: Intrinsic resistance in Gram-positive bacteria (2,800 values changed)
 # Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)
 # Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)
 # Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no changes)
@@ -452,23 +452,23 @@
 # Table 13: Interpretive rules for quinolones (no changes)
 # 
 # Other rules
-# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,194 values changed)
-# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (121 values changed)
+# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2,257 values changed)
+# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (132 values changed)
 # Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no changes)
 # Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)
 # Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no changes)
 # Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)
 # 
 # --------------------------------------------------------------------------
-# EUCAST rules affected 6,489 out of 20,000 rows, making a total of 7,902 edits
+# EUCAST rules affected 6,599 out of 20,000 rows, making a total of 8,020 edits
 # => added 0 test results
 # 
-# => changed 7,902 test results
-#    - 118 test results changed from S to I
-#    - 4,776 test results changed from S to R
-#    - 1,063 test results changed from I to S
-#    - 318 test results changed from I to R
-#    - 1,603 test results changed from R to S
+# => changed 8,020 test results
+#    - 119 test results changed from S to I
+#    - 4,832 test results changed from S to R
+#    - 1,096 test results changed from I to S
+#    - 342 test results changed from I to R
+#    - 1,607 test results changed from R to S
 #    - 24 test results changed from R to I
 # --------------------------------------------------------------------------
 # 
@@ -497,8 +497,8 @@
 # NOTE: Using column `bacteria` as input for `col_mo`.
 # NOTE: Using column `date` as input for `col_date`.
 # NOTE: Using column `patient_id` as input for `col_patient_id`.
-# => Found 5,696 first isolates (28.5% of total)
-

So only 28.5% is suitable for resistance analysis! We can now filter on it with the filter() function, also from the dplyr package:

+# => Found 5,657 first isolates (28.3% of total)
+

So only 28.3% is suitable for resistance analysis! We can now filter on it with the filter() function, also from the dplyr package:

data_1st <- data %>% 
   filter(first == TRUE)

For future use, the above two syntaxes can be shortened with the filter_first_isolate() function:

@@ -508,7 +508,7 @@

First weighted isolates

-

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient P1, sorted on date:

+

We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient D2, sorted on date:

@@ -524,19 +524,19 @@ - - + + - - - + + + - - + + @@ -546,30 +546,30 @@ - - + + - - + + - - + + - + - - + + @@ -579,8 +579,8 @@ - - + + @@ -590,8 +590,8 @@ - - + + @@ -601,10 +601,10 @@ - - + + - + @@ -612,26 +612,26 @@ - - + + - - - - - - - - - - - + + + + + + + + + + +
isolate
12010-01-26P12010-02-14D2 B_ESCHR_COLIISS RSSS TRUE
22010-04-19P12010-04-27D2 B_ESCHR_COLI S S
32010-04-24P12010-05-31D2 B_ESCHR_COLI RRRSS S FALSE
42010-06-11P12010-08-21D2 B_ESCHR_COLI S S SRS FALSE
52010-11-24P12010-09-21D2 B_ESCHR_COLI S S
62010-12-11P12010-10-04D2 B_ESCHR_COLI R S
72010-12-23P12010-10-11D2 B_ESCHR_COLI S S
82011-01-14P12010-11-16D2 B_ESCHR_COLIRS S S S
92011-01-19P12011-03-05D2 B_ESCHR_COLIRR S SFALSE
102011-01-26P1B_ESCHR_COLIRI S S TRUE
102011-04-18D2B_ESCHR_COLISSSSFALSE

Only 2 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics() function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.

@@ -645,7 +645,7 @@ # NOTE: Using column `patient_id` as input for `col_patient_id`. # NOTE: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this. # [Criterion] Inclusion based on key antibiotics, ignoring I -# => Found 15,241 first weighted isolates (76.2% of total)
+# => Found 15,009 first weighted isolates (75.0% of total) @@ -662,20 +662,20 @@ - - + + - - - + + + - - + + @@ -686,44 +686,44 @@ - - + + - - + + - - + + - + - - + + - + - - + + @@ -734,8 +734,8 @@ - - + + @@ -746,10 +746,10 @@ - - + + - + @@ -758,35 +758,35 @@ - - + + - - + + + + + + + + + + + + + + - - - - - - - - - - - - - +
isolate
12010-01-26P12010-02-14D2 B_ESCHR_COLIISS RSSS TRUE TRUE
22010-04-19P12010-04-27D2 B_ESCHR_COLI S S
32010-04-24P12010-05-31D2 B_ESCHR_COLI RRRSS S FALSE TRUE
42010-06-11P12010-08-21D2 B_ESCHR_COLI S S SRS FALSE TRUE
52010-11-24P12010-09-21D2 B_ESCHR_COLI S S S S FALSETRUEFALSE
62010-12-11P12010-10-04D2 B_ESCHR_COLI R S
72010-12-23P12010-10-11D2 B_ESCHR_COLI S S
82011-01-14P12010-11-16D2 B_ESCHR_COLIRS S S S
92011-01-19P12011-03-05D2 B_ESCHR_COLIRRSSSSTRUETRUE
102011-04-18D2B_ESCHR_COLISS S S FALSETRUE
102011-01-26P1B_ESCHR_COLIRISSTRUETRUEFALSE
-

Instead of 2, now 10 isolates are flagged. In total, 76.2% of all isolates are marked ‘first weighted’ - 47.7% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

+

Instead of 2, now 8 isolates are flagged. In total, 75.0% of all isolates are marked ‘first weighted’ - 46.8% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.

As with filter_first_isolate(), there’s a shortcut for this new algorithm too:

data_1st <- data %>% 
   filter_first_weighted_isolate()
-

So we end up with 15,241 isolates for analysis.

+

So we end up with 15,009 isolates for analysis.

We can remove unneeded columns:

data_1st <- data_1st %>% 
   select(-c(first, keyab))
@@ -812,57 +812,9 @@ 1 -2015-05-08 -P3 -Hospital A -B_ESCHR_COLI -R -I -S -S -F -Gram-negative -Escherichia -coli -TRUE - - -3 -2013-09-06 -U9 -Hospital B -B_ESCHR_COLI -R -S -S -S -F -Gram-negative -Escherichia -coli -TRUE - - -5 -2011-04-18 -F4 -Hospital B -B_STRPT_PNMN -S -S -S -R -M -Gram-positive -Streptococcus -pneumoniae -TRUE - - -7 -2016-10-10 -S10 -Hospital D +2011-09-25 +O7 +Hospital C B_STPHY_AURS S S @@ -874,36 +826,84 @@ aureus TRUE - -8 -2010-01-21 -R3 -Hospital B -B_STRPT_PNMN + +3 +2015-03-11 +S3 +Hospital A +B_ESCHR_COLI +R +S S S -R -R F -Gram-positive -Streptococcus -pneumoniae +Gram-negative +Escherichia +coli +TRUE + + +4 +2014-12-11 +G1 +Hospital B +B_ESCHR_COLI +S +S +S +S +M +Gram-negative +Escherichia +coli +TRUE + + +5 +2013-01-02 +J8 +Hospital D +B_ESCHR_COLI +S +S +S +R +M +Gram-negative +Escherichia +coli +TRUE + + +7 +2013-08-06 +H4 +Hospital B +B_ESCHR_COLI +R +I +R +S +M +Gram-negative +Escherichia +coli TRUE 9 -2011-11-23 -B7 -Hospital B -B_STRPT_PNMN -R -R +2016-10-03 +F3 +Hospital D +B_ESCHR_COLI +S S R +S M -Gram-positive -Streptococcus -pneumoniae +Gram-negative +Escherichia +coli TRUE @@ -925,7 +925,7 @@
data_1st %>% freq(genus, species)

Frequency table

Class: character
-Length: 15,241 (of which NA: 0 = 0%)
+Length: 15,009 (of which NA: 0 = 0%)
Unique: 4

Shortest: 16
Longest: 24

@@ -942,33 +942,33 @@ Longest: 24

1 Escherichia coli -7,593 -49.82% -7,593 -49.82% +7,411 +49.38% +7,411 +49.38% 2 Staphylococcus aureus -3,734 -24.50% -11,327 -74.32% +3,707 +24.70% +11,118 +74.08% 3 Streptococcus pneumoniae -2,327 -15.27% -13,654 -89.59% +2,318 +15.44% +13,436 +89.52% 4 Klebsiella pneumoniae -1,587 -10.41% -15,241 +1,573 +10.48% +15,009 100.00% @@ -980,7 +980,7 @@ Longest: 24

The functions resistance() and susceptibility() can be used to calculate antimicrobial resistance or susceptibility. For more specific analyses, the functions proportion_S(), proportion_SI(), proportion_I(), proportion_IR() and proportion_R() can be used to determine the proportion of a specific antimicrobial outcome.

As per the EUCAST guideline of 2019, we calculate resistance as the proportion of R (proportion_R(), equal to resistance()) and susceptibility as the proportion of S and I (proportion_SI(), equal to susceptibility()). These functions can be used on their own:

data_1st %>% resistance(AMX)
-# [1] 0.4724099
+# [1] 0.4684523

Or can be used in conjuction with group_by() and summarise(), both from the dplyr package:

data_1st %>% 
   group_by(hospital) %>% 
@@ -993,19 +993,19 @@ Longest: 24

Hospital A -0.4637681 +0.4640823 Hospital B -0.4776811 +0.4663609 Hospital C -0.4811697 +0.4736130 Hospital D -0.4694820 +0.4749499 @@ -1023,23 +1023,23 @@ Longest: 24

Hospital A -0.4637681 -4554 +0.4640823 +4566 Hospital B -0.4776811 -5399 +0.4663609 +5232 Hospital C -0.4811697 -2257 +0.4736130 +2217 Hospital D -0.4694820 -3031 +0.4749499 +2994 @@ -1059,27 +1059,27 @@ Longest: 24

Escherichia -0.9212433 -0.8984591 -0.9927565 +0.9211982 +0.8896235 +0.9929834 Klebsiella -0.8298677 -0.8897290 -0.9836169 +0.8239034 +0.8804832 +0.9809282 Staphylococcus -0.9290305 -0.9258168 -0.9946438 +0.9188023 +0.9209603 +0.9932560 Streptococcus -0.6067899 +0.5974978 0.0000000 -0.6067899 +0.5974978 @@ -1089,11 +1089,12 @@ Longest: 24

summarise("1. Amoxi/clav" = susceptibility(AMC), "2. Gentamicin" = susceptibility(GEN), "3. Amoxi/clav + genta" = susceptibility(AMC, GEN)) %>% - tidyr::gather("antibiotic", "S", -genus) %>% - ggplot(aes(x = genus, - y = S, - fill = antibiotic)) + - geom_col(position = "dodge2")
+ # pivot_longer() from the tidyr package "lengthens" data: + tidyr::pivot_longer(-genus, names_to = "antibiotic") %>% + ggplot(aes(x = genus, + y = value, + fill = antibiotic)) + + geom_col(position = "dodge2")

@@ -1154,19 +1155,24 @@ Longest: 24

Independence test

The next example uses the included example_isolates, which is an anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. This data.frame can be used to practice AMR analysis.

We will compare the resistance to fosfomycin (column FOS) in hospital A and D. The input for the fisher.test() can be retrieved with a transformation like this:

-
check_FOS <- example_isolates %>%
-  filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D
-  select(hospital_id, FOS) %>%             # select the hospitals and fosfomycin
-  group_by(hospital_id) %>%                # group on the hospitals
-  count_df(combine_SI = TRUE) %>%          # count all isolates per group (hospital_id)
-  tidyr::spread(hospital_id, value) %>%    # transform output so A and D are columns
-  select(A, D) %>%                         # and select these only
-  as.matrix()                              # transform to good old matrix for fisher.test()
-
-check_FOS
-#       A  D
-# [1,] 25 77
-# [2,] 24 33
+
# use package 'tidyr' to pivot data; 
+# it gets installed with this 'AMR' package
+library(tidyr)
+
+check_FOS <- example_isolates %>%
+  filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D
+  select(hospital_id, FOS) %>%             # select the hospitals and fosfomycin
+  group_by(hospital_id) %>%                # group on the hospitals
+  count_df(combine_SI = TRUE) %>%          # count all isolates per group (hospital_id)
+  pivot_wider(names_from = hospital_id,    # transform output so A and D are columns
+              values_from = value) %>%     
+  select(A, D) %>%                         # and only select these columns
+  as.matrix()                              # transform to a good old matrix for fisher.test()
+
+check_FOS
+#       A  D
+# [1,] 25 77
+# [2,] 24 33

We can apply the test now with:

# do Fisher's Exact Test
 fisher.test(check_FOS)                            
@@ -1181,7 +1187,7 @@ Longest: 24

# sample estimates: # odds ratio # 0.4488318
-

As can be seen, the p value is 0.031, which means that the fosfomycin resistances found in hospital A and D are really different.

+

As can be seen, the p value is 0.031, which means that the fosfomycin resistance found in hospital A and D are really different.

diff --git a/docs/articles/AMR_files/figure-html/plot 1-1.png b/docs/articles/AMR_files/figure-html/plot 1-1.png index ac506bca..a6172ff6 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 1-1.png and b/docs/articles/AMR_files/figure-html/plot 1-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 3-1.png b/docs/articles/AMR_files/figure-html/plot 3-1.png index bf1555f3..4e4dfe54 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 3-1.png and b/docs/articles/AMR_files/figure-html/plot 3-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 4-1.png b/docs/articles/AMR_files/figure-html/plot 4-1.png index 3125f21f..3d384a83 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 4-1.png and b/docs/articles/AMR_files/figure-html/plot 4-1.png differ diff --git a/docs/articles/AMR_files/figure-html/plot 5-1.png b/docs/articles/AMR_files/figure-html/plot 5-1.png index bd2a06c1..cbcb6264 100644 Binary files a/docs/articles/AMR_files/figure-html/plot 5-1.png and b/docs/articles/AMR_files/figure-html/plot 5-1.png differ diff --git a/docs/articles/SPSS.html b/docs/articles/SPSS.html index fc505d82..2863556e 100644 --- a/docs/articles/SPSS.html +++ b/docs/articles/SPSS.html @@ -41,7 +41,7 @@ AMR (for R) - 0.8.0 + 0.8.0.9030 @@ -187,7 +187,7 @@

How to import data from SPSS / SAS / Stata

Matthijs S. Berends

-

16 October 2019

+

11 November 2019

@@ -213,7 +213,7 @@
  • R is extremely flexible.

    -

    Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, gathering, grouping and summarising data, or drawing plots, is endless - with SPSS, SAS or Stata you are bound to their algorithms and format styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software. If you sometimes write syntaxes in SPSS to run a complete analysis or to ‘automate’ some of your work, you could do this a lot less time in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish.

    +

    Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, arranging, grouping and summarising data, or drawing plots, is endless - with SPSS, SAS or Stata you are bound to their algorithms and format styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software. If you sometimes write syntaxes in SPSS to run a complete analysis or to ‘automate’ some of your work, you could do this a lot less time in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish.

  • R can be easily automated.

    diff --git a/docs/articles/index.html b/docs/articles/index.html index a1f60b5b..2270e9a5 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 diff --git a/docs/authors.html b/docs/authors.html index 3299e539..881f4b86 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 diff --git a/docs/index.html b/docs/index.html index 59f76de3..6686dfef 100644 --- a/docs/index.html +++ b/docs/index.html @@ -45,7 +45,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 diff --git a/docs/news/index.html b/docs/news/index.html index 9471058e..63b55043 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -84,7 +84,7 @@ AMR (for R) - 0.8.0.9029 + 0.8.0.9030 @@ -231,16 +231,24 @@ -
    +

    -AMR 0.8.0.9029 Unreleased +AMR 0.8.0.9030 Unreleased

    -

    Last updated: 10-Nov-2019

    +

    Last updated: 11-Nov-2019

    New

    @@ -438,14 +446,14 @@ Since this is a major change, usage of the old also_single_tested w

    All these lead to the microbial ID of E. coli:

    - +
  • Function mo_info() as an analogy to ab_info(). The mo_info() prints a list with the full taxonomy, authors, and the URL to the online database of a microorganism
  • Function mo_synonyms() to get all previously accepted taxonomic names of a microorganism

  • @@ -567,14 +575,14 @@ Please
    septic_patients %>% 
    -  freq(age) %>% 
    -  boxplot()
    -# grouped boxplots:
    -septic_patients %>% 
    -  group_by(hospital_id) %>% 
    -  freq(age) %>%
    -  boxplot()
    +
    septic_patients %>% 
    +  freq(age) %>% 
    +  boxplot()
    +# grouped boxplots:
    +septic_patients %>% 
    +  group_by(hospital_id) %>% 
    +  freq(age) %>%
    +  boxplot()
    @@ -659,32 +667,32 @@ This data is updated annually - check the included version with the new function
  • New filters for antimicrobial classes. Use these functions to filter isolates on results in one of more antibiotics from a specific class:

    -
    filter_aminoglycosides()
    -filter_carbapenems()
    -filter_cephalosporins()
    -filter_1st_cephalosporins()
    -filter_2nd_cephalosporins()
    -filter_3rd_cephalosporins()
    -filter_4th_cephalosporins()
    -filter_fluoroquinolones()
    -filter_glycopeptides()
    -filter_macrolides()
    -filter_tetracyclines()
    +
    filter_aminoglycosides()
    +filter_carbapenems()
    +filter_cephalosporins()
    +filter_1st_cephalosporins()
    +filter_2nd_cephalosporins()
    +filter_3rd_cephalosporins()
    +filter_4th_cephalosporins()
    +filter_fluoroquinolones()
    +filter_glycopeptides()
    +filter_macrolides()
    +filter_tetracyclines()

    The antibiotics data set will be searched, after which the input data will be checked for column names with a value in any abbreviations, codes or official names found in the antibiotics data set. For example:

    -
    septic_patients %>% filter_glycopeptides(result = "R")
    -# Filtering on glycopeptide antibacterials: any of `vanc` or `teic` is R
    -septic_patients %>% filter_glycopeptides(result = "R", scope = "all")
    -# Filtering on glycopeptide antibacterials: all of `vanc` and `teic` is R
    +
    septic_patients %>% filter_glycopeptides(result = "R")
    +# Filtering on glycopeptide antibacterials: any of `vanc` or `teic` is R
    +septic_patients %>% filter_glycopeptides(result = "R", scope = "all")
    +# Filtering on glycopeptide antibacterials: all of `vanc` and `teic` is R
  • All ab_* functions are deprecated and replaced by atc_* functions:

    -
    ab_property -> atc_property()
    -ab_name -> atc_name()
    -ab_official -> atc_official()
    -ab_trivial_nl -> atc_trivial_nl()
    -ab_certe -> atc_certe()
    -ab_umcg -> atc_umcg()
    -ab_tradenames -> atc_tradenames()
    +
    ab_property -> atc_property()
    +ab_name -> atc_name()
    +ab_official -> atc_official()
    +ab_trivial_nl -> atc_trivial_nl()
    +ab_certe -> atc_certe()
    +ab_umcg -> atc_umcg()
    +ab_tradenames -> atc_tradenames()
    These functions use as.atc() internally. The old atc_property has been renamed atc_online_property(). This is done for two reasons: firstly, not all ATC codes are of antibiotics (ab) but can also be of antivirals or antifungals. Secondly, the input must have class atc or must be coerable to this class. Properties of these classes should start with the same class name, analogous to as.mo() and e.g. mo_genus.
  • New functions set_mo_source() and get_mo_source() to use your own predefined MO codes as input for as.mo() and consequently all mo_* functions
  • Support for the upcoming dplyr version 0.8.0
  • @@ -696,20 +704,20 @@ These functions use as.atc() internally. The old atc_property
  • New function age_groups() to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic antimicrobial resistance analysis per age group.
  • New function ggplot_rsi_predict() as well as the base R plot() function can now be used for resistance prediction calculated with resistance_predict():

    -
    x <- resistance_predict(septic_patients, col_ab = "amox")
    -plot(x)
    -ggplot_rsi_predict(x)
    +
    x <- resistance_predict(septic_patients, col_ab = "amox")
    +plot(x)
    +ggplot_rsi_predict(x)
  • Functions filter_first_isolate() and filter_first_weighted_isolate() to shorten and fasten filtering on data sets with antimicrobial results, e.g.:

    - +

    is equal to:

    -
    septic_patients %>%
    -  mutate(only_firsts = first_isolate(septic_patients, ...)) %>%
    -  filter(only_firsts == TRUE) %>%
    -  select(-only_firsts)
    +
    septic_patients %>%
    +  mutate(only_firsts = first_isolate(septic_patients, ...)) %>%
    +  filter(only_firsts == TRUE) %>%
    +  select(-only_firsts)
  • New function availability() to check the number of available (non-empty) results in a data.frame
  • @@ -738,33 +746,33 @@ These functions use as.atc() internally. The old atc_property

    They also come with support for German, Dutch, French, Italian, Spanish and Portuguese:

    -
    mo_gramstain("E. coli")
    -# [1] "Gram negative"
    -mo_gramstain("E. coli", language = "de") # German
    -# [1] "Gramnegativ"
    -mo_gramstain("E. coli", language = "es") # Spanish
    -# [1] "Gram negativo"
    -mo_fullname("S. group A", language = "pt") # Portuguese
    -# [1] "Streptococcus grupo A"
    +
    mo_gramstain("E. coli")
    +# [1] "Gram negative"
    +mo_gramstain("E. coli", language = "de") # German
    +# [1] "Gramnegativ"
    +mo_gramstain("E. coli", language = "es") # Spanish
    +# [1] "Gram negativo"
    +mo_fullname("S. group A", language = "pt") # Portuguese
    +# [1] "Streptococcus grupo A"

    Furthermore, former taxonomic names will give a note about the current taxonomic name:

    - +
  • Functions count_R, count_IR, count_I, count_SI and count_S to selectively count resistant or susceptible isolates @@ -1337,7 +1345,7 @@ Using as.mo(..., allow_uncertain = 3)

    Contents

    +

    Interpretation of S, I and R

    + + + +

    In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I and R as shown below (http://www.eucast.org/newsiandr/). Results of several consultations on the new definitions are available on the EUCAST website under "Consultations".

    +
      +
    • S - Susceptible, standard dosing regimen: A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent.

    • +
    • I - Susceptible, increased exposure: A microorganism is categorised as "Susceptible, Increased exposure" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection.

    • +
    • R - Resistant: A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure.

    • +
    + +

    Exposure is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection.

    +

    This AMR package honours this new insight. Use susceptibility() (equal to proportion_SI()) to determine antimicrobial susceptibility and count_susceptible() (equal to count_SI()) to count susceptible isolates.

    Read more on our website!

    @@ -394,6 +407,7 @@
  • Arguments
  • Value
  • Details
  • +
  • Interpretation of S, I and R
  • Read more on our website!
  • See also
  • Examples
  • diff --git a/man/resistance_predict.Rd b/man/resistance_predict.Rd index deb9484f..c29ef00c 100644 --- a/man/resistance_predict.Rd +++ b/man/resistance_predict.Rd @@ -38,7 +38,7 @@ ggplot_rsi_predict(x, main = paste("Resistance Prediction of", x_name), \item{model}{the statistical model of choice. This could be a generalised linear regression model with binomial distribution (i.e. using \code{\link{glm}(..., family = \link{binomial})}), assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance. See Details for all valid options.} -\item{I_as_S}{a logical to indicate whether values \code{I} should be treated as \code{S} (will otherwise be treated as \code{R})} +\item{I_as_S}{a logical to indicate whether values \code{I} should be treated as \code{S} (will otherwise be treated as \code{R}). The default, \code{TRUE}, follows the redefinition by EUCAST about the interpretion of I (increased exposure) in 2019, see section 'Interpretation of S, I and R' below.} \item{preserve_measurements}{a logical to indicate whether predictions of years that are actually available in the data should be overwritten by the original data. The standard errors of those years will be \code{NA}.} @@ -74,6 +74,21 @@ Valid options for the statistical model are: \item{\code{"lin"} or \code{"linear"}: a linear regression model} } } +\section{Interpretation of S, I and R}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I and R as shown below (\url{http://www.eucast.org/newsiandr/}). Results of several consultations on the new definitions are available on the EUCAST website under "Consultations". + +\itemize{ + \item{\strong{S} - }{Susceptible, standard dosing regimen: A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent.} + \item{\strong{I} - }{Susceptible, increased exposure: A microorganism is categorised as "Susceptible, Increased exposure" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection.} + \item{\strong{R} - }{Resistant: A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure.} +} + +Exposure is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. + +This AMR package honours this new insight. Use \code{\link{susceptibility}()} (equal to \code{\link{proportion_SI}()}) to determine antimicrobial susceptibility and \code{\link{count_susceptible}()} (equal to \code{\link{count_SI}()}) to count susceptible isolates. +} + \section{Read more on our website!}{ On our website \url{https://msberends.gitlab.io/AMR} you can find \href{https://msberends.gitlab.io/AMR/articles/AMR.html}{a tutorial} about how to conduct AMR analysis, the \href{https://msberends.gitlab.io/AMR/reference}{complete documentation of all functions} (which reads a lot easier than here in R) and \href{https://msberends.gitlab.io/AMR/articles/WHONET.html}{an example analysis using WHONET data}. diff --git a/vignettes/AMR.Rmd b/vignettes/AMR.Rmd index 1ff4af18..74083d5d 100755 --- a/vignettes/AMR.Rmd +++ b/vignettes/AMR.Rmd @@ -385,9 +385,10 @@ data_1st %>% summarise("1. Amoxi/clav" = susceptibility(AMC), "2. Gentamicin" = susceptibility(GEN), "3. Amoxi/clav + genta" = susceptibility(AMC, GEN)) %>% - tidyr::gather("antibiotic", "S", -genus) %>% + # pivot_longer() from the tidyr package "lengthens" data: + tidyr::pivot_longer(-genus, names_to = "antibiotic") %>% ggplot(aes(x = genus, - y = S, + y = value, fill = antibiotic)) + geom_col(position = "dodge2") ``` @@ -463,14 +464,19 @@ The next example uses the included `example_isolates`, which is an anonymised da We will compare the resistance to fosfomycin (column `FOS`) in hospital A and D. The input for the `fisher.test()` can be retrieved with a transformation like this: ```{r, results = 'markup'} +# use package 'tidyr' to pivot data; +# it gets installed with this 'AMR' package +library(tidyr) + check_FOS <- example_isolates %>% filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D select(hospital_id, FOS) %>% # select the hospitals and fosfomycin group_by(hospital_id) %>% # group on the hospitals count_df(combine_SI = TRUE) %>% # count all isolates per group (hospital_id) - tidyr::spread(hospital_id, value) %>% # transform output so A and D are columns - select(A, D) %>% # and select these only - as.matrix() # transform to good old matrix for fisher.test() + pivot_wider(names_from = hospital_id, # transform output so A and D are columns + values_from = value) %>% + select(A, D) %>% # and only select these columns + as.matrix() # transform to a good old matrix for fisher.test() check_FOS ``` @@ -482,4 +488,4 @@ We can apply the test now with: fisher.test(check_FOS) ``` -As can be seen, the p value is `r round(fisher.test(check_FOS)$p.value, 3)`, which means that the fosfomycin resistances found in hospital A and D are really different. +As can be seen, the p value is `r round(fisher.test(check_FOS)$p.value, 3)`, which means that the fosfomycin resistance found in hospital A and D are really different. diff --git a/vignettes/SPSS.Rmd b/vignettes/SPSS.Rmd index 7f5bf923..91eb4f4f 100755 --- a/vignettes/SPSS.Rmd +++ b/vignettes/SPSS.Rmd @@ -39,7 +39,7 @@ As said, SPSS is easier to learn than R. But SPSS, SAS and Stata come with major * **R is extremely flexible.** - Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, gathering, grouping and summarising data, or drawing plots, is endless - with SPSS, SAS or Stata you are bound to their algorithms and format styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software. If you sometimes write syntaxes in SPSS to run a complete analysis or to 'automate' some of your work, you could do this a lot less time in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish. + Because you write the syntax yourself, you can do anything you want. The flexibility in transforming, arranging, grouping and summarising data, or drawing plots, is endless - with SPSS, SAS or Stata you are bound to their algorithms and format styles. They may be a bit flexible, but you can probably never create that very specific publication-ready plot without using other (paid) software. If you sometimes write syntaxes in SPSS to run a complete analysis or to 'automate' some of your work, you could do this a lot less time in R. You will notice that writing syntaxes in R is a lot more nifty and clever than in SPSS. Still, as working with any statistical package, you will have to have knowledge about what you are doing (statistically) and what you are willing to accomplish. * **R can be easily automated.**