2018-08-10 15:01:05 +02:00
# ==================================================================== #
# TITLE #
2022-10-05 09:12:22 +02:00
# AMR: An R Package for Working with Antimicrobial Resistance Data #
2018-08-10 15:01:05 +02:00
# #
2019-01-02 23:24:07 +01:00
# SOURCE #
2020-07-08 14:48:06 +02:00
# https://github.com/msberends/AMR #
2018-08-10 15:01:05 +02:00
# #
2022-10-05 09:12:22 +02:00
# CITE AS #
# Berends MS, Luz CF, Friedrich AW, Sinha BNM, Albers CJ, Glasner C #
# (2022). AMR: An R Package for Working with Antimicrobial Resistance #
# Data. Journal of Statistical Software, 104(3), 1-31. #
# doi:10.18637/jss.v104.i03 #
# #
2020-10-08 11:16:03 +02:00
# Developed at the University of Groningen, the Netherlands, in #
# collaboration with non-profit organisations Certe Medical #
2022-08-28 10:31:50 +02:00
# Diagnostics & Advice, and University Medical Center Groningen. #
2018-08-10 15:01:05 +02:00
# #
2019-01-02 23:24:07 +01:00
# This R package is free software; you can freely use and distribute #
# it for both personal and commercial purposes under the terms of the #
# GNU General Public License version 2.0 (GNU GPL-2), as published by #
# the Free Software Foundation. #
2020-01-05 17:22:09 +01:00
# We created this package for both routine data analysis and academic #
# research and it was publicly released in the hope that it will be #
# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY. #
2020-10-08 11:16:03 +02:00
# #
# Visit our website for the full manual and a complete tutorial about #
2021-02-02 23:57:35 +01:00
# how to conduct AMR data analysis: https://msberends.github.io/AMR/ #
2018-08-10 15:01:05 +02:00
# ==================================================================== #
2021-01-18 16:57:56 +01:00
#' Calculate Microbial Resistance
2018-08-10 15:01:05 +02:00
#'
2021-01-18 16:57:56 +01:00
#' @description These functions can be used to calculate the (co-)resistance or susceptibility of microbial isolates (i.e. percentage of S, SI, I, IR or R). All functions support quasiquotation with pipes, can be used in `summarise()` from the `dplyr` package and also support grouped variables, see *Examples*.
2018-08-10 15:01:05 +02:00
#'
2019-11-28 22:32:17 +01:00
#' [resistance()] should be used to calculate resistance, [susceptibility()] should be used to calculate susceptibility.\cr
2021-01-18 16:57:56 +01:00
#' @param ... one or more vectors (or columns) with antibiotic interpretations. They will be transformed internally with [as.rsi()] if needed. Use multiple columns to calculate (the lack of) co-resistance: the probability where one of two drugs have a resistant or susceptible result. See *Examples*.
#' @param minimum the minimum allowed number of available (tested) isolates. Any isolate count lower than `minimum` will return `NA` with a warning. The default number of `30` isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see *Source*.
2021-05-12 18:15:03 +02:00
#' @param as_percent a [logical] to indicate whether the output must be returned as a hundred fold with % sign (a character). A value of `0.123456` will then be returned as `"12.3%"`.
#' @param only_all_tested (for combination therapies, i.e. using more than one variable for `...`): a [logical] to indicate that isolates must be tested for all antibiotics, see section *Combination Therapy* below
2020-09-18 16:05:53 +02:00
#' @param data a [data.frame] containing columns with class [`rsi`] (see [as.rsi()])
2020-12-17 16:22:25 +01:00
#' @param translate_ab a column name of the [antibiotics] data set to translate the antibiotic abbreviations to, using [ab_property()]
2019-05-10 16:44:59 +02:00
#' @inheritParams ab_property
2022-10-20 16:08:01 +02:00
#' @param combine_SI a [logical] to indicate whether all values of S and I must be merged into one, so the output only consists of S+I vs. R (susceptible vs. resistant), defaults to `TRUE`
#' @param ab_result antibiotic results to test against, must be one of more values of "R", "S", "I"
#' @param confidence_level the confidence level for the returned confidence interval. For the calculation, the number of S or SI isolates, and R isolates are compared with the total number of available isolates with R, S, or I by using [binom.test()], i.e., the Clopper-Pearson method.
#' @param side the side of the confidence interval to return. Defaults to `"both"` for a length 2 vector, but can also be (abbreviated as) `"min"`/`"left"`/`"lower"`/`"less"` or `"max"`/`"right"`/`"higher"`/`"greater"`.
2019-11-29 19:43:23 +01:00
#' @inheritSection as.rsi Interpretation of R and S/I
2019-11-10 12:16:56 +01:00
#' @details
2019-11-28 22:32:17 +01:00
#' The function [resistance()] is equal to the function [proportion_R()]. The function [susceptibility()] is equal to the function [proportion_SI()].
2022-10-20 16:08:01 +02:00
#'
#' Use [rsi_confidence_interval()] to calculate the confidence interval, which relies on [binom.test()], i.e., the Clopper-Pearson method. This function returns a vector of length 2 at default for antimicrobial *resistance*. Change the `side` argument to "left"/"min" or "right"/"max" to return a single value, and change the `ab_result` argument to e.g. `c("S", "I")` to test for antimicrobial *susceptibility*, see Examples.
2022-08-28 10:31:50 +02:00
#'
2021-05-12 18:15:03 +02:00
#' **Remember that you should filter your data to let it contain only first isolates!** This is needed to exclude duplicates and to reduce selection bias. Use [first_isolate()] to determine them in your data set.
2018-08-10 15:01:05 +02:00
#'
2020-12-22 00:51:17 +01:00
#' These functions are not meant to count isolates, but to calculate the proportion of resistance/susceptibility. Use the [`count()`][AMR::count()] functions to count isolates. The function [susceptibility()] is essentially equal to `count_susceptible() / count_all()`. *Low counts can influence the outcome - the `proportion` functions may camouflage this, since they only return the proportion (albeit being dependent on the `minimum` argument).*
2018-08-22 00:02:26 +02:00
#'
2020-05-16 13:05:47 +02:00
#' The function [proportion_df()] takes any variable from `data` that has an [`rsi`] class (created with [as.rsi()]) and calculates the proportions R, I and S. It also supports grouped variables. The function [rsi_df()] works exactly like [proportion_df()], but adds the number of isolates.
2021-01-18 16:57:56 +01:00
#' @section Combination Therapy:
2020-08-14 13:36:10 +02:00
#' When using more than one variable for `...` (= combination therapy), use `only_all_tested` to only count isolates that are tested for all antibiotics/variables that you test them for. See this example for two antibiotics, Drug A and Drug B, about how [susceptibility()] works to calculate the %SI:
2019-07-01 14:03:15 +02:00
#'
2019-11-28 22:32:17 +01:00
#' ```
2019-08-26 16:02:03 +02:00
#' --------------------------------------------------------------------
#' only_all_tested = FALSE only_all_tested = TRUE
#' ----------------------- -----------------------
#' Drug A Drug B include as include as include as include as
#' numerator denominator numerator denominator
#' -------- -------- ---------- ----------- ---------- -----------
#' S or I S or I X X X X
#' R S or I X X X X
#' <NA> S or I X X - -
#' S or I R X X X X
#' R R - X - X
#' <NA> R - - - -
#' S or I <NA> X X - -
#' R <NA> - - - -
#' <NA> <NA> - - - -
#' --------------------------------------------------------------------
2019-11-28 22:32:17 +01:00
#' ```
2019-03-26 14:24:03 +01:00
#'
2019-11-28 22:32:17 +01:00
#' Please note that, in combination therapies, for `only_all_tested = TRUE` applies that:
#' ```
2019-11-10 12:16:56 +01:00
#' count_S() + count_I() + count_R() = count_all()
#' proportion_S() + proportion_I() + proportion_R() = 1
2019-11-28 22:32:17 +01:00
#' ```
#' and that, in combination therapies, for `only_all_tested = FALSE` applies that:
#' ```
2019-11-10 12:16:56 +01:00
#' count_S() + count_I() + count_R() >= count_all()
#' proportion_S() + proportion_I() + proportion_R() >= 1
2019-11-28 22:32:17 +01:00
#' ```
2019-07-01 14:03:15 +02:00
#'
2019-11-28 22:32:17 +01:00
#' Using `only_all_tested` has no impact when only using one antibiotic as input.
2022-10-20 16:08:01 +02:00
#' @source **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition**, 2022, *Clinical and Laboratory Standards Institute (CLSI)*. <https://clsi.org/standards/products/microbiology/documents/m39/>.
2019-11-28 22:32:17 +01:00
#' @seealso [AMR::count()] to count resistant and susceptible isolates.
2020-09-18 16:05:53 +02:00
#' @return A [double] or, when `as_percent = TRUE`, a [character].
2019-11-10 12:16:56 +01:00
#' @rdname proportion
#' @aliases portion
#' @name proportion
2018-08-10 15:01:05 +02:00
#' @export
#' @examples
2019-08-27 16:45:42 +02:00
#' # example_isolates is a data set available in the AMR package.
2022-08-21 16:37:20 +02:00
#' # run ?example_isolates for more info.
2022-08-28 10:31:50 +02:00
#'
2022-08-21 16:37:20 +02:00
#' # base R ------------------------------------------------------------
2022-10-20 16:08:01 +02:00
#' # determines %R
#' resistance(example_isolates$AMX)
#' rsi_confidence_interval(example_isolates$AMX)
#' rsi_confidence_interval(example_isolates$AMX,
#' confidence_level = 0.975)
#'
#' # determines %S+I:
#' susceptibility(example_isolates$AMX)
#' rsi_confidence_interval(example_isolates$AMX,
#' ab_result = c("S", "I"))
2018-08-12 22:34:03 +02:00
#'
2019-11-10 12:16:56 +01:00
#' # be more specific
#' proportion_S(example_isolates$AMX)
#' proportion_SI(example_isolates$AMX)
#' proportion_I(example_isolates$AMX)
#' proportion_IR(example_isolates$AMX)
#' proportion_R(example_isolates$AMX)
2018-08-10 15:01:05 +02:00
#'
2022-08-21 16:37:20 +02:00
#' # dplyr -------------------------------------------------------------
2021-05-24 09:00:11 +02:00
#' \donttest{
2020-05-16 21:40:50 +02:00
#' if (require("dplyr")) {
2022-10-20 16:08:01 +02:00
#'
2020-05-16 20:08:21 +02:00
#' example_isolates %>%
2022-08-27 20:49:37 +02:00
#' group_by(ward) %>%
2022-08-28 10:31:50 +02:00
#' summarise(
#' r = resistance(CIP),
#' n = n_rsi(CIP)
#' ) # n_rsi works like n_distinct in dplyr, see ?n_rsi
2022-10-20 16:08:01 +02:00
#'
#' }
#' if (require("dplyr")) {
#'
#' example_isolates %>%
#' group_by(ward) %>%
#' summarise(
#' cipro_R = resistance(CIP),
#' ci_min = rsi_confidence_interval(CIP, side = "min"),
#' ci_max = rsi_confidence_interval(CIP, side = "max"),
#' )
#'
#' }
#' if (require("dplyr")) {
#'
2022-10-21 15:13:19 +02:00
#' # scoped dplyr verbs with antibiotic selectors
#' # (you could also use across() of course)
#' example_isolates %>%
#' group_by(ward) %>%
#' summarise_at(
#' c(aminoglycosides(), carbapenems()),
#' resistance
#' )
#'
#' }
#' if (require("dplyr")) {
#'
2020-05-16 20:08:21 +02:00
#' example_isolates %>%
2022-08-27 20:49:37 +02:00
#' group_by(ward) %>%
2022-08-28 10:31:50 +02:00
#' summarise(
#' R = resistance(CIP, as_percent = TRUE),
#' SI = susceptibility(CIP, as_percent = TRUE),
#' n1 = count_all(CIP), # the actual total; sum of all three
#' n2 = n_rsi(CIP), # same - analogous to n_distinct
#' total = n()
#' ) # NOT the number of tested isolates!
#'
2020-05-16 20:08:21 +02:00
#' # Calculate co-resistance between amoxicillin/clav acid and gentamicin,
#' # so we can see that combination therapy does a lot more than mono therapy:
2022-08-28 10:31:50 +02:00
#' example_isolates %>% susceptibility(AMC) # %SI = 76.3%
#' example_isolates %>% count_all(AMC) # n = 1879
#'
#' example_isolates %>% susceptibility(GEN) # %SI = 75.4%
#' example_isolates %>% count_all(GEN) # n = 1855
#'
2020-05-16 20:08:21 +02:00
#' example_isolates %>% susceptibility(AMC, GEN) # %SI = 94.1%
2022-08-28 10:31:50 +02:00
#' example_isolates %>% count_all(AMC, GEN) # n = 1939
#'
#'
2020-05-16 20:08:21 +02:00
#' # See Details on how `only_all_tested` works. Example:
#' example_isolates %>%
2022-08-28 10:31:50 +02:00
#' summarise(
#' numerator = count_susceptible(AMC, GEN),
#' denominator = count_all(AMC, GEN),
#' proportion = susceptibility(AMC, GEN)
#' )
#'
2020-05-16 20:08:21 +02:00
#' example_isolates %>%
2022-08-28 10:31:50 +02:00
#' summarise(
#' numerator = count_susceptible(AMC, GEN, only_all_tested = TRUE),
#' denominator = count_all(AMC, GEN, only_all_tested = TRUE),
#' proportion = susceptibility(AMC, GEN, only_all_tested = TRUE)
#' )
#'
#'
2020-05-16 20:08:21 +02:00
#' example_isolates %>%
2022-08-27 20:49:37 +02:00
#' group_by(ward) %>%
2022-08-28 10:31:50 +02:00
#' summarise(
#' cipro_p = susceptibility(CIP, as_percent = TRUE),
#' cipro_n = count_all(CIP),
#' genta_p = susceptibility(GEN, as_percent = TRUE),
#' genta_n = count_all(GEN),
#' combination_p = susceptibility(CIP, GEN, as_percent = TRUE),
#' combination_n = count_all(CIP, GEN)
#' )
#'
2020-05-16 20:08:21 +02:00
#' # Get proportions S/I/R immediately of all rsi columns
#' example_isolates %>%
#' select(AMX, CIP) %>%
#' proportion_df(translate = FALSE)
2022-08-28 10:31:50 +02:00
#'
2020-05-16 20:08:21 +02:00
#' # It also supports grouping variables
2022-08-21 16:37:20 +02:00
#' # (use rsi_df to also include the count)
2020-05-16 20:08:21 +02:00
#' example_isolates %>%
2022-08-27 20:49:37 +02:00
#' select(ward, AMX, CIP) %>%
#' group_by(ward) %>%
2022-08-21 16:37:20 +02:00
#' rsi_df(translate = FALSE)
2020-05-16 21:40:50 +02:00
#' }
2021-05-24 09:00:11 +02:00
#' }
2019-11-10 12:16:56 +01:00
resistance <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = " R" ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-10 15:01:05 +02:00
}
2019-11-10 12:16:56 +01:00
#' @rdname proportion
2018-08-10 15:01:05 +02:00
#' @export
2019-11-10 12:16:56 +01:00
susceptibility <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = c ( " S" , " I" ) ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
)
}
#' @rdname proportion
#' @export
rsi_confidence_interval <- function ( ... ,
ab_result = " R" ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ,
confidence_level = 0.95 ,
side = " both" ) {
meet_criteria ( ab_result , allow_class = c ( " character" , " rsi" ) , has_length = c ( 1 , 2 , 3 ) , is_in = c ( " R" , " S" , " I" ) )
meet_criteria ( confidence_level , allow_class = " numeric" , is_positive = TRUE , has_length = 1 )
meet_criteria ( side , allow_class = " character" , has_length = 1 , is_in = c ( " both" , " b" , " left" , " l" , " lower" , " lowest" , " less" , " min" , " right" , " r" , " higher" , " highest" , " greater" , " g" , " max" ) )
x <- tryCatch (
rsi_calc ( ... ,
ab_result = ab_result ,
only_all_tested = only_all_tested ,
only_count = TRUE
) ,
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
)
n <- tryCatch (
rsi_calc ( ... ,
ab_result = c ( " S" , " I" , " R" ) ,
only_all_tested = only_all_tested ,
only_count = TRUE
) ,
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2022-10-20 16:08:01 +02:00
if ( n < minimum ) {
warning_ ( " Introducing NA: " ,
ifelse ( n == 0 , " no" , paste ( " only" , n ) ) ,
" results available for `rsi_confidence_interval()` (`minimum` = " , minimum , " )." ,
call = FALSE
)
if ( as_percent == TRUE ) {
return ( NA_character_ )
} else {
return ( NA_real_ )
}
}
out <- stats :: binom.test ( x = x , n = n , conf.level = confidence_level ) $ conf.int
out <- set_clean_class ( out , " double" )
if ( side %in% c ( " left" , " l" , " lower" , " lowest" , " less" , " min" ) ) {
out <- out [1 ]
} else if ( side %in% c ( " right" , " r" , " higher" , " highest" , " greater" , " g" , " max" ) ) {
out <- out [2 ]
}
if ( as_percent == TRUE ) {
percentage ( out , digits = 1 )
} else {
out
}
2019-11-10 12:16:56 +01:00
}
#' @rdname proportion
#' @export
proportion_R <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = " R" ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2019-11-10 12:16:56 +01:00
}
#' @rdname proportion
#' @export
proportion_IR <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = c ( " I" , " R" ) ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-10 15:01:05 +02:00
}
2019-11-10 12:16:56 +01:00
#' @rdname proportion
2018-08-10 15:01:05 +02:00
#' @export
2019-11-10 12:16:56 +01:00
proportion_I <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = " I" ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-10 15:01:05 +02:00
}
2019-11-10 12:16:56 +01:00
#' @rdname proportion
2018-08-10 15:01:05 +02:00
#' @export
2019-11-10 12:16:56 +01:00
proportion_SI <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = c ( " S" , " I" ) ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-10 15:01:05 +02:00
}
2019-11-10 12:16:56 +01:00
#' @rdname proportion
2018-08-10 15:01:05 +02:00
#' @export
2019-11-10 12:16:56 +01:00
proportion_S <- function ( ... ,
minimum = 30 ,
as_percent = FALSE ,
only_all_tested = FALSE ) {
2021-06-14 22:04:04 +02:00
tryCatch (
rsi_calc ( ... ,
2022-08-28 10:31:50 +02:00
ab_result = " S" ,
minimum = minimum ,
as_percent = as_percent ,
only_all_tested = only_all_tested ,
only_count = FALSE
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-10 15:01:05 +02:00
}
2019-11-10 12:16:56 +01:00
#' @rdname proportion
2018-08-12 17:44:06 +02:00
#' @export
2019-11-10 12:16:56 +01:00
proportion_df <- function ( data ,
translate_ab = " name" ,
2021-12-12 09:42:03 +01:00
language = get_AMR_locale ( ) ,
2019-11-10 12:16:56 +01:00
minimum = 30 ,
as_percent = FALSE ,
combine_SI = TRUE ,
2022-10-20 16:08:01 +02:00
confidence_level = 0.95 ) {
2021-06-14 22:04:04 +02:00
tryCatch (
2022-08-28 10:31:50 +02:00
rsi_calc_df (
type = " proportion" ,
data = data ,
translate_ab = translate_ab ,
language = language ,
minimum = minimum ,
as_percent = as_percent ,
combine_SI = combine_SI ,
2022-10-20 16:08:01 +02:00
confidence_level = confidence_level
2022-08-28 10:31:50 +02:00
) ,
2022-10-20 16:08:01 +02:00
error = function ( e ) stop_ ( gsub ( " in rsi_calc_df(): " , " " , e $ message , fixed = TRUE ) , call = -5 )
2022-08-28 10:31:50 +02:00
)
2018-08-12 17:44:06 +02:00
}