diff --git a/.github/workflows/publish-to-pypi.yml b/.github/workflows/publish-to-pypi.yml index 2693023d..96845873 100644 --- a/.github/workflows/publish-to-pypi.yml +++ b/.github/workflows/publish-to-pypi.yml @@ -65,5 +65,12 @@ jobs: run: | cd PythonPackage/AMR python -m twine upload dist/* - # for test server: - # python -m twine upload --repository-url https://test.pypi.org/legacy/ dist/* + + - name: Publish to PyPI Testserver + continue-on-error: true + env: + TWINE_USERNAME: "__token__" + TWINE_PASSWORD: ${{ secrets.PYPI_API_TEST_TOKEN }} + run: | + cd PythonPackage/AMR + python -m twine upload --repository-url https://test.pypi.org/legacy/ dist/* diff --git a/DESCRIPTION b/DESCRIPTION index 19a7a28b..7fe377a5 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 2.1.1.9103 -Date: 2024-10-18 +Version: 2.1.1.9106 +Date: 2024-11-21 Title: Antimicrobial Resistance Data Analysis Description: Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by diff --git a/NEWS.md b/NEWS.md index 6682d731..e648b0cb 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# AMR 2.1.1.9103 +# AMR 2.1.1.9106 *(this beta version will eventually become v3.0. We're happy to reach a new major milestone soon, which will be all about the new One Health support! Install this beta using [the instructions here](https://msberends.github.io/AMR/#latest-development-version).)* diff --git a/data-raw/gpt_training_text_v2.1.1.9106.txt b/data-raw/gpt_training_text_v2.1.1.9106.txt new file mode 100644 index 00000000..f6665eb5 --- /dev/null +++ b/data-raw/gpt_training_text_v2.1.1.9106.txt @@ -0,0 +1,34524 @@ +This files contains all context you must know about the AMR package for R. + + + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/aa_amr-package.R + + + + +#' The `AMR` Package +#' +#' @description +#' Welcome to the `AMR` package. +#' +#' The `AMR` package is a [free and open-source](https://msberends.github.io/AMR/#copyright) R package with [zero dependencies](https://en.wikipedia.org/wiki/Dependency_hell) to simplify the analysis and prediction of Antimicrobial Resistance (AMR) and to work with microbial and antimicrobial data and properties, by using evidence-based methods. **Our aim is to provide a standard** for clean and reproducible AMR data analysis, that can therefore empower epidemiological analyses to continuously enable surveillance and treatment evaluation in any setting. [Many different researchers](https://msberends.github.io/AMR/authors.html) from around the globe are continually helping us to make this a successful and durable project! +#' +#' This work was published in the Journal of Statistical Software (Volume 104(3); \doi{10.18637/jss.v104.i03}) and formed the basis of two PhD theses (\doi{10.33612/diss.177417131} and \doi{10.33612/diss.192486375}). +#' +#' After installing this package, R knows [**`r format_included_data_number(AMR::microorganisms)` microorganisms**](https://msberends.github.io/AMR/reference/microorganisms.html) (updated `r format(TAXONOMY_VERSION$GBIF$accessed_date, "%B %Y")`) and all [**`r format_included_data_number(nrow(AMR::antibiotics) + nrow(AMR::antivirals))` antibiotic, antimycotic and antiviral drugs**](https://msberends.github.io/AMR/reference/antibiotics.html) by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral clinical breakpoint guidelines from CLSI and EUCAST are included, even with epidemiological cut-off (ECOFF) values. It supports and can read any data format, including WHONET data. This package works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). **It was designed to work in any setting, including those with very limited resources**. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the public [University of Groningen](https://www.rug.nl), in collaboration with non-profit organisations [Certe Medical Diagnostics and Advice Foundation](https://www.certe.nl) and [University Medical Center Groningen](https://www.umcg.nl). +#' +#' The `AMR` package is available in `r vector_and(vapply(FUN.VALUE = character(1), LANGUAGES_SUPPORTED_NAMES, function(x) x$exonym), quotes = FALSE, sort = FALSE)`. Antimicrobial drug (group) names and colloquial microorganism names are provided in these languages. +#' @section Reference Data Publicly Available: +#' All data sets in this `AMR` package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @source +#' To cite AMR in publications use: +#' +#' Berends MS, Luz CF, Friedrich AW, Sinha BNM, Albers CJ, Glasner C (2022). "AMR: An R Package for Working with Antimicrobial Resistance Data." _Journal of Statistical Software_, *104*(3), 1-31. \doi{10.18637/jss.v104.i03} +#' +#' A BibTeX entry for LaTeX users is: +#' +#' \preformatted{ +#' `r format(citation("AMR"), style = "bib")` +#' } +#' @name AMR +#' @keywords internal +#' @rdname AMR +"_PACKAGE" + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/aa_globals.R + + + + +# add new version numbers here, and add the rules themselves to "data-raw/eucast_rules.tsv" and clinical_breakpoints +# (sourcing "data-raw/_pre_commit_checks.R" will process the TSV file) +EUCAST_VERSION_BREAKPOINTS <- list( + # "13.0" = list( + # version_txt = "v13.0", + # year = 2023, + # title = "'EUCAST Clinical Breakpoint Tables'", + # url = "https://www.eucast.org/clinical_breakpoints/" + # ), + "12.0" = list( + version_txt = "v12.0", + year = 2022, + title = "'EUCAST Clinical Breakpoint Tables'", + url = "https://www.eucast.org/clinical_breakpoints/" + ), + "11.0" = list( + version_txt = "v11.0", + year = 2021, + title = "'EUCAST Clinical Breakpoint Tables'", + url = "https://www.eucast.org/clinical_breakpoints/" + ), + "10.0" = list( + version_txt = "v10.0", + year = 2020, + title = "'EUCAST Clinical Breakpoint Tables'", + url = "https://www.eucast.org/ast_of_bacteria/previous_versions_of_documents/" + ) +) +EUCAST_VERSION_EXPERT_RULES <- list( + "3.3" = list( + version_txt = "v3.3", + year = 2021, + title = "'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes'", + url = "https://www.eucast.org/expert_rules_and_expected_phenotypes" + ), + "3.2" = list( + version_txt = "v3.2", + year = 2020, + title = "'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes'", + url = "https://www.eucast.org/expert_rules_and_expected_phenotypes" + ), + "3.1" = list( + version_txt = "v3.1", + year = 2016, + title = "'EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes'", + url = "https://www.eucast.org/expert_rules_and_expected_phenotypes" + ) +) +# EUCAST_VERSION_RESISTANTPHENOTYPES <- list( +# "1.2" = list( +# version_txt = "v1.2", +# year = 2023, +# title = "'Expected Resistant Phenotypes'", +# url = "https://www.eucast.org/expert_rules_and_expected_phenotypes" +# ) +# ) + +TAXONOMY_VERSION <- list( + GBIF = list( + name = "Global Biodiversity Information Facility (GBIF)", + accessed_date = as.Date("2024-06-24"), + citation = "GBIF Secretariat (2023). GBIF Backbone Taxonomy. Checklist dataset \\doi{10.15468/39omei}.", + url = "https://www.gbif.org" + ), + LPSN = list( + name = "List of Prokaryotic names with Standing in Nomenclature (LPSN)", + accessed_date = as.Date("2024-06-24"), + citation = "Parte, AC *et al.* (2020). **List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.** International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \\doi{10.1099/ijsem.0.004332}.", + url = "https://lpsn.dsmz.de" + ), + MycoBank = list( + name = "MycoBank", + accessed_date = as.Date("2024-06-24"), + citation = "Vincent, R *et al* (2013). **MycoBank gearing up for new horizons.** IMA Fungus, 4(2), 371-9; \\doi{10.5598/imafungus.2013.04.02.16}.", + url = "https://www.mycobank.org" + ), + BacDive = list( + name = "BacDive", + accessed_date = as.Date("2024-07-16"), + citation = "Reimer, LC *et al.* (2022). ***BacDive* in 2022: the knowledge base for standardized bacterial and archaeal data.** Nucleic Acids Res., 50(D1):D741-D74; \\doi{10.1093/nar/gkab961}.", + url = "https://bacdive.dsmz.de" + ), + SNOMED = list( + name = "Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT)", + accessed_date = as.Date("2024-07-16"), + citation = "Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microorganism', OID 2.16.840.1.114222.4.11.1009 (v12).", + url = "https://phinvads.cdc.gov" + ), + LOINC = list( + name = "Logical Observation Identifiers Names and Codes (LOINC)", + accessed_date = as.Date("2023-10-19"), + citation = "Logical Observation Identifiers Names and Codes (LOINC), Version 2.76 (18 September, 2023).", + url = "https://loinc.org" + ) +) + +globalVariables(c( + ".GenericCallEnv", + ".mo", + ".rowid", + ".syndromic_group", + "ab", + "ab_txt", + "affect_ab_name", + "affect_mo_name", + "angle", + "antibiotic", + "antibiotics", + "atc_group1", + "atc_group2", + "base_ab", + "ci_max", + "ci_min", + "clinical_breakpoints", + "code", + "cols", + "count", + "data", + "disk", + "dosage", + "dose", + "dose_times", + "fullname", + "fullname_lower", + "g_species", + "genus", + "gr", + "group", + "guideline", + "hjust", + "host_index", + "host_match", + "input", + "intrinsic_resistant", + "isolates", + "lang", + "language", + "lookup", + "method", + "mic ", + "mic", + "microorganism", + "microorganisms", + "microorganisms.codes", + "mo", + "n", + "name", + "new", + "numerator", + "observations", + "old", + "old_name", + "pattern", + "R", + "rank_index", + "ref_tbl", + "reference.rule", + "reference.rule_group", + "reference.version", + "rowid", + "rule_group", + "rule_name", + "se_max", + "se_min", + "SI", + "sir", + "species", + "syndromic_group", + "total", + "txt", + "type", + "uti_index", + "value", + "varname", + "x", + "xvar", + "y", + "year", + "yvar" +)) + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/aa_helper_functions.R + + + + +# faster implementation of left_join than using merge() by poorman - we use match(): +pm_left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { + if (is.null(by)) { + by <- intersect(names(x), names(y))[1L] + if (is.na(by)) { + stop_("no common column found for pm_left_join()") + } + pm_join_message(by) + } else if (!is.null(names(by))) { + by <- unname(c(names(by), by)) + } + if (length(by) == 1) { + by <- rep(by, 2) + } + + int_x <- colnames(x) %in% colnames(y) & colnames(x) != by[1] + int_y <- colnames(y) %in% colnames(x) & colnames(y) != by[2] + colnames(x)[int_x] <- paste0(colnames(x)[int_x], suffix[1L]) + colnames(y)[int_y] <- paste0(colnames(y)[int_y], suffix[2L]) + + merged <- cbind( + x, + y[ + match( + x[, by[1], drop = TRUE], + y[, by[2], drop = TRUE] + ), + colnames(y)[!colnames(y) %in% colnames(x) & !colnames(y) == by[2]], + drop = FALSE + ] + ) + + rownames(merged) <- NULL + merged +} + +# support where() like tidyverse (this function will also be used when running `antibiogram()`): +where <- function(fn) { + # based on https://github.com/nathaneastwood/poorman/blob/52eb6947e0b4430cd588976ed8820013eddf955f/R/where.R#L17-L32 + if (!is.function(fn)) { + stop_("`", deparse(substitute(fn)), "()` is not a valid predicate function.") + } + df <- pm_select_env$.data + cols <- pm_select_env$get_colnames() + if (is.null(df)) { + df <- get_current_data("where", call = FALSE) + cols <- colnames(df) + } + preds <- unlist(lapply( + df, + function(x, fn) { + do.call("fn", list(x)) + }, + fn + )) + if (!is.logical(preds)) stop_("`where()` must be used with functions that return `TRUE` or `FALSE`.") + data_cols <- cols + cols <- data_cols[preds] + which(data_cols %in% cols) +} + +# copied and slightly rewritten from {poorman} under permissive license (2021-10-15) +# https://github.com/nathaneastwood/poorman, MIT licensed, Nathan Eastwood, 2020 +case_when_AMR <- function(...) { + fs <- list(...) + lapply(fs, function(x) { + if (!inherits(x, "formula")) { + stop("`case_when()` requires formula inputs.") + } + }) + n <- length(fs) + if (n == 0L) { + stop("No cases provided.") + } + + validate_case_when_length <- function(query, value, fs) { + lhs_lengths <- lengths(query) + rhs_lengths <- lengths(value) + all_lengths <- unique(c(lhs_lengths, rhs_lengths)) + if (length(all_lengths) <= 1L) { + return(all_lengths[[1L]]) + } + non_atomic_lengths <- all_lengths[all_lengths != 1L] + len <- non_atomic_lengths[[1L]] + if (length(non_atomic_lengths) == 1L) { + return(len) + } + inconsistent_lengths <- non_atomic_lengths[-1L] + lhs_problems <- lhs_lengths %in% inconsistent_lengths + rhs_problems <- rhs_lengths %in% inconsistent_lengths + problems <- lhs_problems | rhs_problems + if (any(problems)) { + stop("The following formulas must be length ", len, " or 1, not ", + paste(inconsistent_lengths, collapse = ", "), ".\n ", + paste(fs[problems], collapse = "\n "), + call. = FALSE + ) + } + } + + replace_with <- function(x, i, val, arg_name) { + if (is.null(val)) { + return(x) + } + i[is.na(i)] <- FALSE + if (length(val) == 1L) { + x[i] <- val + } else { + x[i] <- val[i] + } + x + } + + query <- vector("list", n) + value <- vector("list", n) + default_env <- parent.frame() + for (i in seq_len(n)) { + query[[i]] <- eval(fs[[i]][[2]], envir = default_env) + value[[i]] <- eval(fs[[i]][[3]], envir = default_env) + if (!is.logical(query[[i]])) { + stop(fs[[i]][[2]], " does not return a `logical` vector.") + } + } + m <- validate_case_when_length(query, value, fs) + out <- value[[1]][rep(NA_integer_, m)] + replaced <- rep(FALSE, m) + for (i in seq_len(n)) { + out <- replace_with( + out, query[[i]] & !replaced, value[[i]], + NULL + ) + replaced <- replaced | (query[[i]] & !is.na(query[[i]])) + } + out +} + +rbind_AMR <- function(...) { + # this is just rbind(), but with the functionality of dplyr::bind_rows(), + # to allow differences in available columns + l <- list(...) + l_names <- unique(unlist(lapply(l, names))) + l_new <- lapply(l, function(df) { + rownames(df) <- NULL + for (col in l_names[!l_names %in% colnames(df)]) { + # create the new column, could also be length 0 + df[, col] <- rep(NA, NROW(df)) + } + df + }) + do.call(rbind, l_new) +} + +# No export, no Rd +addin_insert_in <- function() { + import_fn("insertText", "rstudioapi")(" %in% ") +} + +# No export, no Rd +addin_insert_like <- function() { + # we want Shift + Ctrl/Cmd + L to iterate over %like%, %unlike%, %like_case%, and %unlike_case% + + getActiveDocumentContext <- import_fn("getActiveDocumentContext", "rstudioapi") + insertText <- import_fn("insertText", "rstudioapi") + modifyRange <- import_fn("modifyRange", "rstudioapi") + document_range <- import_fn("document_range", "rstudioapi") + document_position <- import_fn("document_position", "rstudioapi") + + context <- getActiveDocumentContext() + current_row <- context$selection[[1]]$range$end[1] + current_col <- context$selection[[1]]$range$end[2] + current_row_txt <- context$contents[current_row] + if (is.null(current_row) || current_row_txt %unlike% "%(un)?like") { + insertText(" %like% ") + return(invisible()) + } + + pos_preceded_by <- function(txt) { + if (tryCatch(substr(current_row_txt, current_col - nchar(trimws(txt, which = "right")), current_col) == trimws(txt, which = "right"), + error = function(e) FALSE + )) { + return(TRUE) + } + tryCatch(substr(current_row_txt, current_col - nchar(txt), current_col) %like% paste0("^", txt), + error = function(e) FALSE + ) + } + replace_pos <- function(old, with) { + modifyRange( + document_range( + document_position(current_row, current_col - nchar(old)), + document_position(current_row, current_col) + ), + text = with, + id = context$id + ) + } + + if (pos_preceded_by(" %like% ")) { + replace_pos(" %like% ", with = " %unlike% ") + } else if (pos_preceded_by(" %unlike% ")) { + replace_pos(" %unlike% ", with = " %like_case% ") + } else if (pos_preceded_by(" %like_case% ")) { + replace_pos(" %like_case% ", with = " %unlike_case% ") + } else if (pos_preceded_by(" %unlike_case% ")) { + replace_pos(" %unlike_case% ", with = " %like% ") + } else { + insertText(" %like% ") + } +} + +search_type_in_df <- function(x, type, info = TRUE, add_col_prefix = TRUE) { + meet_criteria(x, allow_class = "data.frame") + meet_criteria(type, allow_class = "character", has_length = 1) + + # try to find columns based on type + found <- NULL + + # remove attributes from other packages + x <- as.data.frame(x, stringsAsFactors = FALSE) + colnames_formatted <- tolower(generalise_antibiotic_name(colnames(x))) + + # -- mo + if (type == "mo") { + add_MO_lookup_to_AMR_env() + + if (any(vapply(FUN.VALUE = logical(1), x, is.mo))) { + # take first 'mo' column + found <- colnames(x)[vapply(FUN.VALUE = logical(1), x, is.mo)] + } else if ("mo" %in% colnames_formatted && + suppressWarnings(all(x$mo %in% c(NA, AMR_env$MO_lookup$mo)))) { + found <- "mo" + } else if (any(colnames_formatted %like_case% "^(mo|microorganism|organism|bacteria|ba[ck]terie)s?$")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "^(mo|microorganism|organism|bacteria|ba[ck]terie)s?$"]) + } else if (any(colnames_formatted %like_case% "^(microorganism|organism|bacteria|ba[ck]terie)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "^(microorganism|organism|bacteria|ba[ck]terie)"]) + } else if (any(colnames_formatted %like_case% "species")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "species"]) + } + } + # -- key antibiotics + if (type %in% c("keyantibiotics", "keyantimicrobials")) { + if (any(colnames_formatted %like_case% "^key.*(ab|antibiotics|antimicrobials)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "^key.*(ab|antibiotics|antimicrobials)"]) + } + } + # -- date + if (type == "date") { + if (any(colnames_formatted %like_case% "^(specimen date|specimen_date|spec_date)")) { + # WHONET support + found <- sort(colnames(x)[colnames_formatted %like_case% "^(specimen date|specimen_date|spec_date)"]) + if (!inherits(pm_pull(x, found), c("Date", "POSIXct"))) { + stop( + font_red(paste0( + "Found column '", font_bold(found), "' to be used as input for `", ifelse(add_col_prefix, "col_", ""), type, + "`, but this column contains no valid dates. Transform its values to valid dates first." + )), + call. = FALSE + ) + } + } else if (any(vapply(FUN.VALUE = logical(1), x, function(x) inherits(x, c("Date", "POSIXct"))))) { + # take first column + found <- colnames(x)[vapply(FUN.VALUE = logical(1), x, function(x) inherits(x, c("Date", "POSIXct")))] + } + } + # -- patient id + if (type == "patient_id") { + crit1 <- colnames_formatted %like_case% "^(patient|patid)" + if (any(crit1)) { + found <- colnames(x)[crit1] + } else { + crit2 <- colnames_formatted %like_case% "(identification |patient|pat.*id)" + if (any(crit2)) { + found <- colnames(x)[crit2] + } + } + } + # -- specimen + if (type == "specimen") { + if (any(colnames_formatted %like_case% "(specimen type|spec_type)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "(specimen type|spec_type)"]) + } else if (any(colnames_formatted %like_case% "^(specimen)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "^(specimen)"]) + } + } + # -- host (animals) + if (type == "host") { + if (any(colnames_formatted %like_case% "^(host|animal)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "^(host|animal)"]) + } else if (any(colnames_formatted %like_case% "((^|[^A-Za-z])host($|[^A-Za-z])|animal)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "((^|[^A-Za-z])host($|[^A-Za-z])|animal)"]) + } + } + # -- UTI (urinary tract infection) + if (type == "uti") { + if (any(colnames_formatted == "uti")) { + found <- colnames(x)[colnames_formatted == "uti"] + } else if (any(colnames_formatted %like_case% "(urine|urinary)")) { + found <- sort(colnames(x)[colnames_formatted %like_case% "(urine|urinary)"]) + } + if (!is.null(found)) { + # this column should contain logicals + if (!is.logical(x[, found, drop = TRUE])) { + message_("Column '", font_bold(found), "' found as input for `", ifelse(add_col_prefix, "col_", ""), type, + "`, but this column does not contain 'logical' values (TRUE/FALSE) and was ignored.", + add_fn = font_red + ) + found <- NULL + } + } + } + + found <- found[1] + + if (!is.null(found) && isTRUE(info)) { + if (message_not_thrown_before("search_in_type", type)) { + msg <- paste0("Using column '", font_bold(found), "' as input for `", ifelse(add_col_prefix, "col_", ""), type, "`.") + if (type %in% c("keyantibiotics", "keyantimicrobials", "specimen")) { + msg <- paste(msg, "Use", font_bold(paste0(ifelse(add_col_prefix, "col_", ""), type), "= FALSE"), "to prevent this.") + } + message_(msg) + } + } + found +} + +is_valid_regex <- function(x) { + regex_at_all <- tryCatch( + vapply( + FUN.VALUE = logical(1), + X = strsplit(x, "", fixed = TRUE), + FUN = function(y) { + any( + y %in% c( + "$", "(", ")", "*", "+", "-", + ".", "?", "[", "]", "^", "{", + "|", "}", "\\" + ), + na.rm = TRUE + ) + }, + USE.NAMES = FALSE + ), + error = function(e) rep(TRUE, length(x)) + ) + regex_valid <- vapply( + FUN.VALUE = logical(1), + X = x, + FUN = function(y) { + !inherits(try(grepl(y, "", perl = TRUE), silent = TRUE), "try-error") + }, + USE.NAMES = FALSE + ) + regex_at_all & regex_valid +} + +stop_ifnot_installed <- function(package) { + installed <- vapply(FUN.VALUE = logical(1), package, requireNamespace, quietly = TRUE) + if (any(!installed) && any(package == "rstudioapi")) { + stop("This function only works in RStudio when using R >= 3.2.", call. = FALSE) + } else if (any(!installed)) { + stop("This requires the ", vector_and(package[!installed]), " package.", + "\nTry to install with install.packages().", + call. = FALSE + ) + } else { + return(invisible()) + } +} + +pkg_is_available <- function(pkg, also_load = FALSE, min_version = NULL) { + if (also_load == TRUE) { + out <- suppressWarnings(require(pkg, character.only = TRUE, warn.conflicts = FALSE)) + } else { + out <- requireNamespace(pkg, quietly = TRUE) + } + if (!is.null(min_version)) { + out <- out && utils::packageVersion(pkg) >= min_version + } + isTRUE(out) +} + +import_fn <- function(name, pkg, error_on_fail = TRUE) { + if (isTRUE(error_on_fail)) { + stop_ifnot_installed(pkg) + } + tryCatch( + # don't use get() to avoid fetching non-API functions + getExportedValue(name = name, ns = asNamespace(pkg)), + error = function(e) { + if (isTRUE(error_on_fail)) { + stop_("function `", name, "()` is not an exported object from package '", pkg, + "'. Please create an issue at ", font_url("https://github.com/msberends/AMR/issues"), ". Many thanks!", + call = FALSE + ) + } else { + return(NULL) + } + } + ) +} + +# this alternative wrapper to the message(), warning() and stop() functions: +# - wraps text to never break lines within words +# - ignores formatted text while wrapping +# - adds indentation dependent on the type of message (such as NOTE) +# - can add additional formatting functions like blue or bold text +word_wrap <- function(..., + add_fn = list(), + as_note = FALSE, + width = 0.95 * getOption("width"), + extra_indent = 0) { + msg <- paste0(c(...), collapse = "") + + if (isTRUE(as_note)) { + msg <- paste0(AMR_env$info_icon, " ", gsub("^note:? ?", "", msg, ignore.case = TRUE)) + } + + if (msg %like% "\n") { + # run word_wraps() over every line here, bind them and return again + return(paste0( + vapply( + FUN.VALUE = character(1), + trimws(unlist(strsplit(msg, "\n", fixed = TRUE)), which = "right"), + word_wrap, + add_fn = add_fn, + as_note = FALSE, + width = width, + extra_indent = extra_indent + ), + collapse = "\n" + )) + } + + # correct for operators (will add the space later on) + ops <- "([,./><\\]\\[])" + msg <- gsub(paste0(ops, " ", ops), "\\1\\2", msg, perl = TRUE) + # we need to correct for already applied style, that adds text like "\033[31m\" + msg_stripped <- gsub("(.*)?\\033\\]8;;.*\\a(.*?)\\033\\]8;;\\a(.*)", "\\1\\2\\3", msg, perl = TRUE) # for font_url() + msg_stripped <- font_stripstyle(msg_stripped) + # where are the spaces now? + msg_stripped_wrapped <- paste0( + strwrap(msg_stripped, + simplify = TRUE, + width = width + ), + collapse = "\n" + ) + msg_stripped_wrapped <- paste0(unlist(strsplit(msg_stripped_wrapped, "(\n|\\*\\|\\*)")), + collapse = "\n" + ) + msg_stripped_spaces <- which(unlist(strsplit(msg_stripped, "", fixed = TRUE)) == " ") + msg_stripped_wrapped_spaces <- which(unlist(strsplit(msg_stripped_wrapped, "", fixed = TRUE)) != "\n") + # so these are the indices of spaces that need to be replaced + replace_spaces <- which(!msg_stripped_spaces %in% msg_stripped_wrapped_spaces) + # put it together + msg <- unlist(strsplit(msg, " ", fixed = TRUE)) + msg[replace_spaces] <- paste0(msg[replace_spaces], "\n") + # add space around operators again + msg <- gsub(paste0(ops, ops), "\\1 \\2", msg, perl = TRUE) + msg <- paste0(msg, collapse = " ") + msg <- gsub("\n ", "\n", msg, fixed = TRUE) + + if (msg_stripped %like% "\u2139 ") { + indentation <- 2 + extra_indent + } else if (msg_stripped %like% "^=> ") { + indentation <- 3 + extra_indent + } else { + indentation <- 0 + extra_indent + } + msg <- gsub("\n", paste0("\n", strrep(" ", indentation)), msg, fixed = TRUE) + # remove trailing empty characters + msg <- gsub("(\n| )+$", "", msg) + + if (length(add_fn) > 0) { + if (!is.list(add_fn)) { + add_fn <- list(add_fn) + } + for (i in seq_len(length(add_fn))) { + msg <- add_fn[[i]](msg) + } + } + + # format backticks + if (pkg_is_available("cli") && + tryCatch(isTRUE(getExportedValue("ansi_has_hyperlink_support", ns = asNamespace("cli"))()), error = function(e) FALSE) && + tryCatch(getExportedValue("isAvailable", ns = asNamespace("rstudioapi"))(), error = function(e) return(FALSE)) && + tryCatch(getExportedValue("versionInfo", ns = asNamespace("rstudioapi"))()$version > "2023.6.0.0", error = function(e) return(FALSE))) { + # we are in a recent version of RStudio, so do something nice: add links to our help pages in the console. + parts <- strsplit(msg, "`", fixed = TRUE)[[1]] + cmds <- parts %in% paste0(ls(envir = asNamespace("AMR")), "()") + # functions with a dot are not allowed: https://github.com/rstudio/rstudio/issues/11273#issuecomment-1156193252 + # lead them to the help page of our package + parts[cmds & parts %like% "[.]"] <- font_url(url = paste0("ide:help:AMR::", gsub("()", "", parts[cmds & parts %like% "[.]"], fixed = TRUE)), + txt = parts[cmds & parts %like% "[.]"]) + # otherwise, give a 'click to run' popup + parts[cmds & parts %unlike% "[.]"] <- font_url(url = paste0("ide:run:AMR::", parts[cmds & parts %unlike% "[.]"]), + txt = parts[cmds & parts %unlike% "[.]"]) + # text starting with `?` must also lead to the help page + parts[parts %like% "^[?]"] <- font_url(url = paste0("ide:help:AMR::", gsub("()", "", gsub("^[?]", "", parts[parts %like% "^[?]"]), fixed = TRUE)), + txt = parts[parts %like% "^[?]"]) + msg <- paste0(parts, collapse = "`") + } + msg <- gsub("`(.+?)`", font_grey_bg("\\1"), msg) + + # clean introduced whitespace in between fullstops + msg <- gsub("[.] +[.]", "..", msg) + # remove extra space that was introduced (e.g. "Smith et al. , 2022") + msg <- gsub(". ,", ".,", msg, fixed = TRUE) + msg <- gsub("[ ,", "[,", msg, fixed = TRUE) + msg <- gsub("/ /", "//", msg, fixed = TRUE) + + msg +} + +message_ <- function(..., + appendLF = TRUE, + add_fn = list(font_blue), + as_note = TRUE) { + message( + word_wrap(..., + add_fn = add_fn, + as_note = as_note + ), + appendLF = appendLF + ) +} + +warning_ <- function(..., + add_fn = list(), + immediate = FALSE, + call = FALSE) { + warning( + trimws2(word_wrap(..., + add_fn = add_fn, + as_note = FALSE + )), + immediate. = immediate, + call. = call + ) +} + +# this alternative to the stop() function: +# - adds the function name where the error was thrown +# - wraps text to never break lines within words +stop_ <- function(..., call = TRUE) { + msg <- paste0(c(...), collapse = "") + msg_call <- "" + if (!isFALSE(call)) { + if (isTRUE(call)) { + call <- as.character(sys.call(-1)[1]) + } else { + # so you can go back more than 1 call, as used in sir_calc(), that now throws a reference to e.g. n_sir() + call <- as.character(sys.call(call)[1]) + } + msg_call <- paste0("in ", call, "():") + } + msg <- trimws2(word_wrap(msg, add_fn = list(), as_note = FALSE)) + if (!is.null(AMR_env$cli_abort) && length(unlist(strsplit(msg, "\n", fixed = TRUE))) <= 1) { + if (is.character(call)) { + call <- as.call(str2lang(paste0(call, "()"))) + } else { + call <- NULL + } + AMR_env$cli_abort(msg, call = call) + } else { + stop(paste(msg_call, msg), call. = FALSE) + } +} + +stop_if <- function(expr, ..., call = TRUE) { + if (isTRUE(expr)) { + if (isTRUE(call)) { + call <- -1 + } + if (!isFALSE(call)) { + # since we're calling stop_(), which is another call + call <- call - 1 + } + stop_(..., call = call) + } +} + +stop_ifnot <- function(expr, ..., call = TRUE) { + if (isFALSE(expr)) { + if (isTRUE(call)) { + call <- -1 + } + if (!isFALSE(call)) { + # since we're calling stop_(), which is another call + call <- call - 1 + } + stop_(..., call = call) + } +} + +"%or%" <- function(x, y) { + if (is.null(x) || is.null(y)) { + if (is.null(x)) { + return(y) + } else { + return(x) + } + } + ifelse(is.na(x), y, x) +} + +return_after_integrity_check <- function(value, type, check_vector) { + if (!all(value[!is.na(value)] %in% check_vector)) { + warning_(paste0("invalid ", type, ", NA generated")) + value[!value %in% check_vector] <- NA + } + value +} + +# transforms data set to a tibble with only ASCII values, to comply with CRAN policies +dataset_UTF8_to_ASCII <- function(df) { + trans <- function(vect) { + iconv(vect, from = "UTF-8", to = "ASCII//TRANSLIT") + } + df <- as.data.frame(df, stringsAsFactors = FALSE) + for (i in seq_len(NCOL(df))) { + col <- df[, i] + if (is.list(col)) { + col <- lapply(col, function(j) trans(j)) + df[, i] <- list(col) + } else { + if (is.factor(col)) { + levels(col) <- trans(levels(col)) + } else if (is.character(col)) { + col <- trans(col) + } else { + col + } + df[, i] <- col + } + } + import_fn("as_tibble", "tibble")(df) +} + +documentation_date <- function(d) { + day <- as.integer(format(d, "%e")) + suffix <- rep("th", length(day)) + suffix[day %in% c(1, 21, 31)] <- "st" + suffix[day %in% c(2, 22)] <- "nd" + suffix[day %in% c(3, 23)] <- "rd" + paste0(month.name[as.integer(format(d, "%m"))], " ", day, suffix, ", ", format(d, "%Y")) +} + +format_included_data_number <- function(data) { + if (is.numeric(data) && length(data) == 1) { + n <- data + } else if (is.data.frame(data)) { + n <- nrow(data) + } else { + n <- length(unique(data)) + } + if (n > 10000) { + rounder <- -3 # round on thousands + } else if (n > 1000) { + rounder <- -2 # round on hundreds + } else if (n < 50) { + # do not round + rounder <- 0 + } else { + rounder <- -1 # round on tens + } + paste0(ifelse(rounder == 0, "", "~"), format(round(n, rounder), decimal.mark = ".", big.mark = " ")) +} + +# for eucast_rules() and mdro(), creates markdown output with URLs and names +create_eucast_ab_documentation <- function() { + x <- trimws(unique(toupper(unlist(strsplit(EUCAST_RULES_DF$then_change_these_antibiotics, ",", fixed = TRUE))))) + ab <- character() + for (val in x) { + if (paste0("AB_", val) %in% ls(envir = asNamespace("AMR"))) { + # antibiotic group names, as defined in data-raw/_pre_commit_checks.R, such as `CARBAPENEMS` + val <- eval(parse(text = paste0("AB_", val)), envir = asNamespace("AMR")) + } else if (val %in% AMR_env$AB_lookup$ab) { + # separate drugs, such as `AMX` + val <- as.ab(val) + } else { + val <- as.sir(NA) + } + ab <- c(ab, val) + } + ab <- unique(ab) + atcs <- ab_atc(ab, only_first = TRUE) + # only keep ABx with an ATC code: + ab <- ab[!is.na(atcs)] + atcs <- atcs[!is.na(atcs)] + + # sort all vectors on name: + ab_names <- ab_name(ab, language = NULL, tolower = TRUE) + ab <- ab[order(ab_names)] + atcs <- atcs[order(ab_names)] + ab_names <- ab_names[order(ab_names)] + # create the text: + atc_txt <- paste0("[", atcs, "](", ab_url(ab), ")") + out <- paste0(ab_names, " (`", ab, "`, ", atc_txt, ")", collapse = ", ") + substr(out, 1, 1) <- toupper(substr(out, 1, 1)) + out +} + +vector_or <- function(v, quotes = TRUE, reverse = FALSE, sort = TRUE, initial_captital = FALSE, last_sep = " or ") { + # makes unique and sorts, and this also removed NAs + v <- unique(v) + if (isTRUE(sort)) { + v <- sort(v) + } + if (isTRUE(reverse)) { + v <- rev(v) + } + if (isTRUE(quotes)) { + quotes <- '"' + } else if (isFALSE(quotes)) { + quotes <- "" + } else { + quotes <- quotes[1L] + } + if (isTRUE(initial_captital)) { + v[1] <- gsub("^([a-z])", "\\U\\1", v[1], perl = TRUE) + } + if (length(v) <= 1) { + return(paste0(quotes, v, quotes)) + } + if (identical(v, c("I", "R", "S"))) { + # class 'sir' should be sorted like this + v <- c("S", "I", "R") + } + if (identical(v, c("I", "NI", "R", "S", "SDD"))) { + # class 'sir' should be sorted like this + v <- c("S", "SDD", "I", "R", "NI") + } + # oxford comma + if (last_sep %in% c(" or ", " and ") && length(v) > 2) { + last_sep <- paste0(",", last_sep) + } + # all commas except for last item, so will become '"val1", "val2", "val3" or "val4"' + paste0( + paste0(quotes, v[seq_len(length(v) - 1)], quotes, collapse = ", "), + last_sep, paste0(quotes, v[length(v)], quotes) + ) +} + +vector_and <- function(v, quotes = TRUE, reverse = FALSE, sort = TRUE, initial_captital = FALSE) { + vector_or( + v = v, quotes = quotes, reverse = reverse, sort = sort, + initial_captital = initial_captital, last_sep = " and " + ) +} + +format_class <- function(class, plural = FALSE) { + class.bak <- class + class[class == "numeric"] <- "number" + class[class == "integer"] <- "whole number" + if (all(c("numeric", "integer") %in% class.bak, na.rm = TRUE)) { + class[class %in% c("number", "whole number")] <- "(whole) number" + } + class[class == "character"] <- "text string" + class[class == "Date"] <- "date" + class[class %in% c("POSIXt", "POSIXct", "POSIXlt")] <- "date/time" + class[class != class.bak] <- paste0( + ifelse(plural, "", "a "), + class[class != class.bak], + ifelse(plural, "s", "") + ) + # exceptions + class[class == "logical"] <- ifelse(plural, "a vector of `TRUE`/`FALSE`", "`TRUE` or `FALSE`") + class[class == "data.frame"] <- "a data set" + if ("list" %in% class) { + class <- "a list" + } + if ("matrix" %in% class) { + class <- "a matrix" + } + if ("custom_eucast_rules" %in% class) { + class <- "input created with `custom_eucast_rules()`" + } + if (any(c("mo", "ab", "sir") %in% class)) { + class <- paste0("of class '", class[1L], "'") + } + class[class == class.bak] <- paste0("of class '", class[class == class.bak], "'") + # output + vector_or(class, quotes = FALSE, sort = FALSE) +} + +# a check for every single argument in all functions +meet_criteria <- function(object, # can be literally `list(...)` for `allow_arguments_from` + allow_class = NULL, + has_length = NULL, + looks_like = NULL, + is_in = NULL, + is_positive = NULL, + is_positive_or_zero = NULL, + is_finite = NULL, + allow_NULL = FALSE, + allow_NA = FALSE, + ignore.case = FALSE, + allow_arguments_from = NULL, # 1 function, or a list of functions + .call_depth = 0) { # depth in calling + + obj_name <- deparse(substitute(object)) + call_depth <- -2 - abs(.call_depth) + + # if object is missing, or another error: + tryCatch(invisible(object), + error = function(e) AMR_env$meet_criteria_error_txt <- e$message + ) + if (!is.null(AMR_env$meet_criteria_error_txt)) { + error_txt <- AMR_env$meet_criteria_error_txt + AMR_env$meet_criteria_error_txt <- NULL + stop(error_txt, call. = FALSE) # don't use stop_() here, our pkg may not be loaded yet + } + AMR_env$meet_criteria_error_txt <- NULL + + if (is.null(object)) { + stop_if(allow_NULL == FALSE, "argument `", obj_name, "` must not be NULL", call = call_depth) + return(invisible()) + } + if (is.null(dim(object)) && length(object) == 1 && suppressWarnings(is.na(object))) { # suppressWarnings for functions + stop_if(allow_NA == FALSE, "argument `", obj_name, "` must not be NA", call = call_depth) + return(invisible()) + } + + if (identical(class(object), "list") && !"list" %in% allow_class) { + # coming from Python, possibly - turn lists (not data.frame) to the underlying data type + object <- unlist(object) + } + + if (!is.null(allow_class) && !(suppressWarnings(all(is.na(object))) && allow_NA == TRUE)) { + stop_ifnot(inherits(object, allow_class), "argument `", obj_name, + "` must be ", format_class(allow_class, plural = isTRUE(has_length > 1)), + ", i.e. not be ", format_class(class(object), plural = isTRUE(has_length > 1)), + call = call_depth + ) + # check data.frames for data + if (inherits(object, "data.frame")) { + stop_if(any(dim(object) == 0), + "the data provided in argument `", obj_name, + "` must contain rows and columns (current dimensions: ", + paste(dim(object), collapse = "x"), ")", + call = call_depth + ) + } + } + if (!is.null(has_length)) { + stop_ifnot(length(object) %in% has_length, "argument `", obj_name, + "` must ", # ifelse(allow_NULL, "be NULL or must ", ""), + "be of length ", vector_or(has_length, quotes = FALSE), + ", not ", length(object), + call = call_depth + ) + } + if (!is.null(looks_like)) { + stop_ifnot(object %like% looks_like, "argument `", obj_name, + "` must ", # ifelse(allow_NULL, "be NULL or must ", ""), + "resemble the regular expression \"", looks_like, "\"", + call = call_depth + ) + } + if (!is.null(is_in)) { + if (ignore.case == TRUE) { + object <- tolower(object) + is_in <- tolower(is_in) + } + is_in.bak <- is_in + if ("logical" %in% allow_class) { + is_in <- is_in[!is_in %in% c("TRUE", "FALSE")] + } + or_values <- vector_or(is_in, quotes = !isTRUE(any(c("numeric", "integer") %in% allow_class))) + if ("logical" %in% allow_class) { + or_values <- paste0(or_values, ", or TRUE or FALSE") + } + stop_ifnot(all(object %in% is_in.bak, na.rm = TRUE), "argument `", obj_name, "` ", + ifelse(!is.null(has_length) && length(has_length) == 1 && has_length == 1, + "must be either ", + "must only contain values " + ), + or_values, + ifelse(allow_NA == TRUE, ", or NA", ""), + call = call_depth + ) + } + if (isTRUE(is_positive)) { + stop_if(is.numeric(object) && !all(object > 0, na.rm = TRUE), "argument `", obj_name, + "` must ", + ifelse(!is.null(has_length) && length(has_length) == 1 && has_length == 1, + "be a number higher than zero", + "all be numbers higher than zero" + ), + call = call_depth + ) + } + if (isTRUE(is_positive_or_zero)) { + stop_if(is.numeric(object) && !all(object >= 0, na.rm = TRUE), "argument `", obj_name, + "` must ", + ifelse(!is.null(has_length) && length(has_length) == 1 && has_length == 1, + "be zero or a positive number", + "all be zero or numbers higher than zero" + ), + call = call_depth + ) + } + if (isTRUE(is_finite)) { + stop_if(is.numeric(object) && !all(is.finite(object[!is.na(object)]), na.rm = TRUE), "argument `", obj_name, + "` must ", + ifelse(!is.null(has_length) && length(has_length) == 1 && has_length == 1, + "be a finite number", + "all be finite numbers" + ), + " (i.e. not be infinite)", + call = call_depth + ) + } + if (!is.null(allow_arguments_from) && !is.null(names(object))) { + args_given <- names(object) + if (is.function(allow_arguments_from)) { + allow_arguments_from <- list(allow_arguments_from) + } + args_allowed <- sort(unique(unlist(lapply(allow_arguments_from, function(x) names(formals(x)))))) + args_allowed <- args_allowed[args_allowed != "..."] + disallowed <- args_given[!args_given %in% args_allowed] + stop_if(length(disallowed) > 0, + ifelse(length(disallowed) == 1, + paste("the argument", vector_and(disallowed), "is"), + paste("the arguments", vector_and(disallowed), "are") + ), + " not valid. Valid arguments are: ", + vector_and(args_allowed), ".", + call = call_depth + ) + } + return(invisible()) +} + +ascertain_sir_classes <- function(x, obj_name) { + sirs <- vapply(FUN.VALUE = logical(1), x, is.sir) + if (!any(sirs, na.rm = TRUE)) { + warning_("the data provided in argument `", obj_name, + "` should contain at least one column of class 'sir'. Eligible SIR column were now guessed. ", + "See `?as.sir`.") + sirs_eligible <- is_sir_eligible(x) + for (col in colnames(x)[sirs_eligible]) { + x[[col]] <- as.sir(x[[col]]) + } + } + x +} + +get_current_data <- function(arg_name, call) { + valid_df <- function(x) { + !is.null(x) && is.data.frame(x) + } + + frms <- sys.frames() + + # check dplyr environments to support dplyr groups + with_mask <- vapply(FUN.VALUE = logical(1), frms, function(e) !is.null(e$mask)) + for (env in frms[which(with_mask)]) { + if (is.function(env$mask$current_rows) && (valid_df(env$data) || valid_df(env$`.data`))) { + # an element `.data` or `data` (containing all data) and `mask` (containing functions) will be in the environment when using dplyr verbs + # we use their mask$current_rows() to get the group rows, since dplyr::cur_data_all() is deprecated and will be removed in the future + # e.g. for `example_isolates %>% group_by(ward) %>% mutate(first = first_isolate(.))` + if (valid_df(env$data)) { + # support for dplyr 1.1.x + df <- env$data + } else { + # support for dplyr 1.0.x + df <- env$`.data` + } + rows <- tryCatch(env$mask$current_rows(), error = function(e) seq_len(NROW(df))) + return(df[rows, , drop = FALSE]) + } + } + + # now go over all underlying environments looking for other dplyr, data.table and base R selection environments + with_generic <- vapply(FUN.VALUE = logical(1), frms, function(e) !is.null(e$`.Generic`)) + for (env in frms[which(with_generic)]) { + if (valid_df(env$`.data`)) { + # an element `.data` will be in the environment when using dplyr::select() + return(env$`.data`) + } else if (valid_df(env$xx)) { + # an element `xx` will be in the environment for rows + cols in base R, e.g. `example_isolates[c(1:3), carbapenems()]` + return(env$xx) + } else if (valid_df(env$x)) { + # an element `x` will be in the environment for only cols in base R, e.g. `example_isolates[, carbapenems()]` + # this element will also be present in data.table environments where there's a .Generic available + return(env$x) + } + } + + # now a special case for dplyr's 'scoped' variants + with_tbl <- vapply(FUN.VALUE = logical(1), frms, function(e) valid_df(e$`.tbl`)) + for (env in frms[which(with_tbl)]) { + if (!is.null(names(env)) && all(c(".tbl", ".vars", ".cols") %in% names(env), na.rm = TRUE)) { + # an element `.tbl` will be in the environment when using scoped dplyr variants, with or without `dplyr::vars()` + # (e.g. `dplyr::summarise_at()` or `dplyr::mutate_at()`) + return(env$`.tbl`) + } + } + + # no data.frame found, so an error must be returned: + if (is.na(arg_name)) { + if (isTRUE(is.numeric(call))) { + fn <- as.character(sys.call(call + 1)[1]) + examples <- paste0( + ", e.g.:\n", + " ", AMR_env$bullet_icon, " your_data %>% select(", fn, "())\n", + " ", AMR_env$bullet_icon, " your_data %>% select(column_a, column_b, ", fn, "())\n", + " ", AMR_env$bullet_icon, " your_data[, ", fn, "()]\n", + " ", AMR_env$bullet_icon, " your_data[, c(\"column_a\", \"column_b\", ", fn, "())]" + ) + } else { + examples <- "" + } + stop_("this function must be used inside a `dplyr` verb or `data.frame` call", + examples, + call = call + ) + } else { + # mimic a base R error that the argument is missing + stop_("argument `", arg_name, "` is missing with no default", call = call) + } +} + +get_current_column <- function() { + # try dplyr::cur_columns() first + cur_column <- import_fn("cur_column", "dplyr", error_on_fail = FALSE) + out <- tryCatch(cur_column(), error = function(e) NULL) + if (!is.null(out)) { + return(out) + } + + # cur_column() doesn't always work (only allowed for certain conditions set by dplyr), but it's probably still possible: + frms <- lapply(sys.frames(), function(env) { + if (tryCatch(!is.null(env$i), error = function(e) FALSE)) { + if (!is.null(env$tibble_vars)) { + # for mutate_if() + # TODO remove later, was part of older dplyr versions (at least not in dplyr 1.1.4) + env$tibble_vars[env$i] + } else { + # for mutate(across()) + if (!is.null(env$data) && is.data.frame(env$data)) { + df <- env$data + } else { + df <- tryCatch(get_current_data(NA, 0), error = function(e) NULL) + } + if (is.data.frame(df)) { + colnames(df)[env$i] + } else { + env$i + } + } + } else { + NULL + } + }) + + vars <- unlist(frms) + if (length(vars) > 0) { + vars[length(vars)] + } else { + # not found, so: + NULL + } +} + +is_null_or_grouped_tbl <- function(x) { + # class "grouped_data" is from {poorman}, see aa_helper_pm_functions.R + # class "grouped_df" is from {dplyr} and might change at one point, so only set in one place; here. + is.null(x) || inherits(x, "grouped_data") || inherits(x, "grouped_df") +} + +get_group_names <- function(x) { + if ("pm_groups" %in% names(attributes(x))) { + pm_get_groups(x) + } else if (!is.null(x) && is_null_or_grouped_tbl(x)) { + grps <- colnames(attributes(x)$groups) + grps[!grps %in% c(".group_id", ".rows")] + } else { + character(0) + } +} + +unique_call_id <- function(entire_session = FALSE, match_fn = NULL) { + if (entire_session == TRUE) { + return(c(envir = "session", call = "session")) + } + + # combination of environment ID (such as "0x7fed4ee8c848") + # and relevant system call (where 'match_fn' is being called in) + calls <- sys.calls() + in_test <- any(as.character(calls[[1]]) %like_case% "run_test_dir|run_test_file|test_all|tinytest|test_package|testthat", na.rm = TRUE) + if (!isTRUE(in_test) && !is.null(match_fn)) { + for (i in seq_len(length(calls))) { + call_clean <- gsub("[^a-zA-Z0-9_().-]", "", as.character(calls[[i]]), perl = TRUE) + if (match_fn %in% call_clean || any(call_clean %like% paste0(match_fn, "\\("), na.rm = TRUE)) { + return(c( + envir = gsub("", "\\1", utils::capture.output(sys.frames()[[1]]), perl = TRUE), + call = paste0(deparse(calls[[i]]), collapse = "") + )) + } + } + } + c( + envir = paste0(sample(c(0:9, letters[1:6]), size = 32, replace = TRUE), collapse = ""), + call = paste0(sample(c(0:9, letters[1:6]), size = 32, replace = TRUE), collapse = "") + ) +} + +#' @noRd +#' @param fn name of the function as a character +#' @param ... character elements to be pasted together as a 'salt' +#' @param entire_session show message once per session +message_not_thrown_before <- function(fn, ..., entire_session = FALSE) { + # this is to prevent that messages/notes will be printed for every dplyr group or more than once per session + # e.g. this would show a msg 4 times: example_isolates %>% group_by(ward) %>% filter(mo_is_gram_negative()) + salt <- gsub("[^a-zA-Z0-9|_-]", "?", substr(paste(c(...), sep = "|", collapse = "|"), 1, 512), perl = TRUE) + not_thrown_before <- is.null(AMR_env[[paste0("thrown_msg.", fn, ".", salt)]]) || + !identical( + AMR_env[[paste0("thrown_msg.", fn, ".", salt)]], + unique_call_id( + entire_session = entire_session, + match_fn = fn + ) + ) + if (isTRUE(not_thrown_before)) { + # message was not thrown before - remember this so on the next run it will return FALSE: + assign( + x = paste0("thrown_msg.", fn, ".", salt), + value = unique_call_id(entire_session = entire_session, match_fn = fn), + envir = AMR_env + ) + } + not_thrown_before +} + +has_colour <- function() { + # this is a base R version of crayon::has_color, but disables colours on emacs + + if (Sys.getenv("EMACS") != "" || Sys.getenv("INSIDE_EMACS") != "") { + # disable on emacs, which only supports 8 colours + return(FALSE) + } + enabled <- getOption("crayon.enabled") + if (!is.null(enabled)) { + return(isTRUE(enabled)) + } + rstudio_with_ansi_support <- function(x) { + if (Sys.getenv("RSTUDIO", "") == "") { + return(FALSE) + } + if ((cols <- Sys.getenv("RSTUDIO_CONSOLE_COLOR", "")) != "" && !is.na(as.double(cols))) { + return(TRUE) + } + tryCatch(getExportedValue("isAvailable", ns = asNamespace("rstudioapi"))(), error = function(e) { + return(FALSE) + }) && + tryCatch(getExportedValue("hasFun", ns = asNamespace("rstudioapi"))("getConsoleHasColor"), error = function(e) { + return(FALSE) + }) + } + if (rstudio_with_ansi_support() && sink.number() == 0) { + return(TRUE) + } + if (!isatty(stdout())) { + return(FALSE) + } + if (tolower(Sys.info()["sysname"]) == "windows") { + if (Sys.getenv("ConEmuANSI") == "ON") { + return(TRUE) + } + if (Sys.getenv("CMDER_ROOT") != "") { + return(TRUE) + } + return(FALSE) + } + if ("COLORTERM" %in% names(Sys.getenv())) { + return(TRUE) + } + if (Sys.getenv("TERM") == "dumb") { + return(FALSE) + } + grepl( + pattern = "^screen|^xterm|^vt100|color|ansi|cygwin|linux", + x = Sys.getenv("TERM"), + ignore.case = TRUE, + perl = TRUE + ) +} + +# set colours if console has_colour() +try_colour <- function(..., before, after, collapse = " ") { + if (length(c(...)) == 0) { + return(character(0)) + } + txt <- paste0(c(...), collapse = collapse) + if (isTRUE(has_colour())) { + if (is.null(collapse)) { + paste0(before, txt, after, collapse = NULL) + } else { + paste0(before, txt, after, collapse = "") + } + } else { + txt + } +} +is_dark <- function() { + if (is.null(AMR_env$is_dark_theme)) { + AMR_env$is_dark_theme <- !has_colour() || tryCatch(isTRUE(getExportedValue("getThemeInfo", ns = asNamespace("rstudioapi"))()$dark), error = function(e) FALSE) + } + isTRUE(AMR_env$is_dark_theme) +} +font_black <- function(..., collapse = " ", adapt = TRUE) { + before <- "\033[38;5;232m" + after <- "\033[39m" + if (isTRUE(adapt) && is_dark()) { + # white + before <- "\033[37m" + after <- "\033[39m" + } + try_colour(..., before = before, after = after, collapse = collapse) +} +font_white <- function(..., collapse = " ", adapt = TRUE) { + before <- "\033[37m" + after <- "\033[39m" + if (isTRUE(adapt) && is_dark()) { + # black + before <- "\033[38;5;232m" + after <- "\033[39m" + } + try_colour(..., before = before, after = after, collapse = collapse) +} +font_blue <- function(..., collapse = " ") { + try_colour(..., before = "\033[34m", after = "\033[39m", collapse = collapse) +} +font_green <- function(..., collapse = " ") { + try_colour(..., before = "\033[32m", after = "\033[39m", collapse = collapse) +} +font_magenta <- function(..., collapse = " ") { + try_colour(..., before = "\033[35m", after = "\033[39m", collapse = collapse) +} +font_red <- function(..., collapse = " ") { + try_colour(..., before = "\033[31m", after = "\033[39m", collapse = collapse) +} +font_silver <- function(..., collapse = " ") { + try_colour(..., before = "\033[90m", after = "\033[39m", collapse = collapse) +} +font_yellow <- function(..., collapse = " ") { + try_colour(..., before = "\033[33m", after = "\033[39m", collapse = collapse) +} +font_subtle <- function(..., collapse = " ") { + try_colour(..., before = "\033[38;5;246m", after = "\033[39m", collapse = collapse) +} +font_grey <- function(..., collapse = " ") { + try_colour(..., before = "\033[38;5;249m", after = "\033[39m", collapse = collapse) +} +font_grey_bg <- function(..., collapse = " ") { + if (is_dark()) { + # similar to HTML #444444 + try_colour(..., before = "\033[48;5;238m", after = "\033[49m", collapse = collapse) + } else { + # similar to HTML #f0f0f0 + try_colour(..., before = "\033[48;5;255m", after = "\033[49m", collapse = collapse) + } +} +font_red_bg <- function(..., collapse = " ") { + # this is #ed553b (picked to be colourblind-safe with other SIR colours) + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;203m", after = "\033[49m", collapse = collapse) +} +font_orange_bg <- function(..., collapse = " ") { + # this is #f6d55c (picked to be colourblind-safe with other SIR colours) + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;222m", after = "\033[49m", collapse = collapse) +} +font_yellow_bg <- function(..., collapse = " ") { + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;228m", after = "\033[49m", collapse = collapse) +} +font_green_bg <- function(..., collapse = " ") { + # this is #3caea3 (picked to be colourblind-safe with other SIR colours) + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;79m", after = "\033[49m", collapse = collapse) +} +font_purple_bg <- function(..., collapse = " ") { + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;89m", after = "\033[49m", collapse = collapse) +} +font_rose_bg <- function(..., collapse = " ") { + try_colour(font_black(..., collapse = collapse, adapt = FALSE), before = "\033[48;5;217m", after = "\033[49m", collapse = collapse) +} +font_na <- function(..., collapse = " ") { + font_red(..., collapse = collapse) +} +font_bold <- function(..., collapse = " ") { + try_colour(..., before = "\033[1m", after = "\033[22m", collapse = collapse) +} +font_italic <- function(..., collapse = " ") { + try_colour(..., before = "\033[3m", after = "\033[23m", collapse = collapse) +} +font_underline <- function(..., collapse = " ") { + try_colour(..., before = "\033[4m", after = "\033[24m", collapse = collapse) +} +font_url <- function(url, txt = url) { + if (tryCatch(isTRUE(getExportedValue("ansi_has_hyperlink_support", ns = asNamespace("cli"))()), error = function(e) FALSE)) { + paste0("\033]8;;", url, "\a", txt, "\033]8;;\a") + } else { + url + } +} +font_stripstyle <- function(x) { + # remove URLs + x <- gsub("\033]8;;(.*?)\a.*?\033]8;;\a", "\\1", x) + # from crayon:::ansi_regex + x <- gsub("(?:(?:\\x{001b}\\[)|\\x{009b})(?:(?:[0-9]{1,3})?(?:(?:;[0-9]{0,3})*)?[A-M|f-m])|\\x{001b}[A-M]", "", x, perl = TRUE) + x +} + +progress_ticker <- function(n = 1, n_min = 0, print = TRUE, clear = TRUE, title = "", only_bar_percent = FALSE, ...) { + if (print == FALSE || n < n_min) { + # create fake/empty object + pb <- list() + pb$tick <- function() { + invisible() + } + pb$kill <- function() { + invisible() + } + set_clean_class(pb, new_class = "txtProgressBar") + } else if (n >= n_min) { + title <- trimws2(title) + if (title != "") { + title <- paste0(title, " ") + } + progress_bar <- import_fn("progress_bar", "progress", error_on_fail = FALSE) + if (!is.null(progress_bar)) { + # so we use progress::progress_bar + # a close()-method was also added, see below for that + pb <- progress_bar$new( + show_after = 0, + format = paste0(title, + ifelse(only_bar_percent == TRUE, "[:bar] :percent", "[:bar] :percent (:current/:total,:eta)")), + clear = clear, + total = n + ) + } else { + # use base R's txtProgressBar + cat(title, "\n", sep = "") + pb <- utils::txtProgressBar(max = n, style = 3) + pb$tick <- function() { + pb$up(pb$getVal() + 1) + } + } + pb + } +} + +#' @method close progress_bar +#' @export +#' @noRd +close.progress_bar <- function(con, ...) { + # for progress::progress_bar$new() + con$terminate() +} + +set_clean_class <- function(x, new_class) { + # return the object with only the new class and no additional attributes where possible + if (is.null(x)) { + x <- NA_character_ + } + if (is.factor(x)) { + # keep only levels and remove all other attributes + lvls <- levels(x) + attributes(x) <- NULL + levels(x) <- lvls + } else if (!is.list(x) && !is.function(x)) { + attributes(x) <- NULL + } + class(x) <- new_class + x +} + +formatted_filesize <- function(...) { + size_kb <- file.size(...) / 1024 + if (size_kb < 1) { + paste(round(size_kb, 1), "kB") + } else if (size_kb < 100) { + paste(round(size_kb, 0), "kB") + } else { + paste(round(size_kb / 1024, 1), "MB") + } +} + +create_pillar_column <- function(x, ...) { + new_pillar_shaft_simple <- import_fn("new_pillar_shaft_simple", "pillar") + new_pillar_shaft_simple(x, ...) +} + +as_original_data_class <- function(df, old_class = NULL, extra_class = NULL) { + if ("tbl_df" %in% old_class && pkg_is_available("tibble")) { + # this will then also remove groups + fn <- import_fn("as_tibble", "tibble") + } else if ("data.table" %in% old_class && pkg_is_available("data.table")) { + fn <- import_fn("as.data.table", "data.table") + } else { + fn <- function(x) base::as.data.frame(df, stringsAsFactors = FALSE) + } + out <- fn(df) + if (!is.null(extra_class)) { + class(out) <- c(extra_class, class(out)) + } + out +} + +# works exactly like round(), but rounds `round2(44.55, 1)` to 44.6 instead of 44.5 +# and adds decimal zeroes until `digits` is reached when force_zero = TRUE +round2 <- function(x, digits = 1, force_zero = TRUE) { + x <- as.double(x) + # https://stackoverflow.com/a/12688836/4575331 + val <- (trunc((abs(x) * 10^digits) + 0.5) / 10^digits) * sign(x) + if (digits > 0 && force_zero == TRUE) { + values_trans <- val[val != as.integer(val) & !is.na(val)] + val[val != as.integer(val) & !is.na(val)] <- paste0( + values_trans, + strrep( + "0", + max( + 0, + digits - nchar( + format( + as.double( + gsub( + ".*[.](.*)$", + "\\1", + values_trans + ) + ), + scientific = FALSE + ) + ) + ) + ) + ) + } + as.double(val) +} + + +# percentage from our other package: 'cleaner' +percentage <- function(x, digits = NULL, ...) { + # getdecimalplaces() function + getdecimalplaces <- function(x, minimum = 0, maximum = 3) { + if (maximum < minimum) { + maximum <- minimum + } + if (minimum > maximum) { + minimum <- maximum + } + max_places <- max(unlist(lapply( + strsplit(sub( + "0+$", "", + as.character(x * 100) + ), ".", fixed = TRUE), + function(y) ifelse(length(y) == 2, nchar(y[2]), 0) + )), na.rm = TRUE) + max( + min(max_places, + maximum, + na.rm = TRUE + ), + minimum, + na.rm = TRUE + ) + } + + # format_percentage() function + format_percentage <- function(x, digits = NULL, ...) { + if (is.null(digits)) { + digits <- getdecimalplaces(x) + } + if (is.null(digits) || is.na(digits) || !is.numeric(digits)) { + digits <- 2 + } + + # round right: percentage(0.4455) and format(as.percentage(0.4455), 1) should return "44.6%", not "44.5%" + x_formatted <- format(round2(as.double(x), digits = digits + 2) * 100, + scientific = FALSE, + digits = max(1, digits), + nsmall = digits, + ... + ) + x_formatted <- paste0(x_formatted, "%") + x_formatted[!grepl(pattern = "^[0-9.,e-]+$", x = x)] <- NA_character_ + x_formatted + } + + # the actual working part + x <- as.double(x) + if (is.null(digits)) { + # max one digit if undefined + digits <- getdecimalplaces(x, minimum = 0, maximum = 1) + } + format_percentage( + structure( + .Data = as.double(x), + class = c("percentage", "numeric") + ), + digits = digits, ... + ) +} + +add_intrinsic_resistance_to_AMR_env <- function() { + # for mo_is_intrinsic_resistant() - saves a lot of time when executed on this vector + if (is.null(AMR_env$intrinsic_resistant)) { + AMR_env$intrinsic_resistant <- paste(AMR::intrinsic_resistant$mo, AMR::intrinsic_resistant$ab) + } +} + +add_MO_lookup_to_AMR_env <- function() { + # for all MO functions, saves a lot of time on package load and in package size + if (is.null(AMR_env$MO_lookup)) { + MO_lookup <- AMR::microorganisms + + MO_lookup$kingdom_index <- NA_real_ + MO_lookup[which(MO_lookup$kingdom == "Bacteria" | MO_lookup$mo == "UNKNOWN"), "kingdom_index"] <- 1 + MO_lookup[which(MO_lookup$kingdom == "Fungi"), "kingdom_index"] <- 1.25 + MO_lookup[which(MO_lookup$kingdom == "Protozoa"), "kingdom_index"] <- 1.5 + MO_lookup[which(MO_lookup$kingdom == "Chromista"), "kingdom_index"] <- 1.75 + MO_lookup[which(MO_lookup$kingdom == "Archaea"), "kingdom_index"] <- 2 + # all the rest + MO_lookup[which(is.na(MO_lookup$kingdom_index)), "kingdom_index"] <- 3 + + # the fullname lowercase, important for the internal algorithms in as.mo() + MO_lookup$fullname_lower <- tolower(trimws2(paste( + MO_lookup$genus, + MO_lookup$species, + MO_lookup$subspecies + ))) + ind <- MO_lookup$genus == "" | grepl("^[(]unknown ", MO_lookup$fullname, perl = TRUE) + MO_lookup[ind, "fullname_lower"] <- tolower(MO_lookup[ind, "fullname", drop = TRUE]) + MO_lookup$fullname_lower <- trimws2(gsub("[^.a-z0-9/ \\-]+", "", MO_lookup$fullname_lower, perl = TRUE)) + # special for Salmonella - they have cities as subspecies but not the species (enterica) in the fullname: + MO_lookup$fullname_lower[which(MO_lookup$subspecies %like_case% "^[A-Z]")] <- gsub(" enterica ", " ", MO_lookup$fullname_lower[which(MO_lookup$subspecies %like_case% "^[A-Z]")], fixed = TRUE) + + MO_lookup$genus_lower <- tolower(MO_lookup$genus) + + MO_lookup$full_first <- substr(MO_lookup$fullname_lower, 1, 1) + MO_lookup$species_first <- tolower(substr(MO_lookup$species, 1, 1)) # tolower for groups (Streptococcus, Salmonella) + MO_lookup$subspecies_first <- tolower(substr(MO_lookup$subspecies, 1, 1)) # tolower for Salmonella serovars + AMR_env$MO_lookup <- MO_lookup + } +} + +trimws2 <- function(..., whitespace = "[\u0009\u000A\u000B\u000C\u000D\u0020\u0085\u00A0\u1680\u180E\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A\u200B\u200C\u200D\u2028\u2029\u202F\u205F\u2060\u3000\uFEFF]") { + # this is even faster than trimws() itself which sets "[ \t\r\n]". + trimws(..., whitespace = whitespace) +} + +totitle <- function(x) { + gsub("^(.)", "\\U\\1", x, perl = TRUE) +} + +readRDS_AMR <- function(file, refhook = NULL) { + # this is readRDS with remote file support + con <- file(file) + on.exit(close(con)) + readRDS(con, refhook = refhook) +} + +# Faster data.table implementations ---- + +match <- function(x, table, ...) { + if (!is.null(AMR_env$chmatch) && inherits(x, "character") && inherits(table, "character")) { + # data.table::chmatch() is much faster than base::match() for character + tryCatch(AMR_env$chmatch(x, table, ...), error = function(e) base::match(x, table, ...)) + } else { + base::match(x, table, ...) + } +} +`%in%` <- function(x, table) { + if (!is.null(AMR_env$chin) && inherits(x, "character") && inherits(table, "character")) { + # data.table::`%chin%`() is much faster than base::`%in%`() for character + tryCatch(AMR_env$chin(x, table), error = function(e) base::`%in%`(x, table)) + } else { + base::`%in%`(x, table) + } +} + +# nolint start + +# Register S3 methods ---- +# copied from vctrs::s3_register by their permission: +# https://github.com/r-lib/vctrs/blob/05968ce8e669f73213e3e894b5f4424af4f46316/R/register-s3.R +s3_register <- function(generic, class, method = NULL) { + stopifnot(is.character(generic), length(generic) == 1) + stopifnot(is.character(class), length(class) == 1) + pieces <- strsplit(generic, "::")[[1]] + stopifnot(length(pieces) == 2) + package <- pieces[[1]] + generic <- pieces[[2]] + caller <- parent.frame() + get_method_env <- function() { + top <- topenv(caller) + if (isNamespace(top)) { + asNamespace(environmentName(top)) + } else { + caller + } + } + get_method <- function(method, env) { + if (is.null(method)) { + get(paste0(generic, ".", class), envir = get_method_env()) + } else { + method + } + } + method_fn <- get_method(method) + stopifnot(is.function(method_fn)) + setHook(packageEvent(package, "onLoad"), function(...) { + ns <- asNamespace(package) + method_fn <- get_method(method) + registerS3method(generic, class, method_fn, envir = ns) + }) + if (!isNamespaceLoaded(package)) { + return(invisible()) + } + envir <- asNamespace(package) + if (exists(generic, envir)) { + registerS3method(generic, class, method_fn, envir = envir) + } + invisible() +} + + +# Support old R versions ---- +# these functions were not available in previous versions of R +# see here for the full list: https://github.com/r-lib/backports +if (getRversion() < "3.1.0") { + # R-3.0 does not contain these functions, set them here to prevent installation failure + # (required for extension of the 'mic' class) + cospi <- function(...) 1 + sinpi <- function(...) 1 + tanpi <- function(...) 1 +} + +if (getRversion() < "3.2.0") { + anyNA <- function(x, recursive = FALSE) { + if (isTRUE(recursive) && (is.list(x) || is.pairlist(x))) { + return(any(rapply(x, anyNA, how = "unlist", recursive = FALSE))) + } + any(is.na(x)) + } + dir.exists <- function(paths) { + x <- base::file.info(paths)$isdir + !is.na(x) & x + } + file.size <- function(...) { + file.info(...)$size + } + file.mtime <- function(...) { + file.info(...)$mtime + } + isNamespaceLoaded <- function(pkg) { + pkg %in% loadedNamespaces() + } + lengths <- function(x, use.names = TRUE) { + vapply(x, length, FUN.VALUE = NA_integer_, USE.NAMES = use.names) + } +} + +if (getRversion() < "3.3.0") { + strrep <- function(x, times) { + x <- as.character(x) + if (length(x) == 0L) { + return(x) + } + unlist(.mapply(function(x, times) { + if (is.na(x) || is.na(times)) { + return(NA_character_) + } + if (times <= 0L) { + return("") + } + paste0(replicate(times, x), collapse = "") + }, list(x = x, times = times), MoreArgs = list()), use.names = FALSE) + } +} + +if (getRversion() < "3.5.0") { + isFALSE <- function(x) { + is.logical(x) && length(x) == 1L && !is.na(x) && !x + } +} + +if (getRversion() < "3.6.0") { + str2lang <- function(s) { + stopifnot(length(s) == 1L) + ex <- parse(text = s, keep.source = FALSE) + stopifnot(length(ex) == 1L) + ex[[1L]] + } + # trims() was introduced in 3.3.0, but its argument `whitespace` only in 3.6.0 + trimws <- function(x, which = c("both", "left", "right"), whitespace = "[ \t\r\n]") { + which <- match.arg(which) + mysub <- function(re, x) sub(re, "", x, perl = TRUE) + switch(which, + left = mysub(paste0("^", whitespace, "+"), x), + right = mysub(paste0(whitespace, "+$"), x), + both = mysub(paste0(whitespace, "+$"), mysub(paste0("^", whitespace, "+"), x)) + ) + } +} + +if (getRversion() < "4.0.0") { + deparse1 <- function(expr, collapse = " ", width.cutoff = 500L, ...) { + paste(deparse(expr, width.cutoff, ...), collapse = collapse) + } +} + +# nolint end + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/aa_helper_pm_functions.R + + + + +# ------------------------------------------------ +# THIS FILE WAS CREATED AUTOMATICALLY! +# Source file: data-raw/reproduction_of_poorman.R +# ------------------------------------------------ + +# poorman: a package to replace all dplyr functions with base R so we can lose dependency on dplyr. +# These functions were downloaded from https://github.com/nathaneastwood/poorman, +# from this commit: https://github.com/nathaneastwood/poorman/tree/52eb6947e0b4430cd588976ed8820013eddf955f. +# +# All functions are prefixed with 'pm_' to make it obvious that they are dplyr substitutes. +# +# All code below was released under MIT license, that permits 'free of charge, to any person obtaining a +# copy of the software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, merge, publish, +# distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software +# is furnished to do so', given that a copyright notice is given in the software. +# +# Copyright notice on 19 September 2020, the day this code was downloaded, as found on +# https://github.com/nathaneastwood/poorman/blob/52eb6947e0b4430cd588976ed8820013eddf955f/LICENSE: +# YEAR: 2020 +# COPYRIGHT HOLDER: Nathan Eastwood + +pm_arrange <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_arrange.grouped_data(.data, ...) + } else { + pm_arrange.default(.data, ...) + } +} + +pm_arrange.default <- function(.data, ...) { + pm_context$setup(.data) + on.exit(pm_context$clean(), add = TRUE) + rows <- eval(substitute(order(...)), envir = pm_context$.data) + .data[rows, , drop = FALSE] +} + +pm_arrange.grouped_data <- function(.data, ...) { + pm_apply_grouped_function("pm_arrange", .data, drop = TRUE, ...) +} +pm_between <- function(x, left, right) { + if (!is.null(attr(x, "class")) && !inherits(x, c("Date", "POSIXct"))) { + warning("`pm_between()` called on numeric vector with S3 class") + } + if (!is.double(x)) x <- as.numeric(x) + x >= as.numeric(left) & x <= as.numeric(right) +} +pm_context <- new.env() + +# Data +pm_context$setup <- function(.data) pm_context$.data <- .data +pm_context$get_data <- function() pm_context$.data +pm_context$get_nrow <- function() nrow(pm_context$.data) +pm_context$get_colnames <- function() colnames(pm_context$.data) +pm_context$clean <- function() rm(list = c(".data"), envir = pm_context) + + +pm_n <- function() { + pm_check_group_pm_context("`pm_n()`") + pm_context$get_nrow() +} + +pm_cur_data <- function() { + pm_check_group_pm_context("`pm_cur_data()`") + data <- pm_context$get_data() + data[, !(colnames(data) %in% pm_get_groups(data)), drop = FALSE] +} + +pm_cur_group <- function() { + pm_check_group_pm_context("`pm_cur_group()`") + data <- pm_context$get_data() + res <- data[1L, pm_get_groups(data), drop = FALSE] + rownames(res) <- NULL + res +} + +pm_cur_group_id <- function() { + pm_check_group_pm_context("`pm_cur_group_id()`") + data <- pm_context$get_data() + res <- data[1L, pm_get_groups(data), drop = FALSE] + details <- pm_get_group_details(data) + details[, ".group_id"] <- seq_len(nrow(details)) + res <- suppressMessages(pm_semi_join(details, res)) + list(res[, ".group_id"]) +} + +pm_cur_group_rows <- function() { + pm_check_group_pm_context("`pm_cur_group_rows()`") + data <- pm_context$get_data() + res <- data[1L, pm_get_groups(data), drop = FALSE] + res <- suppressMessages(pm_semi_join(pm_get_group_details(data), res)) + unlist(res[, ".rows"]) +} + +pm_check_group_pm_context <- function(fn) { + if (is.null(pm_context$.data)) { + stop(fn, " must only be used inside poorman verbs") + } +} +pm_count <- function(x, ..., wt = NULL, sort = FALSE, name = NULL) { + pm_groups <- pm_get_groups(x) + if (!missing(...)) x <- pm_group_by(x, ..., .add = TRUE) + wt <- pm_deparse_var(wt) + res <- do.call(pm_tally, list(x, wt, sort, name)) + if (length(pm_groups) > 0L) res <- do.call(pm_group_by, list(res, as.name(pm_groups))) + res +} + +pm_tally <- function(x, wt = NULL, sort = FALSE, name = NULL) { + name <- pm_check_name(x, name) + wt <- pm_deparse_var(wt) + res <- do.call(pm_summarise, pm_set_names(list(x, pm_tally_n(x, wt)), c(".data", name))) + res <- pm_ungroup(res) + if (isTRUE(sort)) res <- do.call(pm_arrange, list(res, call("pm_desc", as.name(name)))) + rownames(res) <- NULL + res +} + +pm_add_count <- function(x, ..., wt = NULL, sort = FALSE, name = NULL) { + name <- pm_check_name(x, name) + row_names <- rownames(x) + wt <- pm_deparse_var(wt) + if (!missing(...)) x <- pm_group_by(x, ..., .add = TRUE) + res <- do.call(pm_add_tally, list(x, wt, sort, name)) + res[row_names, ] +} + +pm_add_tally <- function(x, wt = NULL, sort = FALSE, name = NULL) { + wt <- pm_deparse_var(wt) + pm_n <- pm_tally_n(x, wt) + name <- pm_check_name(x, name) + res <- do.call(pm_mutate, pm_set_names(list(x, pm_n), c(".data", name))) + + if (isTRUE(sort)) { + do.call(pm_arrange, list(res, call("pm_desc", as.name(name)))) + } else { + res + } +} + +pm_tally_n <- function(x, wt) { + if (is.null(wt) && "pm_n" %in% colnames(x)) { + message("Using `pm_n` as weighting variable") + wt <- "pm_n" + } + pm_context$setup(.data = x) + on.exit(pm_context$clean(), add = TRUE) + if (is.null(wt)) { + call("pm_n") + } else { + call("sum", as.name(wt), na.rm = TRUE) + } +} + +pm_check_name <- function(df, name) { + if (is.null(name)) { + if ("pm_n" %in% colnames(df)) { + stop( + "Column 'pm_n' is already present in output\n", + "* Use `name = \"new_name\"` to pick a new name" + ) + } + return("pm_n") + } + + if (!is.character(name) || length(name) != 1) { + stop("`name` must be a single string") + } + + name +} +pm_desc <- function(x) -xtfrm(x) +pm_distinct <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_distinct.grouped_data(.data, ...) + } else { + pm_distinct.default(.data, ...) + } +} + +pm_distinct.default <- function(.data, ..., .keep_all = FALSE) { + if (ncol(.data) == 0L) { + return(.data[1, ]) + } + cols <- pm_deparse_dots(...) + col_names <- names(cols) + col_len <- length(cols) + if (is.null(col_names) && col_len > 0L) names(cols) <- cols + if (col_len == 0L) { + res <- .data + } else { + res <- pm_mutate(.data, ...) + col_names <- names(cols) + res <- if (!is.null(col_names)) { + zero_names <- nchar(col_names) == 0L + if (any(zero_names)) { + names(cols)[zero_names] <- cols[zero_names] + col_names <- names(cols) + } + suppressMessages(pm_select(res, col_names)) + } else { + suppressMessages(pm_select(res, cols)) + } + } + res <- unique(res) + if (isTRUE(.keep_all)) { + res <- cbind(res, .data[rownames(res), setdiff(colnames(.data), colnames(res)), drop = FALSE]) + } + common_cols <- c(intersect(colnames(.data), colnames(res)), setdiff(col_names, colnames(.data))) + if (length(common_cols) > 0L) res[, common_cols, drop = FALSE] else res +} + +pm_distinct.grouped_data <- function(.data, ..., .keep_all = FALSE) { + pm_apply_grouped_function("pm_distinct", .data, drop = TRUE, ..., .keep_all = .keep_all) +} +pm_eval_env <- new.env() +pm_filter <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_filter.grouped_data(.data, ...) + } else { + pm_filter.default(.data, ...) + } +} + +pm_filter.default <- function(.data, ...) { + conditions <- pm_dotdotdot(...) + cond_class <- vapply(conditions, typeof, NA_character_) + if (any(cond_class != "language")) stop("Conditions must be logical vectors") + pm_context$setup(.data) + on.exit(pm_context$clean(), add = TRUE) + pm_eval_env$env <- parent.frame() + on.exit(rm(list = "env", envir = pm_eval_env), add = TRUE) + rows <- lapply( + conditions, + function(cond, frame) eval(cond, pm_context$.data, frame), + frame = pm_eval_env$env + ) + rows <- Reduce("&", rows) + .data[rows & !is.na(rows), ] +} + +pm_filter.grouped_data <- function(.data, ...) { + rows <- rownames(.data) + res <- pm_apply_grouped_function("pm_filter", .data, drop = TRUE, ...) + res[rows[rows %in% rownames(res)], ] +} +pm_group_by <- function(.data, ..., .add = FALSE) { + pm_check_is_dataframe(.data) + pre_groups <- pm_get_groups(.data) + pm_groups <- pm_deparse_dots(...) + if (isTRUE(.add)) pm_groups <- unique(c(pre_groups, pm_groups)) + unknown <- !(pm_groups %in% colnames(.data)) + if (any(unknown)) stop("Invalid pm_groups: ", pm_groups[unknown]) + class(.data) <- c("grouped_data", class(.data)) + pm_set_groups(.data, pm_groups) +} + +pm_ungroup <- function(x, ...) { + pm_check_is_dataframe(x) + rm_groups <- pm_deparse_dots(...) + pm_groups <- pm_get_groups(x) + if (length(rm_groups) == 0L) rm_groups <- pm_groups + x <- pm_set_groups(x, pm_groups[!(pm_groups %in% rm_groups)]) + if (length(attr(x, "pm_groups")) == 0L) { + attr(x, "pm_groups") <- NULL + class(x) <- class(x)[!(class(x) %in% "grouped_data")] + } + x +} + +pm_set_groups <- function(x, pm_groups) { + attr(x, "pm_groups") <- if (is.null(pm_groups) || length(pm_groups) == 0L) { + NULL + } else { + pm_group_data_worker(x, pm_groups) + } + x +} + +pm_get_groups <- function(x) { + pm_groups <- attr(x, "pm_groups", exact = TRUE) + if (is.null(pm_groups)) character(0) else colnames(pm_groups)[!colnames(pm_groups) %in% c(".group_id", ".rows")] +} + +pm_get_group_details <- function(x) { + pm_groups <- attr(x, "pm_groups", exact = TRUE) + if (is.null(pm_groups)) character(0) else pm_groups +} + +pm_has_groups <- function(x) { + pm_groups <- pm_get_groups(x) + if (length(pm_groups) == 0L) FALSE else TRUE +} + +pm_apply_grouped_function <- function(fn, .data, drop = FALSE, ...) { + pm_groups <- pm_get_groups(.data) + grouped <- pm_split_into_groups(.data, pm_groups, drop) + res <- do.call(rbind, unname(lapply(grouped, fn, ...))) + if (any(pm_groups %in% colnames(res))) { + class(res) <- c("grouped_data", class(res)) + res <- pm_set_groups(res, pm_groups[pm_groups %in% colnames(res)]) + } + res +} + +pm_print.grouped_data <- function(x, ..., digits = NULL, quote = FALSE, right = TRUE, row.names = TRUE, max = NULL) { + class(x) <- "data.frame" + print(x, ..., digits = digits, quote = quote, right = right, row.names = row.names, max = max) + cat("\nGroups: ", paste(pm_get_groups(x), collapse = ", "), "\n\n") +} + +pm_group_data <- function(.data) { + if (!pm_has_groups(.data)) { + return(data.frame(.rows = I(list(seq_len(nrow(.data)))))) + } + pm_groups <- pm_get_groups(.data) + pm_group_data_worker(.data, pm_groups) +} + +pm_group_data_worker <- function(.data, pm_groups) { + res <- unique(.data[, pm_groups, drop = FALSE]) + class(res) <- "data.frame" + nrow_res <- nrow(res) + rows <- rep(list(NA), nrow_res) + for (i in seq_len(nrow_res)) { + rows[[i]] <- which(interaction(.data[, pm_groups]) %in% interaction(res[i, pm_groups])) + } + res$`.rows` <- rows + res <- res[do.call(order, lapply(pm_groups, function(x) res[, x])), , drop = FALSE] + rownames(res) <- NULL + res +} + +pm_group_rows <- function(.data) { + pm_group_data(.data)[[".rows"]] +} + +pm_group_indices <- function(.data) { + if (!pm_has_groups(.data)) { + return(rep(1L, nrow(.data))) + } + pm_groups <- pm_get_groups(.data) + res <- unique(.data[, pm_groups, drop = FALSE]) + res <- res[do.call(order, lapply(pm_groups, function(x) res[, x])), , drop = FALSE] + class(res) <- "data.frame" + nrow_data <- nrow(.data) + rows <- rep(NA, nrow_data) + for (i in seq_len(nrow_data)) { + rows[i] <- which(interaction(res[, pm_groups]) %in% interaction(.data[i, pm_groups])) + } + rows +} + +pm_group_vars <- function(x) { + pm_get_groups(x) +} + +pm_groups <- function(x) { + lapply(pm_get_groups(x), as.symbol) +} + +pm_group_size <- function(x) { + lengths(pm_group_rows(x)) +} + +pm_n_groups <- function(x) { + nrow(pm_group_data(x)) +} +# pm_group_split <- function(.data, ..., .keep = TRUE) { +# dots_len <- ...length() > 0L +# if (pm_has_groups(.data) && isTRUE(dots_len)) { +# warning("... is ignored in pm_group_split(), please use pm_group_by(..., .add = TRUE) %pm>% pm_group_split()") +# } +# if (!pm_has_groups(.data) && isTRUE(dots_len)) { +# .data <- pm_group_by(.data, ...) +# } +# if (!pm_has_groups(.data) && isFALSE(dots_len)) { +# return(list(.data)) +# } +# pm_context$setup(.data) +# on.exit(pm_context$clean(), add = TRUE) +# pm_groups <- pm_get_groups(.data) +# attr(pm_context$.data, "pm_groups") <- NULL +# res <- pm_split_into_groups(pm_context$.data, pm_groups) +# names(res) <- NULL +# if (isFALSE(.keep)) { +# res <- lapply(res, function(x) x[, !colnames(x) %in% pm_groups]) +# } +# any_empty <- unlist(lapply(res, function(x) !(nrow(x) == 0L))) +# res[any_empty] +# } + +pm_group_keys <- function(.data) { + pm_groups <- pm_get_groups(.data) + pm_context$setup(.data) + res <- pm_context$.data[, pm_context$get_colnames() %in% pm_groups, drop = FALSE] + res <- res[!duplicated(res), , drop = FALSE] + if (nrow(res) == 0L) { + return(res) + } + class(res) <- "data.frame" + res <- res[do.call(order, lapply(pm_groups, function(x) res[, x])), , drop = FALSE] + rownames(res) <- NULL + res +} + +pm_split_into_groups <- function(.data, pm_groups, drop = FALSE, ...) { + class(.data) <- "data.frame" + group_factors <- lapply(pm_groups, function(x, .data) as.factor(.data[, x]), .data) + split(x = .data, f = group_factors, drop = drop, ...) +} +pm_if_else <- function(condition, true, false, missing = NULL) { + if (!is.logical(condition)) stop("`condition` must be a logical vector.") + cls_true <- class(true) + cls_false <- class(false) + cls_missing <- class(missing) + if (!identical(cls_true, cls_false)) { + stop("The class of `true` <", class(true), "> is not the same as the class of `false` <", class(false), ">") + } + if (!is.null(missing) && !identical(cls_true, cls_missing)) { + stop("`missing` must be a ", cls_true, " vector, not a ", cls_missing, " vector.") + } + res <- ifelse(condition, true, false) + if (!is.null(missing)) res[is.na(res)] <- missing + attributes(res) <- attributes(true) + res +} + +pm_anti_join <- function(x, y, by = NULL) { + pm_filter_join_worker(x, y, by, type = "anti") +} + +pm_semi_join <- function(x, y, by = NULL) { + pm_filter_join_worker(x, y, by, type = "semi") +} + +pm_filter_join_worker <- function(x, y, by = NULL, type = c("anti", "semi")) { + type <- match.arg(type, choices = c("anti", "semi"), several.ok = FALSE) + if (is.null(by)) { + by <- intersect(names(x), names(y)) + pm_join_message(by) + } + rows <- interaction(x[, by]) %in% interaction(y[, by]) + if (type == "anti") rows <- !rows + res <- x[rows, , drop = FALSE] + rownames(res) <- NULL + res +} + +pm_inner_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { + pm_join_worker(x = x, y = y, by = by, suffix = suffix, sort = FALSE) +} + +# pm_left_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { +# pm_join_worker(x = x, y = y, by = by, suffix = suffix, all.x = TRUE) +# } + +pm_right_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { + pm_join_worker(x = x, y = y, by = by, suffix = suffix, all.y = TRUE) +} + +pm_full_join <- function(x, y, by = NULL, suffix = c(".x", ".y")) { + pm_join_worker(x = x, y = y, by = by, suffix = suffix, all = TRUE) +} + +pm_join_worker <- function(x, y, by = NULL, suffix = c(".x", ".y"), ...) { + x[, ".join_id"] <- seq_len(nrow(x)) + if (is.null(by)) { + by <- intersect(names(x), names(y)) + pm_join_message(by) + merged <- merge(x = x, y = y, by = by, suffixes = suffix, ...)[, union(names(x), names(y))] + } else if (is.null(names(by))) { + merged <- merge(x = x, y = y, by = by, suffixes = suffix, ...) + } else { + merged <- merge(x = x, y = y, by.x = names(by), by.y = by, suffixes = suffix, ...) + } + merged <- merged[order(merged[, ".join_id"]), colnames(merged) != ".join_id"] + rownames(merged) <- NULL + merged +} + +pm_join_message <- function(by) { + if (length(by) > 1L) { + message("Joining, by = c(\"", paste0(by, collapse = "\", \""), "\")\n", sep = "") + } else { + message("Joining, by = \"", by, "\"\n", sep = "") + } +} +pm_lag <- function(x, pm_n = 1L, default = NA) { + if (inherits(x, "ts")) stop("`x` must be a vector, not a `ts` object, do you want `stats::pm_lag()`?") + if (length(pm_n) != 1L || !is.numeric(pm_n) || pm_n < 0L) stop("`pm_n` must be a nonnegative integer scalar") + if (pm_n == 0L) { + return(x) + } + tryCatch( + storage.mode(default) <- typeof(x), + warning = function(w) { + stop("Cannot convert `default` <", typeof(default), "> to `x` <", typeof(x), ">") + } + ) + xlen <- length(x) + pm_n <- pmin(pm_n, xlen) + res <- c(rep(default, pm_n), x[seq_len(xlen - pm_n)]) + attributes(res) <- attributes(x) + res +} + +pm_lead <- function(x, pm_n = 1L, default = NA) { + if (length(pm_n) != 1L || !is.numeric(pm_n) || pm_n < 0L) stop("pm_n must be a nonnegative integer scalar") + if (pm_n == 0L) { + return(x) + } + tryCatch( + storage.mode(default) <- typeof(x), + warning = function(w) { + stop("Cannot convert `default` <", typeof(default), "> to `x` <", typeof(x), ">") + } + ) + xlen <- length(x) + pm_n <- pmin(pm_n, xlen) + res <- c(x[-seq_len(pm_n)], rep(default, pm_n)) + attributes(res) <- attributes(x) + res +} +pm_mutate <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_mutate.grouped_data(.data, ...) + } else { + pm_mutate.default(.data, ...) + } +} + +pm_mutate.default <- function(.data, ...) { + conditions <- pm_dotdotdot(..., .impute_names = TRUE) + .data[, setdiff(names(conditions), names(.data))] <- NA + pm_context$setup(.data) + on.exit(pm_context$clean(), add = TRUE) + for (i in seq_along(conditions)) { + pm_context$.data[, names(conditions)[i]] <- eval(conditions[[i]], envir = pm_context$.data) + } + pm_context$.data +} + +pm_mutate.grouped_data <- function(.data, ...) { + rows <- rownames(.data) + res <- pm_apply_grouped_function("pm_mutate", .data, drop = TRUE, ...) + res[rows, ] +} +pm_n_distinct <- function(..., na.rm = FALSE) { + res <- c(...) + if (is.list(res)) { + return(nrow(unique(as.data.frame(res, stringsAsFactors = FALSE)))) + } + if (isTRUE(na.rm)) res <- res[!is.na(res)] + length(unique(res)) +} +pm_na_if <- function(x, y) { + y_len <- length(y) + x_len <- length(x) + if (!(y_len %in% c(1L, x_len))) stop("`y` must be length ", x_len, " (same as `x`) or 1, not ", y_len) + x[x == y] <- NA + x +} +pm_near <- function(x, y, tol = .Machine$double.eps^0.5) { + abs(x - y) < tol +} +`%pm>%` <- function(lhs, rhs) { + lhs <- substitute(lhs) + rhs <- substitute(rhs) + eval(as.call(c(rhs[[1L]], lhs, as.list(rhs[-1L]))), envir = parent.frame()) +} +pm_pull <- function(.data, var = -1) { + var_deparse <- pm_deparse_var(var) + col_names <- colnames(.data) + if (!(var_deparse %in% col_names) & grepl("^[[:digit:]]+L|[[:digit:]]", var_deparse)) { + var <- as.integer(gsub("L", "", var_deparse)) + var <- pm_if_else(var < 1L, rev(col_names)[abs(var)], col_names[var]) + } else if (var_deparse %in% col_names) { + var <- var_deparse + } + .data[, var, drop = TRUE] +} +pm_set_names <- function(object = nm, nm) { + names(object) <- nm + object +} + +pm_vec_head <- function(x, pm_n = 6L, ...) { + stopifnot(length(pm_n) == 1L) + pm_n <- if (pm_n < 0L) max(length(x) + pm_n, 0L) else min(pm_n, length(x)) + x[seq_len(pm_n)] +} +pm_relocate <- function(.data, ..., .before = NULL, .after = NULL) { + pm_check_is_dataframe(.data) + data_names <- colnames(.data) + col_pos <- pm_select_positions(.data, ...) + + .before <- pm_deparse_var(.before) + .after <- pm_deparse_var(.after) + has_before <- !is.null(.before) + has_after <- !is.null(.after) + + if (has_before && has_after) { + stop("You must supply only one of `.before` and `.after`") + } else if (has_before) { + pm_where <- min(match(.before, data_names)) + col_pos <- c(setdiff(col_pos, pm_where), pm_where) + } else if (has_after) { + pm_where <- max(match(.after, data_names)) + col_pos <- c(pm_where, setdiff(col_pos, pm_where)) + } else { + pm_where <- 1L + col_pos <- union(col_pos, pm_where) + } + lhs <- setdiff(seq(1L, pm_where - 1L), col_pos) + rhs <- setdiff(seq(pm_where + 1L, ncol(.data)), col_pos) + col_pos <- unique(c(lhs, col_pos, rhs)) + col_pos <- col_pos[col_pos <= length(data_names)] + + res <- .data[col_pos] + if (pm_has_groups(.data)) res <- pm_set_groups(res, pm_get_groups(.data)) + res +} +pm_rename <- function(.data, ...) { + pm_check_is_dataframe(.data) + new_names <- names(pm_deparse_dots(...)) + if (length(new_names) == 0L) { + warning("You didn't give any new names") + return(.data) + } + col_pos <- pm_select_positions(.data, ...) + old_names <- colnames(.data)[col_pos] + new_names_zero <- nchar(new_names) == 0L + if (any(new_names_zero)) { + warning("You didn't provide new names for: ", paste0("`", old_names[new_names_zero], collapse = ", "), "`") + new_names[new_names_zero] <- old_names[new_names_zero] + } + colnames(.data)[col_pos] <- new_names + .data +} + +pm_rename_with <- function(.data, .fn, .cols = pm_everything(), ...) { + if (!is.function(.fn)) stop("`", .fn, "` is not a valid function") + grouped <- inherits(.data, "grouped_data") + if (grouped) grp_pos <- which(colnames(.data) %in% pm_group_vars(.data)) + col_pos <- eval(substitute(pm_select_positions(.data, .cols))) + cols <- colnames(.data)[col_pos] + new_cols <- .fn(cols, ...) + if (any(duplicated(new_cols))) { + stop("New names must be unique however `", deparse(substitute(.fn)), "` returns duplicate column names") + } + colnames(.data)[col_pos] <- new_cols + if (grouped) .data <- pm_set_groups(.data, colnames(.data)[grp_pos]) + .data +} +pm_replace_with <- function(x, i, val, arg_name) { + if (is.null(val)) { + return(x) + } + pm_check_length(val, x, arg_name) + pm_check_type(val, x, arg_name) + pm_check_class(val, x, arg_name) + i[is.na(i)] <- FALSE + if (length(val) == 1L) { + x[i] <- val + } else { + x[i] <- val[i] + } + x +} + +pm_check_length <- function(x, y, arg_name) { + length_x <- length(x) + length_y <- length(y) + if (all(length_x %in% c(1L, length_y))) { + return() + } + if (length_y == 1) { + stop(arg_name, " must be length 1, not ", paste(length_x, sep = ", ")) + } else { + stop(arg_name, " must be length ", length_y, " or 1, not ", length_x) + } +} + +pm_check_type <- function(x, y, arg_name) { + x_type <- typeof(x) + y_type <- typeof(y) + if (identical(x_type, y_type)) { + return() + } + stop(arg_name, " must be `", y_type, "`, not `", x_type, "`") +} + +pm_check_class <- function(x, y, arg_name) { + if (!is.object(x)) { + return() + } + exp_classes <- class(y) + out_classes <- class(x) + if (identical(out_classes, exp_classes)) { + return() + } + stop(arg_name, " must have class `", exp_classes, "`, not class `", out_classes, "`") +} +pm_rownames_to_column <- function(.data, var = "rowname") { + pm_check_is_dataframe(.data) + col_names <- colnames(.data) + if (var %in% col_names) stop("Column `", var, "` already exists in `.data`") + .data[, var] <- rownames(.data) + rownames(.data) <- NULL + .data[, c(var, setdiff(col_names, var))] +} +pm_starts_with <- function(match, ignore.case = TRUE, vars = pm_peek_vars()) { + grep(pattern = paste0("^", paste0(match, collapse = "|^")), x = vars, ignore.case = ignore.case) +} + +pm_ends_with <- function(match, ignore.case = TRUE, vars = pm_peek_vars()) { + grep(pattern = paste0(paste0(match, collapse = "$|"), "$"), x = vars, ignore.case = ignore.case) +} + +pm_contains <- function(match, ignore.case = TRUE, vars = pm_peek_vars()) { + pm_matches <- lapply( + match, + function(x) { + if (isTRUE(ignore.case)) { + match_u <- toupper(x) + match_l <- tolower(x) + pos_u <- grep(pattern = match_u, x = toupper(vars), fixed = TRUE) + pos_l <- grep(pattern = match_l, x = tolower(vars), fixed = TRUE) + unique(c(pos_l, pos_u)) + } else { + grep(pattern = x, x = vars, fixed = TRUE) + } + } + ) + unique(unlist(pm_matches)) +} + +pm_matches <- function(match, ignore.case = TRUE, perl = FALSE, vars = pm_peek_vars()) { + grep(pattern = match, x = vars, ignore.case = ignore.case, perl = perl) +} + +pm_num_range <- function(prefix, range, width = NULL, vars = pm_peek_vars()) { + if (!is.null(width)) { + range <- sprintf(paste0("%0", width, "d"), range) + } + find <- paste0(prefix, range) + if (any(duplicated(vars))) { + stop("Column names must be unique") + } else { + x <- match(find, vars) + x[!is.na(x)] + } +} + +pm_all_of <- function(x, vars = pm_peek_vars()) { + x_ <- !x %in% vars + if (any(x_)) { + which_x_ <- which(x_) + if (length(which_x_) == 1L) { + stop("The column ", x[which_x_], " does not exist.") + } else { + stop("The columns ", paste(x[which_x_], collapse = ", "), " do not exist.") + } + } else { + which(vars %in% x) + } +} + +pm_any_of <- function(x, vars = pm_peek_vars()) { + which(vars %in% x) +} + +pm_everything <- function(vars = pm_peek_vars()) { + seq_along(vars) +} + +pm_last_col <- function(offset = 0L, vars = pm_peek_vars()) { + if (!pm_is_wholenumber(offset)) stop("`offset` must be an integer") + pm_n <- length(vars) + if (offset && pm_n <= offset) { + stop("`offset` must be smaller than the number of `vars`") + } else if (pm_n == 0) { + stop("Can't pm_select last column when `vars` is empty") + } else { + pm_n - offset + } +} + +pm_peek_vars <- function() { + pm_select_env$get_colnames() +} +pm_select_positions <- function(.data, ..., .group_pos = FALSE) { + cols <- pm_dotdotdot(...) + pm_select_env$setup(.data = .data, calling_frame = parent.frame(2L)) + on.exit(pm_select_env$clean(), add = TRUE) + data_names <- pm_select_env$get_colnames() + pos <- unlist(lapply(cols, pm_eval_expr)) + col_len <- pm_select_env$get_ncol() + if (any(pos > col_len)) { + oor <- pos[which(pos > col_len)] + oor_len <- length(oor) + stop( + "Location", if (oor_len > 1) "s " else " ", pm_collapse_to_sentence(oor), + if (oor_len > 1) " don't " else " doesn't ", "exist. There are only ", col_len, " columns." + ) + } + if (isTRUE(.group_pos)) { + pm_groups <- pm_get_groups(.data) + missing_groups <- !(pm_groups %in% cols) + if (any(missing_groups)) { + sel_missing <- pm_groups[missing_groups] + message("Adding missing grouping variables: `", paste(sel_missing, collapse = "`, `"), "`") + readd <- match(sel_missing, data_names) + if (length(names(cols)) > 0L) names(readd) <- data_names[readd] + pos <- c(readd, pos) + } + } + pos[!duplicated(pos)] +} + +pm_eval_expr <- function(x) { + type <- typeof(x) + switch(type, + "integer" = x, + "double" = as.integer(x), + "character" = pm_select_char(x), + "symbol" = pm_select_symbol(x), + "language" = pm_eval_call(x), + stop("Expressions of type <", typeof(x), "> cannot be evaluated for use when subsetting.") + ) +} + +pm_select_char <- function(expr) { + pos <- match(expr, pm_select_env$get_colnames()) + if (is.na(pos)) stop("Column `", expr, "` does not exist") + pos +} + +pm_select_symbol <- function(expr) { + expr_name <- as.character(expr) + if (grepl("^is\\.", expr_name) && pm_is_function(expr)) { + stop( + "Predicate functions must be wrapped in `pm_where()`.\n\n", + sprintf(" data %%pm>%% pm_select(pm_where(%s))", expr_name) + ) + } + res <- try(pm_select_char(as.character(expr)), silent = TRUE) + if (inherits(res, "try-error")) { + res <- tryCatch( + unlist(lapply(eval(expr, envir = pm_select_env$calling_frame), pm_eval_expr)), + error = function(e) stop("Column ", expr, " does not exist.") + ) + } + res +} + +pm_eval_call <- function(x) { + type <- as.character(x[[1]]) + switch(type, + `:` = pm_select_seq(x), + `!` = pm_select_negate(x), + `-` = pm_select_minus(x), + `c` = pm_select_c(x), + `(` = pm_select_bracket(x), + pm_select_pm_context(x) + ) +} + +pm_select_seq <- function(expr) { + x <- pm_eval_expr(expr[[2]]) + y <- pm_eval_expr(expr[[3]]) + x:y +} + +pm_select_negate <- function(expr) { + x <- if (pm_is_negated_colon(expr)) { + expr <- call(":", expr[[2]][[2]], expr[[2]][[3]][[2]]) + pm_eval_expr(expr) + } else { + pm_eval_expr(expr[[2]]) + } + x * -1L +} + +pm_is_negated_colon <- function(expr) { + expr[[1]] == "!" && length(expr[[2]]) > 1L && expr[[2]][[1]] == ":" && expr[[2]][[3]][[1]] == "!" +} + +pm_select_minus <- function(expr) { + x <- pm_eval_expr(expr[[2]]) + x * -1L +} + +pm_select_c <- function(expr) { + lst_expr <- as.list(expr) + lst_expr[[1]] <- NULL + unlist(lapply(lst_expr, pm_eval_expr)) +} + +pm_select_bracket <- function(expr) { + pm_eval_expr(expr[[2]]) +} + +pm_select_pm_context <- function(expr) { + eval(expr, envir = pm_select_env$.data) +} + +pm_select_env <- new.env() +pm_select_env$setup <- function(.data, calling_frame) { + pm_select_env$.data <- .data + pm_select_env$calling_frame <- calling_frame +} +pm_select_env$clean <- function() { + rm(list = c(".data", "calling_frame"), envir = pm_select_env) +} +pm_select_env$get_colnames <- function() colnames(pm_select_env$.data) +pm_select_env$get_nrow <- function() nrow(pm_select_env$.data) +pm_select_env$get_ncol <- function() ncol(pm_select_env$.data) + +pm_select <- function(.data, ...) { + col_pos <- pm_select_positions(.data, ..., .group_pos = TRUE) + map_names <- names(col_pos) + map_names_length <- nchar(map_names) + if (any(map_names_length == 0L)) { + no_new_names <- which(map_names_length == 0L) + map_names[no_new_names] <- colnames(.data)[no_new_names] + } + res <- .data[, col_pos, drop = FALSE] + if (!is.null(map_names) && all(col_pos > 0L)) colnames(res) <- map_names + if (pm_has_groups(.data)) res <- pm_set_groups(res, pm_get_groups(.data)) + res +} +pm_summarise <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_summarise.grouped_data(.data, ...) + } else { + pm_summarise.default(.data, ...) + } +} + +pm_summarise.default <- function(.data, ...) { + fns <- pm_dotdotdot(...) + pm_context$setup(.data) + on.exit(pm_context$clean(), add = TRUE) + pm_groups_exist <- pm_has_groups(pm_context$.data) + if (pm_groups_exist) { + group <- unique(pm_context$.data[, pm_get_groups(pm_context$.data), drop = FALSE]) + } + res <- lapply( + fns, + function(x) { + x_res <- do.call(with, list(pm_context$.data, x)) + if (is.list(x_res)) I(x_res) else x_res + } + ) + res <- as.data.frame(res, stringsAsFactors = FALSE) + fn_names <- names(fns) + colnames(res) <- if (is.null(fn_names)) fns else fn_names + if (pm_groups_exist) res <- cbind(group, res, row.names = NULL) + res +} + +pm_summarise.grouped_data <- function(.data, ...) { + pm_groups <- pm_get_groups(.data) + res <- pm_apply_grouped_function("pm_summarise", .data, drop = TRUE, ...) + res <- res[do.call(order, lapply(pm_groups, function(x) res[, x])), ] + rownames(res) <- NULL + res +} + +pm_transmute <- function(.data, ...) { + pm_check_is_dataframe(.data) + if ("grouped_data" %in% class(.data)) { + pm_transmute.grouped_data(.data, ...) + } else { + pm_transmute.default(.data, ...) + } +} + +pm_transmute.default <- function(.data, ...) { + conditions <- pm_deparse_dots(...) + mutated <- pm_mutate(.data, ...) + mutated[, names(conditions), drop = FALSE] +} + +pm_transmute.grouped_data <- function(.data, ...) { + rows <- rownames(.data) + res <- pm_apply_grouped_function("pm_transmute", .data, drop = TRUE, ...) + res[rows, ] +} +pm_dotdotdot <- function(..., .impute_names = FALSE) { + dots <- eval(substitute(alist(...))) + if (isTRUE(.impute_names)) { + pm_deparse_dots <- lapply(dots, deparse) + names_dots <- names(dots) + unnamed <- if (is.null(names_dots)) rep(TRUE, length(dots)) else nchar(names_dots) == 0L + names(dots)[unnamed] <- pm_deparse_dots[unnamed] + } + dots +} + +pm_deparse_dots <- function(...) { + vapply(substitute(...()), deparse, NA_character_) +} + +pm_deparse_var <- function(var, frame = if (is.null(pm_eval_env$env)) parent.frame() else pm_eval_env$env) { + sub_var <- eval(substitute(substitute(var)), frame) + if (is.symbol(sub_var)) var <- as.character(sub_var) + var +} + +pm_check_is_dataframe <- function(.data) { + parent_fn <- all.names(sys.call(-1L), max.names = 1L) + if (!is.data.frame(.data)) stop(parent_fn, " must be given a data.frame") + invisible() +} + +pm_is_wholenumber <- function(x) { + x %% 1L == 0L +} + +pm_seq2 <- function(from, to) { + if (length(from) != 1) stop("`from` must be length one") + if (length(to) != 1) stop("`to` must be length one") + if (from > to) integer() else seq.int(from, to) +} + +pm_is_function <- function(x, frame) { + res <- tryCatch( + is.function(x), + warning = function(w) FALSE, + error = function(e) FALSE + ) + if (isTRUE(res)) { + return(res) + } + res <- tryCatch( + is.function(eval(x)), + warning = function(w) FALSE, + error = function(e) FALSE + ) + if (isTRUE(res)) { + return(res) + } + res <- tryCatch( + is.function(eval(as.symbol(deparse(substitute(x))))), + warning = function(w) FALSE, + error = function(e) FALSE + ) + if (isTRUE(res)) { + return(res) + } + FALSE +} + +pm_collapse_to_sentence <- function(x) { + len_x <- length(x) + if (len_x == 0L) { + stop("Length of `x` is 0") + } else if (len_x == 1L) { + as.character(x) + } else if (len_x == 2L) { + paste(x, collapse = " and ") + } else { + paste(paste(x[1:(len_x - 1)], collapse = ", "), x[len_x], sep = " and ") + } +} +pm_where <- function(fn) { + if (!pm_is_function(fn)) { + stop(pm_deparse_var(fn), " is not a valid predicate function.") + } + preds <- unlist(lapply( + pm_select_env$.data, + function(x, fn) { + do.call("fn", list(x)) + }, + fn + )) + if (!is.logical(preds)) stop("`pm_where()` must be used with functions that return `TRUE` or `FALSE`.") + data_cols <- pm_select_env$get_colnames() + cols <- data_cols[preds] + which(data_cols %in% cols) +} + +pm_cume_dist <- function(x) { + rank(x, ties.method = "max", na.last = "keep") / sum(!is.na(x)) +} + +pm_dense_rank <- function(x) { + match(x, sort(unique(x))) +} + +pm_min_rank <- function(x) { + rank(x, ties.method = "min", na.last = "keep") +} + +pm_ntile <- function(x = pm_row_number(), pm_n) { + if (!missing(x)) x <- pm_row_number(x) + len <- length(x) - sum(is.na(x)) + pm_n <- as.integer(floor(pm_n)) + if (len == 0L) { + rep(NA_integer_, length(x)) + } else { + pm_n_larger <- as.integer(len %% pm_n) + pm_n_smaller <- as.integer(pm_n - pm_n_larger) + size <- len / pm_n + larger_size <- as.integer(ceiling(size)) + smaller_size <- as.integer(floor(size)) + larger_threshold <- larger_size * pm_n_larger + bins <- pm_if_else( + x <= larger_threshold, + (x + (larger_size - 1L)) / larger_size, + (x + (-larger_threshold + smaller_size - 1L)) / smaller_size + pm_n_larger + ) + as.integer(floor(bins)) + } +} + +pm_percent_rank <- function(x) { + (pm_min_rank(x) - 1) / (sum(!is.na(x)) - 1) +} + +pm_row_number <- function(x) { + if (missing(x)) seq_len(pm_n()) else rank(x, ties.method = "first", na.last = "keep") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/aa_options.R + + + + +#' Options for the AMR package +#' +#' This is an overview of all the package-specific [options()] you can set in the `AMR` package. +#' @section Options: +#' * `AMR_antibiogram_formatting_type` \cr A [numeric] (1-12) to use in [antibiogram()], to indicate which formatting type to use. +#' * `AMR_breakpoint_type` \cr A [character] to use in [as.sir()], to indicate which breakpoint type to use. This must be either `r vector_or(clinical_breakpoints$type)`. +#' * `AMR_cleaning_regex` \cr A [regular expression][base::regex] (case-insensitive) to use in [as.mo()] and all [`mo_*`][mo_property()] functions, to clean the user input. The default is the outcome of [mo_cleaning_regex()], which removes texts between brackets and texts such as "species" and "serovar". +#' * `AMR_custom_ab` \cr A file location to an RDS file, to use custom antimicrobial drugs with this package. This is explained in [add_custom_antimicrobials()]. +#' * `AMR_custom_mo` \cr A file location to an RDS file, to use custom microorganisms with this package. This is explained in [add_custom_microorganisms()]. +#' * `AMR_eucastrules` \cr A [character] to set the default types of rules for [eucast_rules()] function, must be one or more of: `"breakpoints"`, `"expert"`, `"other"`, `"custom"`, `"all"`, and defaults to `c("breakpoints", "expert")`. +#' * `AMR_guideline` \cr A [character] to set the default guideline for interpreting MIC values and disk diffusion diameters with [as.sir()]. Can be only the guideline name (e.g., `"CLSI"`) or the name with a year (e.g. `"CLSI 2019"`). The default to the latest implemented EUCAST guideline, currently \code{"`r clinical_breakpoints$guideline[1]`"}. Supported guideline are currently EUCAST (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`) and CLSI (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`). +#' * `AMR_ignore_pattern` \cr A [regular expression][base::regex] to ignore (i.e., make `NA`) any match given in [as.mo()] and all [`mo_*`][mo_property()] functions. +#' * `AMR_include_PKPD` \cr A [logical] to use in [as.sir()], to indicate that PK/PD clinical breakpoints must be applied as a last resort - the default is `TRUE`. +#' * `AMR_include_screening` \cr A [logical] to use in [as.sir()], to indicate that clinical breakpoints for screening are allowed - the default is `FALSE`. +#' * `AMR_keep_synonyms` \cr A [logical] to use in [as.mo()] and all [`mo_*`][mo_property()] functions, to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is `FALSE`. +#' * `AMR_locale` \cr A [character] to set the language for the `AMR` package, can be one of these supported language names or ISO-639-1 codes: `r vector_or(paste0(sapply(LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), " (" , LANGUAGES_SUPPORTED, ")"), quotes = FALSE, sort = FALSE)`. The default is the current system language (if supported, English otherwise). +#' * `AMR_mo_source` \cr A file location for a manual code list to be used in [as.mo()] and all [`mo_*`][mo_property()] functions. This is explained in [set_mo_source()]. +#' +#' @section Saving Settings Between Sessions: +#' Settings in \R are not saved globally and are thus lost when \R is exited. You can save your options to your own `.Rprofile` file, which is a user-specific file. You can edit it using: +#' +#' ```r +#' utils::file.edit("~/.Rprofile") +#' ``` +#' +#' In this file, you can set options such as... +#' +#' ```r +#' options(AMR_locale = "pt") +#' options(AMR_include_PKPD = TRUE) +#' ``` +#' +#' ...to add Portuguese language support of antibiotics, and allow PK/PD rules when interpreting MIC values with [as.sir()]. +#' +#' ### Share Options Within Team +#' +#' For a more global approach, e.g. within a (data) team, save an options file to a remote file location, such as a shared network drive, and have each user read in this file automatically at start-up. This would work in this way: +#' +#' 1. Save a plain text file to e.g. "X:/team_folder/R_options.R" and fill it with preferred settings. +#' +#' 2. For each user, open the `.Rprofile` file using `utils::file.edit("~/.Rprofile")` and put in there: +#' +#' ```r +#' source("X:/team_folder/R_options.R") +#' ``` +#' +#' 3. Reload R/RStudio and check the settings with [getOption()], e.g. `getOption("AMR_locale")` if you have set that value. +#' +#' Now the team settings are configured in only one place, and can be maintained there. +#' @keywords internal +#' @name AMR-options +NULL + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ab.R + + + + +#' Transform Input to an Antibiotic ID +#' +#' Use this function to determine the antibiotic drug code of one or more antibiotics. The data set [antibiotics] will be searched for abbreviations, official names and synonyms (brand names). +#' @param x a [character] vector to determine to antibiotic ID +#' @param flag_multiple_results a [logical] to indicate whether a note should be printed to the console that probably more than one antibiotic drug code or name can be retrieved from a single input value. +#' @param info a [logical] to indicate whether a progress bar should be printed - the default is `TRUE` only in interactive mode +#' @param ... arguments passed on to internal functions +#' @rdname as.ab +#' @inheritSection WHOCC WHOCC +#' @details All entries in the [antibiotics] data set have three different identifiers: a human readable EARS-Net code (column `ab`, used by ECDC and WHONET), an ATC code (column `atc`, used by WHO), and a CID code (column `cid`, Compound ID, used by PubChem). The data set contains more than 5,000 official brand names from many different countries, as found in PubChem. Not that some drugs contain multiple ATC codes. +#' +#' All these properties will be searched for the user input. The [as.ab()] can correct for different forms of misspelling: +#' +#' * Wrong spelling of drug names (such as "tobramicin" or "gentamycin"), which corrects for most audible similarities such as f/ph, x/ks, c/z/s, t/th, etc. +#' * Too few or too many vowels or consonants +#' * Switching two characters (such as "mreopenem", often the case in clinical data, when doctors typed too fast) +#' * Digitalised paper records, leaving artefacts like 0/o/O (zero and O's), B/8, n/r, etc. +#' +#' Use the [`ab_*`][ab_property()] functions to get properties based on the returned antibiotic ID, see *Examples*. +#' +#' Note: the [as.ab()] and [`ab_*`][ab_property()] functions may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. +#' +#' You can add your own manual codes to be considered by [as.ab()] and all [`ab_*`][ab_property()] functions, see [add_custom_antimicrobials()]. +#' @section Source: +#' World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} +#' +#' European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +#' @aliases ab +#' @return A [character] [vector] with additional class [`ab`] +#' @seealso +#' * [antibiotics] for the [data.frame] that is being used to determine ATCs +#' * [ab_from_text()] for a function to retrieve antimicrobial drugs from clinical text (from health care records) +#' @inheritSection AMR Reference Data Publicly Available +#' @export +#' @examples +#' # these examples all return "ERY", the ID of erythromycin: +#' as.ab("J01FA01") +#' as.ab("J 01 FA 01") +#' as.ab("Erythromycin") +#' as.ab("eryt") +#' as.ab(" eryt 123") +#' as.ab("ERYT") +#' as.ab("ERY") +#' as.ab("eritromicine") # spelled wrong, yet works +#' as.ab("Erythrocin") # trade name +#' as.ab("Romycin") # trade name +#' +#' # spelling from different languages and dyslexia are no problem +#' ab_atc("ceftriaxon") +#' ab_atc("cephtriaxone") # small spelling error +#' ab_atc("cephthriaxone") # or a bit more severe +#' ab_atc("seephthriaaksone") # and even this works +#' +#' # use ab_* functions to get a specific properties (see ?ab_property); +#' # they use as.ab() internally: +#' ab_name("J01FA01") +#' ab_name("eryt") +#' +#' \donttest{ +#' if (require("dplyr")) { +#' # you can quickly rename 'sir' columns using set_ab_names() with dplyr: +#' example_isolates %>% +#' set_ab_names(where(is.sir), property = "atc") +#' } +#' } +as.ab <- function(x, flag_multiple_results = TRUE, info = interactive(), ...) { + meet_criteria(x, allow_class = c("character", "numeric", "integer", "factor"), allow_NA = TRUE) + meet_criteria(flag_multiple_results, allow_class = "logical", has_length = 1) + meet_criteria(info, allow_class = "logical", has_length = 1) + + if (is.ab(x)) { + return(x) + } + if (all(x %in% c(AMR_env$AB_lookup$ab, NA))) { + # all valid AB codes, but not yet right class + return(set_clean_class(x, + new_class = c("ab", "character") + )) + } + + loop_time <- list(...)$loop_time + if (is.null(loop_time)) { + loop_time <- 1 + } + already_regex <- isTRUE(list(...)$already_regex) + fast_mode <- isTRUE(list(...)$fast_mode) + + x_bak <- x + x <- toupper(x) + + # remove diacritics + x <- iconv(x, from = "UTF-8", to = "ASCII//TRANSLIT") + x <- gsub('"', "", x, fixed = TRUE) + x <- gsub("(specimen|specimen date|specimen_date|spec_date|gender|^dates?$|animal|host($|[a-z]))", "", x, ignore.case = TRUE, perl = TRUE) + # penicillin is a special case: we call it so, but then most often mean benzylpenicillin + x[x %like_case% "^PENICILLIN" & x %unlike_case% "[ /+-]"] <- "benzylpenicillin" + x_bak_clean <- x + if (already_regex == FALSE) { + x_bak_clean <- generalise_antibiotic_name(x_bak_clean) + } + + x <- unique(x_bak_clean) # this means that every x is in fact generalise_antibiotic_name(x) + x_new <- rep(NA_character_, length(x)) + x_unknown <- character(0) + x_unknown_ATCs <- character(0) + + note_if_more_than_one_found <- function(found, index, from_text) { + if (loop_time == 1 && isTRUE(length(from_text) > 1)) { + abnames <- ab_name(from_text, tolower = TRUE, loop_time = loop_time + 1) + if (ab_name(found[1L], language = NULL) %like% "(clavulanic acid|(avi|tazo|mono|vabor)bactam)") { + abnames <- abnames[!abnames %in% c("clavulanic acid", "avibactam", "tazobactam", "vaborbactam", "monobactam")] + } + if (length(abnames) > 1) { + if (toupper(paste(abnames, collapse = " ")) %in% AMR_env$AB_lookup$generalised_name) { + # if the found values combined is a valid AB, return that + found <- AMR_env$AB_lookup$ab[match(toupper(paste(abnames, collapse = " ")), AMR_env$AB_lookup$generalised_name)][1] + } else { + message_( + "More than one result was found for item ", index, ": ", + vector_and(abnames, quotes = FALSE) + ) + } + } + } + found[1L] + } + + # Fill in names, AB codes, CID codes and ATC codes directly (`x` is already clean and uppercase) + known_names <- x %in% AMR_env$AB_lookup$generalised_name + x_new[known_names] <- AMR_env$AB_lookup$ab[match(x[known_names], AMR_env$AB_lookup$generalised_name)] + known_codes_ab <- x %in% AMR_env$AB_lookup$ab + known_codes_atc <- vapply(FUN.VALUE = logical(1), x, function(x_) x_ %in% unlist(AMR_env$AB_lookup$atc), USE.NAMES = FALSE) + known_codes_cid <- x %in% AMR_env$AB_lookup$cid + x_new[known_codes_ab] <- AMR_env$AB_lookup$ab[match(x[known_codes_ab], AMR_env$AB_lookup$ab)] + x_new[known_codes_atc] <- AMR_env$AB_lookup$ab[vapply( + FUN.VALUE = integer(1), + x[known_codes_atc], + function(x_) { + which(vapply( + FUN.VALUE = logical(1), + AMR_env$AB_lookup$atc, + function(atc) x_ %in% atc + ))[1L] + }, + USE.NAMES = FALSE + )] + x_new[known_codes_cid] <- AMR_env$AB_lookup$ab[match(x[known_codes_cid], AMR_env$AB_lookup$cid)] + previously_coerced <- x %in% AMR_env$ab_previously_coerced$x + x_new[previously_coerced & is.na(x_new)] <- AMR_env$ab_previously_coerced$ab[match(x[is.na(x_new) & x %in% AMR_env$ab_previously_coerced$x], AMR_env$ab_previously_coerced$x)] + already_known <- known_names | known_codes_ab | known_codes_atc | known_codes_cid | previously_coerced + + # fix for NAs + x_new[is.na(x)] <- NA + already_known[is.na(x)] <- FALSE + + if (loop_time == 1 && sum(already_known) < length(x)) { + progress <- progress_ticker(n = sum(!already_known), n_min = 25, print = info) # start if n >= 25 + on.exit(close(progress)) + } + + for (i in which(!already_known)) { + if (loop_time == 1) { + progress$tick() + } + + if (is.na(x[i]) || is.null(x[i])) { + next + } + if (identical(x[i], "") || + # prevent "bacteria" from coercing to TMP, since Bacterial is a brand name of it: + identical(tolower(x[i]), "bacteria")) { + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + next + } + if (x[i] %like_case% "[A-Z][0-9][0-9][A-Z][A-Z][0-9][0-9]") { + # seems an ATC code, but the available ones are in `already_known`, so: + x_unknown <- c(x_unknown, x[i]) + x_unknown_ATCs <- c(x_unknown_ATCs, x[i]) + x_new[i] <- NA_character_ + next + } + + if (fast_mode == FALSE && flag_multiple_results == TRUE && x[i] %like% "[ ]") { + from_text <- tryCatch(suppressWarnings(ab_from_text(x[i], loop_time = loop_time + 1, translate_ab = FALSE)[[1]]), + error = function(e) character(0) + ) + } else { + from_text <- character(0) + } + + # old code for phenoxymethylpenicillin (Peni V) + if (x[i] == "PNV") { + x_new[i] <- "PHN" + next + } + + # exact LOINC code + loinc_found <- unlist(lapply( + AMR_env$AB_lookup$generalised_loinc, + function(s) x[i] %in% s + )) + found <- AMR_env$AB_lookup$ab[loinc_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # exact synonym + synonym_found <- unlist(lapply( + AMR_env$AB_lookup$generalised_synonyms, + function(s) x[i] %in% s + )) + found <- AMR_env$AB_lookup$ab[synonym_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # exact abbreviation + abbr_found <- unlist(lapply( + AMR_env$AB_lookup$generalised_abbreviations, + # require at least 2 characters for abbreviations + function(s) x[i] %in% s && nchar(x[i]) >= 2 + )) + found <- AMR_env$AB_lookup$ab[abbr_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # length of input is quite long, and Levenshtein distance is only max 2 + if (nchar(x[i]) >= 10) { + levenshtein <- as.double(utils::adist(x[i], AMR_env$AB_lookup$generalised_name)) + if (any(levenshtein <= 2)) { + found <- AMR_env$AB_lookup$ab[which(levenshtein <= 2)] + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # allow characters that resemble others, but only continue when having more than 3 characters + if (nchar(x[i]) <= 3) { + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + next + } + x_spelling <- x[i] + if (already_regex == FALSE) { + x_spelling <- gsub("[IY]+", "[IY]+", x_spelling, perl = TRUE) + x_spelling <- gsub("(C|K|Q|QU|S|Z|X|KS)+", "(C|K|Q|QU|S|Z|X|KS)+", x_spelling, perl = TRUE) + x_spelling <- gsub("(PH|F|V)+", "(PH|F|V)+", x_spelling, perl = TRUE) + x_spelling <- gsub("(TH|T)+", "(TH|T)+", x_spelling, perl = TRUE) + x_spelling <- gsub("A+", "A+", x_spelling, perl = TRUE) + x_spelling <- gsub("E+", "E+", x_spelling, perl = TRUE) + x_spelling <- gsub("O+", "O+", x_spelling, perl = TRUE) + # allow any ending of -in/-ine and -im/-ime + x_spelling <- gsub("(\\[IY\\]\\+(N|M)|\\[IY\\]\\+(N|M)E\\+?)$", "[IY]+(N|M)E*", x_spelling, perl = TRUE) + # allow any ending of -ol/-ole + x_spelling <- gsub("(O\\+L|O\\+LE\\+)$", "O+LE*", x_spelling, perl = TRUE) + # allow any ending of -on/-one + x_spelling <- gsub("(O\\+N|O\\+NE\\+)$", "O+NE*", x_spelling, perl = TRUE) + # replace multiple same characters to single one with '+', like "ll" -> "l+" + x_spelling <- gsub("(.)\\1+", "\\1+", x_spelling, perl = TRUE) + # replace spaces and slashes with a possibility on both + x_spelling <- gsub("[ /]", "( .*|.*/)", x_spelling, perl = TRUE) + # correct for digital reading text (OCR) + x_spelling <- gsub("[NRD8B]", "[NRD8B]", x_spelling, perl = TRUE) + x_spelling <- gsub("(O|0)", "(O|0)+", x_spelling, perl = TRUE) + x_spelling <- gsub("++", "+", x_spelling, fixed = TRUE) + } + + # try if name starts with it + found <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$generalised_name %like% paste0("^", x_spelling)), "ab", drop = TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + # try if name ends with it + found <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$generalised_name %like% paste0(x_spelling, "$")), "ab", drop = TRUE] + if (nchar(x[i]) >= 4 && length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # and try if any synonym starts with it + synonym_found <- unlist(lapply( + AMR_env$AB_lookup$generalised_synonyms, + function(s) any(s %like% paste0("^", x_spelling)) + )) + found <- AMR_env$AB_lookup$ab[synonym_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # INITIAL SEARCH - More uncertain results ---- + if (loop_time <= 2 && fast_mode == FALSE) { + # only run on first and second try + + # try by removing all spaces + if (x[i] %like% " ") { + found <- suppressWarnings(as.ab(gsub(" +", "", x[i], perl = TRUE), loop_time = loop_time + 2)) + if (length(found) > 0 && !is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # try by removing all spaces and numbers + if (x[i] %like% " " || x[i] %like% "[0-9]") { + found <- suppressWarnings(as.ab(gsub("[ 0-9]", "", x[i], perl = TRUE), loop_time = loop_time + 2)) + if (length(found) > 0 && !is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # transform back from other languages and try again + x_translated <- paste( + lapply( + strsplit(x[i], "[^A-Z0-9]"), + function(y) { + for (i in seq_len(length(y))) { + for (lang in LANGUAGES_SUPPORTED[LANGUAGES_SUPPORTED != "en"]) { + y[i] <- ifelse(tolower(y[i]) %in% tolower(TRANSLATIONS[, lang, drop = TRUE]), + TRANSLATIONS[which(tolower(TRANSLATIONS[, lang, drop = TRUE]) == tolower(y[i]) & + !isFALSE(TRANSLATIONS$fixed)), "pattern"], + y[i] + ) + } + } + generalise_antibiotic_name(y) + } + )[[1]], + collapse = "/" + ) + x_translated_guess <- suppressWarnings(as.ab(x_translated, loop_time = loop_time + 2)) + if (!is.na(x_translated_guess)) { + x_new[i] <- x_translated_guess + next + } + + # now also try to coerce brandname combinations like "Amoxy/clavulanic acid" + x_translated <- paste( + lapply( + strsplit(x_translated, "[^A-Z0-9 ]"), + function(y) { + for (i in seq_len(length(y))) { + y_name <- suppressWarnings(ab_name(y[i], language = NULL, loop_time = loop_time + 2)) + y[i] <- ifelse(!is.na(y_name), + y_name, + y[i] + ) + } + generalise_antibiotic_name(y) + } + )[[1]], + collapse = "/" + ) + x_translated_guess <- suppressWarnings(as.ab(x_translated, loop_time = loop_time + 2)) + if (!is.na(x_translated_guess)) { + x_new[i] <- x_translated_guess + next + } + + # try by removing all trailing capitals + if (x[i] %like_case% "[a-z]+[A-Z]+$") { + found <- suppressWarnings(as.ab(gsub("[A-Z]+$", "", x[i], perl = TRUE), loop_time = loop_time + 2)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # keep only letters + found <- suppressWarnings(as.ab(gsub("[^A-Z]", "", x[i], perl = TRUE), loop_time = loop_time + 2)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # try from a bigger text, like from a health care record, see ?ab_from_text + # already calculated above if flag_multiple_results = TRUE + if (flag_multiple_results == TRUE) { + found <- from_text[1L] + } else { + found <- tryCatch(suppressWarnings(ab_from_text(x[i], loop_time = loop_time + 2, translate_ab = FALSE)[[1]][1L]), + error = function(e) NA_character_ + ) + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # first 5 except for cephalosporins, then first 7 (those cephalosporins all start quite the same!) + found <- suppressWarnings(as.ab(substr(x[i], 1, 5), loop_time = loop_time + 2)) + if (!is.na(found) && ab_group(found, loop_time = loop_time + 1) %unlike% "cephalosporins") { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + found <- suppressWarnings(as.ab(substr(x[i], 1, 7), loop_time = loop_time + 2)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # make all consonants facultative + search_str <- gsub("([BCDFGHJKLMNPQRSTVWXZ])", "\\1*", x[i], perl = TRUE) + found <- suppressWarnings(as.ab(search_str, loop_time = loop_time + 2, already_regex = TRUE)) + # keep at least 4 normal characters + if (nchar(gsub(".\\*", "", search_str, perl = TRUE)) < 4) { + found <- NA + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # make all vowels facultative + search_str <- gsub("([AEIOUY])", "\\1*", x[i], perl = TRUE) + found <- suppressWarnings(as.ab(search_str, loop_time = loop_time + 2, already_regex = TRUE)) + # keep at least 5 normal characters + if (nchar(gsub(".\\*", "", search_str, perl = TRUE)) < 5) { + found <- NA + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # allow misspelling of vowels + x_spelling <- gsub("A+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("E+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("I+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("O+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("U+", "[AEIOU]+", x_spelling, fixed = TRUE) + found <- suppressWarnings(as.ab(x_spelling, loop_time = loop_time + 2, already_regex = TRUE)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # try with switched character, like "mreopenem" + for (j in seq_len(nchar(x[i]))) { + x_switched <- paste0( + # beginning part: + substr(x[i], 1, j - 1), + # here is the switching of 2 characters: + substr(x[i], j + 1, j + 1), + substr(x[i], j, j), + # ending part: + substr(x[i], j + 2, nchar(x[i])) + ) + found <- suppressWarnings(as.ab(x_switched, loop_time = loop_time + 1)) + if (!is.na(found)) { + break + } + } + if (!is.na(found)) { + x_new[i] <- found[1L] + next + } + } # end of loop_time <= 2 + + # not found + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + } + + if (loop_time == 1 && sum(already_known) < length(x)) { + close(progress) + } + + # save to package env to save time for next time + if (loop_time == 1) { + AMR_env$ab_previously_coerced <- AMR_env$ab_previously_coerced[which(!AMR_env$ab_previously_coerced$x %in% x), , drop = FALSE] + AMR_env$ab_previously_coerced <- unique(rbind_AMR( + AMR_env$ab_previously_coerced, + data.frame( + x = x, + ab = x_new, + x_bak = x_bak[match(x, x_bak_clean)], + stringsAsFactors = FALSE + ) + )) + } + + # take failed ATC codes apart from rest + if (length(x_unknown_ATCs) > 0 && fast_mode == FALSE) { + warning_( + "in `as.ab()`: these ATC codes are not (yet) in the antibiotics data set: ", + vector_and(x_unknown_ATCs), "." + ) + } + x_unknown <- x_unknown[!x_unknown %in% x_unknown_ATCs] + x_unknown <- c( + x_unknown, + AMR_env$ab_previously_coerced$x_bak[which(AMR_env$ab_previously_coerced$x %in% x & is.na(AMR_env$ab_previously_coerced$ab))] + ) + x_unknown <- x_unknown[!x_unknown %in% c("", NA)] + if (length(x_unknown) > 0 && fast_mode == FALSE) { + warning_( + "in `as.ab()`: these values could not be coerced to a valid antimicrobial ID: ", + vector_and(x_unknown), "." + ) + } + + x_result <- x_new[match(x_bak_clean, x)] + if (length(x_result) == 0) { + x_result <- NA_character_ + } + + set_clean_class(x_result, + new_class = c("ab", "character") + ) +} + +#' @rdname as.ab +#' @export +is.ab <- function(x) { + inherits(x, "ab") +} + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.ab <- function(x, ...) { + out <- trimws(format(x)) + out[is.na(x)] <- font_na(NA) + + # add the names to the drugs as mouse-over! + if (tryCatch(isTRUE(getExportedValue("ansi_has_hyperlink_support", ns = asNamespace("cli"))()), error = function(e) FALSE)) { + out[!is.na(x)] <- font_url(url = paste0(x[!is.na(x)], ": ", ab_name(x[!is.na(x)])), + txt = out[!is.na(x)]) + } + + create_pillar_column(out, align = "left", min_width = 4) +} + +# will be exported using s3_register() in R/zzz.R +type_sum.ab <- function(x, ...) { + "ab" +} + +#' @method print ab +#' @export +#' @noRd +print.ab <- function(x, ...) { + cat("Class 'ab'\n") + print(as.character(x), quote = FALSE) +} + +#' @method as.data.frame ab +#' @export +#' @noRd +as.data.frame.ab <- function(x, ...) { + nm <- deparse1(substitute(x)) + if (!"nm" %in% names(list(...))) { + as.data.frame.vector(as.ab(x), ..., nm = nm) + } else { + as.data.frame.vector(as.ab(x), ...) + } +} +#' @method [ ab +#' @export +#' @noRd +"[.ab" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [[ ab +#' @export +#' @noRd +"[[.ab" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [<- ab +#' @export +#' @noRd +"[<-.ab" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + return_after_integrity_check(y, "antimicrobial drug code", AMR_env$AB_lookup$ab) +} +#' @method [[<- ab +#' @export +#' @noRd +"[[<-.ab" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + return_after_integrity_check(y, "antimicrobial drug code", AMR_env$AB_lookup$ab) +} +#' @method c ab +#' @export +#' @noRd +c.ab <- function(...) { + x <- list(...)[[1L]] + y <- NextMethod() + attributes(y) <- attributes(x) + return_after_integrity_check(y, "antimicrobial drug code", AMR_env$AB_lookup$ab) +} + +#' @method unique ab +#' @export +#' @noRd +unique.ab <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method rep ab +#' @export +#' @noRd +rep.ab <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +generalise_antibiotic_name <- function(x) { + x <- toupper(x) + # remove suffices + x <- gsub("_(MIC|RSI|SIR|DIS[CK])$", "", x, perl = TRUE) + # remove disk concentrations, like LVX_NM -> LVX + x <- gsub("_[A-Z]{2}[0-9_.]{0,3}$", "", x, perl = TRUE) + # keep only max 1 space + x <- trimws2(gsub(" +", " ", x, perl = TRUE)) + # non-character, space or number should be a slash + x <- gsub("[^A-Z0-9 -)(]", "/", x, perl = TRUE) + # correct for 'high level' antibiotics + x <- gsub("([^A-Z0-9/ -]+)?(HIGH(.?LE?VE?L)?|[^A-Z0-9/]H[^A-Z0-9]?L)([^A-Z0-9 -]+)?", "-HIGH", x, perl = TRUE) + # remove part between brackets if that's followed by another string + x <- gsub("(.*)+ [(].*[)]", "\\1", x) + # spaces around non-characters must be removed: amox + clav -> amox/clav + x <- gsub("(.*[A-Z0-9]) ([^A-Z0-9].*)", "\\1\\2", x, perl = TRUE) + x <- gsub("(.*[^A-Z0-9]) ([A-Z0-9].*)", "\\1\\2", x, perl = TRUE) + # remove hyphen after a starting "co" + x <- gsub("^CO-", "CO", x, perl = TRUE) + # replace operators with a space + x <- gsub("(/| AND | WITH | W/|[+]|[-])+", " ", x, perl = TRUE) + x +} + +get_translate_ab <- function(translate_ab) { + translate_ab <- as.character(translate_ab)[1L] + if (translate_ab %in% c("TRUE", "official")) { + return("name") + } else if (translate_ab %in% c(NA_character_, "FALSE")) { + return(FALSE) + } else { + translate_ab <- tolower(translate_ab) + stop_ifnot(translate_ab %in% colnames(AMR::antibiotics), + "invalid value for 'translate_ab', this must be a column name of the antibiotics data set\n", + "or TRUE (equals 'name') or FALSE to not translate at all.", + call = FALSE + ) + translate_ab + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ab_from_text.R + + + + +#' Retrieve Antimicrobial Drug Names and Doses from Clinical Text +#' +#' Use this function on e.g. clinical texts from health care records. It returns a [list] with all antimicrobial drugs, doses and forms of administration found in the texts. +#' @param text text to analyse +#' @param type type of property to search for, either `"drug"`, `"dose"` or `"administration"`, see *Examples* +#' @param collapse a [character] to pass on to `paste(, collapse = ...)` to only return one [character] per element of `text`, see *Examples* +#' @param translate_ab if `type = "drug"`: a column name of the [antibiotics] data set to translate the antibiotic abbreviations to, using [ab_property()]. The default is `FALSE`. Using `TRUE` is equal to using "name". +#' @param thorough_search a [logical] to indicate whether the input must be extensively searched for misspelling and other faulty input values. Setting this to `TRUE` will take considerably more time than when using `FALSE`. At default, it will turn `TRUE` when all input elements contain a maximum of three words. +#' @param info a [logical] to indicate whether a progress bar should be printed - the default is `TRUE` only in interactive mode +#' @param ... arguments passed on to [as.ab()] +#' @details This function is also internally used by [as.ab()], although it then only searches for the first drug name and will throw a note if more drug names could have been returned. Note: the [as.ab()] function may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. +#' +#' ### Argument `type` +#' At default, the function will search for antimicrobial drug names. All text elements will be searched for official names, ATC codes and brand names. As it uses [as.ab()] internally, it will correct for misspelling. +#' +#' With `type = "dose"` (or similar, like "dosing", "doses"), all text elements will be searched for [numeric] values that are higher than 100 and do not resemble years. The output will be [numeric]. It supports any unit (g, mg, IE, etc.) and multiple values in one clinical text, see *Examples*. +#' +#' With `type = "administration"` (or abbreviations, like "admin", "adm"), all text elements will be searched for a form of drug administration. It supports the following forms (including common abbreviations): buccal, implant, inhalation, instillation, intravenous, nasal, oral, parenteral, rectal, sublingual, transdermal and vaginal. Abbreviations for oral (such as 'po', 'per os') will become "oral", all values for intravenous (such as 'iv', 'intraven') will become "iv". It supports multiple values in one clinical text, see *Examples*. +#' +#' ### Argument `collapse` +#' Without using `collapse`, this function will return a [list]. This can be convenient to use e.g. inside a `mutate()`):\cr +#' `df %>% mutate(abx = ab_from_text(clinical_text))` +#' +#' The returned AB codes can be transformed to official names, groups, etc. with all [`ab_*`][ab_property()] functions such as [ab_name()] and [ab_group()], or by using the `translate_ab` argument. +#' +#' With using `collapse`, this function will return a [character]:\cr +#' `df %>% mutate(abx = ab_from_text(clinical_text, collapse = "|"))` +#' @export +#' @return A [list], or a [character] if `collapse` is not `NULL` +#' @examples +#' # mind the bad spelling of amoxicillin in this line, +#' # straight from a true health care record: +#' ab_from_text("28/03/2020 regular amoxicilliin 500mg po tid") +#' +#' ab_from_text("500 mg amoxi po and 400mg cipro iv") +#' ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "dose") +#' ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "admin") +#' +#' ab_from_text("500 mg amoxi po and 400mg cipro iv", collapse = ", ") +#' \donttest{ +#' # if you want to know which antibiotic groups were administered, do e.g.: +#' abx <- ab_from_text("500 mg amoxi po and 400mg cipro iv") +#' ab_group(abx[[1]]) +#' +#' if (require("dplyr")) { +#' tibble(clinical_text = c( +#' "given 400mg cipro and 500 mg amox", +#' "started on doxy iv today" +#' )) %>% +#' mutate( +#' abx_codes = ab_from_text(clinical_text), +#' abx_doses = ab_from_text(clinical_text, type = "doses"), +#' abx_admin = ab_from_text(clinical_text, type = "admin"), +#' abx_coll = ab_from_text(clinical_text, collapse = "|"), +#' abx_coll_names = ab_from_text(clinical_text, +#' collapse = "|", +#' translate_ab = "name" +#' ), +#' abx_coll_doses = ab_from_text(clinical_text, +#' type = "doses", +#' collapse = "|" +#' ), +#' abx_coll_admin = ab_from_text(clinical_text, +#' type = "admin", +#' collapse = "|" +#' ) +#' ) +#' } +#' } +ab_from_text <- function(text, + type = c("drug", "dose", "administration"), + collapse = NULL, + translate_ab = FALSE, + thorough_search = NULL, + info = interactive(), + ...) { + if (missing(type)) { + type <- type[1L] + } + + meet_criteria(text) + meet_criteria(type, allow_class = "character", has_length = 1) + meet_criteria(collapse, has_length = 1, allow_NULL = TRUE) + meet_criteria(translate_ab, allow_NULL = FALSE) # get_translate_ab() will be more informative about what's allowed + meet_criteria(thorough_search, allow_class = "logical", has_length = 1, allow_NULL = TRUE) + meet_criteria(info, allow_class = "logical", has_length = 1) + + type <- tolower(trimws2(type)) + + text <- tolower(as.character(text)) + text_split_all <- strsplit(text, "[ ;.,:\\|]") + progress <- progress_ticker(n = length(text_split_all), n_min = 5, print = info) + on.exit(close(progress)) + + if (type %like% "(drug|ab|anti)") { + translate_ab <- get_translate_ab(translate_ab) + + if (isTRUE(thorough_search) || + (isTRUE(is.null(thorough_search)) && max(vapply(FUN.VALUE = double(1), text_split_all, length), na.rm = TRUE) <= 3)) { + text_split_all <- text_split_all[nchar(text_split_all) >= 4 & grepl("[a-z]+", text_split_all)] + result <- lapply(text_split_all, function(text_split) { + progress$tick() + text_split <- text_split[text_split %like% "[A-Z]" & text_split %unlike% "[0-9]"] + if (length(text_split) == 0) { + return(as.ab(NA_character_)) + } + suppressWarnings( + as.ab(text_split, ...) + ) + }) + } else { + # no thorough search + abbr <- unlist(AMR::antibiotics$abbreviations) + abbr <- abbr[nchar(abbr) >= 4] + names_atc <- substr(c(AMR::antibiotics$name, AMR::antibiotics$atc), 1, 5) + synonyms <- unlist(AMR::antibiotics$synonyms) + synonyms <- synonyms[nchar(synonyms) >= 4] + # regular expression must not be too long, so split synonyms in two: + synonyms_part1 <- synonyms[seq_len(0.5 * length(synonyms))] + synonyms_part2 <- synonyms[!synonyms %in% synonyms_part1] + to_regex <- function(x) { + paste0( + "^(", + paste0(unique(gsub("[^a-z0-9]+", "", sort(tolower(x)))), collapse = "|"), + ").*" + ) + } + result <- lapply(text_split_all, function(text_split) { + progress$tick() + suppressWarnings( + as.ab( + unique(c( + text_split[text_split %like_case% to_regex(abbr)], + text_split[text_split %like_case% to_regex(names_atc)], + text_split[text_split %like_case% to_regex(synonyms_part1)], + text_split[text_split %like_case% to_regex(synonyms_part2)] + )), + ... + ) + ) + }) + } + + close(progress) + + result <- lapply(result, function(out) { + out <- out[!is.na(out)] + if (length(out) == 0) { + as.ab(NA) + } else { + if (!isFALSE(translate_ab)) { + out <- ab_property(out, property = translate_ab, initial_search = FALSE) + } + out + } + }) + } else if (type %like% "dos") { + text_split_all <- strsplit(text, " ", fixed = TRUE) + result <- lapply(text_split_all, function(text_split) { + text_split <- text_split[text_split %like% "^[0-9]{2,}(/[0-9]+)?[a-z]*$"] + # only left part of "/", like 500 in "500/125" + text_split <- gsub("/.*", "", text_split) + text_split <- gsub(",", ".", text_split, fixed = TRUE) # foreign system using comma as decimal sep + text_split <- as.double(gsub("[^0-9.]", "", text_split)) + # minimal 100 units/mg and no years that unlikely doses + text_split <- text_split[text_split >= 100 & !text_split %in% c(1951:1999, 2001:2049)] + + if (length(text_split) > 0) { + text_split + } else { + NA_real_ + } + }) + } else if (type %like% "adm") { + result <- lapply(text_split_all, function(text_split) { + text_split <- text_split[text_split %like% "(^iv$|intraven|^po$|per os|oral|implant|inhal|instill|nasal|paren|rectal|sublingual|buccal|trans.*dermal|vaginal)"] + if (length(text_split) > 0) { + text_split <- gsub("(^po$|.*per os.*)", "oral", text_split) + text_split <- gsub("(^iv$|.*intraven.*)", "iv", text_split) + text_split + } else { + NA_character_ + } + }) + } else { + stop_("`type` must be either 'drug', 'dose' or 'administration'") + } + + # collapse text if needed + if (!is.null(collapse)) { + result <- vapply(FUN.VALUE = character(1), result, function(x) { + if (length(x) == 1 & all(is.na(x))) { + NA_character_ + } else { + paste0(x, collapse = collapse) + } + }) + } + + result +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ab_property.R + + + + +#' Get Properties of an Antibiotic +#' +#' Use these functions to return a specific property of an antibiotic from the [antibiotics] data set. All input values will be evaluated internally with [as.ab()]. +#' @param x any (vector of) text that can be coerced to a valid antibiotic drug code with [as.ab()] +#' @param tolower a [logical] to indicate whether the first [character] of every output should be transformed to a lower case [character]. This will lead to e.g. "polymyxin B" and not "polymyxin b". +#' @param property one of the column names of one of the [antibiotics] data set: `vector_or(colnames(antibiotics), sort = FALSE)`. +#' @param language language of the returned text - the default is the current system language (see [get_AMR_locale()]) and can also be set with the package option [`AMR_locale`][AMR-options]. Use `language = NULL` or `language = ""` to prevent translation. +#' @param administration way of administration, either `"oral"` or `"iv"` +#' @param open browse the URL using [utils::browseURL()] +#' @param ... in case of [set_ab_names()] and `data` is a [data.frame]: columns to select (supports tidy selection such as `column1:column4`), otherwise other arguments passed on to [as.ab()] +#' @param data a [data.frame] of which the columns need to be renamed, or a [character] vector of column names +#' @param snake_case a [logical] to indicate whether the names should be in so-called [snake case](https://en.wikipedia.org/wiki/Snake_case): in lower case and all spaces/slashes replaced with an underscore (`_`) +#' @param only_first a [logical] to indicate whether only the first ATC code must be returned, with giving preference to J0-codes (i.e., the antimicrobial drug group) +#' @details All output [will be translated][translate] where possible. +#' +#' The function [ab_url()] will return the direct URL to the official WHO website. A warning will be returned if the required ATC code is not available. +#' +#' The function [set_ab_names()] is a special column renaming function for [data.frame]s. It renames columns names that resemble antimicrobial drugs. It always makes sure that the new column names are unique. If `property = "atc"` is set, preference is given to ATC codes from the J-group. +#' @inheritSection as.ab Source +#' @rdname ab_property +#' @name ab_property +#' @return +#' - An [integer] in case of [ab_cid()] +#' - A named [list] in case of [ab_info()] and multiple [ab_atc()]/[ab_synonyms()]/[ab_tradenames()] +#' - A [double] in case of [ab_ddd()] +#' - A [data.frame] in case of [set_ab_names()] +#' - A [character] in all other cases +#' @export +#' @seealso [antibiotics] +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' # all properties: +#' ab_name("AMX") +#' ab_atc("AMX") +#' ab_cid("AMX") +#' ab_synonyms("AMX") +#' ab_tradenames("AMX") +#' ab_group("AMX") +#' ab_atc_group1("AMX") +#' ab_atc_group2("AMX") +#' ab_url("AMX") +#' +#' # smart lowercase transformation +#' ab_name(x = c("AMC", "PLB")) +#' ab_name(x = c("AMC", "PLB"), tolower = TRUE) +#' +#' # defined daily doses (DDD) +#' ab_ddd("AMX", "oral") +#' ab_ddd_units("AMX", "oral") +#' ab_ddd("AMX", "iv") +#' ab_ddd_units("AMX", "iv") +#' +#' ab_info("AMX") # all properties as a list +#' +#' # all ab_* functions use as.ab() internally, so you can go from 'any' to 'any': +#' ab_atc("AMP") +#' ab_group("J01CA01") +#' ab_loinc("ampicillin") +#' ab_name("21066-6") +#' ab_name(6249) +#' ab_name("J01CA01") +#' +#' # spelling from different languages and dyslexia are no problem +#' ab_atc("ceftriaxon") +#' ab_atc("cephtriaxone") +#' ab_atc("cephthriaxone") +#' ab_atc("seephthriaaksone") +#' +#' # use set_ab_names() for renaming columns +#' colnames(example_isolates) +#' colnames(set_ab_names(example_isolates)) +#' colnames(set_ab_names(example_isolates, NIT:VAN)) +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' set_ab_names() +#' +#' # this does the same: +#' example_isolates %>% +#' rename_with(set_ab_names) +#' +#' # set_ab_names() works with any AB property: +#' example_isolates %>% +#' set_ab_names(property = "atc") +#' +#' example_isolates %>% +#' set_ab_names(where(is.sir)) %>% +#' colnames() +#' +#' example_isolates %>% +#' set_ab_names(NIT:VAN) %>% +#' colnames() +#' } +#' } +ab_name <- function(x, language = get_AMR_locale(), tolower = FALSE, ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(tolower, allow_class = "logical", has_length = 1) + + x <- translate_into_language(ab_validate(x = x, property = "name", ...), language = language, only_affect_ab_names = TRUE) + if (tolower == TRUE) { + # use perl to only transform the first character + # as we want "polymyxin B", not "polymyxin b" + x <- gsub("^([A-Z])", "\\L\\1", x, perl = TRUE) + } + x +} + +#' @rdname ab_property +#' @export +ab_cid <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + ab_validate(x = x, property = "cid", ...) +} + +#' @rdname ab_property +#' @export +ab_synonyms <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + syns <- ab_validate(x = x, property = "synonyms", ...) + names(syns) <- x + if (length(syns) == 1) { + unname(unlist(syns)) + } else { + syns + } +} + +#' @rdname ab_property +#' @export +ab_tradenames <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + ab_synonyms(x, ...) +} + +#' @rdname ab_property +#' @export +ab_group <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + translate_into_language(ab_validate(x = x, property = "group", ...), language = language, only_affect_ab_names = TRUE) +} + +#' @rdname ab_property +#' @aliases ATC +#' @export +ab_atc <- function(x, only_first = FALSE, ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(only_first, allow_class = "logical", has_length = 1) + + atcs <- ab_validate(x = x, property = "atc", ...) + + if (only_first == TRUE) { + atcs <- vapply( + FUN.VALUE = character(1), + # get only the first ATC code + atcs, + function(x) { + # try to get the J-group + if (any(x %like% "^J")) { + x[x %like% "^J"][1L] + } else { + as.character(x[1L]) + } + } + ) + } else if (length(atcs) == 1) { + atcs <- unname(unlist(atcs)) + } else { + names(atcs) <- x + } + + atcs +} + +#' @rdname ab_property +#' @export +ab_atc_group1 <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + translate_into_language(ab_validate(x = x, property = "atc_group1", ...), language = language, only_affect_ab_names = TRUE) +} + +#' @rdname ab_property +#' @export +ab_atc_group2 <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + translate_into_language(ab_validate(x = x, property = "atc_group2", ...), language = language, only_affect_ab_names = TRUE) +} + +#' @rdname ab_property +#' @export +ab_loinc <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + loincs <- ab_validate(x = x, property = "loinc", ...) + names(loincs) <- x + if (length(loincs) == 1) { + unname(unlist(loincs)) + } else { + loincs + } +} + +#' @rdname ab_property +#' @export +ab_ddd <- function(x, administration = "oral", ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(administration, is_in = c("oral", "iv"), has_length = 1) + + x <- as.ab(x, ...) + ddd_prop <- paste0(administration, "_ddd") + out <- ab_validate(x = x, property = ddd_prop) + + if (any(ab_name(x, language = NULL) %like% "/" & is.na(out))) { + warning_( + "in `ab_ddd()`: DDDs of some combined products are available for different dose combinations and not (yet) part of the AMR package.", + "Please refer to the WHOCC website:\n", + "atcddd.fhi.no/ddd/list_of_ddds_combined_products/" + ) + } + out +} + +#' @rdname ab_property +#' @export +ab_ddd_units <- function(x, administration = "oral", ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(administration, is_in = c("oral", "iv"), has_length = 1) + + x <- as.ab(x, ...) + ddd_prop <- paste0(administration, "_units") + out <- ab_validate(x = x, property = ddd_prop) + + if (any(ab_name(x, language = NULL) %like% "/" & is.na(out))) { + warning_( + "in `ab_ddd_units()`: DDDs of some combined products are available for different dose combinations and not (yet) part of the AMR package.", + "Please refer to the WHOCC website:\n", + "atcddd.fhi.no/ddd/list_of_ddds_combined_products/" + ) + } + out +} + +#' @rdname ab_property +#' @export +ab_info <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + + x <- as.ab(x, ...) + list( + ab = as.character(x), + cid = ab_cid(x), + name = ab_name(x, language = language), + group = ab_group(x, language = language), + atc = ab_atc(x), + atc_group1 = ab_atc_group1(x, language = language), + atc_group2 = ab_atc_group2(x, language = language), + tradenames = ab_tradenames(x), + loinc = ab_loinc(x), + ddd = list( + oral = list( + amount = ab_ddd(x, administration = "oral"), + units = ab_ddd_units(x, administration = "oral") + ), + iv = list( + amount = ab_ddd(x, administration = "iv"), + units = ab_ddd_units(x, administration = "iv") + ) + ) + ) +} + + +#' @rdname ab_property +#' @export +ab_url <- function(x, open = FALSE, ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(open, allow_class = "logical", has_length = 1) + + ab <- as.ab(x = x, ...) + atcs <- ab_atc(ab, only_first = TRUE) + u <- character(length(atcs)) + # veterinary codes + u[atcs %like% "^Q"] <- paste0("https://atcddd.fhi.no/atcvet/atcvet_index/?code=", atcs[atcs %like% "^Q"], "&showdescription=no") + u[atcs %unlike% "^Q"] <- paste0("https://atcddd.fhi.no/atc_ddd_index//?code=", atcs[atcs %unlike% "^Q"], "&showdescription=no") + u[is.na(atcs)] <- NA_character_ + names(u) <- ab_name(ab) + + NAs <- ab_name(ab, tolower = TRUE, language = NULL)[!is.na(ab) & is.na(atcs)] + if (length(NAs) > 0) { + warning_("in `ab_url()`: no ATC code available for ", vector_and(NAs, quotes = FALSE), ".") + } + + if (open == TRUE) { + if (length(u) > 1 && !is.na(u[1L])) { + warning_("in `ab_url()`: only the first URL will be opened, as `browseURL()` only suports one string.") + } + if (!is.na(u[1L])) { + utils::browseURL(u[1L]) + } + } + u +} + +#' @rdname ab_property +#' @export +ab_property <- function(x, property = "name", language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(property, is_in = colnames(AMR::antibiotics), has_length = 1) + language <- validate_language(language) + translate_into_language(ab_validate(x = x, property = property, ...), language = language) +} + +#' @rdname ab_property +#' @aliases ATC +#' @export +set_ab_names <- function(data, ..., property = "name", language = get_AMR_locale(), snake_case = NULL) { + meet_criteria(data, allow_class = c("data.frame", "character")) + meet_criteria(property, is_in = colnames(AMR::antibiotics), has_length = 1, ignore.case = TRUE) + language <- validate_language(language) + meet_criteria(snake_case, allow_class = "logical", has_length = 1, allow_NULL = TRUE) + + x_deparsed <- deparse(substitute(data)) + if (length(x_deparsed) > 1 || any(x_deparsed %unlike% "[a-z]+")) { + x_deparsed <- "your_data" + } + + property <- tolower(property) + if (is.null(snake_case)) { + snake_case <- property == "name" + } + + if (is.data.frame(data)) { + if (tryCatch(length(c(...)) > 1, error = function(e) TRUE)) { + df <- tryCatch(suppressWarnings(pm_select(data, ...)), + error = function(e) { + data[, c(...), drop = FALSE] + } + ) + } else if (tryCatch(is.character(c(...)), error = function(e) FALSE)) { + df <- data[, c(...), drop = FALSE] + } else { + df <- data + } + vars <- get_column_abx(df, info = FALSE, only_sir_columns = FALSE, sort = FALSE, fn = "set_ab_names") + if (length(vars) == 0) { + message_("No columns with antibiotic results found for `set_ab_names()`, leaving names unchanged.") + return(data) + } + } else { + # quickly get antibiotic drug codes + vars_ab <- as.ab(data, fast_mode = TRUE) + vars <- data[!is.na(vars_ab)] + } + x <- vapply( + FUN.VALUE = character(1), + ab_property(vars, property = property, language = language), + function(x) { + if (property == "atc") { + # try to get the J-group + if (any(x %like% "^J")) { + x[x %like% "^J"][1L] + } else { + as.character(x[1L]) + } + } else { + as.character(x[1L]) + } + }, + USE.NAMES = FALSE + ) + if (any(x %in% c("", NA))) { + warning_( + "in `set_ab_names()`: no ", property, " found for column(s): ", + vector_and(vars[x %in% c("", NA)], sort = FALSE) + ) + x[x %in% c("", NA)] <- vars[x %in% c("", NA)] + } + + if (snake_case == TRUE) { + x <- tolower(gsub("[^a-zA-Z0-9]+", "_", x)) + } + + if (anyDuplicated(x)) { + # very hacky way of adding the index to each duplicate + # so "Amoxicillin", "Amoxicillin", "Amoxicillin" + # will be "Amoxicillin", "Amoxicillin_2", "Amoxicillin_3" + invisible(lapply( + unique(x), + function(u) { + dups <- which(x == u) + if (length(dups) > 1) { + # there are duplicates + dup_add_int <- dups[2:length(dups)] + x[dup_add_int] <<- paste0(x[dup_add_int], "_", 2:length(dups)) + } + } + )) + } + if (is.data.frame(data)) { + colnames(data)[colnames(data) %in% vars] <- x + data + } else { + data[which(!is.na(vars_ab))] <- x + data + } +} + +ab_validate <- function(x, property, ...) { + if (tryCatch(all(x[!is.na(x)] %in% AMR_env$AB_lookup$ab), error = function(e) FALSE)) { + # special case for ab_* functions where class is already 'ab' + x <- AMR_env$AB_lookup[match(x, AMR_env$AB_lookup$ab), property, drop = TRUE] + } else { + # try to catch an error when inputting an invalid argument + # so the 'call.' can be set to FALSE + tryCatch(x[1L] %in% AMR_env$AB_lookup[1, property, drop = TRUE], + error = function(e) stop(e$message, call. = FALSE) + ) + + if (!all(x %in% AMR_env$AB_lookup[, property, drop = TRUE])) { + x <- as.ab(x, ...) + if (all(is.na(x)) && is.list(AMR_env$AB_lookup[, property, drop = TRUE])) { + x <- rep(NA_character_, length(x)) + } else { + x <- AMR_env$AB_lookup[match(x, AMR_env$AB_lookup$ab), property, drop = TRUE] + } + } + } + + if (property == "ab") { + return(set_clean_class(x, new_class = c("ab", "character"))) + } else if (property == "cid") { + return(as.integer(x)) + } else if (property %like% "ddd") { + return(as.double(x)) + } else { + x[is.na(x)] <- NA + return(x) + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ab_selectors.R + + + + +#' Antibiotic Selectors +#' +#' @description These functions allow for filtering rows and selecting columns based on antibiotic test results that are of a specific antibiotic class or group (according to the [antibiotics] data set), without the need to define the columns or antibiotic abbreviations. +#' +#' In short, if you have a column name that resembles an antimicrobial drug, it will be picked up by any of these functions that matches its pharmaceutical class: "cefazolin", "kefzol", "CZO" and "J01DB04" will all be picked up by [cephalosporins()]. +#' @param ab_class an antimicrobial class or a part of it, such as `"carba"` and `"carbapenems"`. The columns `group`, `atc_group1` and `atc_group2` of the [antibiotics] data set will be searched (case-insensitive) for this value. +#' @param filter an [expression] to be evaluated in the [antibiotics] data set, such as `name %like% "trim"` +#' @param only_sir_columns a [logical] to indicate whether only columns of class `sir` must be selected (default is `FALSE`), see [as.sir()] +#' @param only_treatable a [logical] to indicate whether antimicrobial drugs should be excluded that are only for laboratory tests (default is `TRUE`), such as gentamicin-high (`GEH`) and imipenem/EDTA (`IPE`) +#' @param ... ignored, only in place to allow future extensions +#' @details +#' These functions can be used in data set calls for selecting columns and filtering rows. They work with base \R, the Tidyverse, and `data.table`. They are heavily inspired by the [Tidyverse selection helpers][tidyselect::language] such as [`everything()`][tidyselect::everything()], but are not limited to `dplyr` verbs. Nonetheless, they are very convenient to use with `dplyr` functions such as [`select()`][dplyr::select()], [`filter()`][dplyr::filter()] and [`summarise()`][dplyr::summarise()], see *Examples*. +#' +#' All columns in the data in which these functions are called will be searched for known antibiotic names, abbreviations, brand names, and codes (ATC, EARS-Net, WHO, etc.) according to the [antibiotics] data set. This means that a selector such as [aminoglycosides()] will pick up column names like 'gen', 'genta', 'J01GB03', 'tobra', 'Tobracin', etc. +#' +#' The [ab_class()] function can be used to filter/select on a manually defined antibiotic class. It searches for results in the [antibiotics] data set within the columns `group`, `atc_group1` and `atc_group2`. +#' @section Full list of supported (antibiotic) classes: +#' +#' `r paste0(" * ", na.omit(sapply(DEFINED_AB_GROUPS, function(ab) ifelse(tolower(gsub("^AB_", "", ab)) %in% ls(envir = asNamespace("AMR")), paste0("[", tolower(gsub("^AB_", "", ab)), "()] can select: \\cr ", vector_and(paste0(ab_name(eval(parse(text = ab), envir = asNamespace("AMR")), language = NULL, tolower = TRUE), " (", eval(parse(text = ab), envir = asNamespace("AMR")), ")"), quotes = FALSE, sort = TRUE)), character(0)), USE.NAMES = FALSE)), "\n", collapse = "")` +#' @rdname antibiotic_class_selectors +#' @name antibiotic_class_selectors +#' @return (internally) a [character] vector of column names, with additional class `"ab_selector"` +#' @export +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates. +#' example_isolates +#' +#' +#' # Examples sections below are split into 'dplyr', 'base R', and 'data.table': +#' +#' \donttest{ +#' # dplyr ------------------------------------------------------------------- +#' +#' if (require("dplyr")) { +#' example_isolates %>% select(carbapenems()) +#' } +#' +#' if (require("dplyr")) { +#' # select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB' +#' example_isolates %>% select(mo, aminoglycosides()) +#' } +#' +#' if (require("dplyr")) { +#' # select only antibiotic columns with DDDs for oral treatment +#' example_isolates %>% select(administrable_per_os()) +#' } +#' +#' if (require("dplyr")) { +#' # get AMR for all aminoglycosides e.g., per ward: +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise(across(aminoglycosides(), +#' resistance)) +#' } +#' if (require("dplyr")) { +#' # You can combine selectors with '&' to be more specific: +#' example_isolates %>% +#' select(penicillins() & administrable_per_os()) +#' } +#' if (require("dplyr")) { +#' # get AMR for only drugs that matter - no intrinsic resistance: +#' example_isolates %>% +#' filter(mo_genus() %in% c("Escherichia", "Klebsiella")) %>% +#' group_by(ward) %>% +#' summarise_at(not_intrinsic_resistant(), +#' resistance) +#' } +#' if (require("dplyr")) { +#' # get susceptibility for antibiotics whose name contains "trim": +#' example_isolates %>% +#' filter(first_isolate()) %>% +#' group_by(ward) %>% +#' summarise(across(ab_selector(name %like% "trim"), susceptibility)) +#' } +#' if (require("dplyr")) { +#' # this will select columns 'IPM' (imipenem) and 'MEM' (meropenem): +#' example_isolates %>% +#' select(carbapenems()) +#' } +#' if (require("dplyr")) { +#' # this will select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB': +#' example_isolates %>% +#' select(mo, aminoglycosides()) +#' } +#' if (require("dplyr")) { +#' # any() and all() work in dplyr's filter() too: +#' example_isolates %>% +#' filter( +#' any(aminoglycosides() == "R"), +#' all(cephalosporins_2nd() == "R") +#' ) +#' } +#' if (require("dplyr")) { +#' # also works with c(): +#' example_isolates %>% +#' filter(any(c(carbapenems(), aminoglycosides()) == "R")) +#' } +#' if (require("dplyr")) { +#' # not setting any/all will automatically apply all(): +#' example_isolates %>% +#' filter(aminoglycosides() == "R") +#' } +#' if (require("dplyr")) { +#' # this will select columns 'mo' and all antimycobacterial drugs ('RIF'): +#' example_isolates %>% +#' select(mo, ab_class("mycobact")) +#' } +#' if (require("dplyr")) { +#' # get bug/drug combinations for only glycopeptides in Gram-positives: +#' example_isolates %>% +#' filter(mo_is_gram_positive()) %>% +#' select(mo, glycopeptides()) %>% +#' bug_drug_combinations() %>% +#' format() +#' } +#' if (require("dplyr")) { +#' data.frame( +#' some_column = "some_value", +#' J01CA01 = "S" +#' ) %>% # ATC code of ampicillin +#' select(penicillins()) # only the 'J01CA01' column will be selected +#' } +#' if (require("dplyr")) { +#' # with recent versions of dplyr, this is all equal: +#' x <- example_isolates[carbapenems() == "R", ] +#' y <- example_isolates %>% filter(carbapenems() == "R") +#' z <- example_isolates %>% filter(if_all(carbapenems(), ~ .x == "R")) +#' identical(x, y) && identical(y, z) +#' } +#' +#' +#' # base R ------------------------------------------------------------------ +#' +#' # select columns 'IPM' (imipenem) and 'MEM' (meropenem) +#' example_isolates[, carbapenems()] +#' +#' # select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB' +#' example_isolates[, c("mo", aminoglycosides())] +#' +#' # select only antibiotic columns with DDDs for oral treatment +#' example_isolates[, administrable_per_os()] +#' +#' # filter using any() or all() +#' example_isolates[any(carbapenems() == "R"), ] +#' subset(example_isolates, any(carbapenems() == "R")) +#' +#' # filter on any or all results in the carbapenem columns (i.e., IPM, MEM): +#' example_isolates[any(carbapenems()), ] +#' example_isolates[all(carbapenems()), ] +#' +#' # filter with multiple antibiotic selectors using c() +#' example_isolates[all(c(carbapenems(), aminoglycosides()) == "R"), ] +#' +#' # filter + select in one go: get penicillins in carbapenem-resistant strains +#' example_isolates[any(carbapenems() == "R"), penicillins()] +#' +#' # You can combine selectors with '&' to be more specific. For example, +#' # penicillins() would select benzylpenicillin ('peni G') and +#' # administrable_per_os() would select erythromycin. Yet, when combined these +#' # drugs are both omitted since benzylpenicillin is not administrable per os +#' # and erythromycin is not a penicillin: +#' example_isolates[, penicillins() & administrable_per_os()] +#' +#' # ab_selector() applies a filter in the `antibiotics` data set and is thus +#' # very flexible. For instance, to select antibiotic columns with an oral DDD +#' # of at least 1 gram: +#' example_isolates[, ab_selector(oral_ddd > 1 & oral_units == "g")] +#' +#' +#' # data.table -------------------------------------------------------------- +#' +#' # data.table is supported as well, just use it in the same way as with +#' # base R, but add `with = FALSE` if using a single AB selector. +#' +#' if (require("data.table")) { +#' dt <- as.data.table(example_isolates) +#' +#' # this does not work, it returns column *names* +#' dt[, carbapenems()] +#' } +#' if (require("data.table")) { +#' # so `with = FALSE` is required +#' dt[, carbapenems(), with = FALSE] +#' } +#' +#' # for multiple selections or AB selectors, `with = FALSE` is not needed: +#' if (require("data.table")) { +#' dt[, c("mo", aminoglycosides())] +#' } +#' if (require("data.table")) { +#' dt[, c(carbapenems(), aminoglycosides())] +#' } +#' +#' # row filters are also supported: +#' if (require("data.table")) { +#' dt[any(carbapenems() == "S"), ] +#' } +#' if (require("data.table")) { +#' dt[any(carbapenems() == "S"), penicillins(), with = FALSE] +#' } +#' } +ab_class <- function(ab_class, + only_sir_columns = FALSE, + only_treatable = TRUE, + ...) { + meet_criteria(ab_class, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec(NULL, only_sir_columns = only_sir_columns, ab_class_args = ab_class, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @details The [ab_selector()] function can be used to internally filter the [antibiotics] data set on any results, see *Examples*. It allows for filtering on a (part of) a certain name, and/or a group name or even a minimum of DDDs for oral treatment. This function yields the highest flexibility, but is also the least user-friendly, since it requires a hard-coded filter to set. +#' @export +ab_selector <- function(filter, + only_sir_columns = FALSE, + only_treatable = TRUE, + ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + + # get_current_data() has to run each time, for cases where e.g., filter() and select() are used in same call + # but it only takes a couple of milliseconds + vars_df <- get_current_data(arg_name = NA, call = -2) + # to improve speed, get_column_abx() will only run once when e.g. in a select or group call + ab_in_data <- get_column_abx(vars_df, + info = FALSE, only_sir_columns = only_sir_columns, + sort = FALSE, fn = "ab_selector" + ) + call <- substitute(filter) + agents <- tryCatch(AMR_env$AB_lookup[which(eval(call, envir = AMR_env$AB_lookup)), "ab", drop = TRUE], + error = function(e) stop_(e$message, call = -5) + ) + agents <- ab_in_data[ab_in_data %in% agents] + message_agent_names( + function_name = "ab_selector", + agents = agents, + ab_group = NULL, + examples = "", + call = call + ) + structure(unname(agents), + class = c("ab_selector", "character") + ) +} + +#' @rdname antibiotic_class_selectors +#' @export +aminoglycosides <- function(only_sir_columns = FALSE, only_treatable = TRUE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec("aminoglycosides", only_sir_columns = only_sir_columns, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @export +aminopenicillins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("aminopenicillins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +antifungals <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("antifungals", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +antimycobacterials <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("antimycobacterials", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +betalactams <- function(only_sir_columns = FALSE, only_treatable = TRUE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec("betalactams", only_sir_columns = only_sir_columns, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @export +carbapenems <- function(only_sir_columns = FALSE, only_treatable = TRUE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec("carbapenems", only_sir_columns = only_sir_columns, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins_1st <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins_1st", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins_2nd <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins_2nd", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins_3rd <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins_3rd", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins_4th <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins_4th", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +cephalosporins_5th <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("cephalosporins_5th", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +fluoroquinolones <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("fluoroquinolones", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +glycopeptides <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("glycopeptides", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +lincosamides <- function(only_sir_columns = FALSE, only_treatable = TRUE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec("lincosamides", only_sir_columns = only_sir_columns, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @export +lipoglycopeptides <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("lipoglycopeptides", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +macrolides <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("macrolides", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +nitrofurans <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("nitrofurans", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +oxazolidinones <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("oxazolidinones", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +penicillins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("penicillins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +polymyxins <- function(only_sir_columns = FALSE, only_treatable = TRUE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(only_treatable, allow_class = "logical", has_length = 1) + ab_select_exec("polymyxins", only_sir_columns = only_sir_columns, only_treatable = only_treatable) +} + +#' @rdname antibiotic_class_selectors +#' @export +quinolones <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("quinolones", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +rifamycins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("rifamycins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +streptogramins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("streptogramins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +tetracyclines <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("tetracyclines", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +trimethoprims <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("trimethoprims", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @export +ureidopenicillins <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + ab_select_exec("ureidopenicillins", only_sir_columns = only_sir_columns) +} + +#' @rdname antibiotic_class_selectors +#' @details The [administrable_per_os()] and [administrable_iv()] functions also rely on the [antibiotics] data set - antibiotic columns will be matched where a DDD (defined daily dose) for resp. oral and IV treatment is available in the [antibiotics] data set. +#' @export +administrable_per_os <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + # get_current_data() has to run each time, for cases where e.g., filter() and select() are used in same call + # but it only takes a couple of milliseconds + vars_df <- get_current_data(arg_name = NA, call = -2) + # to improve speed, get_column_abx() will only run once when e.g. in a select or group call + ab_in_data <- get_column_abx(vars_df, + info = FALSE, only_sir_columns = only_sir_columns, + sort = FALSE, fn = "administrable_per_os" + ) + agents_all <- AMR_env$AB_lookup[which(!is.na(AMR_env$AB_lookup$oral_ddd)), "ab", drop = TRUE] + agents <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$ab %in% ab_in_data & !is.na(AMR_env$AB_lookup$oral_ddd)), "ab", drop = TRUE] + agents <- ab_in_data[ab_in_data %in% agents] + message_agent_names( + function_name = "administrable_per_os", + agents = agents, + ab_group = "administrable_per_os", + examples = paste0( + " (such as ", + vector_or( + ab_name( + sample(agents_all, + size = min(5, length(agents_all)), + replace = FALSE + ), + tolower = TRUE, + language = NULL + ), + quotes = FALSE + ), + ")" + ) + ) + structure(unname(agents), + class = c("ab_selector", "character") + ) +} + +#' @rdname antibiotic_class_selectors +#' @export +administrable_iv <- function(only_sir_columns = FALSE, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + # get_current_data() has to run each time, for cases where e.g., filter() and select() are used in same call + # but it only takes a couple of milliseconds + vars_df <- get_current_data(arg_name = NA, call = -2) + # to improve speed, get_column_abx() will only run once when e.g. in a select or group call + ab_in_data <- get_column_abx(vars_df, + info = FALSE, only_sir_columns = only_sir_columns, + sort = FALSE, fn = "administrable_iv" + ) + agents_all <- AMR_env$AB_lookup[which(!is.na(AMR_env$AB_lookup$iv_ddd)), "ab", drop = TRUE] + agents <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$ab %in% ab_in_data & !is.na(AMR_env$AB_lookup$iv_ddd)), "ab", drop = TRUE] + agents <- ab_in_data[ab_in_data %in% agents] + message_agent_names( + function_name = "administrable_iv", + agents = agents, + ab_group = "administrable_iv", + examples = "" + ) + structure(unname(agents), + class = c("ab_selector", "character") + ) +} + +#' @rdname antibiotic_class_selectors +#' @inheritParams eucast_rules +#' @details The [not_intrinsic_resistant()] function can be used to only select antibiotic columns that pose no intrinsic resistance for the microorganisms in the data set. For example, if a data set contains only microorganism codes or names of *E. coli* and *K. pneumoniae* and contains a column "vancomycin", this column will be removed (or rather, unselected) using this function. It currently applies `r format_eucast_version_nr(names(EUCAST_VERSION_EXPERT_RULES[1]))` to determine intrinsic resistance, using the [eucast_rules()] function internally. Because of this determination, this function is quite slow in terms of performance. +#' @export +not_intrinsic_resistant <- function(only_sir_columns = FALSE, col_mo = NULL, version_expertrules = 3.3, ...) { + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + # get_current_data() has to run each time, for cases where e.g., filter() and select() are used in same call + # but it only takes a couple of milliseconds + vars_df <- get_current_data(arg_name = NA, call = -2) + # to improve speed, get_column_abx() will only run once when e.g. in a select or group call + ab_in_data <- get_column_abx(vars_df, + info = FALSE, only_sir_columns = only_sir_columns, + sort = FALSE, fn = "not_intrinsic_resistant" + ) + # intrinsic vars + vars_df_R <- tryCatch( + sapply( + eucast_rules(vars_df, + col_mo = col_mo, + version_expertrules = version_expertrules, + rules = "expert", + info = FALSE + ), + function(col) { + tryCatch(!any(is.na(col)) && all(col == "R"), + error = function(e) FALSE + ) + } + ), + error = function(e) stop_("in not_intrinsic_resistant(): ", e$message, call = FALSE) + ) + + agents <- ab_in_data[ab_in_data %in% names(vars_df_R[which(vars_df_R)])] + if (length(agents) > 0 && + message_not_thrown_before("not_intrinsic_resistant", sort(agents))) { + agents_formatted <- paste0("'", font_bold(agents, collapse = NULL), "'") + agents_names <- ab_name(names(agents), tolower = TRUE, language = NULL) + need_name <- generalise_antibiotic_name(agents) != generalise_antibiotic_name(agents_names) + agents_formatted[need_name] <- paste0(agents_formatted[need_name], " (", agents_names[need_name], ")") + message_( + "For `not_intrinsic_resistant()` removing ", + ifelse(length(agents) == 1, "column ", "columns "), + vector_and(agents_formatted, quotes = FALSE, sort = FALSE) + ) + } + + vars_df_R <- names(vars_df_R)[which(!vars_df_R)] + # find columns that are abx, but also intrinsic R + out <- unname(intersect(ab_in_data, vars_df_R)) + structure(out, + class = c("ab_selector", "character") + ) +} + +ab_select_exec <- function(function_name, + only_sir_columns = FALSE, + only_treatable = FALSE, + ab_class_args = NULL) { + # get_current_data() has to run each time, for cases where e.g., filter() and select() are used in same call + # but it only takes a couple of milliseconds + vars_df <- get_current_data(arg_name = NA, call = -3) + # to improve speed, get_column_abx() will only run once when e.g. in a select or group call + ab_in_data <- get_column_abx(vars_df, + info = FALSE, only_sir_columns = only_sir_columns, + sort = FALSE, fn = function_name + ) + + # untreatable drugs + if (only_treatable == TRUE) { + untreatable <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$name %like% "-high|EDTA|polysorbate|macromethod|screening|nacubactam"), "ab", drop = TRUE] + if (any(untreatable %in% names(ab_in_data))) { + if (message_not_thrown_before(function_name, "ab_class", "untreatable")) { + warning_( + "in `", function_name, "()`: some drugs were ignored since they cannot be used for treating patients: ", + vector_and( + ab_name(names(ab_in_data)[names(ab_in_data) %in% untreatable], + language = NULL, + tolower = TRUE + ), + quotes = FALSE, + sort = TRUE + ), ". They can be included using `", function_name, "(only_treatable = FALSE)`." + ) + } + ab_in_data <- ab_in_data[!names(ab_in_data) %in% untreatable] + } + } + + if (length(ab_in_data) == 0) { + message_("No antimicrobial drugs found in the data.") + return(NULL) + } + + if (is.null(ab_class_args) || isTRUE(function_name %in% c("antifungals", "antimycobacterials"))) { + ab_group <- NULL + if (isTRUE(function_name == "antifungals")) { + abx <- AMR_env$AB_lookup$ab[which(AMR_env$AB_lookup$group == "Antifungals")] + } else if (isTRUE(function_name == "antimycobacterials")) { + abx <- AMR_env$AB_lookup$ab[which(AMR_env$AB_lookup$group == "Antimycobacterials")] + } else { + # their upper case equivalent are vectors with class 'ab', created in data-raw/_pre_commit_checks.R + # carbapenems() gets its codes from AMR:::AB_CARBAPENEMS + abx <- get(paste0("AB_", toupper(function_name)), envir = asNamespace("AMR")) + # manually added codes from add_custom_antimicrobials() must also be supported + if (length(AMR_env$custom_ab_codes) > 0) { + custom_ab <- AMR_env$AB_lookup[which(AMR_env$AB_lookup$ab %in% AMR_env$custom_ab_codes), ] + check_string <- paste0(custom_ab$group, custom_ab$atc_group1, custom_ab$atc_group2) + if (function_name == "betalactams") { + find_group <- "beta-lactams" + } else if (function_name %like% "cephalosporins_") { + find_group <- gsub("_(.*)$", paste0(" (\\1 gen.)"), function_name) + } else { + find_group <- function_name + } + abx <- c(abx, custom_ab$ab[which(check_string %like% find_group)]) + } + ab_group <- function_name + } + examples <- paste0(" (such as ", vector_or( + ab_name(sample(abx, size = min(2, length(abx)), replace = FALSE), + tolower = TRUE, + language = NULL + ), + quotes = FALSE + ), ")") + } else { + # this for the 'manual' ab_class() function + abx <- subset( + AMR_env$AB_lookup, + group %like% ab_class_args | + atc_group1 %like% ab_class_args | + atc_group2 %like% ab_class_args + )$ab + ab_group <- find_ab_group(ab_class_args) + function_name <- "ab_class" + examples <- paste0(" (such as ", find_ab_names(ab_class_args, 2), ")") + } + + # get the columns with a group names in the chosen ab class + agents <- ab_in_data[names(ab_in_data) %in% abx] + + message_agent_names( + function_name = function_name, + agents = agents, + ab_group = ab_group, + examples = examples, + ab_class_args = ab_class_args + ) + + structure(unname(agents), + class = c("ab_selector", "character") + ) +} + +#' @method print ab_selector +#' @export +#' @noRd +print.ab_selector <- function(x, ...) { + warning_("It should never be needed to print an antibiotic selector class. Are you using data.table? Then add the argument `with = FALSE`, see our examples at `?ab_selector`.", + immediate = TRUE) + cat("Class 'ab_selector'\n") + print(as.character(x), quote = FALSE) +} + +#' @method c ab_selector +#' @export +#' @noRd +c.ab_selector <- function(...) { + structure(unlist(lapply(list(...), as.character)), + class = c("ab_selector", "character") + ) +} + +all_any_ab_selector <- function(type, ..., na.rm = TRUE) { + cols_ab <- c(...) + result <- cols_ab[toupper(cols_ab) %in% c("S", "SDD", "I", "R", "NI")] + if (length(result) == 0) { + message_("Filtering ", type, " of columns ", vector_and(font_bold(cols_ab, collapse = NULL), quotes = "'"), ' to contain value "S", "I" or "R"') + result <- c("S", "SDD", "I", "R", "NI") + } + cols_ab <- cols_ab[!cols_ab %in% result] + df <- get_current_data(arg_name = NA, call = -3) + + if (type == "all") { + scope_fn <- all + } else { + scope_fn <- any + } + + x_transposed <- as.list(as.data.frame(t(df[, cols_ab, drop = FALSE]), stringsAsFactors = FALSE)) + vapply( + FUN.VALUE = logical(1), + X = x_transposed, + FUN = function(y) scope_fn(y %in% result, na.rm = na.rm), + USE.NAMES = FALSE + ) +} + +#' @method all ab_selector +#' @export +#' @noRd +all.ab_selector <- function(..., na.rm = FALSE) { + all_any_ab_selector("all", ..., na.rm = na.rm) +} + +#' @method any ab_selector +#' @export +#' @noRd +any.ab_selector <- function(..., na.rm = FALSE) { + all_any_ab_selector("any", ..., na.rm = na.rm) +} + + +#' @method all ab_selector_any_all +#' @export +#' @noRd +all.ab_selector_any_all <- function(..., na.rm = FALSE) { + # this is all() on a logical vector from `==.ab_selector` or `!=.ab_selector` + # e.g., example_isolates %>% filter(all(carbapenems() == "R")) + # so just return the vector as is, only correcting for na.rm + out <- unclass(c(...)) + if (isTRUE(na.rm)) { + out <- out[!is.na(out)] + } + out +} + +#' @method any ab_selector_any_all +#' @export +#' @noRd +any.ab_selector_any_all <- function(..., na.rm = FALSE) { + # this is any() on a logical vector from `==.ab_selector` or `!=.ab_selector` + # e.g., example_isolates %>% filter(any(carbapenems() == "R")) + # so just return the vector as is, only correcting for na.rm + out <- unclass(c(...)) + if (isTRUE(na.rm)) { + out <- out[!is.na(out)] + } + out +} + +#' @method == ab_selector +#' @export +#' @noRd +`==.ab_selector` <- function(e1, e2) { + calls <- as.character(match.call()) + fn_name <- calls[2] + fn_name <- gsub("^(c\\()(.*)(\\))$", "\\2", fn_name) + if (is_any(fn_name)) { + type <- "any" + } else if (is_all(fn_name)) { + type <- "all" + } else { + type <- "all" + if (length(e1) > 1) { + message_( + "Assuming a filter on ", type, " ", length(e1), " ", gsub("[\\(\\)]", "", fn_name), + ". Wrap around `all()` or `any()` to prevent this note." + ) + } + } + structure(all_any_ab_selector(type = type, e1, e2), + class = c("ab_selector_any_all", "logical") + ) +} + +#' @method != ab_selector +#' @export +#' @noRd +`!=.ab_selector` <- function(e1, e2) { + calls <- as.character(match.call()) + fn_name <- calls[2] + fn_name <- gsub("^(c\\()(.*)(\\))$", "\\2", fn_name) + if (is_any(fn_name)) { + type <- "any" + } else if (is_all(fn_name)) { + type <- "all" + } else { + type <- "all" + if (length(e1) > 1) { + message_( + "Assuming a filter on ", type, " ", length(e1), " ", gsub("[\\(\\)]", "", fn_name), + ". Wrap around `all()` or `any()` to prevent this note." + ) + } + } + # this is `!=`, so turn around the values + sir <- c("S", "SDD", "I", "R", "NI") + e2 <- sir[sir != e2] + structure(all_any_ab_selector(type = type, e1, e2), + class = c("ab_selector_any_all", "logical") + ) +} + +#' @method & ab_selector +#' @export +#' @noRd +`&.ab_selector` <- function(e1, e2) { + # this is only required for base R, since tidyselect has already implemented this + # e.g., for: example_isolates[, penicillins() & administrable_per_os()] + structure(intersect(unclass(e1), unclass(e2)), + class = c("ab_selector", "character") + ) +} +#' @method | ab_selector +#' @export +#' @noRd +`|.ab_selector` <- function(e1, e2) { + # this is only required for base R, since tidyselect has already implemented this + # e.g., for: example_isolates[, penicillins() | administrable_per_os()] + structure(union(unclass(e1), unclass(e2)), + class = c("ab_selector", "character") + ) +} + +is_any <- function(el1) { + syscalls <- paste0(trimws2(deparse(sys.calls())), collapse = " ") + el1 <- gsub("(.*),.*", "\\1", el1) + syscalls %like% paste0("[^_a-zA-Z0-9]any\\(", "(c\\()?", el1) +} +is_all <- function(el1) { + syscalls <- paste0(trimws2(deparse(sys.calls())), collapse = " ") + el1 <- gsub("(.*),.*", "\\1", el1) + syscalls %like% paste0("[^_a-zA-Z0-9]all\\(", "(c\\()?", el1) +} + +find_ab_group <- function(ab_class_args) { + ab_class_args <- gsub("[^a-zA-Z0-9]", ".*", ab_class_args) + AMR_env$AB_lookup %pm>% + subset(group %like% ab_class_args | + atc_group1 %like% ab_class_args | + atc_group2 %like% ab_class_args) %pm>% + pm_pull(group) %pm>% + unique() %pm>% + tolower() %pm>% + sort() %pm>% + paste(collapse = "/") +} + +find_ab_names <- function(ab_group, n = 3) { + ab_group <- gsub("[^a-zA-Z|0-9]", ".*", ab_group) + + # try popular first, they have DDDs + drugs <- AMR_env$AB_lookup[which((!is.na(AMR_env$AB_lookup$iv_ddd) | !is.na(AMR_env$AB_lookup$oral_ddd)) & + AMR_env$AB_lookup$name %unlike% " " & + AMR_env$AB_lookup$group %like% ab_group & + AMR_env$AB_lookup$ab %unlike% "[0-9]$"), ]$name + if (length(drugs) < n) { + # now try it all + drugs <- AMR_env$AB_lookup[which((AMR_env$AB_lookup$group %like% ab_group | + AMR_env$AB_lookup$atc_group1 %like% ab_group | + AMR_env$AB_lookup$atc_group2 %like% ab_group) & + AMR_env$AB_lookup$ab %unlike% "[0-9]$"), ]$name + } + if (length(drugs) == 0) { + return("??") + } + vector_or( + ab_name(sample(drugs, size = min(n, length(drugs)), replace = FALSE), + tolower = TRUE, + language = NULL + ), + quotes = FALSE + ) +} + +message_agent_names <- function(function_name, agents, ab_group = NULL, examples = "", ab_class_args = NULL, call = NULL) { + if (message_not_thrown_before(function_name, sort(agents))) { + if (length(agents) == 0) { + if (is.null(ab_group)) { + message_("For `", function_name, "()` no antimicrobial drugs found", examples, ".") + } else if (ab_group == "administrable_per_os") { + message_("No orally administrable drugs found", examples, ".") + } else if (ab_group == "administrable_iv") { + message_("No IV administrable drugs found", examples, ".") + } else { + message_("No antimicrobial drugs of class '", ab_group, "' found", examples, ".") + } + } else { + agents_formatted <- paste0("'", font_bold(agents, collapse = NULL), "'") + agents_names <- ab_name(names(agents), tolower = TRUE, language = NULL) + need_name <- generalise_antibiotic_name(agents) != generalise_antibiotic_name(agents_names) + agents_formatted[need_name] <- paste0(agents_formatted[need_name], " (", agents_names[need_name], ")") + message_( + "For `", function_name, "(", + ifelse(function_name == "ab_class", + paste0("\"", ab_class_args, "\""), + ifelse(!is.null(call), + paste0(deparse(call), collapse = " "), + "" + ) + ), + ")` using ", + ifelse(length(agents) == 1, "column ", "columns "), + vector_and(agents_formatted, quotes = FALSE, sort = FALSE) + ) + } + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/age.R + + + + +#' Age in Years of Individuals +#' +#' Calculates age in years based on a reference date, which is the system date at default. +#' @param x date(s), [character] (vectors) will be coerced with [as.POSIXlt()] +#' @param reference reference date(s) (default is today), [character] (vectors) will be coerced with [as.POSIXlt()] +#' @param exact a [logical] to indicate whether age calculation should be exact, i.e. with decimals. It divides the number of days of [year-to-date](https://en.wikipedia.org/wiki/Year-to-date) (YTD) of `x` by the number of days in the year of `reference` (either 365 or 366). +#' @param na.rm a [logical] to indicate whether missing values should be removed +#' @param ... arguments passed on to [as.POSIXlt()], such as `origin` +#' @details Ages below 0 will be returned as `NA` with a warning. Ages above 120 will only give a warning. +#' +#' This function vectorises over both `x` and `reference`, meaning that either can have a length of 1 while the other argument has a larger length. +#' @return An [integer] (no decimals) if `exact = FALSE`, a [double] (with decimals) otherwise +#' @seealso To split ages into groups, use the [age_groups()] function. +#' @export +#' @examples +#' # 10 random pre-Y2K birth dates +#' df <- data.frame(birth_date = as.Date("2000-01-01") - runif(10) * 25000) +#' +#' # add ages +#' df$age <- age(df$birth_date) +#' +#' # add exact ages +#' df$age_exact <- age(df$birth_date, exact = TRUE) +#' +#' # add age at millenium switch +#' df$age_at_y2k <- age(df$birth_date, "2000-01-01") +#' +#' df +age <- function(x, reference = Sys.Date(), exact = FALSE, na.rm = FALSE, ...) { + meet_criteria(x, allow_class = c("character", "Date", "POSIXt")) + meet_criteria(reference, allow_class = c("character", "Date", "POSIXt")) + meet_criteria(exact, allow_class = "logical", has_length = 1) + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + + if (length(x) != length(reference)) { + if (length(x) == 1) { + x <- rep(x, length(reference)) + } else if (length(reference) == 1) { + reference <- rep(reference, length(x)) + } else { + stop_("`x` and `reference` must be of same length, or `reference` must be of length 1.") + } + } + x <- as.POSIXlt(x, ...) + reference <- as.POSIXlt(reference, ...) + + # from https://stackoverflow.com/a/25450756/4575331 + years_gap <- reference$year - x$year + ages <- ifelse(reference$mon < x$mon | (reference$mon == x$mon & reference$mday < x$mday), + as.integer(years_gap - 1), + as.integer(years_gap) + ) + + # add decimals + if (exact == TRUE) { + # get dates of `x` when `x` would have the year of `reference` + x_in_reference_year <- as.POSIXlt( + paste0( + format(as.Date(reference), "%Y"), + format(as.Date(x), "-%m-%d") + ), + format = "%Y-%m-%d" + ) + # get differences in days + n_days_x_rest <- as.double(difftime(as.Date(reference), + as.Date(x_in_reference_year), + units = "days" + )) + # get numbers of days the years of `reference` has for a reliable denominator + n_days_reference_year <- as.POSIXlt(paste0(format(as.Date(reference), "%Y"), "-12-31"), + format = "%Y-%m-%d" + )$yday + 1 + # add decimal parts of year + mod <- n_days_x_rest / n_days_reference_year + # negative mods are cases where `x_in_reference_year` > `reference` - so 'add' a year + mod[!is.na(mod) & mod < 0] <- mod[!is.na(mod) & mod < 0] + 1 + # and finally add to ages + ages <- ages + mod + } + + if (any(ages < 0, na.rm = TRUE)) { + ages[!is.na(ages) & ages < 0] <- NA + warning_("in `age()`: NAs introduced for ages below 0.") + } + if (any(ages > 120, na.rm = TRUE)) { + warning_("in `age()`: some ages are above 120.") + } + + if (isTRUE(na.rm)) { + ages <- ages[!is.na(ages)] + } + + if (exact == TRUE) { + as.double(ages) + } else { + as.integer(ages) + } +} + +#' Split Ages into Age Groups +#' +#' Split ages into age groups defined by the `split` argument. This allows for easier demographic (antimicrobial resistance) analysis. +#' @param x age, e.g. calculated with [age()] +#' @param split_at values to split `x` at - the default is age groups 0-11, 12-24, 25-54, 55-74 and 75+. See *Details*. +#' @param na.rm a [logical] to indicate whether missing values should be removed +#' @details To split ages, the input for the `split_at` argument can be: +#' +#' * A [numeric] vector. A value of e.g. `c(10, 20)` will split `x` on 0-9, 10-19 and 20+. A value of only `50` will split `x` on 0-49 and 50+. +#' The default is to split on young children (0-11), youth (12-24), young adults (25-54), middle-aged adults (55-74) and elderly (75+). +#' * A character: +#' - `"children"` or `"kids"`, equivalent of: `c(0, 1, 2, 4, 6, 13, 18)`. This will split on 0, 1, 2-3, 4-5, 6-12, 13-17 and 18+. +#' - `"elderly"` or `"seniors"`, equivalent of: `c(65, 75, 85)`. This will split on 0-64, 65-74, 75-84, 85+. +#' - `"fives"`, equivalent of: `1:20 * 5`. This will split on 0-4, 5-9, ..., 95-99, 100+. +#' - `"tens"`, equivalent of: `1:10 * 10`. This will split on 0-9, 10-19, ..., 90-99, 100+. +#' @return Ordered [factor] +#' @seealso To determine ages, based on one or more reference dates, use the [age()] function. +#' @export +#' @examples +#' ages <- c(3, 8, 16, 54, 31, 76, 101, 43, 21) +#' +#' # split into 0-49 and 50+ +#' age_groups(ages, 50) +#' +#' # split into 0-19, 20-49 and 50+ +#' age_groups(ages, c(20, 50)) +#' +#' # split into groups of ten years +#' age_groups(ages, 1:10 * 10) +#' age_groups(ages, split_at = "tens") +#' +#' # split into groups of five years +#' age_groups(ages, 1:20 * 5) +#' age_groups(ages, split_at = "fives") +#' +#' # split specifically for children +#' age_groups(ages, c(1, 2, 4, 6, 13, 18)) +#' age_groups(ages, "children") +#' +#' \donttest{ +#' # resistance of ciprofloxacin per age group +#' if (require("dplyr") && require("ggplot2")) { +#' example_isolates %>% +#' filter_first_isolate() %>% +#' filter(mo == as.mo("Escherichia coli")) %>% +#' group_by(age_group = age_groups(age)) %>% +#' select(age_group, CIP) %>% +#' ggplot_sir( +#' x = "age_group", +#' minimum = 0, +#' x.title = "Age Group", +#' title = "Ciprofloxacin resistance per age group" +#' ) +#' } +#' } +age_groups <- function(x, split_at = c(12, 25, 55, 75), na.rm = FALSE) { + meet_criteria(x, allow_class = c("numeric", "integer"), is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(split_at, allow_class = c("numeric", "integer", "character"), is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + + if (any(x < 0, na.rm = TRUE)) { + x[x < 0] <- NA + warning_("in `age_groups()`: NAs introduced for ages below 0.") + } + if (is.character(split_at)) { + split_at <- split_at[1L] + if (split_at %like% "^(child|kid|junior)") { + split_at <- c(0, 1, 2, 4, 6, 13, 18) + } else if (split_at %like% "^(elder|senior)") { + split_at <- c(65, 75, 85) + } else if (split_at %like% "^five") { + split_at <- 1:20 * 5 + } else if (split_at %like% "^ten") { + split_at <- 1:10 * 10 + } + } + split_at <- sort(unique(as.integer(split_at))) + if (!split_at[1] == 0) { + # add base number 0 + split_at <- c(0, split_at) + } + split_at <- split_at[!is.na(split_at)] + stop_if(length(split_at) == 1, "invalid value for `split_at`") # only 0 is available + + # turn input values to 'split_at' indices + y <- x + lbls <- split_at + for (i in seq_len(length(split_at))) { + y[x >= split_at[i]] <- i + # create labels + lbls[i - 1] <- paste0(unique(c(split_at[i - 1], split_at[i] - 1)), collapse = "-") + } + + # last category + lbls[length(lbls)] <- paste0(split_at[length(split_at)], "+") + + agegroups <- factor(lbls[y], levels = lbls, ordered = TRUE) + + if (isTRUE(na.rm)) { + agegroups <- agegroups[!is.na(agegroups)] + } + + agegroups +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/antibiogram.R + + + + +#' Generate Traditional, Combination, Syndromic, or WISCA Antibiograms +#' +#' Create detailed antibiograms with options for traditional, combination, syndromic, and Bayesian WISCA methods. Based on the approaches of Klinker *et al.*, Barbieri *et al.*, and the Bayesian WISCA model (Weighted-Incidence Syndromic Combination Antibiogram) by Bielicki *et al.*, this function provides flexible output formats including plots and tables, ideal for integration with R Markdown and Quarto reports. +#' @param x a [data.frame] containing at least a column with microorganisms and columns with antibiotic results (class 'sir', see [as.sir()]) +#' @param antibiotics vector of any antibiotic name or code (will be evaluated with [as.ab()], column name of `x`, or (any combinations of) [antibiotic selectors][antibiotic_class_selectors] such as [aminoglycosides()] or [carbapenems()]. For combination antibiograms, this can also be set to values separated with `"+"`, such as "TZP+TOB" or "cipro + genta", given that columns resembling such antibiotics exist in `x`. See *Examples*. +#' @param mo_transform a character to transform microorganism input - must be `"name"`, `"shortname"` (default), `"gramstain"`, or one of the column names of the [microorganisms] data set: `r vector_or(colnames(microorganisms), sort = FALSE, quotes = TRUE)`. Can also be `NULL` to not transform the input. +#' @param ab_transform a character to transform antibiotic input - must be one of the column names of the [antibiotics] data set (defaults to `"name"`): `r vector_or(colnames(antibiotics), sort = FALSE, quotes = TRUE)`. Can also be `NULL` to not transform the input. +#' @param syndromic_group a column name of `x`, or values calculated to split rows of `x`, e.g. by using [ifelse()] or [`case_when()`][dplyr::case_when()]. See *Examples*. +#' @param add_total_n a [logical] to indicate whether total available numbers per pathogen should be added to the table (default is `TRUE`). This will add the lowest and highest number of available isolate per antibiotic (e.g, if for *E. coli* 200 isolates are available for ciprofloxacin and 150 for amoxicillin, the returned number will be "150-200"). +#' @param only_all_tested (for combination antibiograms): a [logical] to indicate that isolates must be tested for all antibiotics, see *Details* +#' @param digits number of digits to use for rounding the susceptibility percentage +#' @param formatting_type numeric value (1–12) indicating how the 'cells' of the antibiogram table should be formatted. See *Details* > *Formatting Type* for a list of options. +#' @param col_mo column name of the names or codes of the microorganisms (see [as.mo()]) - the default is the first column of class [`mo`]. Values will be coerced using [as.mo()]. +#' @param language language to translate text, which defaults to the system language (see [get_AMR_locale()]) +#' @param minimum the minimum allowed number of available (tested) isolates. Any isolate count lower than `minimum` will return `NA` with a warning. The default number of `30` isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see *Source*. +#' @param combine_SI a [logical] to indicate whether all susceptibility should be determined by results of either S, SDD, or I, instead of only S (default is `TRUE`) +#' @param sep a separating character for antibiotic columns in combination antibiograms +#' @param info a [logical] to indicate info should be printed - the default is `TRUE` only in interactive mode +#' @param object an [antibiogram()] object +#' @param ... when used in [R Markdown or Quarto][knitr::kable()]: arguments passed on to [knitr::kable()] (otherwise, has no use) +#' @details This function returns a table with values between 0 and 100 for *susceptibility*, not resistance. +#' +#' **Remember that you should filter your data to let it contain only first isolates!** This is needed to exclude duplicates and to reduce selection bias. Use [first_isolate()] to determine them in your data set with one of the four available algorithms. +#' +#' ### Formatting Type +#' +#' The formatting of the 'cells' of the table can be set with the argument `formatting_type`. In these examples, `5` is the susceptibility percentage, `15` the numerator, and `300` the denominator: +#' +#' 1. 5 +#' 2. 15 +#' 3. 300 +#' 4. 15/300 +#' 5. 5 (300) +#' 6. 5% (300) +#' 7. 5 (N=300) +#' 8. 5% (N=300) +#' 9. 5 (15/300) +#' 10. 5% (15/300) +#' 11. 5 (N=15/300) +#' 12. 5% (N=15/300) +#' +#' The default is `10`, which can be set globally with the package option [`AMR_antibiogram_formatting_type`][AMR-options], e.g. `options(AMR_antibiogram_formatting_type = 5)`. +#' +#' Set `digits` (defaults to `0`) to alter the rounding of the susceptibility percentage. +#' +#' ### Antibiogram Types +#' +#' There are four antibiogram types, as summarised by Klinker *et al.* (2021, \doi{10.1177/20499361211011373}), and they are all supported by [antibiogram()]. Use WISCA whenever possible, since it provides precise coverage estimates by accounting for pathogen incidence and antimicrobial susceptibility. See the section *Why Use WISCA?* on this page. +#' +#' The four antibiogram types: +#' +#' 1. **Traditional Antibiogram** +#' +#' Case example: Susceptibility of *Pseudomonas aeruginosa* to piperacillin/tazobactam (TZP) +#' +#' Code example: +#' +#' ```r +#' antibiogram(your_data, +#' antibiotics = "TZP") +#' ``` +#' +#' 2. **Combination Antibiogram** +#' +#' Case example: Additional susceptibility of *Pseudomonas aeruginosa* to TZP + tobramycin versus TZP alone +#' +#' Code example: +#' +#' ```r +#' antibiogram(your_data, +#' antibiotics = c("TZP", "TZP+TOB", "TZP+GEN")) +#' ``` +#' +#' 3. **Syndromic Antibiogram** +#' +#' Case example: Susceptibility of *Pseudomonas aeruginosa* to TZP among respiratory specimens (obtained among ICU patients only) +#' +#' Code example: +#' +#' ```r +#' antibiogram(your_data, +#' antibiotics = penicillins(), +#' syndromic_group = "ward") +#' ``` +#' +#' 4. **Weighted-Incidence Syndromic Combination Antibiogram (WISCA)** +#' +#' WISCA enhances empirical antibiotic selection by weighting the incidence of pathogens in specific clinical syndromes and combining them with their susceptibility data. It provides an estimation of regimen coverage by aggregating pathogen incidences and susceptibilities across potential causative organisms. See also the section *Why Use WISCA?* on this page. +#' +#' Case example: Susceptibility of *Pseudomonas aeruginosa* to TZP among respiratory specimens (obtained among ICU patients only) for male patients age >=65 years with heart failure +#' +#' Code example: +#' +#' ```r +#' library(dplyr) +#' your_data %>% +#' filter(ward == "ICU" & specimen_type == "Respiratory") %>% +#' antibiogram(antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"), +#' syndromic_group = ifelse(.$age >= 65 & +#' .$gender == "Male" & +#' .$condition == "Heart Disease", +#' "Study Group", "Control Group")) +#' ``` +#' +#' WISCA uses a sophisticated Bayesian decision model to combine both local and pooled antimicrobial resistance data. This approach not only evaluates local patterns but can also draw on multi-centre datasets to improve regimen accuracy, even in low-incidence infections like paediatric bloodstream infections (BSIs). +#' +#' ### Inclusion in Combination Antibiogram and Syndromic Antibiogram +#' +#' Note that for types 2 and 3 (Combination Antibiogram and Syndromic Antibiogram), it is important to realise that susceptibility can be calculated in two ways, which can be set with the `only_all_tested` argument (default is `FALSE`). See this example for two antibiotics, Drug A and Drug B, about how [antibiogram()] works to calculate the %SI: +#' +#' ``` +#' -------------------------------------------------------------------- +#' only_all_tested = FALSE only_all_tested = TRUE +#' ----------------------- ----------------------- +#' Drug A Drug B include as include as include as include as +#' numerator denominator numerator denominator +#' -------- -------- ---------- ----------- ---------- ----------- +#' S or I S or I X X X X +#' R S or I X X X X +#' S or I X X - - +#' S or I R X X X X +#' R R - X - X +#' R - - - - +#' S or I X X - - +#' R - - - - +#' - - - - +#' -------------------------------------------------------------------- +#' ``` +#' +#' ### Plotting +#' +#' All types of antibiograms as listed above can be plotted (using [ggplot2::autoplot()] or base \R's [plot()] and [barplot()]). +#' +#' THe outcome of [antibiogram()] can also be used directly in R Markdown / Quarto (i.e., `knitr`) for reports. In this case, [knitr::kable()] will be applied automatically and microorganism names will even be printed in italics at default (see argument `italicise`). +#' +#' You can also use functions from specific 'table reporting' packages to transform the output of [antibiogram()] to your needs, e.g. with `flextable::as_flextable()` or `gt::gt()`. +#' +#' @section Why Use WISCA?: +#' WISCA is a powerful tool for guiding empirical antibiotic therapy because it provides precise coverage estimates by accounting for pathogen incidence and antimicrobial susceptibility. This is particularly important in empirical treatment, where the causative pathogen is often unknown at the outset. Traditional antibiograms do not reflect the weighted likelihood of specific pathogens based on clinical syndromes, which can lead to suboptimal treatment choices. +#' +#' The Bayesian WISCA, as described by Bielicki *et al.* (2016), improves on earlier methods by handling uncertainties common in smaller datasets, such as low-incidence infections. This method offers a significant advantage by: +#' +#' 1. Pooling Data from Multiple Sources:\cr WISCA uses pooled data from multiple hospitals or surveillance sources to overcome limitations of small sample sizes at individual institutions, allowing for more confident selection of narrow-spectrum antibiotics or combinations. +#' 2. Bayesian Framework:\cr The Bayesian decision tree model accounts for both local data and prior knowledge (such as inherent resistance patterns) to estimate regimen coverage. It allows for a more precise estimation of coverage, even in cases where susceptibility data is missing or incomplete. +#' 3. Incorporating Pathogen and Regimen Uncertainty:\cr WISCA allows clinicians to see the likelihood that an empirical regimen will be effective against all relevant pathogens, taking into account uncertainties related to both pathogen prevalence and antimicrobial resistance. This leads to better-informed, data-driven clinical decisions. +#' 4. Scenarios for Optimising Treatment:\cr For hospitals or settings with low-incidence infections, WISCA helps determine whether local data is sufficient or if pooling with external data is necessary. It also identifies statistically significant differences or similarities between antibiotic regimens, enabling clinicians to choose optimal therapies with greater confidence. +#' +#' WISCA is essential in optimising empirical treatment by shifting away from broad-spectrum antibiotics, which are often overused in empirical settings. By offering precise estimates based on syndromic patterns and pooled data, WISCA supports antimicrobial stewardship by guiding more targeted therapy, reducing unnecessary broad-spectrum use, and combating the rise of antimicrobial resistance. +#' @source +#' * Bielicki JA *et al.* (2016). **Selecting appropriate empirical antibiotic regimens for paediatric bloodstream infections: application of a Bayesian decision model to local and pooled antimicrobial resistance surveillance data** *Journal of Antimicrobial Chemotherapy* 71(3); \doi{10.1093/jac/dkv397} +#' * Klinker KP *et al.* (2021). **Antimicrobial stewardship and antibiograms: importance of moving beyond traditional antibiograms**. *Therapeutic Advances in Infectious Disease*, May 5;8:20499361211011373; \doi{10.1177/20499361211011373} +#' * Barbieri E *et al.* (2021). **Development of a Weighted-Incidence Syndromic Combination Antibiogram (WISCA) to guide the choice of the empiric antibiotic treatment for urinary tract infection in paediatric patients: a Bayesian approach** *Antimicrobial Resistance & Infection Control* May 1;10(1):74; \doi{10.1186/s13756-021-00939-2} +#' * **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition**, 2022, *Clinical and Laboratory Standards Institute (CLSI)*. . +#' @rdname antibiogram +#' @name antibiogram +#' @export +#' @examples +#' # example_isolates is a data set available in the AMR package. +#' # run ?example_isolates for more info. +#' example_isolates +#' +#' \donttest{ +#' # Traditional antibiogram ---------------------------------------------- +#' +#' antibiogram(example_isolates, +#' antibiotics = c(aminoglycosides(), carbapenems()) +#' ) +#' +#' antibiogram(example_isolates, +#' antibiotics = aminoglycosides(), +#' ab_transform = "atc", +#' mo_transform = "gramstain" +#' ) +#' +#' antibiogram(example_isolates, +#' antibiotics = carbapenems(), +#' ab_transform = "name", +#' mo_transform = "name" +#' ) +#' +#' +#' # Combined antibiogram ------------------------------------------------- +#' +#' # combined antibiotics yield higher empiric coverage +#' antibiogram(example_isolates, +#' antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"), +#' mo_transform = "gramstain" +#' ) +#' +#' # names of antibiotics do not need to resemble columns exactly: +#' antibiogram(example_isolates, +#' antibiotics = c("Cipro", "cipro + genta"), +#' mo_transform = "gramstain", +#' ab_transform = "name", +#' sep = " & " +#' ) +#' +#' +#' # Syndromic antibiogram ------------------------------------------------ +#' +#' # the data set could contain a filter for e.g. respiratory specimens +#' antibiogram(example_isolates, +#' antibiotics = c(aminoglycosides(), carbapenems()), +#' syndromic_group = "ward" +#' ) +#' +#' # now define a data set with only E. coli +#' ex1 <- example_isolates[which(mo_genus() == "Escherichia"), ] +#' +#' # with a custom language, though this will be determined automatically +#' # (i.e., this table will be in Spanish on Spanish systems) +#' antibiogram(ex1, +#' antibiotics = aminoglycosides(), +#' ab_transform = "name", +#' syndromic_group = ifelse(ex1$ward == "ICU", +#' "UCI", "No UCI" +#' ), +#' language = "es" +#' ) +#' +#' +#' # Weighted-incidence syndromic combination antibiogram (WISCA) --------- +#' +#' # the data set could contain a filter for e.g. respiratory specimens/ICU +#' antibiogram(example_isolates, +#' antibiotics = c("AMC", "AMC+CIP", "TZP", "TZP+TOB"), +#' mo_transform = "gramstain", +#' minimum = 10, # this should be >=30, but now just as example +#' syndromic_group = ifelse(example_isolates$age >= 65 & +#' example_isolates$gender == "M", +#' "WISCA Group 1", "WISCA Group 2" +#' ) +#' ) +#' +#' +#' # Print the output for R Markdown / Quarto ----------------------------- +#' +#' ureido <- antibiogram(example_isolates, +#' antibiotics = ureidopenicillins(), +#' ab_transform = "name" +#' ) +#' +#' # in an Rmd file, you would just need to return `ureido` in a chunk, +#' # but to be explicit here: +#' if (requireNamespace("knitr")) { +#' cat(knitr::knit_print(ureido)) +#' } +#' +#' +#' # Generate plots with ggplot2 or base R -------------------------------- +#' +#' ab1 <- antibiogram(example_isolates, +#' antibiotics = c("AMC", "CIP", "TZP", "TZP+TOB"), +#' mo_transform = "gramstain" +#' ) +#' ab2 <- antibiogram(example_isolates, +#' antibiotics = c("AMC", "CIP", "TZP", "TZP+TOB"), +#' mo_transform = "gramstain", +#' syndromic_group = "ward" +#' ) +#' +#' if (requireNamespace("ggplot2")) { +#' ggplot2::autoplot(ab1) +#' } +#' if (requireNamespace("ggplot2")) { +#' ggplot2::autoplot(ab2) +#' } +#' +#' plot(ab1) +#' plot(ab2) +#' } +antibiogram <- function(x, + antibiotics = where(is.sir), + mo_transform = "shortname", + ab_transform = "name", + syndromic_group = NULL, + add_total_n = FALSE, + only_all_tested = FALSE, + digits = 0, + formatting_type = getOption("AMR_antibiogram_formatting_type", 10), + col_mo = NULL, + language = get_AMR_locale(), + minimum = 30, + combine_SI = TRUE, + sep = " + ", + info = interactive()) { + meet_criteria(x, allow_class = "data.frame") + x <- ascertain_sir_classes(x, "x") + meet_criteria(mo_transform, allow_class = "character", has_length = 1, is_in = c("name", "shortname", "gramstain", colnames(AMR::microorganisms)), allow_NULL = TRUE) + meet_criteria(ab_transform, allow_class = "character", has_length = 1, is_in = colnames(AMR::antibiotics), allow_NULL = TRUE) + meet_criteria(syndromic_group, allow_class = "character", allow_NULL = TRUE, allow_NA = TRUE) + meet_criteria(add_total_n, allow_class = "logical", has_length = 1) + meet_criteria(only_all_tested, allow_class = "logical", has_length = 1) + meet_criteria(digits, allow_class = c("numeric", "integer"), has_length = 1, is_finite = TRUE) + meet_criteria(formatting_type, allow_class = c("numeric", "integer"), has_length = 1, is_in = c(1:12)) + meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + language <- validate_language(language) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(sep, allow_class = "character", has_length = 1) + meet_criteria(info, allow_class = "logical", has_length = 1) + + # try to find columns based on type + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = interactive()) + stop_if(is.null(col_mo), "`col_mo` must be set") + } + # transform MOs + x$`.mo` <- x[, col_mo, drop = TRUE] + if (is.null(mo_transform)) { + # leave as is + } else if (mo_transform == "gramstain") { + x$`.mo` <- mo_gramstain(x$`.mo`, language = language) + } else if (mo_transform == "shortname") { + x$`.mo` <- mo_shortname(x$`.mo`, language = language) + } else if (mo_transform == "name") { + x$`.mo` <- mo_name(x$`.mo`, language = language) + } else { + x$`.mo` <- mo_property(x$`.mo`, property = mo_transform, language = language) + } + x$`.mo`[is.na(x$`.mo`)] <- "(??)" + + # get syndromic groups + if (!is.null(syndromic_group)) { + if (length(syndromic_group) == 1 && syndromic_group %in% colnames(x)) { + x$`.syndromic_group` <- x[, syndromic_group, drop = TRUE] + } else if (!is.null(syndromic_group)) { + x$`.syndromic_group` <- syndromic_group + } + x$`.syndromic_group`[is.na(x$`.syndromic_group`) | x$`.syndromic_group` == ""] <- paste0("(", translate_AMR("unknown", language = language), ")") + has_syndromic_group <- TRUE + } else { + has_syndromic_group <- FALSE + } + + # get antibiotics + if (tryCatch(is.character(antibiotics), error = function(e) FALSE)) { + antibiotics.bak <- antibiotics + # split antibiotics on separator and make it a list + antibiotics <- strsplit(gsub(" ", "", antibiotics), "+", fixed = TRUE) + # get available antibiotics in data set + df_ab <- get_column_abx(x, verbose = FALSE, info = FALSE) + # get antibiotics from user + user_ab <- suppressMessages(suppressWarnings(lapply(antibiotics, as.ab, flag_multiple_results = FALSE, info = FALSE))) + non_existing <- character(0) + user_ab <- lapply(user_ab, function(x) { + out <- unname(df_ab[match(x, names(df_ab))]) + non_existing <<- c(non_existing, x[is.na(out) & !is.na(x)]) + # remove non-existing columns + out[!is.na(out)] + }) + user_ab <- user_ab[unlist(lapply(user_ab, length)) > 0] + + if (length(non_existing) > 0) { + warning_("The following antibiotics were not available and ignored: ", vector_and(ab_name(non_existing, language = NULL, tolower = TRUE), quotes = FALSE)) + } + + # make list unique + antibiotics <- unique(user_ab) + # go through list to set AMR in combinations + for (i in seq_len(length(antibiotics))) { + abx <- antibiotics[[i]] + for (ab in abx) { + # make sure they are SIR columns + x[, ab] <- as.sir(x[, ab, drop = TRUE]) + } + new_colname <- paste0(trimws(abx), collapse = sep) + if (length(abx) == 1) { + next + } else { + # determine whether this new column should contain S, I, R, or NA + if (isTRUE(combine_SI)) { + S_values <- c("S", "SDD", "I") + } else { + S_values <- "S" + } + other_values <- setdiff(c("S", "SDD", "I", "R", "NI"), S_values) + x_transposed <- as.list(as.data.frame(t(x[, abx, drop = FALSE]), stringsAsFactors = FALSE)) + if (isTRUE(only_all_tested)) { + x[new_colname] <- as.sir(vapply(FUN.VALUE = character(1), x_transposed, function(x) ifelse(anyNA(x), NA_character_, ifelse(any(x %in% S_values), "S", "R")), USE.NAMES = FALSE)) + } else { + x[new_colname] <- as.sir(vapply( + FUN.VALUE = character(1), x_transposed, function(x) ifelse(any(x %in% S_values, na.rm = TRUE), "S", ifelse(anyNA(x), NA_character_, "R")), + USE.NAMES = FALSE + )) + } + } + antibiotics[[i]] <- new_colname + } + antibiotics <- unlist(antibiotics) + } else { + antibiotics <- colnames(suppressWarnings(x[, antibiotics, drop = FALSE])) + } + + if (isTRUE(has_syndromic_group)) { + out <- x %pm>% + pm_select(.syndromic_group, .mo, antibiotics) %pm>% + pm_group_by(.syndromic_group) + } else { + out <- x %pm>% + pm_select(.mo, antibiotics) + } + + # get numbers of S, I, R (per group) + out <- out %pm>% + bug_drug_combinations( + col_mo = ".mo", + FUN = function(x) x + ) + counts <- out + + if (isTRUE(combine_SI)) { + out$numerator <- out$S + out$I + out$SDD + } else { + out$numerator <- out$S + } + if (all(out$total < minimum, na.rm = TRUE)) { + warning_("All combinations had less than `minimum = ", minimum, "` results, returning an empty antibiogram") + return(as_original_data_class(data.frame(), class(out), extra_class = "antibiogram")) + } else if (any(out$total < minimum, na.rm = TRUE)) { + if (isTRUE(info)) { + message_("NOTE: ", sum(out$total < minimum, na.rm = TRUE), " combinations had less than `minimum = ", minimum, "` results and were ignored", add_fn = font_red) + } + out <- out %pm>% + subset(total >= minimum) + } + + # regroup for summarising + if (isTRUE(has_syndromic_group)) { + colnames(out)[1] <- "syndromic_group" + out <- out %pm>% + pm_group_by(syndromic_group, mo, ab) + } else { + out <- out %pm>% + pm_group_by(mo, ab) + } + + # formatting type: + # 1. 5 + # 2. 15 + # 3. 300 + # 4. 15/300 + # 5. 5 (300) + # 6. 5% (300) + # 7. 5 (N=300) + # 8. 5% (N=300) + # 9. 5 (15/300) + # 10. 5% (15/300) + # 11. 5 (N=15/300) + # 12. 5% (N=15/300) + out_numeric <- out %pm>% + pm_summarise(percentage = numerator / total, + numerator = numerator, + total = total) + out$digits <- digits # since pm_sumarise() cannot work with an object outside the current frame + if (formatting_type == 1) out <- out %pm>% pm_summarise(out_value = round((numerator / total) * 100, digits = digits)) + if (formatting_type == 2) out <- out %pm>% pm_summarise(out_value = numerator) + if (formatting_type == 3) out <- out %pm>% pm_summarise(out_value = total) + if (formatting_type == 4) out <- out %pm>% pm_summarise(out_value = paste0(numerator, "/", total)) + if (formatting_type == 5) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), " (", total, ")")) + if (formatting_type == 6) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), "% (", total, ")")) + if (formatting_type == 7) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), " (N=", total, ")")) + if (formatting_type == 8) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), "% (N=", total, ")")) + if (formatting_type == 9) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), " (", numerator, "/", total, ")")) + if (formatting_type == 10) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), "% (", numerator, "/", total, ")")) + if (formatting_type == 11) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), " (N=", numerator, "/", total, ")")) + if (formatting_type == 12) out <- out %pm>% pm_summarise(out_value = paste0(round((numerator / total) * 100, digits = digits), "% (N=", numerator, "/", total, ")")) + + # transform names of antibiotics + ab_naming_function <- function(x, t, l, s) { + x <- strsplit(x, s, fixed = TRUE) + out <- character(length = length(x)) + for (i in seq_len(length(x))) { + a <- x[[i]] + if (is.null(t)) { + # leave as is + } else if (t == "atc") { + a <- ab_atc(a, only_first = TRUE, language = l) + } else { + a <- ab_property(a, property = t, language = l) + } + if (length(a) > 1) { + a <- paste0(trimws(a), collapse = sep) + } + out[i] <- a + } + out + } + out$ab <- ab_naming_function(out$ab, t = ab_transform, l = language, s = sep) + out_numeric$ab <- ab_naming_function(out_numeric$ab, t = ab_transform, l = language, s = sep) + + # transform long to wide + long_to_wide <- function(object) { + object <- object %pm>% + # an unclassed data.frame is required for stats::reshape() + as.data.frame(stringsAsFactors = FALSE) %pm>% + stats::reshape(direction = "wide", idvar = "mo", timevar = "ab", v.names = "out_value") + colnames(object) <- gsub("^out_value?[.]", "", colnames(object)) + return(object) + } + + # ungroup for long -> wide transformation + attr(out, "pm_groups") <- NULL + attr(out, "groups") <- NULL + class(out) <- class(out)[!class(out) %in% c("grouped_df", "grouped_data")] + + if (isTRUE(has_syndromic_group)) { + grps <- unique(out$syndromic_group) + for (i in seq_len(length(grps))) { + grp <- grps[i] + if (i == 1) { + new_df <- long_to_wide(out[which(out$syndromic_group == grp), , drop = FALSE]) + } else { + new_df <- rbind_AMR( + new_df, + long_to_wide(out[which(out$syndromic_group == grp), , drop = FALSE]) + ) + } + } + # sort rows + new_df <- new_df %pm>% pm_arrange(mo, syndromic_group) + # sort columns + new_df <- new_df[, c("syndromic_group", "mo", sort(colnames(new_df)[!colnames(new_df) %in% c("syndromic_group", "mo")])), drop = FALSE] + colnames(new_df)[1:2] <- translate_AMR(c("Syndromic Group", "Pathogen"), language = language) + } else { + new_df <- long_to_wide(out) + # sort rows + new_df <- new_df %pm>% pm_arrange(mo) + # sort columns + new_df <- new_df[, c("mo", sort(colnames(new_df)[colnames(new_df) != "mo"])), drop = FALSE] + colnames(new_df)[1] <- translate_AMR("Pathogen", language = language) + } + + # add total N if indicated + if (isTRUE(add_total_n)) { + if (isTRUE(has_syndromic_group)) { + n_per_mo <- counts %pm>% + pm_group_by(mo, .syndromic_group) %pm>% + pm_summarise(paste0(min(total, na.rm = TRUE), "-", max(total, na.rm = TRUE))) + colnames(n_per_mo) <- c("mo", "syn", "count") + count_group <- n_per_mo$count[match(paste(new_df[[2]], new_df[[1]]), paste(n_per_mo$mo, n_per_mo$syn))] + edit_col <- 2 + } else { + n_per_mo <- counts %pm>% + pm_group_by(mo) %pm>% + pm_summarise(paste0(min(total, na.rm = TRUE), "-", max(total, na.rm = TRUE))) + colnames(n_per_mo) <- c("mo", "count") + count_group <- n_per_mo$count[match(new_df[[1]], n_per_mo$mo)] + edit_col <- 1 + } + if (NCOL(new_df) == edit_col + 1) { + # only 1 antibiotic + new_df[[edit_col]] <- paste0(new_df[[edit_col]], " (", unlist(lapply(strsplit(x = count_group, split = "-", fixed = TRUE), function(x) x[1])), ")") + colnames(new_df)[edit_col] <- paste(colnames(new_df)[edit_col], "(N)") + } else { + # more than 1 antibiotic + new_df[[edit_col]] <- paste0(new_df[[edit_col]], " (", count_group, ")") + colnames(new_df)[edit_col] <- paste(colnames(new_df)[edit_col], "(N min-max)") + } + } + + out <- as_original_data_class(new_df, class(x), extra_class = "antibiogram") + rownames(out) <- NULL + structure(out, + has_syndromic_group = has_syndromic_group, + out_numeric = out_numeric, + combine_SI = combine_SI + ) +} + +# will be exported in R/zzz.R +tbl_sum.antibiogram <- function(x, ...) { + dims <- paste(format(NROW(x), big.mark = ","), AMR_env$cross_icon, format(NCOL(x), big.mark = ",")) + names(dims) <- "An Antibiogram" + dims +} + +# will be exported in R/zzz.R +tbl_format_footer.antibiogram <- function(x, ...) { + footer <- NextMethod() + if (NROW(x) == 0) { + return(footer) + } + c(footer, font_subtle(paste0("# Use `plot()` or `ggplot2::autoplot()` to create a plot of this antibiogram,\n", + "# or use it directly in R Markdown or ", + font_url("https://quarto.org", "Quarto"), ", see ", word_wrap("?antibiogram")))) +} + +#' @export +#' @rdname antibiogram +plot.antibiogram <- function(x, ...) { + df <- attributes(x)$out_numeric + if ("syndromic_group" %in% colnames(df)) { + # barplot in base R does not support facets - paste columns together + df$mo <- paste(df$mo, "-", df$syndromic_group) + df$syndromic_group <- NULL + df <- df[order(df$mo), , drop = FALSE] + } + mo_levels <- unique(df$mo) + mfrow_old <- graphics::par()$mfrow + sqrt_levels <- sqrt(length(mo_levels)) + graphics::par(mfrow = c(ceiling(sqrt_levels), floor(sqrt_levels))) + for (i in seq_along(mo_levels)) { + mo <- mo_levels[i] + df_sub <- df[df$mo == mo, , drop = FALSE] + + barplot( + height = df_sub$percentage * 100, + xlab = NULL, + ylab = ifelse(isTRUE(attributes(x)$combine_SI), "%SI", "%S"), + names.arg = df_sub$ab, + col = "#aaaaaa", + beside = TRUE, + main = mo, + legend = NULL + ) + } + graphics::par(mfrow = mfrow_old) +} + +#' @export +#' @noRd +barplot.antibiogram <- function(height, ...) { + plot(height, ...) +} + +#' @method autoplot antibiogram +#' @rdname antibiogram +# will be exported using s3_register() in R/zzz.R +autoplot.antibiogram <- function(object, ...) { + df <- attributes(object)$out_numeric + ggplot2::ggplot(df) + + ggplot2::geom_col( + ggplot2::aes( + x = ab, + y = percentage * 100, + fill = if ("syndromic_group" %in% colnames(df)) { + syndromic_group + } else { + NULL + } + ), + position = ggplot2::position_dodge2(preserve = "single") + ) + + ggplot2::facet_wrap("mo") + + ggplot2::labs( + y = ifelse(isTRUE(attributes(object)$combine_SI), "%SI", "%S"), + x = NULL, + fill = if ("syndromic_group" %in% colnames(df)) { + colnames(object)[1] + } else { + NULL + } + ) +} + +# will be exported in zzz.R +#' @method knit_print antibiogram +#' @param italicise a [logical] to indicate whether the microorganism names in the [knitr][knitr::kable()] table should be made italic, using [italicise_taxonomy()]. +#' @param na character to use for showing `NA` values +#' @rdname antibiogram +knit_print.antibiogram <- function(x, italicise = TRUE, na = getOption("knitr.kable.NA", default = ""), ...) { + stop_ifnot_installed("knitr") + meet_criteria(italicise, allow_class = "logical", has_length = 1) + meet_criteria(na, allow_class = "character", has_length = 1, allow_NA = TRUE) + + if (isTRUE(italicise)) { + # make all microorganism names italic, according to nomenclature + names_col <- ifelse(isTRUE(attributes(x)$has_syndromic_group), 2, 1) + x[[names_col]] <- italicise_taxonomy(x[[names_col]], type = "markdown") + } + + old_option <- getOption("knitr.kable.NA") + options(knitr.kable.NA = na) + on.exit(options(knitr.kable.NA = old_option)) + + out <- paste(c("", "", knitr::kable(x, ..., output = FALSE)), collapse = "\n") + knitr::asis_output(out) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/atc_online.R + + + + +#' Get ATC Properties from WHOCC Website +#' +#' Gets data from the WHOCC website to determine properties of an Anatomical Therapeutic Chemical (ATC) (e.g. an antibiotic), such as the name, defined daily dose (DDD) or standard unit. +#' @param atc_code a [character] (vector) with ATC code(s) of antibiotics, will be coerced with [as.ab()] and [ab_atc()] internally if not a valid ATC code +#' @param property property of an ATC code. Valid values are `"ATC"`, `"Name"`, `"DDD"`, `"U"` (`"unit"`), `"Adm.R"`, `"Note"` and `groups`. For this last option, all hierarchical groups of an ATC code will be returned, see *Examples*. +#' @param administration type of administration when using `property = "Adm.R"`, see *Details* +#' @param url url of website of the WHOCC. The sign `%s` can be used as a placeholder for ATC codes. +#' @param url_vet url of website of the WHOCC for veterinary medicine. The sign `%s` can be used as a placeholder for ATC_vet codes (that all start with "Q"). +#' @param ... arguments to pass on to `atc_property` +#' @details +#' Options for argument `administration`: +#' +#' - `"Implant"` = Implant +#' - `"Inhal"` = Inhalation +#' - `"Instill"` = Instillation +#' - `"N"` = nasal +#' - `"O"` = oral +#' - `"P"` = parenteral +#' - `"R"` = rectal +#' - `"SL"` = sublingual/buccal +#' - `"TD"` = transdermal +#' - `"V"` = vaginal +#' +#' Abbreviations of return values when using `property = "U"` (unit): +#' +#' - `"g"` = gram +#' - `"mg"` = milligram +#' - `"mcg"` = microgram +#' - `"U"` = unit +#' - `"TU"` = thousand units +#' - `"MU"` = million units +#' - `"mmol"` = millimole +#' - `"ml"` = millilitre (e.g. eyedrops) +#' +#' **N.B. This function requires an internet connection and only works if the following packages are installed: `curl`, `rvest`, `xml2`.** +#' @export +#' @rdname atc_online +#' @source +#' @examples +#' \donttest{ +#' if (requireNamespace("curl") && requireNamespace("rvest") && requireNamespace("xml2")) { +#' # oral DDD (Defined Daily Dose) of amoxicillin +#' atc_online_property("J01CA04", "DDD", "O") +#' atc_online_ddd(ab_atc("amox")) +#' +#' # parenteral DDD (Defined Daily Dose) of amoxicillin +#' atc_online_property("J01CA04", "DDD", "P") +#' +#' atc_online_property("J01CA04", property = "groups") # search hierarchical groups of amoxicillin +#' } +#' } +atc_online_property <- function(atc_code, + property, + administration = "O", + url = "https://atcddd.fhi.no/atc_ddd_index/?code=%s&showdescription=no", + url_vet = "https://atcddd.fhi.no/atcvet/atcvet_index/?code=%s&showdescription=no") { + meet_criteria(atc_code, allow_class = "character", allow_NA = TRUE) + meet_criteria(property, allow_class = "character", has_length = 1, is_in = c("ATC", "Name", "DDD", "U", "unit", "Adm.R", "Note", "groups"), ignore.case = TRUE) + meet_criteria(administration, allow_class = "character", has_length = 1) + meet_criteria(url, allow_class = "character", has_length = 1, looks_like = "https?://") + meet_criteria(url_vet, allow_class = "character", has_length = 1, looks_like = "https?://") + + has_internet <- import_fn("has_internet", "curl") + html_attr <- import_fn("html_attr", "rvest") + html_children <- import_fn("html_children", "rvest") + html_node <- import_fn("html_node", "rvest") + html_nodes <- import_fn("html_nodes", "rvest") + html_table <- import_fn("html_table", "rvest") + html_text <- import_fn("html_text", "rvest") + read_html <- import_fn("read_html", "xml2") + + if (!all(atc_code %in% unlist(AMR::antibiotics$atc))) { + atc_code <- as.character(ab_atc(atc_code, only_first = TRUE)) + } + + if (!has_internet()) { + message_("There appears to be no internet connection, returning NA.", + add_fn = font_red, + as_note = FALSE + ) + return(rep(NA, length(atc_code))) + } + + property <- tolower(property) + # also allow unit as property + if (property == "unit") { + property <- "u" + } + if (property == "ddd") { + returnvalue <- rep(NA_real_, length(atc_code)) + } else if (property == "groups") { + returnvalue <- list() + } else { + returnvalue <- rep(NA_character_, length(atc_code)) + } + + progress <- progress_ticker(n = length(atc_code), 3) + on.exit(close(progress)) + + for (i in seq_len(length(atc_code))) { + progress$tick() + + if (is.na(atc_code[i])) { + next + } + + if (atc_code[i] %like% "^Q") { + # veterinary drugs, ATC_vet codes start with a "Q" + atc_url <- url_vet + } else { + atc_url <- url + } + atc_url <- sub("%s", atc_code[i], atc_url, fixed = TRUE) + + if (property == "groups") { + out <- tryCatch( + read_html(atc_url) %pm>% + html_node("#content") %pm>% + html_children() %pm>% + html_node("a"), + error = function(e) NULL + ) + if (is.null(out)) { + message_("Connection to ", atc_url, " failed.") + return(rep(NA, length(atc_code))) + } + + # get URLS of items + hrefs <- out %pm>% html_attr("href") + # get text of items + texts <- out %pm>% html_text() + # select only text items where URL like "code=" + texts <- texts[grepl("?code=", tolower(hrefs), fixed = TRUE)] + # last one is antibiotics, skip it + texts <- texts[seq_len(length(texts)) - 1] + returnvalue <- c(list(texts), returnvalue) + } else { + out <- tryCatch( + read_html(atc_url) %pm>% + html_nodes("table") %pm>% + html_table(header = TRUE) %pm>% + as.data.frame(stringsAsFactors = FALSE), + error = function(e) NULL + ) + if (is.null(out)) { + message_("Connection to ", atc_url, " failed.") + return(rep(NA, length(atc_code))) + } + + # case insensitive column names + colnames(out) <- gsub("^atc.*", "atc", tolower(colnames(out))) + + if (length(out) == 0) { + message_("in `atc_online_property()`: no properties found for ATC ", atc_code[i], ". Please check ", font_url(atc_url, "this WHOCC webpage"), ".") + returnvalue[i] <- NA + next + } + + if (property %in% c("atc", "name")) { + # ATC and name are only in first row + returnvalue[i] <- out[1, property, drop = TRUE] + } else { + if (!"adm.r" %in% colnames(out) || is.na(out[1, "adm.r", drop = TRUE])) { + returnvalue[i] <- NA + next + } else { + for (j in seq_len(nrow(out))) { + if (out[j, "adm.r"] == administration) { + returnvalue[i] <- out[j, property, drop = TRUE] + } + } + } + } + } + } + + if (property == "groups" && length(returnvalue) == 1) { + returnvalue <- returnvalue[[1]] + } + + returnvalue +} + +#' @rdname atc_online +#' @export +atc_online_groups <- function(atc_code, ...) { + meet_criteria(atc_code, allow_class = "character", allow_NA = TRUE) + atc_online_property(atc_code = atc_code, property = "groups", ...) +} + +#' @rdname atc_online +#' @export +atc_online_ddd <- function(atc_code, ...) { + meet_criteria(atc_code, allow_class = "character", allow_NA = TRUE) + atc_online_property(atc_code = atc_code, property = "ddd", ...) +} + +#' @rdname atc_online +#' @export +atc_online_ddd_units <- function(atc_code, ...) { + meet_criteria(atc_code, allow_class = "character", allow_NA = TRUE) + atc_online_property(atc_code = atc_code, property = "unit", ...) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/av.R + + + + +#' Transform Input to an Antiviral Drug ID +#' +#' Use this function to determine the antiviral drug code of one or more antiviral drugs. The data set [antivirals] will be searched for abbreviations, official names and synonyms (brand names). +#' @param x a [character] vector to determine to antiviral drug ID +#' @param flag_multiple_results a [logical] to indicate whether a note should be printed to the console that probably more than one antiviral drug code or name can be retrieved from a single input value. +#' @param info a [logical] to indicate whether a progress bar should be printed - the default is `TRUE` only in interactive mode +#' @param ... arguments passed on to internal functions +#' @rdname as.av +#' @inheritSection WHOCC WHOCC +#' @details All entries in the [antivirals] data set have three different identifiers: a human readable EARS-Net code (column `ab`, used by ECDC and WHONET), an ATC code (column `atc`, used by WHO), and a CID code (column `cid`, Compound ID, used by PubChem). The data set contains more than 5,000 official brand names from many different countries, as found in PubChem. Not that some drugs contain multiple ATC codes. +#' +#' All these properties will be searched for the user input. The [as.av()] can correct for different forms of misspelling: +#' +#' * Wrong spelling of drug names (such as "acyclovir"), which corrects for most audible similarities such as f/ph, x/ks, c/z/s, t/th, etc. +#' * Too few or too many vowels or consonants +#' * Switching two characters (such as "aycclovir", often the case in clinical data, when doctors typed too fast) +#' * Digitalised paper records, leaving artefacts like 0/o/O (zero and O's), B/8, n/r, etc. +#' +#' Use the [`av_*`][av_property()] functions to get properties based on the returned antiviral drug ID, see *Examples*. +#' +#' Note: the [as.av()] and [`av_*`][av_property()] functions may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. +#' @section Source: +#' World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} +#' +#' European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +#' @aliases av +#' @return A [character] [vector] with additional class [`ab`] +#' @seealso +#' * [antivirals] for the [data.frame] that is being used to determine ATCs +#' * [av_from_text()] for a function to retrieve antimicrobial drugs from clinical text (from health care records) +#' @inheritSection AMR Reference Data Publicly Available +#' @export +#' @examples +#' # these examples all return "ACI", the ID of aciclovir: +#' as.av("J05AB01") +#' as.av("J 05 AB 01") +#' as.av("Aciclovir") +#' as.av("aciclo") +#' as.av(" aciclo 123") +#' as.av("ACICL") +#' as.av("ACI") +#' as.av("Virorax") # trade name +#' as.av("Zovirax") # trade name +#' +#' as.av("acyklofir") # severe spelling error, yet works +#' +#' # use av_* functions to get a specific properties (see ?av_property); +#' # they use as.av() internally: +#' av_name("J05AB01") +#' av_name("acicl") +as.av <- function(x, flag_multiple_results = TRUE, info = interactive(), ...) { + meet_criteria(x, allow_class = c("character", "numeric", "integer", "factor"), allow_NA = TRUE) + meet_criteria(flag_multiple_results, allow_class = "logical", has_length = 1) + meet_criteria(info, allow_class = "logical", has_length = 1) + + if (is.av(x)) { + return(x) + } + if (all(x %in% c(AMR_env$AV_lookup$av, NA))) { + # all valid AB codes, but not yet right class + return(set_clean_class(x, + new_class = c("av", "character") + )) + } + + initial_search <- is.null(list(...)$initial_search) + already_regex <- isTRUE(list(...)$already_regex) + fast_mode <- isTRUE(list(...)$fast_mode) + + x_bak <- x + x <- toupper(x) + + # remove diacritics + x <- iconv(x, from = "UTF-8", to = "ASCII//TRANSLIT") + x <- gsub('"', "", x, fixed = TRUE) + x <- gsub("(specimen|specimen date|specimen_date|spec_date|gender|^dates?$)", "", x, ignore.case = TRUE, perl = TRUE) + x_bak_clean <- x + if (already_regex == FALSE) { + x_bak_clean <- generalise_antibiotic_name(x_bak_clean) + } + + x <- unique(x_bak_clean) # this means that every x is in fact generalise_antibiotic_name(x) + x_new <- rep(NA_character_, length(x)) + x_unknown <- character(0) + x_unknown_ATCs <- character(0) + + note_if_more_than_one_found <- function(found, index, from_text) { + if (isTRUE(initial_search) && isTRUE(length(from_text) > 1)) { + avnames <- av_name(from_text, tolower = TRUE, initial_search = FALSE) + if (av_name(found[1L], language = NULL) %like% "(clavulanic acid|avibactam)") { + avnames <- avnames[!avnames %in% c("clavulanic acid", "avibactam")] + } + if (length(avnames) > 1) { + warning_( + "More than one result was found for item ", index, ": ", + vector_and(avnames, quotes = FALSE) + ) + } + } + found[1L] + } + + # Fill in names, AB codes, CID codes and ATC codes directly (`x` is already clean and uppercase) + known_names <- x %in% AMR_env$AV_lookup$generalised_name + x_new[known_names] <- AMR_env$AV_lookup$av[match(x[known_names], AMR_env$AV_lookup$generalised_name)] + known_codes_av <- x %in% AMR_env$AV_lookup$av + known_codes_atc <- vapply(FUN.VALUE = logical(1), x, function(x_) x_ %in% unlist(AMR_env$AV_lookup$atc), USE.NAMES = FALSE) + known_codes_cid <- x %in% AMR_env$AV_lookup$cid + x_new[known_codes_av] <- AMR_env$AV_lookup$av[match(x[known_codes_av], AMR_env$AV_lookup$av)] + x_new[known_codes_atc] <- AMR_env$AV_lookup$av[vapply( + FUN.VALUE = integer(1), + x[known_codes_atc], + function(x_) { + which(vapply( + FUN.VALUE = logical(1), + AMR_env$AV_lookup$atc, + function(atc) x_ %in% atc + ))[1L] + }, + USE.NAMES = FALSE + )] + x_new[known_codes_cid] <- AMR_env$AV_lookup$av[match(x[known_codes_cid], AMR_env$AV_lookup$cid)] + previously_coerced <- x %in% AMR_env$av_previously_coerced$x + x_new[previously_coerced & is.na(x_new)] <- AMR_env$av_previously_coerced$av[match(x[is.na(x_new) & x %in% AMR_env$av_previously_coerced$x], AMR_env$av_previously_coerced$x)] + already_known <- known_names | known_codes_av | known_codes_atc | known_codes_cid | previously_coerced + + # fix for NAs + x_new[is.na(x)] <- NA + already_known[is.na(x)] <- FALSE + + if (isTRUE(initial_search) && sum(already_known) < length(x)) { + progress <- progress_ticker(n = sum(!already_known), n_min = 25, print = info) # start if n >= 25 + on.exit(close(progress)) + } + + for (i in which(!already_known)) { + if (isTRUE(initial_search)) { + progress$tick() + } + + if (is.na(x[i]) || is.null(x[i])) { + next + } + if (identical(x[i], "") || + # prevent "bacteria" from coercing to TMP, since Bacterial is a brand name of it: + identical(tolower(x[i]), "bacteria")) { + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + next + } + if (x[i] %like_case% "[A-Z][0-9][0-9][A-Z][A-Z][0-9][0-9]") { + # seems an ATC code, but the available ones are in `already_known`, so: + x_unknown <- c(x_unknown, x[i]) + x_unknown_ATCs <- c(x_unknown_ATCs, x[i]) + x_new[i] <- NA_character_ + next + } + + if (fast_mode == FALSE && flag_multiple_results == TRUE && x[i] %like% "[ ]") { + from_text <- tryCatch(suppressWarnings(av_from_text(x[i], initial_search = FALSE, translate_av = FALSE)[[1]]), + error = function(e) character(0) + ) + } else { + from_text <- character(0) + } + + # old code for phenoxymethylpenicillin (Peni V) + if (x[i] == "PNV") { + x_new[i] <- "PHN" + next + } + + # exact LOINC code + loinc_found <- unlist(lapply( + AMR_env$AV_lookup$generalised_loinc, + function(s) x[i] %in% s + )) + found <- AMR_env$AV_lookup$av[loinc_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # exact synonym + synonym_found <- unlist(lapply( + AMR_env$AV_lookup$generalised_synonyms, + function(s) x[i] %in% s + )) + found <- AMR_env$AV_lookup$av[synonym_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # length of input is quite long, and Levenshtein distance is only max 2 + if (nchar(x[i]) >= 10) { + levenshtein <- as.double(utils::adist(x[i], AMR_env$AV_lookup$generalised_name)) + if (any(levenshtein <= 2)) { + found <- AMR_env$AV_lookup$av[which(levenshtein <= 2)] + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # allow characters that resemble others, but only continue when having more than 3 characters + if (nchar(x[i]) <= 3) { + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + next + } + x_spelling <- x[i] + if (already_regex == FALSE) { + x_spelling <- gsub("[IY]+", "[IY]+", x_spelling, perl = TRUE) + x_spelling <- gsub("(C|K|Q|QU|S|Z|X|KS)+", "(C|K|Q|QU|S|Z|X|KS)+", x_spelling, perl = TRUE) + x_spelling <- gsub("(PH|F|V)+", "(PH|F|V)+", x_spelling, perl = TRUE) + x_spelling <- gsub("(TH|T)+", "(TH|T)+", x_spelling, perl = TRUE) + x_spelling <- gsub("A+", "A+", x_spelling, perl = TRUE) + x_spelling <- gsub("E+", "E+", x_spelling, perl = TRUE) + x_spelling <- gsub("O+", "O+", x_spelling, perl = TRUE) + # allow any ending of -in/-ine and -im/-ime + x_spelling <- gsub("(\\[IY\\]\\+(N|M)|\\[IY\\]\\+(N|M)E\\+?)$", "[IY]+(N|M)E*", x_spelling, perl = TRUE) + # allow any ending of -ol/-ole + x_spelling <- gsub("(O\\+L|O\\+LE\\+)$", "O+LE*", x_spelling, perl = TRUE) + # allow any ending of -on/-one + x_spelling <- gsub("(O\\+N|O\\+NE\\+)$", "O+NE*", x_spelling, perl = TRUE) + # replace multiple same characters to single one with '+', like "ll" -> "l+" + x_spelling <- gsub("(.)\\1+", "\\1+", x_spelling, perl = TRUE) + # replace spaces and slashes with a possibility on both + x_spelling <- gsub("[ /]", "( .*|.*/)", x_spelling, perl = TRUE) + # correct for digital reading text (OCR) + x_spelling <- gsub("[NRD8B]", "[NRD8B]", x_spelling, perl = TRUE) + x_spelling <- gsub("(O|0)", "(O|0)+", x_spelling, perl = TRUE) + x_spelling <- gsub("++", "+", x_spelling, fixed = TRUE) + } + + # try if name starts with it + found <- AMR_env$AV_lookup[which(AMR_env$AV_lookup$generalised_name %like% paste0("^", x_spelling)), "av", drop = TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + # try if name ends with it + found <- AMR_env$AV_lookup[which(AMR_env$AV_lookup$generalised_name %like% paste0(x_spelling, "$")), "av", drop = TRUE] + if (nchar(x[i]) >= 4 && length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # and try if any synonym starts with it + synonym_found <- unlist(lapply( + AMR_env$AV_lookup$generalised_synonyms, + function(s) any(s %like% paste0("^", x_spelling)) + )) + found <- AMR_env$AV_lookup$av[synonym_found == TRUE] + if (length(found) > 0) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # INITIAL SEARCH - More uncertain results ---- + + if (isTRUE(initial_search) && fast_mode == FALSE) { + # only run on first try + + # try by removing all spaces + if (x[i] %like% " ") { + found <- suppressWarnings(as.av(gsub(" +", "", x[i], perl = TRUE), initial_search = FALSE)) + if (length(found) > 0 && !is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # try by removing all spaces and numbers + if (x[i] %like% " " || x[i] %like% "[0-9]") { + found <- suppressWarnings(as.av(gsub("[ 0-9]", "", x[i], perl = TRUE), initial_search = FALSE)) + if (length(found) > 0 && !is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # transform back from other languages and try again + x_translated <- paste( + lapply( + strsplit(x[i], "[^A-Z0-9]"), + function(y) { + for (i in seq_len(length(y))) { + for (lang in LANGUAGES_SUPPORTED[LANGUAGES_SUPPORTED != "en"]) { + y[i] <- ifelse(tolower(y[i]) %in% tolower(TRANSLATIONS[, lang, drop = TRUE]), + TRANSLATIONS[which(tolower(TRANSLATIONS[, lang, drop = TRUE]) == tolower(y[i]) & + !isFALSE(TRANSLATIONS$fixed)), "pattern"], + y[i] + ) + } + } + generalise_antibiotic_name(y) + } + )[[1]], + collapse = "/" + ) + x_translated_guess <- suppressWarnings(as.av(x_translated, initial_search = FALSE)) + if (!is.na(x_translated_guess)) { + x_new[i] <- x_translated_guess + next + } + + # now also try to coerce brandname combinations like "Amoxy/clavulanic acid" + x_translated <- paste( + lapply( + strsplit(x_translated, "[^A-Z0-9 ]"), + function(y) { + for (i in seq_len(length(y))) { + y_name <- suppressWarnings(av_name(y[i], language = NULL, initial_search = FALSE)) + y[i] <- ifelse(!is.na(y_name), + y_name, + y[i] + ) + } + generalise_antibiotic_name(y) + } + )[[1]], + collapse = "/" + ) + x_translated_guess <- suppressWarnings(as.av(x_translated, initial_search = FALSE)) + if (!is.na(x_translated_guess)) { + x_new[i] <- x_translated_guess + next + } + + # try by removing all trailing capitals + if (x[i] %like_case% "[a-z]+[A-Z]+$") { + found <- suppressWarnings(as.av(gsub("[A-Z]+$", "", x[i], perl = TRUE), initial_search = FALSE)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + } + + # keep only letters + found <- suppressWarnings(as.av(gsub("[^A-Z]", "", x[i], perl = TRUE), initial_search = FALSE)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # try from a bigger text, like from a health care record, see ?av_from_text + # already calculated above if flag_multiple_results = TRUE + if (flag_multiple_results == TRUE) { + found <- from_text[1L] + } else { + found <- tryCatch(suppressWarnings(av_from_text(x[i], initial_search = FALSE, translate_av = FALSE)[[1]][1L]), + error = function(e) NA_character_ + ) + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # first 5 + found <- suppressWarnings(as.av(substr(x[i], 1, 5), initial_search = FALSE)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # make all consonants facultative + search_str <- gsub("([BCDFGHJKLMNPQRSTVWXZ])", "\\1*", x[i], perl = TRUE) + found <- suppressWarnings(as.av(search_str, initial_search = FALSE, already_regex = TRUE)) + # keep at least 4 normal characters + if (nchar(gsub(".\\*", "", search_str, perl = TRUE)) < 4) { + found <- NA + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # make all vowels facultative + search_str <- gsub("([AEIOUY])", "\\1*", x[i], perl = TRUE) + found <- suppressWarnings(as.av(search_str, initial_search = FALSE, already_regex = TRUE)) + # keep at least 5 normal characters + if (nchar(gsub(".\\*", "", search_str, perl = TRUE)) < 5) { + found <- NA + } + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # allow misspelling of vowels + x_spelling <- gsub("A+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("E+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("I+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("O+", "[AEIOU]+", x_spelling, fixed = TRUE) + x_spelling <- gsub("U+", "[AEIOU]+", x_spelling, fixed = TRUE) + found <- suppressWarnings(as.av(x_spelling, initial_search = FALSE, already_regex = TRUE)) + if (!is.na(found)) { + x_new[i] <- note_if_more_than_one_found(found, i, from_text) + next + } + + # try with switched character, like "mreopenem" + for (j in seq_len(nchar(x[i]))) { + x_switched <- paste0( + # beginning part: + substr(x[i], 1, j - 1), + # here is the switching of 2 characters: + substr(x[i], j + 1, j + 1), + substr(x[i], j, j), + # ending part: + substr(x[i], j + 2, nchar(x[i])) + ) + found <- suppressWarnings(as.av(x_switched, initial_search = FALSE)) + if (!is.na(found)) { + break + } + } + if (!is.na(found)) { + x_new[i] <- found[1L] + next + } + } # end of initial_search = TRUE + + # not found + x_unknown <- c(x_unknown, x_bak[x[i] == x_bak_clean][1]) + } + + if (isTRUE(initial_search) && sum(already_known) < length(x)) { + close(progress) + } + + # save to package env to save time for next time + if (isTRUE(initial_search)) { + AMR_env$av_previously_coerced <- AMR_env$av_previously_coerced[which(!AMR_env$av_previously_coerced$x %in% x), , drop = FALSE] + AMR_env$av_previously_coerced <- unique(rbind_AMR( + AMR_env$av_previously_coerced, + data.frame( + x = x, + av = x_new, + x_bak = x_bak[match(x, x_bak_clean)], + stringsAsFactors = FALSE + ) + )) + } + + # take failed ATC codes apart from rest + if (length(x_unknown_ATCs) > 0 && fast_mode == FALSE) { + warning_( + "in `as.av()`: these ATC codes are not (yet) in the antivirals data set: ", + vector_and(x_unknown_ATCs), "." + ) + } + x_unknown <- x_unknown[!x_unknown %in% x_unknown_ATCs] + x_unknown <- c( + x_unknown, + AMR_env$av_previously_coerced$x_bak[which(AMR_env$av_previously_coerced$x %in% x & is.na(AMR_env$av_previously_coerced$av))] + ) + if (length(x_unknown) > 0 && fast_mode == FALSE) { + warning_( + "in `as.av()`: these values could not be coerced to a valid antiviral drug ID: ", + vector_and(x_unknown), "." + ) + } + + x_result <- x_new[match(x_bak_clean, x)] + if (length(x_result) == 0) { + x_result <- NA_character_ + } + + set_clean_class(x_result, + new_class = c("av", "character") + ) +} + +#' @rdname as.av +#' @export +is.av <- function(x) { + inherits(x, "av") +} + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.av <- function(x, ...) { + out <- trimws(format(x)) + out[!is.na(x)] <- gsub("+", font_subtle("+"), out[!is.na(x)], fixed = TRUE) + out[is.na(x)] <- font_na(NA) + create_pillar_column(out, align = "left", min_width = 4) +} + +# will be exported using s3_register() in R/zzz.R +type_sum.av <- function(x, ...) { + "av" +} + +#' @method print av +#' @export +#' @noRd +print.av <- function(x, ...) { + cat("Class 'av'\n") + print(as.character(x), quote = FALSE) +} + +#' @method as.data.frame av +#' @export +#' @noRd +as.data.frame.av <- function(x, ...) { + nm <- deparse1(substitute(x)) + if (!"nm" %in% names(list(...))) { + as.data.frame.vector(as.av(x), ..., nm = nm) + } else { + as.data.frame.vector(as.av(x), ...) + } +} +#' @method [ av +#' @export +#' @noRd +"[.av" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [[ av +#' @export +#' @noRd +"[[.av" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [<- av +#' @export +#' @noRd +"[<-.av" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + return_after_integrity_check(y, "antiviral drug code", AMR_env$AV_lookup$av) +} +#' @method [[<- av +#' @export +#' @noRd +"[[<-.av" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + return_after_integrity_check(y, "antiviral drug code", AMR_env$AV_lookup$av) +} +#' @method c av +#' @export +#' @noRd +c.av <- function(...) { + x <- list(...)[[1L]] + y <- NextMethod() + attributes(y) <- attributes(x) + return_after_integrity_check(y, "antiviral drug code", AMR_env$AV_lookup$av) +} + +#' @method unique av +#' @export +#' @noRd +unique.av <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method rep av +#' @export +#' @noRd +rep.av <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +get_translate_av <- function(translate_av) { + translate_av <- as.character(translate_av)[1L] + if (translate_av %in% c("TRUE", "official")) { + return("name") + } else if (translate_av %in% c(NA_character_, "FALSE")) { + return(FALSE) + } else { + translate_av <- tolower(translate_av) + stop_ifnot(translate_av %in% colnames(AMR::antivirals), + "invalid value for 'translate_av', this must be a column name of the antivirals data set\n", + "or TRUE (equals 'name') or FALSE to not translate at all.", + call = FALSE + ) + translate_av + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/av_from_text.R + + + + +#' Retrieve Antiviral Drug Names and Doses from Clinical Text +#' +#' Use this function on e.g. clinical texts from health care records. It returns a [list] with all antiviral drugs, doses and forms of administration found in the texts. +#' @param text text to analyse +#' @param type type of property to search for, either `"drug"`, `"dose"` or `"administration"`, see *Examples* +#' @param collapse a [character] to pass on to `paste(, collapse = ...)` to only return one [character] per element of `text`, see *Examples* +#' @param translate_av if `type = "drug"`: a column name of the [antivirals] data set to translate the antibiotic abbreviations to, using [av_property()]. The default is `FALSE`. Using `TRUE` is equal to using "name". +#' @param thorough_search a [logical] to indicate whether the input must be extensively searched for misspelling and other faulty input values. Setting this to `TRUE` will take considerably more time than when using `FALSE`. At default, it will turn `TRUE` when all input elements contain a maximum of three words. +#' @param info a [logical] to indicate whether a progress bar should be printed - the default is `TRUE` only in interactive mode +#' @param ... arguments passed on to [as.av()] +#' @details This function is also internally used by [as.av()], although it then only searches for the first drug name and will throw a note if more drug names could have been returned. Note: the [as.av()] function may use very long regular expression to match brand names of antiviral drugs. This may fail on some systems. +#' +#' ### Argument `type` +#' At default, the function will search for antiviral drug names. All text elements will be searched for official names, ATC codes and brand names. As it uses [as.av()] internally, it will correct for misspelling. +#' +#' With `type = "dose"` (or similar, like "dosing", "doses"), all text elements will be searched for [numeric] values that are higher than 100 and do not resemble years. The output will be [numeric]. It supports any unit (g, mg, IE, etc.) and multiple values in one clinical text, see *Examples*. +#' +#' With `type = "administration"` (or abbreviations, like "admin", "adm"), all text elements will be searched for a form of drug administration. It supports the following forms (including common abbreviations): buccal, implant, inhalation, instillation, intravenous, nasal, oral, parenteral, rectal, sublingual, transdermal and vaginal. Abbreviations for oral (such as 'po', 'per os') will become "oral", all values for intravenous (such as 'iv', 'intraven') will become "iv". It supports multiple values in one clinical text, see *Examples*. +#' +#' ### Argument `collapse` +#' Without using `collapse`, this function will return a [list]. This can be convenient to use e.g. inside a `mutate()`):\cr +#' `df %>% mutate(avx = av_from_text(clinical_text))` +#' +#' The returned AV codes can be transformed to official names, groups, etc. with all [`av_*`][av_property()] functions such as [av_name()] and [av_group()], or by using the `translate_av` argument. +#' +#' With using `collapse`, this function will return a [character]:\cr +#' `df %>% mutate(avx = av_from_text(clinical_text, collapse = "|"))` +#' @export +#' @return A [list], or a [character] if `collapse` is not `NULL` +#' @examples +#' av_from_text("28/03/2020 valaciclovir po tid") +#' av_from_text("28/03/2020 valaciclovir po tid", type = "admin") +av_from_text <- function(text, + type = c("drug", "dose", "administration"), + collapse = NULL, + translate_av = FALSE, + thorough_search = NULL, + info = interactive(), + ...) { + if (missing(type)) { + type <- type[1L] + } + + meet_criteria(text) + meet_criteria(type, allow_class = "character", has_length = 1) + meet_criteria(collapse, has_length = 1, allow_NULL = TRUE) + meet_criteria(translate_av, allow_NULL = FALSE) # get_translate_av() will be more informative about what's allowed + meet_criteria(thorough_search, allow_class = "logical", has_length = 1, allow_NULL = TRUE) + meet_criteria(info, allow_class = "logical", has_length = 1) + + type <- tolower(trimws2(type)) + + text <- tolower(as.character(text)) + text_split_all <- strsplit(text, "[ ;.,:\\|]") + progress <- progress_ticker(n = length(text_split_all), n_min = 5, print = info) + on.exit(close(progress)) + + if (type %like% "(drug|ab|anti)") { + translate_av <- get_translate_av(translate_av) + + if (isTRUE(thorough_search) || + (isTRUE(is.null(thorough_search)) && max(vapply(FUN.VALUE = double(1), text_split_all, length), na.rm = TRUE) <= 3)) { + text_split_all <- text_split_all[nchar(text_split_all) >= 4 & grepl("[a-z]+", text_split_all)] + result <- lapply(text_split_all, function(text_split) { + progress$tick() + suppressWarnings( + as.av(text_split, ...) + ) + }) + } else { + # no thorough search + names_atc <- substr(c(AMR::antivirals$name, AMR::antivirals$atc), 1, 5) + synonyms <- unlist(AMR::antivirals$synonyms) + synonyms <- synonyms[nchar(synonyms) >= 4] + # regular expression must not be too long, so split synonyms in two: + synonyms_part1 <- synonyms[seq_len(0.5 * length(synonyms))] + synonyms_part2 <- synonyms[!synonyms %in% synonyms_part1] + to_regex <- function(x) { + paste0( + "^(", + paste0(unique(gsub("[^a-z0-9]+", "", sort(tolower(x)))), collapse = "|"), + ").*" + ) + } + result <- lapply(text_split_all, function(text_split) { + progress$tick() + suppressWarnings( + as.av( + unique(c( + text_split[text_split %like_case% to_regex(names_atc)], + text_split[text_split %like_case% to_regex(synonyms_part1)], + text_split[text_split %like_case% to_regex(synonyms_part2)] + )), + ... + ) + ) + }) + } + + close(progress) + + result <- lapply(result, function(out) { + out <- out[!is.na(out)] + if (length(out) == 0) { + as.av(NA) + } else { + if (!isFALSE(translate_av)) { + out <- av_property(out, property = translate_av, initial_search = FALSE) + } + out + } + }) + } else if (type %like% "dos") { + text_split_all <- strsplit(text, " ", fixed = TRUE) + result <- lapply(text_split_all, function(text_split) { + text_split <- text_split[text_split %like% "^[0-9]{2,}(/[0-9]+)?[a-z]*$"] + # only left part of "/", like 500 in "500/125" + text_split <- gsub("/.*", "", text_split) + text_split <- gsub(",", ".", text_split, fixed = TRUE) # foreign system using comma as decimal sep + text_split <- as.double(gsub("[^0-9.]", "", text_split)) + # minimal 100 units/mg and no years that unlikely doses + text_split <- text_split[text_split >= 100 & !text_split %in% c(1951:1999, 2001:2049)] + + if (length(text_split) > 0) { + text_split + } else { + NA_real_ + } + }) + } else if (type %like% "adm") { + result <- lapply(text_split_all, function(text_split) { + text_split <- text_split[text_split %like% "(^iv$|intraven|^po$|per os|oral|implant|inhal|instill|nasal|paren|rectal|sublingual|buccal|trans.*dermal|vaginal)"] + if (length(text_split) > 0) { + text_split <- gsub("(^po$|.*per os.*)", "oral", text_split) + text_split <- gsub("(^iv$|.*intraven.*)", "iv", text_split) + text_split + } else { + NA_character_ + } + }) + } else { + stop_("`type` must be either 'drug', 'dose' or 'administration'") + } + + # collapse text if needed + if (!is.null(collapse)) { + result <- vapply(FUN.VALUE = character(1), result, function(x) { + if (length(x) == 1 & all(is.na(x))) { + NA_character_ + } else { + paste0(x, collapse = collapse) + } + }) + } + + result +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/av_property.R + + + + +#' Get Properties of an Antiviral Drug +#' +#' Use these functions to return a specific property of an antiviral drug from the [antivirals] data set. All input values will be evaluated internally with [as.av()]. +#' @param x any (vector of) text that can be coerced to a valid antiviral drug code with [as.av()] +#' @param tolower a [logical] to indicate whether the first [character] of every output should be transformed to a lower case [character]. +#' @param property one of the column names of one of the [antivirals] data set: `vector_or(colnames(antivirals), sort = FALSE)`. +#' @param language language of the returned text - the default is system language (see [get_AMR_locale()]) and can also be set with the package option [`AMR_locale`][AMR-options]. Use `language = NULL` or `language = ""` to prevent translation. +#' @param administration way of administration, either `"oral"` or `"iv"` +#' @param open browse the URL using [utils::browseURL()] +#' @param ... other arguments passed on to [as.av()] +#' @details All output [will be translated][translate] where possible. +#' +#' The function [av_url()] will return the direct URL to the official WHO website. A warning will be returned if the required ATC code is not available. +#' @inheritSection as.av Source +#' @rdname av_property +#' @name av_property +#' @return +#' - An [integer] in case of [av_cid()] +#' - A named [list] in case of [av_info()] and multiple [av_atc()]/[av_synonyms()]/[av_tradenames()] +#' - A [double] in case of [av_ddd()] +#' - A [character] in all other cases +#' @export +#' @seealso [antivirals] +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' # all properties: +#' av_name("ACI") +#' av_atc("ACI") +#' av_cid("ACI") +#' av_synonyms("ACI") +#' av_tradenames("ACI") +#' av_group("ACI") +#' av_url("ACI") +#' +#' # lowercase transformation +#' av_name(x = c("ACI", "VALA")) +#' av_name(x = c("ACI", "VALA"), tolower = TRUE) +#' +#' # defined daily doses (DDD) +#' av_ddd("ACI", "oral") +#' av_ddd_units("ACI", "oral") +#' av_ddd("ACI", "iv") +#' av_ddd_units("ACI", "iv") +#' +#' av_info("ACI") # all properties as a list +#' +#' # all av_* functions use as.av() internally, so you can go from 'any' to 'any': +#' av_atc("ACI") +#' av_group("J05AB01") +#' av_loinc("abacavir") +#' av_name("29113-8") +#' av_name(135398513) +#' av_name("J05AB01") +av_name <- function(x, language = get_AMR_locale(), tolower = FALSE, ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(tolower, allow_class = "logical", has_length = 1) + + x <- translate_into_language(av_validate(x = x, property = "name", ...), language = language, only_affect_ab_names = TRUE) + if (tolower == TRUE) { + # use perl to only transform the first character + # as we want "polymyxin B", not "polymyxin b" + x <- gsub("^([A-Z])", "\\L\\1", x, perl = TRUE) + } + x +} + +#' @rdname av_property +#' @export +av_cid <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + av_validate(x = x, property = "cid", ...) +} + +#' @rdname av_property +#' @export +av_synonyms <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + syns <- av_validate(x = x, property = "synonyms", ...) + names(syns) <- x + if (length(syns) == 1) { + unname(unlist(syns)) + } else { + syns + } +} + +#' @rdname av_property +#' @export +av_tradenames <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + av_synonyms(x, ...) +} + +#' @rdname av_property +#' @export +av_group <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + translate_into_language(av_validate(x = x, property = "atc_group", ...), language = language, only_affect_ab_names = TRUE) +} + +#' @rdname av_property +#' @export +av_atc <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + # ATCs in the antivirals data set are not a list + av_validate(x = x, property = "atc", ...) +} + +#' @rdname av_property +#' @export +av_loinc <- function(x, ...) { + meet_criteria(x, allow_NA = TRUE) + loincs <- av_validate(x = x, property = "loinc", ...) + names(loincs) <- x + if (length(loincs) == 1) { + unname(unlist(loincs)) + } else { + loincs + } +} + +#' @rdname av_property +#' @export +av_ddd <- function(x, administration = "oral", ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(administration, is_in = c("oral", "iv"), has_length = 1) + + x <- as.av(x, ...) + ddd_prop <- paste0(administration, "_ddd") + out <- av_validate(x = x, property = ddd_prop) + + if (any(av_name(x, language = NULL) %like% "/" & is.na(out))) { + warning_( + "in `av_ddd()`: DDDs of some combined products are available for different dose combinations and not (yet) part of the AMR package.", + "Please refer to the WHOCC website:\n", + "atcddd.fhi.no/ddd/list_of_ddds_combined_products/" + ) + } + out +} + +#' @rdname av_property +#' @export +av_ddd_units <- function(x, administration = "oral", ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(administration, is_in = c("oral", "iv"), has_length = 1) + + x <- as.av(x, ...) + ddd_prop <- paste0(administration, "_units") + out <- av_validate(x = x, property = ddd_prop) + + if (any(av_name(x, language = NULL) %like% "/" & is.na(out))) { + warning_( + "in `av_ddd_units()`: DDDs of some combined products are available for different dose combinations and not (yet) part of the AMR package.", + "Please refer to the WHOCC website:\n", + "atcddd.fhi.no/ddd/list_of_ddds_combined_products/" + ) + } + out +} + +#' @rdname av_property +#' @export +av_info <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + + x <- as.av(x, ...) + list( + av = as.character(x), + cid = av_cid(x), + name = av_name(x, language = language), + group = av_group(x, language = language), + atc = av_atc(x), + tradenames = av_tradenames(x), + loinc = av_loinc(x), + ddd = list( + oral = list( + amount = av_ddd(x, administration = "oral"), + units = av_ddd_units(x, administration = "oral") + ), + iv = list( + amount = av_ddd(x, administration = "iv"), + units = av_ddd_units(x, administration = "iv") + ) + ) + ) +} + + +#' @rdname av_property +#' @export +av_url <- function(x, open = FALSE, ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(open, allow_class = "logical", has_length = 1) + + av <- as.av(x = x, ...) + atcs <- av_atc(av, only_first = TRUE) + u <- paste0("https://atcddd.fhi.no/atc_ddd_index/?code=", atcs, "&showdescription=no") + u[is.na(atcs)] <- NA_character_ + names(u) <- av_name(av) + + NAs <- av_name(av, tolower = TRUE, language = NULL)[!is.na(av) & is.na(atcs)] + if (length(NAs) > 0) { + warning_("in `av_url()`: no ATC code available for ", vector_and(NAs, quotes = FALSE), ".") + } + + if (open == TRUE) { + if (length(u) > 1 && !is.na(u[1L])) { + warning_("in `av_url()`: only the first URL will be opened, as `browseURL()` only suports one string.") + } + if (!is.na(u[1L])) { + utils::browseURL(u[1L]) + } + } + u +} + +#' @rdname av_property +#' @export +av_property <- function(x, property = "name", language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(property, is_in = colnames(AMR::antivirals), has_length = 1) + language <- validate_language(language) + translate_into_language(av_validate(x = x, property = property, ...), language = language) +} + +av_validate <- function(x, property, ...) { + if (tryCatch(all(x[!is.na(x)] %in% AMR_env$AV_lookup$av), error = function(e) FALSE)) { + # special case for av_* functions where class is already 'av' + x <- AMR_env$AV_lookup[match(x, AMR_env$AV_lookup$av), property, drop = TRUE] + } else { + # try to catch an error when inputting an invalid argument + # so the 'call.' can be set to FALSE + tryCatch(x[1L] %in% AMR_env$AV_lookup[1, property, drop = TRUE], + error = function(e) stop(e$message, call. = FALSE) + ) + + if (!all(x %in% AMR_env$AV_lookup[, property, drop = TRUE])) { + x <- as.av(x, ...) + if (all(is.na(x)) && is.list(AMR_env$AV_lookup[, property, drop = TRUE])) { + x <- rep(NA_character_, length(x)) + } else { + x <- AMR_env$AV_lookup[match(x, AMR_env$AV_lookup$av), property, drop = TRUE] + } + } + } + + if (property == "av") { + return(set_clean_class(x, new_class = c("av", "character"))) + } else if (property == "cid") { + return(as.integer(x)) + } else if (property %like% "ddd") { + return(as.double(x)) + } else { + x[is.na(x)] <- NA + return(x) + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/availability.R + + + + +#' Check Availability of Columns +#' +#' Easy check for data availability of all columns in a data set. This makes it easy to get an idea of which antimicrobial combinations can be used for calculation with e.g. [susceptibility()] and [resistance()]. +#' @param tbl a [data.frame] or [list] +#' @param width number of characters to present the visual availability - the default is filling the width of the console +#' @details The function returns a [data.frame] with columns `"resistant"` and `"visual_resistance"`. The values in that columns are calculated with [resistance()]. +#' @return [data.frame] with column names of `tbl` as row names +#' @export +#' @examples +#' availability(example_isolates) +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' filter(mo == as.mo("Escherichia coli")) %>% +#' select_if(is.sir) %>% +#' availability() +#' } +#' } +availability <- function(tbl, width = NULL) { + meet_criteria(tbl, allow_class = "data.frame") + meet_criteria(width, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive = TRUE, is_finite = TRUE) + + tbl <- as.data.frame(tbl, stringsAsFactors = FALSE) + + x <- vapply(FUN.VALUE = double(1), tbl, function(x) { + 1 - sum(is.na(x)) / length(x) + }) + n <- vapply(FUN.VALUE = double(1), tbl, function(x) length(x[!is.na(x)])) + R <- vapply(FUN.VALUE = double(1), tbl, function(x) ifelse(is.sir(x), resistance(x, minimum = 0), NA_real_)) + R_print <- character(length(R)) + R_print[!is.na(R)] <- percentage(R[!is.na(R)]) + R_print[is.na(R)] <- "" + + if (is.null(width)) { + width <- getOption("width", 100) - + (max(nchar(colnames(tbl))) + + # count col + 8 + + # available % column + 10 + + # resistant % column + 10 + + # extra margin + 5) + width <- width / 2 + } + + if (length(R[is.na(R)]) == ncol(tbl)) { + width <- width * 2 + 10 + } + + x_chars_R <- strrep("#", round(width * R, digits = 2)) + x_chars_SI <- strrep("-", width - nchar(x_chars_R)) + vis_resistance <- paste0("|", x_chars_R, x_chars_SI, "|") + vis_resistance[is.na(R)] <- "" + + x_chars <- strrep("#", round(x, digits = 2) / (1 / width)) + x_chars_empty <- strrep("-", width - nchar(x_chars)) + + df <- data.frame( + count = n, + available = percentage(x), + visual_availabilty = paste0("|", x_chars, x_chars_empty, "|"), + resistant = R_print, + visual_resistance = vis_resistance, + stringsAsFactors = FALSE + ) + if (length(R[is.na(R)]) == ncol(tbl)) { + df[, 1:3, drop = FALSE] + } else { + df + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/bug_drug_combinations.R + + + + +#' Determine Bug-Drug Combinations +#' +#' Determine antimicrobial resistance (AMR) of all bug-drug combinations in your data set where at least 30 (default) isolates are available per species. Use [format()] on the result to prettify it to a publishable/printable format, see *Examples*. +#' @inheritParams eucast_rules +#' @param combine_SI a [logical] to indicate whether values S, SDD, and I should be summed, so resistance will be based on only R - the default is `TRUE` +#' @param add_ab_group a [logical] to indicate where the group of the antimicrobials must be included as a first column +#' @param remove_intrinsic_resistant [logical] to indicate that rows and columns with 100% resistance for all tested antimicrobials must be removed from the table +#' @param FUN the function to call on the `mo` column to transform the microorganism codes - the default is [mo_shortname()] +#' @param translate_ab a [character] of length 1 containing column names of the [antibiotics] data set +#' @param ... arguments passed on to `FUN` +#' @inheritParams sir_df +#' @inheritParams base::formatC +#' @details The function [format()] calculates the resistance per bug-drug combination and returns a table ready for reporting/publishing. Use `combine_SI = TRUE` (default) to test R vs. S+I and `combine_SI = FALSE` to test R+I vs. S. This table can also directly be used in R Markdown / Quarto without the need for e.g. [knitr::kable()]. +#' @export +#' @rdname bug_drug_combinations +#' @return The function [bug_drug_combinations()] returns a [data.frame] with columns "mo", "ab", "S", "SDD", "I", "R", and "total". +#' @examples +#' # example_isolates is a data set available in the AMR package. +#' # run ?example_isolates for more info. +#' example_isolates +#' +#' \donttest{ +#' x <- bug_drug_combinations(example_isolates) +#' head(x) +#' format(x, translate_ab = "name (atc)") +#' +#' # Use FUN to change to transformation of microorganism codes +#' bug_drug_combinations(example_isolates, +#' FUN = mo_gramstain +#' ) +#' +#' bug_drug_combinations(example_isolates, +#' FUN = function(x) { +#' ifelse(x == as.mo("Escherichia coli"), +#' "E. coli", +#' "Others" +#' ) +#' } +#' ) +#' } +bug_drug_combinations <- function(x, + col_mo = NULL, + FUN = mo_shortname, + ...) { + meet_criteria(x, allow_class = "data.frame") + x <- ascertain_sir_classes(x, "x") + meet_criteria(col_mo, allow_class = "character", is_in = colnames(x), has_length = 1, allow_NULL = TRUE) + meet_criteria(FUN, allow_class = "function", has_length = 1) + + # try to find columns based on type + # -- mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo") + stop_if(is.null(col_mo), "`col_mo` must be set") + } else { + stop_ifnot(col_mo %in% colnames(x), "column '", col_mo, "' (`col_mo`) not found") + } + + x.bak <- x + x <- as.data.frame(x, stringsAsFactors = FALSE) + x[, col_mo] <- FUN(x[, col_mo, drop = TRUE], ...) + + unique_mo <- sort(unique(x[, col_mo, drop = TRUE])) + + # select only groups and antibiotics + if (is_null_or_grouped_tbl(x.bak)) { + data_has_groups <- TRUE + groups <- get_group_names(x.bak) + x <- x[, c(groups, col_mo, colnames(x)[vapply(FUN.VALUE = logical(1), x, is.sir)]), drop = FALSE] + } else { + data_has_groups <- FALSE + x <- x[, c(col_mo, names(which(vapply(FUN.VALUE = logical(1), x, is.sir)))), drop = FALSE] + } + + run_it <- function(x) { + out <- data.frame( + mo = character(0), + ab = character(0), + S = integer(0), + SDD = integer(0), + I = integer(0), + R = integer(0), + total = integer(0), + stringsAsFactors = FALSE + ) + if (data_has_groups) { + group_values <- unique(x[, which(colnames(x) %in% groups), drop = FALSE]) + rownames(group_values) <- NULL + x <- x[, which(!colnames(x) %in% groups), drop = FALSE] + } + + for (i in seq_len(length(unique_mo))) { + # filter on MO group and only select SIR columns + x_mo_filter <- x[which(x[, col_mo, drop = TRUE] == unique_mo[i]), names(which(vapply(FUN.VALUE = logical(1), x, is.sir))), drop = FALSE] + # turn and merge everything + pivot <- lapply(x_mo_filter, function(x) { + m <- as.matrix(table(as.sir(x))) + data.frame(S = m["S", ], SDD = m["SDD", ], I = m["I", ], R = m["R", ], NI = m["NI", ], stringsAsFactors = FALSE) + }) + merged <- do.call(rbind_AMR, pivot) + out_group <- data.frame( + mo = rep(unique_mo[i], NROW(merged)), + ab = rownames(merged), + S = merged$S, + SDD = merged$SDD, + I = merged$I, + R = merged$R, + NI = merged$NI, + total = merged$S + merged$SDD + merged$I + merged$R + merged$NI, + stringsAsFactors = FALSE + ) + if (data_has_groups) { + if (nrow(group_values) < nrow(out_group)) { + # repeat group_values for the number of rows in out_group + repeated <- rep(seq_len(nrow(group_values)), + each = nrow(out_group) / nrow(group_values) + ) + group_values <- group_values[repeated, , drop = FALSE] + } + out_group <- cbind(group_values, out_group) + } + out <- rbind_AMR(out, out_group) + } + out + } + # based on pm_apply_grouped_function + apply_group <- function(.data, fn, groups, drop = FALSE, ...) { + grouped <- pm_split_into_groups(.data, groups, drop) + res <- do.call(rbind_AMR, unname(lapply(grouped, fn, ...))) + if (any(groups %in% colnames(res))) { + class(res) <- c("grouped_data", class(res)) + res <- pm_set_groups(res, groups[groups %in% colnames(res)]) + } + res + } + + if (data_has_groups) { + out <- apply_group(x, "run_it", groups) + } else { + out <- run_it(x) + } + out <- out %pm>% pm_arrange(mo, ab) + out <- as_original_data_class(out, class(x.bak)) # will remove tibble groups + rownames(out) <- NULL + structure(out, class = c("bug_drug_combinations", ifelse(data_has_groups, "grouped", character(0)), class(out))) +} + +#' @method format bug_drug_combinations +#' @export +#' @rdname bug_drug_combinations +format.bug_drug_combinations <- function(x, + translate_ab = "name (ab, atc)", + language = get_AMR_locale(), + minimum = 30, + combine_SI = TRUE, + add_ab_group = TRUE, + remove_intrinsic_resistant = FALSE, + decimal.mark = getOption("OutDec"), + big.mark = ifelse(decimal.mark == ",", ".", ","), + ...) { + meet_criteria(x, allow_class = "data.frame") + meet_criteria(translate_ab, allow_class = c("character", "logical"), has_length = 1, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(add_ab_group, allow_class = "logical", has_length = 1) + meet_criteria(remove_intrinsic_resistant, allow_class = "logical", has_length = 1) + meet_criteria(decimal.mark, allow_class = "character", has_length = 1) + meet_criteria(big.mark, allow_class = "character", has_length = 1) + + x.bak <- x + if (inherits(x, "grouped")) { + # bug_drug_combinations() has been run on groups, so de-group here + warning_("in `format()`: formatting the output of `bug_drug_combinations()` does not support grouped variables, they were ignored") + x <- as.data.frame(x, stringsAsFactors = FALSE) + idx <- split(seq_len(nrow(x)), paste0(x$mo, "%%", x$ab)) + x <- data.frame( + mo = gsub("(.*)%%(.*)", "\\1", names(idx)), + ab = gsub("(.*)%%(.*)", "\\2", names(idx)), + S = vapply(FUN.VALUE = double(1), idx, function(i) sum(x$S[i], na.rm = TRUE)), + SDD = vapply(FUN.VALUE = double(1), idx, function(i) sum(x$SDD[i], na.rm = TRUE)), + I = vapply(FUN.VALUE = double(1), idx, function(i) sum(x$I[i], na.rm = TRUE)), + R = vapply(FUN.VALUE = double(1), idx, function(i) sum(x$R[i], na.rm = TRUE)), + NI = vapply(FUN.VALUE = double(1), idx, function(i) sum(x$NI[i], na.rm = TRUE)), + total = vapply(FUN.VALUE = double(1), idx, function(i) { + sum(x$S[i], na.rm = TRUE) + + sum(x$SDD[i], na.rm = TRUE) + + sum(x$I[i], na.rm = TRUE) + + sum(x$R[i], na.rm = TRUE) + + sum(x$NI[i], na.rm = TRUE) + }), + stringsAsFactors = FALSE + ) + } + + x <- as.data.frame(x, stringsAsFactors = FALSE) + x <- subset(x, total >= minimum) + + if (remove_intrinsic_resistant == TRUE) { + x <- subset(x, R != total) + } + if (combine_SI == TRUE) { + x$isolates <- x$R + } else { + x$isolates <- x$R + x$I + x$SDD + } + + give_ab_name <- function(ab, format, language) { + format <- tolower(format) + ab_txt <- rep(format, length(ab)) + for (i in seq_len(length(ab_txt))) { + ab_txt[i] <- gsub("ab", as.character(as.ab(ab[i])), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("cid", ab_cid(ab[i]), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("group", ab_group(ab[i], language = language), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("atc_group1", ab_atc_group1(ab[i], language = language), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("atc_group2", ab_atc_group2(ab[i], language = language), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("atc", ab_atc(ab[i], only_first = TRUE), ab_txt[i], fixed = TRUE) + ab_txt[i] <- gsub("name", ab_name(ab[i], language = language), ab_txt[i], fixed = TRUE) + ab_txt[i] + } + ab_txt + } + + remove_NAs <- function(.data) { + cols <- colnames(.data) + .data <- as.data.frame(lapply(.data, function(x) ifelse(is.na(x), "", x)), + stringsAsFactors = FALSE + ) + colnames(.data) <- cols + .data + } + + create_var <- function(.data, ...) { + dots <- list(...) + for (i in seq_len(length(dots))) { + .data[, names(dots)[i]] <- dots[[i]] + } + .data + } + + y <- x %pm>% + create_var( + ab = as.ab(x$ab), + ab_txt = give_ab_name(ab = x$ab, format = translate_ab, language = language) + ) %pm>% + pm_group_by(ab, ab_txt, mo) %pm>% + pm_summarise( + isolates = sum(isolates, na.rm = TRUE), + total = sum(total, na.rm = TRUE) + ) %pm>% + pm_ungroup() + + y <- y %pm>% + create_var(txt = paste0( + percentage(y$isolates / y$total, decimal.mark = decimal.mark, big.mark = big.mark), + " (", trimws(format(y$isolates, big.mark = big.mark)), "/", + trimws(format(y$total, big.mark = big.mark)), ")" + )) %pm>% + pm_select(ab, ab_txt, mo, txt) %pm>% + pm_arrange(mo) + + # replace tidyr::pivot_wider() from here + for (i in unique(y$mo)) { + mo_group <- y[which(y$mo == i), c("ab", "txt"), drop = FALSE] + colnames(mo_group) <- c("ab", i) + rownames(mo_group) <- NULL + y <- y %pm>% + pm_left_join(mo_group, by = "ab") + } + y <- y %pm>% + pm_distinct(ab, .keep_all = TRUE) %pm>% + pm_select(-mo, -txt) %pm>% + # replace tidyr::pivot_wider() until here + remove_NAs() + + select_ab_vars <- function(.data) { + .data[, c("ab_group", "ab_txt", colnames(.data)[!colnames(.data) %in% c("ab_group", "ab_txt", "ab")]), drop = FALSE] + } + + y <- y %pm>% + create_var(ab_group = ab_group(y$ab, language = language)) %pm>% + select_ab_vars() %pm>% + pm_arrange(ab_group, ab_txt) + y <- y %pm>% + create_var(ab_group = ifelse(y$ab_group != pm_lag(y$ab_group) | is.na(pm_lag(y$ab_group)), y$ab_group, "")) + + if (add_ab_group == FALSE) { + y <- y %pm>% + pm_select(-ab_group) %pm>% + pm_rename("Drug" = ab_txt) + colnames(y)[1] <- translate_into_language(colnames(y)[1], language, only_unknown = FALSE) + } else { + y <- y %pm>% + pm_rename( + "Group" = ab_group, + "Drug" = ab_txt + ) + } + + if (!is.null(language)) { + colnames(y) <- translate_into_language(colnames(y), language, only_unknown = FALSE) + } + + if (remove_intrinsic_resistant == TRUE) { + y <- y[, !vapply(FUN.VALUE = logical(1), y, function(col) all(col %like% "100", na.rm = TRUE) & !anyNA(col)), drop = FALSE] + } + + rownames(y) <- NULL + as_original_data_class(y, class(x.bak), extra_class = "formatted_bug_drug_combinations") # will remove tibble groups +} + +# will be exported in zzz.R +knit_print.formatted_bug_drug_combinations <- function(x, ...) { + stop_ifnot_installed("knitr") + # make columns with MO names italic according to nomenclature + colnames(x)[3:NCOL(x)] <- italicise_taxonomy(colnames(x)[3:NCOL(x)], type = "markdown") + knitr::asis_output(paste("", "", knitr::kable(x, ...), collapse = "\n")) +} + +#' @method print bug_drug_combinations +#' @export +print.bug_drug_combinations <- function(x, ...) { + x_class <- class(x) + print( + set_clean_class(x, + new_class = x_class[!x_class %in% c("bug_drug_combinations", "grouped")] + ), + ... + ) + message_("Use 'format()' on this result to get a publishable/printable format.", + ifelse(inherits(x, "grouped"), " Note: The grouping variable(s) will be ignored.", ""), + as_note = FALSE + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/count.R + + + + +#' Count Available Isolates +#' +#' @description These functions can be used to count resistant/susceptible microbial isolates. All functions support quasiquotation with pipes, can be used in `summarise()` from the `dplyr` package and also support grouped variables, see *Examples*. +#' +#' [count_resistant()] should be used to count resistant isolates, [count_susceptible()] should be used to count susceptible isolates. +#' @param ... one or more vectors (or columns) with antibiotic interpretations. They will be transformed internally with [as.sir()] if needed. +#' @inheritParams proportion +#' @inheritSection as.sir Interpretation of SIR +#' @details These functions are meant to count isolates. Use the [resistance()]/[susceptibility()] functions to calculate microbial resistance/susceptibility. +#' +#' The function [count_resistant()] is equal to the function [count_R()]. The function [count_susceptible()] is equal to the function [count_SI()]. +#' +#' The function [n_sir()] is an alias of [count_all()]. They can be used to count all available isolates, i.e. where all input antibiotics have an available result (S, I or R). Their use is equal to `n_distinct()`. Their function is equal to `count_susceptible(...) + count_resistant(...)`. +#' +#' The function [count_df()] takes any variable from `data` that has an [`sir`] class (created with [as.sir()]) and counts the number of S's, I's and R's. It also supports grouped variables. The function [sir_df()] works exactly like [count_df()], but adds the percentage of S, I and R. +#' @inheritSection proportion Combination Therapy +#' @seealso [`proportion_*`][proportion] to calculate microbial resistance and susceptibility. +#' @return An [integer] +#' @rdname count +#' @name count +#' @export +#' @examples +#' # example_isolates is a data set available in the AMR package. +#' # run ?example_isolates for more info. +#' +#' # base R ------------------------------------------------------------ +#' count_resistant(example_isolates$AMX) # counts "R" +#' count_susceptible(example_isolates$AMX) # counts "S" and "I" +#' count_all(example_isolates$AMX) # counts "S", "I" and "R" +#' +#' # be more specific +#' count_S(example_isolates$AMX) +#' count_SI(example_isolates$AMX) +#' count_I(example_isolates$AMX) +#' count_IR(example_isolates$AMX) +#' count_R(example_isolates$AMX) +#' +#' # Count all available isolates +#' count_all(example_isolates$AMX) +#' n_sir(example_isolates$AMX) +#' +#' # n_sir() is an alias of count_all(). +#' # Since it counts all available isolates, you can +#' # calculate back to count e.g. susceptible isolates. +#' # These results are the same: +#' count_susceptible(example_isolates$AMX) +#' susceptibility(example_isolates$AMX) * n_sir(example_isolates$AMX) +#' +#' # dplyr ------------------------------------------------------------- +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise( +#' R = count_R(CIP), +#' I = count_I(CIP), +#' S = count_S(CIP), +#' n1 = count_all(CIP), # the actual total; sum of all three +#' n2 = n_sir(CIP), # same - analogous to n_distinct +#' total = n() +#' ) # NOT the number of tested isolates! +#' +#' # Number of available isolates for a whole antibiotic class +#' # (i.e., in this data set columns GEN, TOB, AMK, KAN) +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise(across(aminoglycosides(), n_sir)) +#' +#' # Count co-resistance between amoxicillin/clav acid and gentamicin, +#' # so we can see that combination therapy does a lot more than mono therapy. +#' # Please mind that `susceptibility()` calculates percentages right away instead. +#' example_isolates %>% count_susceptible(AMC) # 1433 +#' example_isolates %>% count_all(AMC) # 1879 +#' +#' example_isolates %>% count_susceptible(GEN) # 1399 +#' example_isolates %>% count_all(GEN) # 1855 +#' +#' example_isolates %>% count_susceptible(AMC, GEN) # 1764 +#' example_isolates %>% count_all(AMC, GEN) # 1936 +#' +#' # Get number of S+I vs. R immediately of selected columns +#' example_isolates %>% +#' select(AMX, CIP) %>% +#' count_df(translate = FALSE) +#' +#' # It also supports grouping variables +#' example_isolates %>% +#' select(ward, AMX, CIP) %>% +#' group_by(ward) %>% +#' count_df(translate = FALSE) +#' } +#' } +count_resistant <- function(..., only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = "R", + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_susceptible <- function(..., only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = c("S", "SDD", "I"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_S <- function(..., only_all_tested = FALSE) { + if (message_not_thrown_before("count_S", entire_session = TRUE)) { + message_("Using `count_S()` is discouraged; use `count_susceptible()` instead to also consider \"I\" and \"SDD\" being susceptible. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = "S", + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_SI <- function(..., only_all_tested = FALSE) { + if (message_not_thrown_before("count_SI", entire_session = TRUE)) { + message_("Note that `count_SI()` will also count dose-dependent susceptibility, 'SDD'. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("S", "SDD", "I"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_I <- function(..., only_all_tested = FALSE) { + if (message_not_thrown_before("count_I", entire_session = TRUE)) { + message_("Note that `count_I()` will also count dose-dependent susceptibility, 'SDD'. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("I", "SDD"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_IR <- function(..., only_all_tested = FALSE) { + if (message_not_thrown_before("count_IR", entire_session = TRUE)) { + message_("Using `count_IR()` is discouraged; use `count_resistant()` instead to not consider \"I\" and \"SDD\" being resistant. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("I", "SDD", "R"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_R <- function(..., only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = "R", + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +count_all <- function(..., only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = c("S", "SDD", "I", "R", "NI"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname count +#' @export +n_sir <- count_all + +#' @rdname count +#' @export +count_df <- function(data, + translate_ab = "name", + language = get_AMR_locale(), + combine_SI = TRUE) { + tryCatch( + sir_calc_df( + type = "count", + data = data, + translate_ab = translate_ab, + language = language, + combine_SI = combine_SI, + confidence_level = 0.95 # doesn't matter, will be removed + ), + error = function(e) stop_(gsub("in sir_calc_df(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/custom_antimicrobials.R + + + + +#' Add Custom Antimicrobials +#' +#' With [add_custom_antimicrobials()] you can add your own custom antimicrobial drug names and codes. +#' @param x a [data.frame] resembling the [antibiotics] data set, at least containing columns "ab" and "name" +#' @details **Important:** Due to how \R works, the [add_custom_antimicrobials()] function has to be run in every \R session - added antimicrobials are not stored between sessions and are thus lost when \R is exited. +#' +#' There are two ways to circumvent this and automate the process of adding antimicrobials: +#' +#' **Method 1:** Using the package option [`AMR_custom_ab`][AMR-options], which is the preferred method. To use this method: +#' +#' 1. Create a data set in the structure of the [antibiotics] data set (containing at the very least columns "ab" and "name") and save it with [saveRDS()] to a location of choice, e.g. `"~/my_custom_ab.rds"`, or any remote location. +#' +#' 2. Set the file location to the package option [`AMR_custom_ab`][AMR-options]: `options(AMR_custom_ab = "~/my_custom_ab.rds")`. This can even be a remote file location, such as an https URL. Since options are not saved between \R sessions, it is best to save this option to the `.Rprofile` file so that it will be loaded on start-up of \R. To do this, open the `.Rprofile` file using e.g. `utils::file.edit("~/.Rprofile")`, add this text and save the file: +#' +#' ```r +#' # Add custom antimicrobial codes: +#' options(AMR_custom_ab = "~/my_custom_ab.rds") +#' ``` +#' +#' Upon package load, this file will be loaded and run through the [add_custom_antimicrobials()] function. +#' +#' **Method 2:** Loading the antimicrobial additions directly from your `.Rprofile` file. Note that the definitions will be stored in a user-specific \R file, which is a suboptimal workflow. To use this method: +#' +#' 1. Edit the `.Rprofile` file using e.g. `utils::file.edit("~/.Rprofile")`. +#' +#' 2. Add a text like below and save the file: +#' +#' ```r +#' # Add custom antibiotic drug codes: +#' AMR::add_custom_antimicrobials( +#' data.frame(ab = "TESTAB", +#' name = "Test Antibiotic", +#' group = "Test Group") +#' ) +#' ``` +#' +#' Use [clear_custom_antimicrobials()] to clear the previously added antimicrobials. +#' @seealso [add_custom_microorganisms()] to add custom microorganisms. +#' @rdname add_custom_antimicrobials +#' @export +#' @examples +#' \donttest{ +#' +#' # returns NA and throws a warning (which is suppressed here): +#' suppressWarnings( +#' as.ab("testab") +#' ) +#' +#' # now add a custom entry - it will be considered by as.ab() and +#' # all ab_*() functions +#' add_custom_antimicrobials( +#' data.frame( +#' ab = "TESTAB", +#' name = "Test Antibiotic", +#' # you can add any property present in the +#' # 'antibiotics' data set, such as 'group': +#' group = "Test Group" +#' ) +#' ) +#' +#' # "testab" is now a new antibiotic: +#' as.ab("testab") +#' ab_name("testab") +#' ab_group("testab") +#' +#' ab_info("testab") +#' +#' +#' # Add Co-fluampicil, which is one of the many J01CR50 codes, see +#' # https://atcddd.fhi.no/ddd/list_of_ddds_combined_products/ +#' add_custom_antimicrobials( +#' data.frame( +#' ab = "COFLU", +#' name = "Co-fluampicil", +#' atc = "J01CR50", +#' group = "Beta-lactams/penicillins" +#' ) +#' ) +#' ab_atc("Co-fluampicil") +#' ab_name("J01CR50") +#' +#' # even antibiotic selectors work +#' x <- data.frame( +#' random_column = "some value", +#' coflu = as.sir("S"), +#' ampicillin = as.sir("R") +#' ) +#' x +#' x[, betalactams()] +#' } +add_custom_antimicrobials <- function(x) { + meet_criteria(x, allow_class = "data.frame") + stop_ifnot( + all(c("ab", "name") %in% colnames(x)), + "`x` must contain columns \"ab\" and \"name\"." + ) + stop_if( + any(x$ab %in% AMR_env$AB_lookup$ab), + "Antimicrobial drug code(s) ", vector_and(x$ab[x$ab %in% AMR_env$AB_lookup$ab]), " already exist in the internal `antibiotics` data set." + ) + # remove any extra class/type, such as grouped tbl, or data.table: + x <- as.data.frame(x, stringsAsFactors = FALSE) + # keep only columns available in the antibiotics data set + x <- x[, colnames(AMR_env$AB_lookup)[colnames(AMR_env$AB_lookup) %in% colnames(x)], drop = FALSE] + x$generalised_name <- generalise_antibiotic_name(x$name) + x$generalised_all <- as.list(x$generalised_name) + for (col in colnames(x)) { + if (is.list(AMR_env$AB_lookup[, col, drop = TRUE]) & !is.list(x[, col, drop = TRUE])) { + x[, col] <- as.list(x[, col, drop = TRUE]) + } + } + + AMR_env$custom_ab_codes <- c(AMR_env$custom_ab_codes, x$ab) + class(AMR_env$AB_lookup$ab) <- "character" + + new_df <- AMR_env$AB_lookup[0, , drop = FALSE][seq_len(NROW(x)), , drop = FALSE] + rownames(new_df) <- NULL + list_cols <- vapply(FUN.VALUE = logical(1), new_df, is.list) + for (l in which(list_cols)) { + # prevent binding NULLs in lists, replace with NA + new_df[, l] <- as.list(NA_character_) + } + for (col in colnames(x)) { + # assign new values + new_df[, col] <- x[, col, drop = TRUE] + } + AMR_env$AB_lookup <- unique(rbind_AMR(AMR_env$AB_lookup, new_df)) + + AMR_env$ab_previously_coerced <- AMR_env$ab_previously_coerced[which(!AMR_env$ab_previously_coerced$ab %in% x$ab), , drop = FALSE] + class(AMR_env$AB_lookup$ab) <- c("ab", "character") + message_("Added ", nr2char(nrow(x)), " record", ifelse(nrow(x) > 1, "s", ""), " to the internal `antibiotics` data set.") +} + +#' @rdname add_custom_antimicrobials +#' @export +clear_custom_antimicrobials <- function() { + n <- nrow(AMR_env$AB_lookup) + AMR_env$AB_lookup <- cbind(AMR::antibiotics, AB_LOOKUP) + n2 <- nrow(AMR_env$AB_lookup) + AMR_env$custom_ab_codes <- character(0) + AMR_env$ab_previously_coerced <- AMR_env$ab_previously_coerced[which(AMR_env$ab_previously_coerced$ab %in% AMR_env$AB_lookup$ab), , drop = FALSE] + message_("Cleared ", nr2char(n - n2), " custom record", ifelse(n - n2 > 1, "s", ""), " from the internal `antibiotics` data set.") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/custom_eucast_rules.R + + + + +#' Define Custom EUCAST Rules +#' +#' Define custom EUCAST rules for your organisation or specific analysis and use the output of this function in [eucast_rules()]. +#' @param ... rules in [formula][base::tilde] notation, see below for instructions, and in *Examples* +#' @details +#' Some organisations have their own adoption of EUCAST rules. This function can be used to define custom EUCAST rules to be used in the [eucast_rules()] function. +#' @section How it works: +#' +#' ### Basics +#' +#' If you are familiar with the [`case_when()`][dplyr::case_when()] function of the `dplyr` package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written *before* the tilde (`~`) and the consequence of the rule is written *after* the tilde: +#' +#' ```r +#' x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S", +#' TZP == "R" ~ aminopenicillins == "R") +#' ``` +#' +#' These are two custom EUCAST rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work: +#' +#' ```r +#' x +#' #> A set of custom EUCAST rules: +#' #> +#' #> 1. If TZP is "S" then set to S : +#' #> amoxicillin (AMX), ampicillin (AMP) +#' #> +#' #> 2. If TZP is "R" then set to R : +#' #> amoxicillin (AMX), ampicillin (AMP) +#' ``` +#' +#' The rules (the part *before* the tilde, in above example `TZP == "S"` and `TZP == "R"`) must be evaluable in your data set: it should be able to run as a filter in your data set without errors. This means for the above example that the column `TZP` must exist. We will create a sample data set and test the rules set: +#' +#' ```r +#' df <- data.frame(mo = c("Escherichia coli", "Klebsiella pneumoniae"), +#' TZP = as.sir("R"), +#' ampi = as.sir("S"), +#' cipro = as.sir("S")) +#' df +#' #> mo TZP ampi cipro +#' #> 1 Escherichia coli R S S +#' #> 2 Klebsiella pneumoniae R S S +#' +#' eucast_rules(df, rules = "custom", custom_rules = x, info = FALSE) +#' #> mo TZP ampi cipro +#' #> 1 Escherichia coli R R S +#' #> 2 Klebsiella pneumoniae R R S +#' ``` +#' +#' ### Using taxonomic properties in rules +#' +#' There is one exception in columns used for the rules: all column names of the [microorganisms] data set can also be used, but do not have to exist in the data set. These column names are: `r vector_and(colnames(microorganisms), sort = FALSE)`. Thus, this next example will work as well, despite the fact that the `df` data set does not contain a column `genus`: +#' +#' ```r +#' y <- custom_eucast_rules(TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S", +#' TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R") +#' +#' eucast_rules(df, rules = "custom", custom_rules = y, info = FALSE) +#' #> mo TZP ampi cipro +#' #> 1 Escherichia coli R S S +#' #> 2 Klebsiella pneumoniae R R S +#' ``` +#' +#' ### Usage of multiple antibiotics and antibiotic group names +#' +#' You can define antibiotic groups instead of single antibiotics for the rule consequence, which is the part *after* the tilde (~). In the examples above, the antibiotic group `aminopenicillins` includes both ampicillin and amoxicillin. +#' +#' Rules can also be applied to multiple antibiotics and antibiotic groups simultaneously. Use the `c()` function to combine multiple antibiotics. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R": +#' +#' ```r +#' x <- custom_eucast_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R") +#' x +#' #> A set of custom EUCAST rules: +#' #> +#' #> 1. If TZP is "R" then set to "R": +#' #> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP) +#' ``` +#' +#' These `r length(DEFINED_AB_GROUPS)` antibiotic groups are allowed in the rules (case-insensitive) and can be used in any combination: +#' +#' `r paste0(" * ", sapply(DEFINED_AB_GROUPS, function(x) paste0(tolower(gsub("^AB_", "", x)), "\\cr(", vector_and(ab_name(eval(parse(text = x), envir = asNamespace("AMR")), language = NULL, tolower = TRUE), quotes = FALSE), ")"), USE.NAMES = FALSE), "\n", collapse = "")` +#' @returns A [list] containing the custom rules +#' @export +#' @examples +#' x <- custom_eucast_rules( +#' AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", +#' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I" +#' ) +#' x +#' +#' # run the custom rule set (verbose = TRUE will return a logbook instead of the data set): +#' eucast_rules(example_isolates, +#' rules = "custom", +#' custom_rules = x, +#' info = FALSE, +#' verbose = TRUE +#' ) +#' +#' # combine rule sets +#' x2 <- c( +#' x, +#' custom_eucast_rules(TZP == "R" ~ carbapenems == "R") +#' ) +#' x2 +custom_eucast_rules <- function(...) { + dots <- tryCatch(list(...), + error = function(e) "error" + ) + stop_if( + identical(dots, "error"), + "rules must be a valid formula inputs (e.g., using '~'), see `?custom_eucast_rules`" + ) + n_dots <- length(dots) + stop_if(n_dots == 0, "no custom rules were set. Please read the documentation using `?custom_eucast_rules`.") + out <- vector("list", n_dots) + for (i in seq_len(n_dots)) { + stop_ifnot( + inherits(dots[[i]], "formula"), + "rule ", i, " must be a valid formula input (e.g., using '~'), see `?custom_eucast_rules`" + ) + + # Query + qry <- dots[[i]][[2]] + if (inherits(qry, "call")) { + qry <- as.expression(qry) + } + qry <- as.character(qry) + # these will prevent vectorisation, so replace them: + qry <- gsub("&&", "&", qry, fixed = TRUE) + qry <- gsub("||", "|", qry, fixed = TRUE) + # format nicely, setting spaces around operators + qry <- gsub(" *([&|+-/*^><==]+) *", " \\1 ", qry) + qry <- gsub(" ?, ?", ", ", qry) + qry <- gsub("'", "\"", qry, fixed = TRUE) + out[[i]]$query <- as.expression(qry) + + # Resulting rule + result <- dots[[i]][[3]] + stop_ifnot( + deparse(result) %like% "==", + "the result of rule ", i, " (the part after the `~`) must contain `==`, such as in `... ~ ampicillin == \"R\"`, see `?custom_eucast_rules`" + ) + result_group <- as.character(result)[[2]] + result_group<- as.character(str2lang(result_group)) + result_group <- result_group[result_group != "c"] + result_group_agents <- character(0) + for (j in seq_len(length(result_group))) { + if (paste0("AB_", toupper(result_group[j]), "S") %in% DEFINED_AB_GROUPS) { + # support for e.g. 'aminopenicillin' if user meant 'aminopenicillins' + result_group[j] <- paste0(result_group[j], "s") + } + if (paste0("AB_", toupper(result_group[j])) %in% DEFINED_AB_GROUPS) { + result_group_agents <- c(result_group_agents, + eval(parse(text = paste0("AB_", toupper(result_group[j]))), envir = asNamespace("AMR"))) + } else { + out_group <- tryCatch( + suppressWarnings(as.ab(result_group[j], + fast_mode = TRUE, + flag_multiple_results = FALSE + )), + error = function(e) NA_character_ + ) + if (!all(is.na(out_group))) { + result_group_agents <- c(result_group_agents, out_group) + } + } + } + result_group_agents <- result_group_agents[!is.na(result_group_agents)] + + stop_if( + length(result_group_agents) == 0, + "this result of rule ", i, " could not be translated to a single antimicrobial drug/group: \"", + as.character(result)[[2]], "\".\n\nThe input can be a name or code of an antimicrobial drug, or be one of: ", + vector_or(tolower(gsub("AB_", "", DEFINED_AB_GROUPS)), quotes = FALSE), "." + ) + result_value <- as.character(result)[[3]] + result_value[result_value == "NA"] <- NA + stop_ifnot( + result_value %in% c("S", "SDD", "I", "R", "NI", NA), + "the resulting value of rule ", i, " must be either \"S\", \"SDD\", \"I\", \"R\", \"NI\" or NA" + ) + result_value <- as.sir(result_value) + + out[[i]]$result_group <- result_group_agents + out[[i]]$result_value <- result_value + } + + names(out) <- paste0("rule", seq_len(n_dots)) + set_clean_class(out, new_class = c("custom_eucast_rules", "list")) +} + +#' @method c custom_eucast_rules +#' @noRd +#' @export +c.custom_eucast_rules <- function(x, ...) { + if (length(list(...)) == 0) { + return(x) + } + out <- unclass(x) + for (e in list(...)) { + out <- c(out, unclass(e)) + } + names(out) <- paste0("rule", seq_len(length(out))) + set_clean_class(out, new_class = c("custom_eucast_rules", "list")) +} + +#' @method as.list custom_eucast_rules +#' @noRd +#' @export +as.list.custom_eucast_rules <- function(x, ...) { + c(x, ...) +} + +#' @method print custom_eucast_rules +#' @export +#' @noRd +print.custom_eucast_rules <- function(x, ...) { + cat("A set of custom EUCAST rules:\n") + for (i in seq_len(length(x))) { + rule <- x[[i]] + rule$query <- format_custom_query_rule(rule$query) + if (is.na(rule$result_value)) { + val <- font_red("") + } else if (rule$result_value == "R") { + val <- font_red_bg(" R ") + } else if (rule$result_value == "S") { + val <- font_green_bg(" S ") + } else { + val <- font_orange_bg(" I ") + } + agents <- paste0( + font_blue(ab_name(rule$result_group, language = NULL, tolower = TRUE), + collapse = NULL + ), + " (", rule$result_group, ")" + ) + agents <- sort(agents) + rule_if <- word_wrap( + paste0( + i, ". ", font_bold("If "), font_blue(rule$query), font_bold(" then "), + "set to {result}:" + ), + extra_indent = 5 + ) + rule_if <- gsub("{result}", val, rule_if, fixed = TRUE) + rule_then <- paste0(" ", word_wrap(paste0(agents, collapse = ", "), extra_indent = 5)) + cat("\n ", rule_if, "\n", rule_then, "\n", sep = "") + } +} + +format_custom_query_rule <- function(query, colours = has_colour()) { + # font_black() is a bit expensive so do it once: + txt <- font_black("{text}") + query <- gsub(" & ", sub("{text}", font_bold(" and "), txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" | ", sub("{text}", " or ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" + ", sub("{text}", " plus ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" - ", sub("{text}", " minus ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" / ", sub("{text}", " divided by ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" * ", sub("{text}", " times ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" == ", sub("{text}", " is ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" > ", sub("{text}", " is higher than ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" < ", sub("{text}", " is lower than ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" >= ", sub("{text}", " is higher than or equal to ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" <= ", sub("{text}", " is lower than or equal to ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" ^ ", sub("{text}", " to the power of ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" %in% ", sub("{text}", " is one of ", txt, fixed = TRUE), query, fixed = TRUE) + query <- gsub(" %like% ", sub("{text}", " resembles ", txt, fixed = TRUE), query, fixed = TRUE) + if (colours == TRUE) { + query <- gsub('"R"', font_red_bg(" R "), query, fixed = TRUE) + query <- gsub('"S"', font_green_bg(" S "), query, fixed = TRUE) + query <- gsub('"I"', font_orange_bg(" I "), query, fixed = TRUE) + } + # replace the black colour 'stops' with blue colour 'starts' + query <- gsub("\033[39m", "\033[34m", as.character(query), fixed = TRUE) + # start with blue + query <- paste0("\033[34m", query) + if (colours == FALSE) { + query <- font_stripstyle(query) + } + query +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/custom_microorganisms.R + + + + +#' Add Custom Microorganisms +#' +#' With [add_custom_microorganisms()] you can add your own custom microorganisms, such the non-taxonomic outcome of laboratory analysis. +#' @param x a [data.frame] resembling the [microorganisms] data set, at least containing column "genus" (case-insensitive) +#' @details This function will fill in missing taxonomy for you, if specific taxonomic columns are missing, see *Examples*. +#' +#' **Important:** Due to how \R works, the [add_custom_microorganisms()] function has to be run in every \R session - added microorganisms are not stored between sessions and are thus lost when \R is exited. +#' +#' There are two ways to circumvent this and automate the process of adding microorganisms: +#' +#' **Method 1:** Using the package option [`AMR_custom_mo`][AMR-options], which is the preferred method. To use this method: +#' +#' 1. Create a data set in the structure of the [microorganisms] data set (containing at the very least column "genus") and save it with [saveRDS()] to a location of choice, e.g. `"~/my_custom_mo.rds"`, or any remote location. +#' +#' 2. Set the file location to the package option [`AMR_custom_mo`][AMR-options]: `options(AMR_custom_mo = "~/my_custom_mo.rds")`. This can even be a remote file location, such as an https URL. Since options are not saved between \R sessions, it is best to save this option to the `.Rprofile` file so that it will be loaded on start-up of \R. To do this, open the `.Rprofile` file using e.g. `utils::file.edit("~/.Rprofile")`, add this text and save the file: +#' +#' ```r +#' # Add custom microorganism codes: +#' options(AMR_custom_mo = "~/my_custom_mo.rds") +#' ``` +#' +#' Upon package load, this file will be loaded and run through the [add_custom_microorganisms()] function. +#' +#' **Method 2:** Loading the microorganism directly from your `.Rprofile` file. Note that the definitions will be stored in a user-specific \R file, which is a suboptimal workflow. To use this method: +#' +#' 1. Edit the `.Rprofile` file using e.g. `utils::file.edit("~/.Rprofile")`. +#' +#' 2. Add a text like below and save the file: +#' +#' ```r +#' # Add custom antibiotic drug codes: +#' AMR::add_custom_microorganisms( +#' data.frame(genus = "Enterobacter", +#' species = "asburiae/cloacae") +#' ) +#' ``` +#' +#' Use [clear_custom_microorganisms()] to clear the previously added microorganisms. +#' @seealso [add_custom_antimicrobials()] to add custom antimicrobials. +#' @rdname add_custom_microorganisms +#' @export +#' @examples +#' \donttest{ +#' # a combination of species is not formal taxonomy, so +#' # this will result in "Enterobacter cloacae cloacae", +#' # since it resembles the input best: +#' mo_name("Enterobacter asburiae/cloacae") +#' +#' # now add a custom entry - it will be considered by as.mo() and +#' # all mo_*() functions +#' add_custom_microorganisms( +#' data.frame( +#' genus = "Enterobacter", +#' species = "asburiae/cloacae" +#' ) +#' ) +#' +#' # E. asburiae/cloacae is now a new microorganism: +#' mo_name("Enterobacter asburiae/cloacae") +#' +#' # its code: +#' as.mo("Enterobacter asburiae/cloacae") +#' +#' # all internal algorithms will work as well: +#' mo_name("Ent asburia cloacae") +#' +#' # and even the taxonomy was added based on the genus! +#' mo_family("E. asburiae/cloacae") +#' mo_gramstain("Enterobacter asburiae/cloacae") +#' +#' mo_info("Enterobacter asburiae/cloacae") +#' +#' +#' # the function tries to be forgiving: +#' add_custom_microorganisms( +#' data.frame( +#' GENUS = "BACTEROIDES / PARABACTEROIDES SLASHLINE", +#' SPECIES = "SPECIES" +#' ) +#' ) +#' mo_name("BACTEROIDES / PARABACTEROIDES") +#' mo_rank("BACTEROIDES / PARABACTEROIDES") +#' +#' # taxonomy still works, even though a slashline genus was given as input: +#' mo_family("Bacteroides/Parabacteroides") +#' +#' +#' # for groups and complexes, set them as species or subspecies: +#' add_custom_microorganisms( +#' data.frame( +#' genus = "Citrobacter", +#' species = c("freundii", "braakii complex"), +#' subspecies = c("complex", "") +#' ) +#' ) +#' mo_name(c("C. freundii complex", "C. braakii complex")) +#' mo_species(c("C. freundii complex", "C. braakii complex")) +#' mo_gramstain(c("C. freundii complex", "C. braakii complex")) +#' } +add_custom_microorganisms <- function(x) { + meet_criteria(x, allow_class = "data.frame") + stop_ifnot("genus" %in% tolower(colnames(x)), paste0("`x` must contain column 'genus'.")) + + add_MO_lookup_to_AMR_env() + + # remove any extra class/type, such as grouped tbl, or data.table: + x <- as.data.frame(x, stringsAsFactors = FALSE) + colnames(x) <- tolower(colnames(x)) + # rename 'name' to 'fullname' if it's in the data set + if ("name" %in% colnames(x) && !"fullname" %in% colnames(x)) { + colnames(x)[colnames(x) == "name"] <- "fullname" + } + # keep only columns available in the microorganisms data set + x <- x[, colnames(AMR_env$MO_lookup)[colnames(AMR_env$MO_lookup) %in% colnames(x)], drop = FALSE] + + # clean the input ---- + for (col in c("genus", "species", "subspecies")) { + if (!col %in% colnames(x)) { + x[, col] <- "" + } + if (is.factor(x[, col, drop = TRUE])) { + x[, col] <- as.character(x[, col, drop = TRUE]) + } + col_ <- x[, col, drop = TRUE] + col_ <- tolower(col_) + col_ <- gsub("slashline", "", col_, fixed = TRUE) + col_ <- trimws2(col_) + col_[col_ %like% "(sub)?species"] <- "" + col_ <- gsub(" *([/-]) *", "\\1", col_, perl = TRUE) + # groups are in our taxonomic table with a capital G + col_ <- gsub(" group( |$)", " Group\\1", col_, perl = TRUE) + + col_[is.na(col_)] <- "" + if (col == "genus") { + substr(col_, 1, 1) <- toupper(substr(col_, 1, 1)) + col_ <- gsub("/([a-z])", "/\\U\\1", col_, perl = TRUE) + stop_if(any(col_ == ""), "the 'genus' column cannot be empty") + stop_if(any(col_ %like% " "), "the 'genus' column must not contain spaces") + } + x[, col] <- col_ + } + # if subspecies is a group or complex, add it to the species and empty the subspecies + x$species[which(x$subspecies %in% c("group", "Group", "complex"))] <- paste( + x$species[which(x$subspecies %in% c("group", "Group", "complex"))], + x$subspecies[which(x$subspecies %in% c("group", "Group", "complex"))] + ) + x$subspecies[which(x$subspecies %in% c("group", "Group", "complex"))] <- "" + + if ("rank" %in% colnames(x)) { + stop_ifnot( + all(x$rank %in% AMR_env$MO_lookup$rank), + "the 'rank' column can only contain these values: ", vector_or(AMR_env$MO_lookup$rank) + ) + } else { + x$rank <- ifelse(x$subspecies != "", "subspecies", + ifelse(x$species != "", "species", + ifelse(x$genus != "", "genus", + stop("in add_custom_microorganisms(): only microorganisms up to the genus level can be added", + call. = FALSE + ) + ) + ) + ) + } + x$source <- "Added by user" + if (!"fullname" %in% colnames(x)) { + x$fullname <- trimws2(paste(x$genus, x$species, x$subspecies)) + } + if (!"kingdom" %in% colnames(x)) x$kingdom <- "" + if (!"phylum" %in% colnames(x)) x$phylum <- "" + if (!"class" %in% colnames(x)) x$class <- "" + if (!"order" %in% colnames(x)) x$order <- "" + if (!"family" %in% colnames(x)) x$family <- "" + x$kingdom[is.na(x$kingdom)] <- "" + x$phylum[is.na(x$phylum)] <- "" + x$class[is.na(x$class)] <- "" + x$order[is.na(x$order)] <- "" + x$family[is.na(x$family)] <- "" + + for (col in colnames(x)) { + if (is.factor(x[, col, drop = TRUE])) { + x[, col] <- as.character(x[, col, drop = TRUE]) + } + if (is.list(AMR_env$MO_lookup[, col, drop = TRUE])) { + x[, col] <- as.list(x[, col, drop = TRUE]) + } + } + + # fill in taxonomy based on genus + genus_to_check <- gsub("^(.*)[^a-zA-Z].*", "\\1", x$genus, perl = TRUE) + x$kingdom[which(x$kingdom == "" & genus_to_check != "")] <- AMR_env$MO_lookup$kingdom[match(genus_to_check[which(x$kingdom == "" & genus_to_check != "")], AMR_env$MO_lookup$genus)] + x$phylum[which(x$phylum == "" & genus_to_check != "")] <- AMR_env$MO_lookup$phylum[match(genus_to_check[which(x$phylum == "" & genus_to_check != "")], AMR_env$MO_lookup$genus)] + x$class[which(x$class == "" & genus_to_check != "")] <- AMR_env$MO_lookup$class[match(genus_to_check[which(x$class == "" & genus_to_check != "")], AMR_env$MO_lookup$genus)] + x$order[which(x$order == "" & genus_to_check != "")] <- AMR_env$MO_lookup$order[match(genus_to_check[which(x$order == "" & genus_to_check != "")], AMR_env$MO_lookup$genus)] + x$family[which(x$family == "" & genus_to_check != "")] <- AMR_env$MO_lookup$family[match(genus_to_check[which(x$family == "" & genus_to_check != "")], AMR_env$MO_lookup$genus)] + + # fill in other columns that are used in internal algorithms + x$prevalence <- NA_real_ + x$prevalence[which(genus_to_check != "")] <- AMR_env$MO_lookup$prevalence[match(genus_to_check[which(genus_to_check != "")], AMR_env$MO_lookup$genus)] + x$prevalence[is.na(x$prevalence)] <- 1.25 + x$status <- "accepted" + x$ref <- paste("Self-added,", format(Sys.Date(), "%Y")) + x$kingdom_index <- AMR_env$MO_lookup$kingdom_index[match(genus_to_check, AMR_env$MO_lookup$genus)] + # complete missing kingdom index, so mo_matching_score() will not return NA + x$kingdom_index[is.na(x$kingdom_index)] <- 1 + x$fullname_lower <- tolower(x$fullname) + x$full_first <- substr(x$fullname_lower, 1, 1) + x$species_first <- tolower(substr(x$species, 1, 1)) + x$subspecies_first <- tolower(substr(x$subspecies, 1, 1)) + + if (!"mo" %in% colnames(x)) { + # create the mo code + x$mo <- NA_character_ + } + x$mo <- trimws2(as.character(x$mo)) + x$mo[x$mo == ""] <- NA_character_ + current <- sum(AMR_env$MO_lookup$source == "Added by user", na.rm = TRUE) + x$mo[is.na(x$mo)] <- paste0( + "CUSTOM", + seq.int(from = current + 1, to = current + nrow(x), by = 1), + "_", + trimws( + paste(abbreviate_mo(x$genus, 5), + abbreviate_mo(x$species, 4, hyphen_as_space = TRUE), + abbreviate_mo(x$subspecies, 4, hyphen_as_space = TRUE), + sep = "_"), + whitespace = "_")) + stop_if(anyDuplicated(c(as.character(AMR_env$MO_lookup$mo), x$mo)), "MO codes must be unique and not match existing MO codes of the AMR package") + + # add to package ---- + AMR_env$custom_mo_codes <- c(AMR_env$custom_mo_codes, x$mo) + class(AMR_env$MO_lookup$mo) <- "character" + + new_df <- AMR_env$MO_lookup[0, , drop = FALSE][seq_len(NROW(x)), , drop = FALSE] + rownames(new_df) <- NULL + list_cols <- vapply(FUN.VALUE = logical(1), new_df, is.list) + for (l in which(list_cols)) { + # prevent binding NULLs in lists, replace with NA + new_df[, l] <- as.list(NA_character_) + } + for (col in colnames(x)) { + # assign new values + new_df[, col] <- x[, col, drop = TRUE] + } + + # clear previous coercions + suppressMessages(mo_reset_session()) + + AMR_env$MO_lookup <- unique(rbind_AMR(AMR_env$MO_lookup, new_df)) + class(AMR_env$MO_lookup$mo) <- c("mo", "character") + if (nrow(x) <= 3) { + message_("Added ", vector_and(italicise(x$fullname), quotes = FALSE), " to the internal `microorganisms` data set.") + } else { + message_("Added ", nr2char(nrow(x)), " records to the internal `microorganisms` data set.") + } +} + +#' @rdname add_custom_microorganisms +#' @export +clear_custom_microorganisms <- function() { + n <- nrow(AMR_env$MO_lookup) + + # reset + AMR_env$MO_lookup <- NULL + add_MO_lookup_to_AMR_env() + + # clear previous coercions + suppressMessages(mo_reset_session()) + + n2 <- nrow(AMR_env$MO_lookup) + AMR_env$custom_mo_codes <- character(0) + AMR_env$mo_previously_coerced <- AMR_env$mo_previously_coerced[which(AMR_env$mo_previously_coerced$mo %in% AMR_env$MO_lookup$mo), , drop = FALSE] + AMR_env$mo_uncertainties <- AMR_env$mo_uncertainties[0, , drop = FALSE] + message_("Cleared ", nr2char(n - n2), " custom record", ifelse(n - n2 > 1, "s", ""), " from the internal `microorganisms` data set.") +} + +abbreviate_mo <- function(x, minlength = 5, prefix = "", hyphen_as_space = FALSE, ...) { + if (hyphen_as_space == TRUE) { + x <- gsub("-", " ", x, fixed = TRUE) + } + # keep a starting Latin ae + suppressWarnings( + gsub("(\u00C6|\u00E6)+", + "AE", + toupper( + paste0(prefix, + abbreviate( + gsub("^ae", + "\u00E6\u00E6", + x, + ignore.case = TRUE), + minlength = minlength, + use.classes = TRUE, + method = "both.sides", + ... + )))) + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/data.R + + + + +#' Data Sets with `r format(nrow(antibiotics) + nrow(antivirals), big.mark = " ")` Antimicrobial Drugs +#' +#' Two data sets containing all antibiotics/antimycotics and antivirals. Use [as.ab()] or one of the [`ab_*`][ab_property()] functions to retrieve values from the [antibiotics] data set. Three identifiers are included in this data set: an antibiotic ID (`ab`, primarily used in this package) as defined by WHONET/EARS-Net, an ATC code (`atc`) as defined by the WHO, and a Compound ID (`cid`) as found in PubChem. Other properties in this data set are derived from one or more of these codes. Note that some drugs have multiple ATC codes. +#' @format +#' ### For the [antibiotics] data set: a [tibble][tibble::tibble] with `r nrow(antibiotics)` observations and `r ncol(antibiotics)` variables: +#' - `ab`\cr Antibiotic ID as used in this package (such as `AMC`), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available. ***This is a unique identifier.*** +#' - `cid`\cr Compound ID as found in PubChem. ***This is a unique identifier.*** +#' - `name`\cr Official name as used by WHONET/EARS-Net or the WHO. ***This is a unique identifier.*** +#' - `group`\cr A short and concise group name, based on WHONET and WHOCC definitions +#' - `atc`\cr ATC codes (Anatomical Therapeutic Chemical) as defined by the WHOCC, like `J01CR02` +#' - `atc_group1`\cr Official pharmacological subgroup (3rd level ATC code) as defined by the WHOCC, like `"Macrolides, lincosamides and streptogramins"` +#' - `atc_group2`\cr Official chemical subgroup (4th level ATC code) as defined by the WHOCC, like `"Macrolides"` +#' - `abbr`\cr List of abbreviations as used in many countries, also for antibiotic susceptibility testing (AST) +#' - `synonyms`\cr Synonyms (often trade names) of a drug, as found in PubChem based on their compound ID +#' - `oral_ddd`\cr Defined Daily Dose (DDD), oral treatment, currently available for `r sum(!is.na(antibiotics$oral_ddd))` drugs +#' - `oral_units`\cr Units of `oral_ddd` +#' - `iv_ddd`\cr Defined Daily Dose (DDD), parenteral (intravenous) treatment, currently available for `r sum(!is.na(antibiotics$iv_ddd))` drugs +#' - `iv_units`\cr Units of `iv_ddd` +#' - `loinc`\cr All codes associated with the name of the antimicrobial drug from `r TAXONOMY_VERSION$LOINC$citation` Use [ab_loinc()] to retrieve them quickly, see [ab_property()]. +#' +#' ### For the [antivirals] data set: a [tibble][tibble::tibble] with `r nrow(antivirals)` observations and `r ncol(antivirals)` variables: +#' - `av`\cr Antiviral ID as used in this package (such as `ACI`), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available. ***This is a unique identifier.*** Combinations are codes that contain a `+` to indicate this, such as `ATA+COBI` for atazanavir/cobicistat. +#' - `name`\cr Official name as used by WHONET/EARS-Net or the WHO. ***This is a unique identifier.*** +#' - `atc`\cr ATC codes (Anatomical Therapeutic Chemical) as defined by the WHOCC +#' - `cid`\cr Compound ID as found in PubChem. ***This is a unique identifier.*** +#' - `atc_group`\cr Official pharmacological subgroup (3rd level ATC code) as defined by the WHOCC +#' - `synonyms`\cr Synonyms (often trade names) of a drug, as found in PubChem based on their compound ID +#' - `oral_ddd`\cr Defined Daily Dose (DDD), oral treatment +#' - `oral_units`\cr Units of `oral_ddd` +#' - `iv_ddd`\cr Defined Daily Dose (DDD), parenteral treatment +#' - `iv_units`\cr Units of `iv_ddd` +#' - `loinc`\cr All codes associated with the name of the antiviral drug from `r TAXONOMY_VERSION$LOINC$citation` Use [av_loinc()] to retrieve them quickly, see [av_property()]. +#' @details Properties that are based on an ATC code are only available when an ATC is available. These properties are: `atc_group1`, `atc_group2`, `oral_ddd`, `oral_units`, `iv_ddd` and `iv_units`. +#' +#' Synonyms (i.e. trade names) were derived from the PubChem Compound ID (column `cid`) and consequently only available where a CID is available. +#' +#' ### Direct download +#' Like all data sets in this package, these data sets are publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @source +#' +#' * World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology (WHOCC): +#' +#' * `r TAXONOMY_VERSION$LOINC$citation` Accessed from <`r TAXONOMY_VERSION$LOINC$url`> on `r documentation_date(TAXONOMY_VERSION$LOINC$accessed_date)`. +#' +#' * European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: +#' @inheritSection WHOCC WHOCC +#' @seealso [microorganisms], [intrinsic_resistant] +#' @examples +#' antibiotics +#' antivirals +"antibiotics" + +#' @rdname antibiotics +"antivirals" + +#' Data Set with `r format(nrow(microorganisms), big.mark = " ")` Taxonomic Records of Microorganisms +#' +#' @description +#' A data set containing the full microbial taxonomy (**last updated: `r documentation_date(max(TAXONOMY_VERSION$GBIF$accessed_date, TAXONOMY_VERSION$LPSN$accessed_date, TAXONOMY_VERSION$MycoBank$accessed_date))`**) of `r nr2char(length(unique(microorganisms$kingdom[!microorganisms$kingdom %like% "unknown"])))` kingdoms. This data set is the backbone of this `AMR` package. MO codes can be looked up using [as.mo()] and microorganism properties can be looked up using any of the [`mo_*`][mo_property()] functions. +#' +#' This data set is carefully crafted, yet made 100% reproducible from public and authoritative taxonomic sources (using [this script](https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R)), namely: *`r TAXONOMY_VERSION$LPSN$name`* for bacteria, *`r TAXONOMY_VERSION$MycoBank$name`* for fungi, and *`r TAXONOMY_VERSION$GBIF$name`* for all others taxons. +#' @format A [tibble][tibble::tibble] with `r format(nrow(microorganisms), big.mark = " ")` observations and `r ncol(microorganisms)` variables: +#' - `mo`\cr ID of microorganism as used by this package. ***This is a unique identifier.*** +#' - `fullname`\cr Full name, like `"Escherichia coli"`. For the taxonomic ranks genus, species and subspecies, this is the 'pasted' text of genus, species, and subspecies. For all taxonomic ranks higher than genus, this is the name of the taxon. ***This is a unique identifier.*** +#' - `status` \cr Status of the taxon, either `r vector_or(microorganisms$status)` +#' - `kingdom`, `phylum`, `class`, `order`, `family`, `genus`, `species`, `subspecies`\cr Taxonomic rank of the microorganism. Note that for fungi, *phylum* is equal to their taxonomic *division*. Also, for fungi, *subkingdom* and *subdivision* were left out since they do not occur in the bacterial taxonomy. +#' - `rank`\cr Text of the taxonomic rank of the microorganism, such as `"species"` or `"genus"` +#' - `ref`\cr Author(s) and year of related scientific publication. This contains only the *first surname* and year of the *latest* authors, e.g. "Wallis *et al.* 2006 *emend.* Smith and Jones 2018" becomes "Smith *et al.*, 2018". This field is directly retrieved from the source specified in the column `source`. Moreover, accents were removed to comply with CRAN that only allows ASCII characters. +#' - `oxygen_tolerance` \cr Oxygen tolerance, either `r vector_or(microorganisms$oxygen_tolerance)`. These data were retrieved from BacDive (see *Source*). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently `r round(length(microorganisms$oxygen_tolerance[which(!is.na(microorganisms$oxygen_tolerance))]) / nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]) * 100, 1)`% of all `r format_included_data_number(nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]))` bacteria in the data set contain an oxygen tolerance. +#' - `source`\cr Either `r vector_or(microorganisms$source)` (see *Source*) +#' - `lpsn`\cr Identifier ('Record number') of `r TAXONOMY_VERSION$LPSN$name`. This will be the first/highest LPSN identifier to keep one identifier per row. For example, *Acetobacter ascendens* has LPSN Record number 7864 and 11011. Only the first is available in the `microorganisms` data set. ***This is a unique identifier***, though available for only `r format_included_data_number(sum(!is.na(microorganisms$lpsn)))` records. +#' - `lpsn_parent`\cr LPSN identifier of the parent taxon +#' - `lpsn_renamed_to`\cr LPSN identifier of the currently valid taxon +#' - `mycobank`\cr Identifier ('MycoBank #') of `r TAXONOMY_VERSION$MycoBank$name`. ***This is a unique identifier***, though available for only `r format_included_data_number(sum(!is.na(microorganisms$mycobank)))` records. +#' - `mycobank_parent`\cr MycoBank identifier of the parent taxon +#' - `mycobank_renamed_to`\cr MycoBank identifier of the currently valid taxon +#' - `gbif`\cr Identifier ('taxonID') of `r TAXONOMY_VERSION$GBIF$name`. ***This is a unique identifier***, though available for only `r format_included_data_number(sum(!is.na(microorganisms$gbif)))` records. +#' - `gbif_parent`\cr GBIF identifier of the parent taxon +#' - `gbif_renamed_to`\cr GBIF identifier of the currently valid taxon +#' - `prevalence`\cr Prevalence of the microorganism based on Bartlett *et al.* (2022, \doi{10.1099/mic.0.001269}), see [mo_matching_score()] for the full explanation +#' - `snomed`\cr Systematized Nomenclature of Medicine (SNOMED) code of the microorganism, version of `r documentation_date(TAXONOMY_VERSION$SNOMED$accessed_date)` (see *Source*). Use [mo_snomed()] to retrieve it quickly, see [mo_property()]. +#' @details +#' Please note that entries are only based on LPSN, MycoBank, and GBIF (see below). Since these sources incorporate entries based on (recent) publications in the International Journal of Systematic and Evolutionary Microbiology (IJSEM), it can happen that the year of publication is sometimes later than one might expect. +#' +#' For example, *Staphylococcus pettenkoferi* was described for the first time in Diagnostic Microbiology and Infectious Disease in 2002 (\doi{10.1016/s0732-8893(02)00399-1}), but it was not until 2007 that a publication in IJSEM followed (\doi{10.1099/ijs.0.64381-0}). Consequently, the `AMR` package returns 2007 for `mo_year("S. pettenkoferi")`. +#' +#' @section Included Taxa: +#' Included taxonomic data from [LPSN](`r TAXONOMY_VERSION$LPSN$url`), [MycoBank](`r TAXONOMY_VERSION$MycoBank$url`), and [GBIF](`r TAXONOMY_VERSION$GBIF$url`) are: +#' - All `r format_included_data_number(microorganisms[which(microorganisms$kingdom %in% c("Archeae", "Bacteria")), , drop = FALSE])` (sub)species from the kingdoms of Archaea and Bacteria +#' - `r format_included_data_number(microorganisms[which(microorganisms$kingdom == "Fungi"), , drop = FALSE])` species from the kingdom of Fungi. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package. Only relevant fungi are covered (such as all species of *Aspergillus*, *Candida*, *Cryptococcus*, *Histoplasma*, *Pneumocystis*, *Saccharomyces* and *Trichophyton*). +#' - `r format_included_data_number(microorganisms[which(microorganisms$kingdom == "Protozoa"), , drop = FALSE])` (sub)species from the kingdom of Protozoa +#' - `r format_included_data_number(microorganisms[which(microorganisms$kingdom == "Animalia"), , drop = FALSE])` (sub)species from `r format_included_data_number(microorganisms[which(microorganisms$kingdom == "Animalia"), "genus", drop = TRUE])` other relevant genera from the kingdom of Animalia (such as *Strongyloides* and *Taenia*) +#' - All `r format_included_data_number(microorganisms[which(microorganisms$status != "accepted"), , drop = FALSE])` previously accepted names of all included (sub)species (these were taxonomically renamed) +#' - The complete taxonomic tree of all included (sub)species: from kingdom to subspecies +#' - The identifier of the parent taxons +#' - The year and first author of the related scientific publication +#' +#' ### Manual additions +#' For convenience, some entries were added manually: +#' +#' - `r format_included_data_number(microorganisms[which(microorganisms$source == "manually added" & microorganisms$genus == "Salmonella"), , drop = FALSE])` entries of *Salmonella*, such as the city-like serovars and groups A to H +#' - `r format_included_data_number(length(which(microorganisms$rank == "species group")))` species groups (such as the beta-haemolytic *Streptococcus* groups A to K, coagulase-negative *Staphylococcus* (CoNS), *Mycobacterium tuberculosis* complex, etc.), of which the group compositions are stored in the [microorganisms.groups] data set +#' - 1 entry of *Blastocystis* (*B. hominis*), although it officially does not exist (Noel *et al.* 2005, PMID 15634993) +#' - 1 entry of *Moraxella* (*M. catarrhalis*), which was formally named *Branhamella catarrhalis* (Catlin, 1970) though this change was never accepted within the field of clinical microbiology +#' - 8 other 'undefined' entries (unknown, unknown Gram-negatives, unknown Gram-positives, unknown yeast, unknown fungus, and unknown anaerobic Gram-pos/Gram-neg bacteria) +#' +#' The syntax used to transform the original data to a cleansed \R format, can be [found here](https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R). +#' +#' ### Direct download +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @source +#' Taxonomic entries were imported in this order of importance: +#' 1. `r TAXONOMY_VERSION$LPSN$name`:\cr\cr +#' `r TAXONOMY_VERSION$LPSN$citation` Accessed from <`r TAXONOMY_VERSION$LPSN$url`> on `r documentation_date(TAXONOMY_VERSION$LPSN$accessed_date)`. +#' +#' 2. `r TAXONOMY_VERSION$MycoBank$name`:\cr\cr +#' `r TAXONOMY_VERSION$MycoBank$citation` Accessed from <`r TAXONOMY_VERSION$MycoBank$url`> on `r documentation_date(TAXONOMY_VERSION$MycoBank$accessed_date)`. +#' +#' 3. `r TAXONOMY_VERSION$GBIF$name`:\cr\cr +#' `r TAXONOMY_VERSION$GBIF$citation` Accessed from <`r TAXONOMY_VERSION$GBIF$url`> on `r documentation_date(TAXONOMY_VERSION$GBIF$accessed_date)`. +#' +#' Furthermore, these sources were used for additional details: +#' +#' * `r TAXONOMY_VERSION$BacDive$name`:\cr\cr +#' `r TAXONOMY_VERSION$BacDive$citation` Accessed from <`r TAXONOMY_VERSION$BacDive$url`> on `r documentation_date(TAXONOMY_VERSION$BacDive$accessed_date)`. +#' +#' * `r TAXONOMY_VERSION$SNOMED$name`:\cr\cr +#' `r TAXONOMY_VERSION$SNOMED$citation` Accessed from <`r TAXONOMY_VERSION$SNOMED$url`> on `r documentation_date(TAXONOMY_VERSION$SNOMED$accessed_date)`. +#' +#' * Grimont *et al.* (2007). Antigenic Formulae of the Salmonella Serovars, 9th Edition. WHO Collaborating Centre for Reference and Research on *Salmonella* (WHOCC-SALM). +#' +#' * Bartlett *et al.* (2022). **A comprehensive list of bacterial pathogens infecting humans** *Microbiology* 168:001269; \doi{10.1099/mic.0.001269} +#' @seealso [as.mo()], [mo_property()], [microorganisms.groups], [microorganisms.codes], [intrinsic_resistant] +#' @examples +#' microorganisms +"microorganisms" + +#' Data Set with `r format(nrow(microorganisms.codes), big.mark = " ")` Common Microorganism Codes +#' +#' A data set containing commonly used codes for microorganisms, from laboratory systems and [WHONET](https://whonet.org). Define your own with [set_mo_source()]. They will all be searched when using [as.mo()] and consequently all the [`mo_*`][mo_property()] functions. +#' @format A [tibble][tibble::tibble] with `r format(nrow(microorganisms.codes), big.mark = " ")` observations and `r ncol(microorganisms.codes)` variables: +#' - `code`\cr Commonly used code of a microorganism. ***This is a unique identifier.*** +#' - `mo`\cr ID of the microorganism in the [microorganisms] data set +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @seealso [as.mo()] [microorganisms] +#' @examples +#' microorganisms.codes +#' +#' # 'ECO' or 'eco' is the WHONET code for E. coli: +#' microorganisms.codes[microorganisms.codes$code == "ECO", ] +#' +#' # and therefore, 'eco' will be understood as E. coli in this package: +#' mo_info("eco") +#' +#' # works for all AMR functions: +#' mo_is_intrinsic_resistant("eco", ab = "vancomycin") +"microorganisms.codes" + +#' Data Set with `r format(nrow(microorganisms.groups), big.mark = " ")` Microorganisms In Species Groups +#' +#' A data set containing species groups and microbiological complexes, which are used in [the clinical breakpoints table][clinical_breakpoints]. +#' @format A [tibble][tibble::tibble] with `r format(nrow(microorganisms.groups), big.mark = " ")` observations and `r ncol(microorganisms.groups)` variables: +#' - `mo_group`\cr ID of the species group / microbiological complex +#' - `mo`\cr ID of the microorganism belonging in the species group / microbiological complex +#' - `mo_group_name`\cr Name of the species group / microbiological complex, as retrieved with [mo_name()] +#' - `mo_name`\cr Name of the microorganism belonging in the species group / microbiological complex, as retrieved with [mo_name()] +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @seealso [as.mo()] [microorganisms] +#' @examples +#' microorganisms.groups +#' +#' # these are all species in the Bacteroides fragilis group, as per WHONET: +#' microorganisms.groups[microorganisms.groups$mo_group == "B_BCTRD_FRGL-C", ] +"microorganisms.groups" + +#' Data Set with `r format(nrow(example_isolates), big.mark = " ")` Example Isolates +#' +#' A data set containing `r format(nrow(example_isolates), big.mark = " ")` microbial isolates with their full antibiograms. This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis. For examples, please read [the tutorial on our website](https://msberends.github.io/AMR/articles/AMR.html). +#' @format A [tibble][tibble::tibble] with `r format(nrow(example_isolates), big.mark = " ")` observations and `r ncol(example_isolates)` variables: +#' - `date`\cr Date of receipt at the laboratory +#' - `patient`\cr ID of the patient +#' - `age`\cr Age of the patient +#' - `gender`\cr Gender of the patient, either `r vector_or(example_isolates$gender)` +#' - `ward`\cr Ward type where the patient was admitted, either `r vector_or(example_isolates$ward)` +#' - `mo`\cr ID of microorganism created with [as.mo()], see also the [microorganisms] data set +#' - `PEN:RIF`\cr `r sum(vapply(FUN.VALUE = logical(1), example_isolates, is.sir))` different antibiotics with class [`sir`] (see [as.sir()]); these column names occur in the [antibiotics] data set and can be translated with [set_ab_names()] or [ab_name()] +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @examples +#' example_isolates +"example_isolates" + +#' Data Set with Unclean Data +#' +#' A data set containing `r format(nrow(example_isolates_unclean), big.mark = " ")` microbial isolates that are not cleaned up and consequently not ready for AMR data analysis. This data set can be used for practice. +#' @format A [tibble][tibble::tibble] with `r format(nrow(example_isolates_unclean), big.mark = " ")` observations and `r ncol(example_isolates_unclean)` variables: +#' - `patient_id`\cr ID of the patient +#' - `date`\cr date of receipt at the laboratory +#' - `hospital`\cr ID of the hospital, from A to C +#' - `bacteria`\cr info about microorganism that can be transformed with [as.mo()], see also [microorganisms] +#' - `AMX:GEN`\cr 4 different antibiotics that have to be transformed with [as.sir()] +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @examples +#' example_isolates_unclean +"example_isolates_unclean" + +#' Data Set with `r format(nrow(WHONET), big.mark = " ")` Isolates - WHONET Example +#' +#' This example data set has the exact same structure as an export file from WHONET. Such files can be used with this package, as this example data set shows. The antibiotic results are from our [example_isolates] data set. All patient names were created using online surname generators and are only in place for practice purposes. +#' @format A [tibble][tibble::tibble] with `r format(nrow(WHONET), big.mark = " ")` observations and `r ncol(WHONET)` variables: +#' - `Identification number`\cr ID of the sample +#' - `Specimen number`\cr ID of the specimen +#' - `Organism`\cr Name of the microorganism. Before analysis, you should transform this to a valid microbial class, using [as.mo()]. +#' - `Country`\cr Country of origin +#' - `Laboratory`\cr Name of laboratory +#' - `Last name`\cr Fictitious last name of patient +#' - `First name`\cr Fictitious initial of patient +#' - `Sex`\cr Fictitious gender of patient +#' - `Age`\cr Fictitious age of patient +#' - `Age category`\cr Age group, can also be looked up using [age_groups()] +#' - `Date of admission`\cr [Date] of hospital admission +#' - `Specimen date`\cr [Date] when specimen was received at laboratory +#' - `Specimen type`\cr Specimen type or group +#' - `Specimen type (Numeric)`\cr Translation of `"Specimen type"` +#' - `Reason`\cr Reason of request with Differential Diagnosis +#' - `Isolate number`\cr ID of isolate +#' - `Organism type`\cr Type of microorganism, can also be looked up using [mo_type()] +#' - `Serotype`\cr Serotype of microorganism +#' - `Beta-lactamase`\cr Microorganism produces beta-lactamase? +#' - `ESBL`\cr Microorganism produces extended spectrum beta-lactamase? +#' - `Carbapenemase`\cr Microorganism produces carbapenemase? +#' - `MRSA screening test`\cr Microorganism is possible MRSA? +#' - `Inducible clindamycin resistance`\cr Clindamycin can be induced? +#' - `Comment`\cr Other comments +#' - `Date of data entry`\cr [Date] this data was entered in WHONET +#' - `AMP_ND10:CIP_EE`\cr `r sum(vapply(FUN.VALUE = logical(1), WHONET, is.sir))` different antibiotics. You can lookup the abbreviations in the [antibiotics] data set, or use e.g. [`ab_name("AMP")`][ab_name()] to get the official name immediately. Before analysis, you should transform this to a valid antibiotic class, using [as.sir()]. +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @examples +#' WHONET +"WHONET" + +#' Data Set with Clinical Breakpoints for SIR Interpretation +#' +#' @description Data set containing clinical breakpoints to interpret MIC and disk diffusion to SIR values, according to international guidelines. This dataset contain breakpoints for humans, `r length(unique(clinical_breakpoints$host[!clinical_breakpoints$host %in% clinical_breakpoints$type]))` different animal groups, and ECOFFs. +#' +#' These breakpoints are currently implemented: +#' - For **clinical microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`; +#' - For **veterinary microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`; +#' - For **ECOFFs** (Epidemiological Cut-off Values): EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`. +#' +#' Use [as.sir()] to transform MICs or disks measurements to SIR values. +#' @format A [tibble][tibble::tibble] with `r format(nrow(clinical_breakpoints), big.mark = " ")` observations and `r ncol(clinical_breakpoints)` variables: +#' - `guideline`\cr Name of the guideline +#' - `type`\cr Breakpoint type, either `r vector_or(clinical_breakpoints$type)` +#' - `host`\cr Host of infectious agent. This is mostly useful for veterinary breakpoints and is either `r vector_or(clinical_breakpoints$host)` +#' - `method`\cr Testing method, either `r vector_or(clinical_breakpoints$method)` +#' - `site`\cr Body site for which the breakpoint must be applied, e.g. "Oral" or "Respiratory" +#' - `mo`\cr Microbial ID, see [as.mo()] +#' - `rank_index`\cr Taxonomic rank index of `mo` from 1 (subspecies/infraspecies) to 5 (unknown microorganism) +#' - `ab`\cr Antibiotic code as used by this package, EARS-Net and WHONET, see [as.ab()] +#' - `ref_tbl`\cr Info about where the guideline rule can be found +#' - `disk_dose`\cr Dose of the used disk diffusion method +#' - `breakpoint_S`\cr Lowest MIC value or highest number of millimetres that leads to "S" +#' - `breakpoint_R`\cr Highest MIC value or lowest number of millimetres that leads to "R" +#' - `uti`\cr A [logical] value (`TRUE`/`FALSE`) to indicate whether the rule applies to a urinary tract infection (UTI) +#' - `is_SDD`\cr A [logical] value (`TRUE`/`FALSE`) to indicate whether the intermediate range between "S" and "R" should be interpreted as "SDD", instead of "I". This currently applies to `r sum(clinical_breakpoints$is_SDD)` breakpoints. +#' @details +#' ### Different types of breakpoints +#' Supported types of breakpoints are `r vector_and(clinical_breakpoints$type, quote = FALSE)`. ECOFF (Epidemiological cut-off) values are used in antimicrobial susceptibility testing to differentiate between wild-type and non-wild-type strains of bacteria or fungi. +#' +#' The default is `"human"`, which can also be set with the package option [`AMR_breakpoint_type`][AMR-options]. Use [`as.sir(..., breakpoint_type = ...)`][as.sir()] to interpret raw data using a specific breakpoint type, e.g. `as.sir(..., breakpoint_type = "ECOFF")` to use ECOFFs. +#' +#' ### Imported from WHONET +#' Clinical breakpoints in this package were validated through and imported from [WHONET](https://whonet.org), a free desktop Windows application developed and supported by the WHO Collaborating Centre for Surveillance of Antimicrobial Resistance. More can be read on [their website](https://whonet.org). The developers of WHONET and this `AMR` package have been in contact about sharing their work. We highly appreciate their great development on the WHONET software. +#' +#' ### Response from CLSI and EUCAST +#' The CEO of CLSI and the chairman of EUCAST have endorsed the work and public use of this `AMR` package (and consequently the use of their breakpoints) in June 2023, when future development of distributing clinical breakpoints was discussed in a meeting between CLSI, EUCAST, WHO, developers of WHONET software, and developers of this `AMR` package. +#' +#' ### Download +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). They allow for machine reading EUCAST and CLSI guidelines, which is almost impossible with the MS Excel and PDF files distributed by EUCAST and CLSI, though initiatives have started to overcome these burdens. +#' +#' **NOTE:** this `AMR` package (and the WHONET software as well) contains rather complex internal methods to apply the guidelines. For example, some breakpoints must be applied on certain species groups (which are in case of this package available through the [microorganisms.groups] data set). It is important that this is considered when using the breakpoints for own use. +#' @seealso [intrinsic_resistant] +#' @examples +#' clinical_breakpoints +"clinical_breakpoints" + +#' Data Set with Bacterial Intrinsic Resistance +#' +#' Data set containing defined intrinsic resistance by EUCAST of all bug-drug combinations. +#' @format A [tibble][tibble::tibble] with `r format(nrow(intrinsic_resistant), big.mark = " ")` observations and `r ncol(intrinsic_resistant)` variables: +#' - `mo`\cr Microorganism ID +#' - `ab`\cr Antibiotic ID +#' @details +#' This data set is based on `r format_eucast_version_nr(3.3)`. +#' +#' ### Direct download +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' +#' They **allow for machine reading EUCAST and CLSI guidelines**, which is almost impossible with the MS Excel and PDF files distributed by EUCAST and CLSI. +#' @examples +#' intrinsic_resistant +"intrinsic_resistant" + +#' Data Set with Treatment Dosages as Defined by EUCAST +#' +#' EUCAST breakpoints used in this package are based on the dosages in this data set. They can be retrieved with [eucast_dosage()]. +#' @format A [tibble][tibble::tibble] with `r format(nrow(dosage), big.mark = " ")` observations and `r ncol(dosage)` variables: +#' - `ab`\cr Antibiotic ID as used in this package (such as `AMC`), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available +#' - `name`\cr Official name of the antimicrobial drug as used by WHONET/EARS-Net or the WHO +#' - `type`\cr Type of the dosage, either `r vector_or(dosage$type)` +#' - `dose`\cr Dose, such as "2 g" or "25 mg/kg" +#' - `dose_times`\cr Number of times a dose must be administered +#' - `administration`\cr Route of administration, either `r vector_or(dosage$administration)` +#' - `notes`\cr Additional dosage notes +#' - `original_txt`\cr Original text in the PDF file of EUCAST +#' - `eucast_version`\cr Version number of the EUCAST Clinical Breakpoints guideline to which these dosages apply, either `r vector_or(dosage$eucast_version, quotes = FALSE, sort = TRUE, reverse = TRUE)` +#' @details +#' Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit [our website for the download links](https://msberends.github.io/AMR/articles/datasets.html). The actual files are of course available on [our GitHub repository](https://github.com/msberends/AMR/tree/main/data-raw). +#' @examples +#' dosage +"dosage" + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/disk.R + + + + +#' Transform Input to Disk Diffusion Diameters +#' +#' This transforms a vector to a new class [`disk`], which is a disk diffusion growth zone size (around an antibiotic disk) in millimetres between 0 and 50. +#' @rdname as.disk +#' @param x vector +#' @param na.rm a [logical] indicating whether missing values should be removed +#' @details Interpret disk values as SIR values with [as.sir()]. It supports guidelines from EUCAST and CLSI. +#' +#' Disk diffusion growth zone sizes must be between 0 and 50 millimetres. Values higher than 50 but lower than 100 will be maximised to 50. All others input values outside the 0-50 range will return `NA`. +#' @return An [integer] with additional class [`disk`] +#' @aliases disk +#' @export +#' @seealso [as.sir()] +#' @examples +#' # transform existing disk zones to the `disk` class (using base R) +#' df <- data.frame( +#' microorganism = "Escherichia coli", +#' AMP = 20, +#' CIP = 14, +#' GEN = 18, +#' TOB = 16 +#' ) +#' df[, 2:5] <- lapply(df[, 2:5], as.disk) +#' str(df) +#' +#' \donttest{ +#' # transforming is easier with dplyr: +#' if (require("dplyr")) { +#' df %>% mutate(across(AMP:TOB, as.disk)) +#' } +#' } +#' +#' # interpret disk values, see ?as.sir +#' as.sir( +#' x = as.disk(18), +#' mo = "Strep pneu", # `mo` will be coerced with as.mo() +#' ab = "ampicillin", # and `ab` with as.ab() +#' guideline = "EUCAST" +#' ) +#' +#' # interpret whole data set, pretend to be all from urinary tract infections: +#' as.sir(df, uti = TRUE) +as.disk <- function(x, na.rm = FALSE) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + + if (!is.disk(x)) { + x <- unlist(x) + if (isTRUE(na.rm)) { + x <- x[!is.na(x)] + } + x[trimws2(x) == ""] <- NA + x.bak <- x + + na_before <- length(x[is.na(x)]) + + # heavily based on cleaner::clean_double(): + clean_double2 <- function(x, remove = "[^0-9.,-]", fixed = FALSE) { + x <- gsub(",", ".", x, fixed = TRUE) + # remove ending dot/comma + x <- gsub("[,.]$", "", x) + # only keep last dot/comma + reverse <- function(x) vapply(FUN.VALUE = character(1), lapply(strsplit(x, NULL), rev), paste, collapse = "") + x <- sub("{{dot}}", ".", + gsub(".", "", + reverse(sub(".", "}}tod{{", + reverse(x), + fixed = TRUE + )), + fixed = TRUE + ), + fixed = TRUE + ) + x_clean <- gsub(remove, "", x, ignore.case = TRUE, fixed = fixed) + # remove everything that is not a number or dot + as.double(gsub("[^0-9.]+", "", x_clean)) + } + + # round up and make it an integer + x <- as.integer(ceiling(clean_double2(x))) + + # disks can never be less than 0 mm or more than 50 mm + x[x < 0 | x > 99] <- NA_integer_ + x[x > 50] <- 50L + na_after <- length(x[is.na(x)]) + + if (na_before != na_after) { + list_missing <- x.bak[is.na(x) & !is.na(x.bak)] %pm>% + unique() %pm>% + sort() %pm>% + vector_and(quotes = TRUE) + cur_col <- get_current_column() + warning_("in `as.disk()`: ", na_after - na_before, " result", + ifelse(na_after - na_before > 1, "s", ""), + ifelse(is.null(cur_col), "", paste0(" in column '", cur_col, "'")), + " truncated (", + round(((na_after - na_before) / length(x)) * 100), + "%) that were invalid disk zones: ", + list_missing, + call = FALSE + ) + } + } + set_clean_class(as.integer(x), + new_class = c("disk", "integer") + ) +} + +all_valid_disks <- function(x) { + if (!inherits(x, c("disk", "character", "numeric", "integer"))) { + return(FALSE) + } + x_disk <- tryCatch(suppressWarnings(as.disk(x[!is.na(x)])), + error = function(e) NA + ) + !anyNA(x_disk) && !all(is.na(x)) +} + +#' @rdname as.disk +#' @details `NA_disk_` is a missing value of the new `disk` class. +#' @export +NA_disk_ <- set_clean_class(as.integer(NA_real_), + new_class = c("disk", "integer") +) + +#' @rdname as.disk +#' @export +is.disk <- function(x) { + inherits(x, "disk") +} + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.disk <- function(x, ...) { + out <- trimws(format(x)) + out[is.na(x)] <- font_na(NA) + create_pillar_column(out, align = "right", width = 2) +} + +#' @method print disk +#' @export +#' @noRd +print.disk <- function(x, ...) { + cat("Class 'disk'\n") + print(as.integer(x), quote = FALSE) +} + +#' @method [ disk +#' @export +#' @noRd +"[.disk" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [[ disk +#' @export +#' @noRd +"[[.disk" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [<- disk +#' @export +#' @noRd +"[<-.disk" <- function(i, j, ..., value) { + value <- as.disk(value) + y <- NextMethod() + attributes(y) <- attributes(i) + y +} +#' @method [[<- disk +#' @export +#' @noRd +"[[<-.disk" <- function(i, j, ..., value) { + value <- as.disk(value) + y <- NextMethod() + attributes(y) <- attributes(i) + y +} +#' @method c disk +#' @export +#' @noRd +c.disk <- function(...) { + as.disk(unlist(lapply(list(...), as.character))) +} + +#' @method unique disk +#' @export +#' @noRd +unique.disk <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method rep disk +#' @export +#' @noRd +rep.disk <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +# will be exported using s3_register() in R/zzz.R +get_skimmers.disk <- function(column) { + skimr::sfl( + skim_type = "disk", + min = ~ min(as.double(.), na.rm = TRUE), + max = ~ max(as.double(.), na.rm = TRUE), + median = ~ stats::median(as.double(.), na.rm = TRUE), + n_unique = ~ length(unique(stats::na.omit(.))), + hist = ~ skimr::inline_hist(stats::na.omit(as.double(.))) + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/eucast_rules.R + + + + + +format_eucast_version_nr <- function(version, markdown = TRUE) { + # for documentation - adds title, version number, year and url in markdown language + lst <- c(EUCAST_VERSION_BREAKPOINTS, EUCAST_VERSION_EXPERT_RULES) + version <- format(unique(version), nsmall = 1) + txt <- character(0) + for (i in seq_len(length(version))) { + v <- version[i] + if (markdown == TRUE) { + txt <- c(txt, paste0( + "[", lst[[v]]$title, " ", lst[[v]]$version_txt, "](", lst[[v]]$url, ")", + " (", lst[[v]]$year, ")" + )) + } else { + txt <- c(txt, paste0( + lst[[version]]$title, " ", lst[[v]]$version_txt, + " (", lst[[v]]$year, ")" + )) + } + } + vector_and(txt, quotes = FALSE) +} + +#' Apply EUCAST Rules +#' +#' @description +#' Apply rules for clinical breakpoints and intrinsic resistance as defined by the European Committee on Antimicrobial Susceptibility Testing (EUCAST, ), see *Source*. Use [eucast_dosage()] to get a [data.frame] with advised dosages of a certain bug-drug combination, which is based on the [dosage] data set. +#' +#' To improve the interpretation of the antibiogram before EUCAST rules are applied, some non-EUCAST rules can applied at default, see *Details*. +#' @param x a data set with antibiotic columns, such as `amox`, `AMX` and `AMC` +#' @param info a [logical] to indicate whether progress should be printed to the console - the default is only print while in interactive sessions +#' @param rules a [character] vector that specifies which rules should be applied. Must be one or more of `"breakpoints"`, `"expert"`, `"other"`, `"custom"`, `"all"`, and defaults to `c("breakpoints", "expert")`. The default value can be set to another value using the package option [`AMR_eucastrules`][AMR-options]: `options(AMR_eucastrules = "all")`. If using `"custom"`, be sure to fill in argument `custom_rules` too. Custom rules can be created with [custom_eucast_rules()]. +#' @param verbose a [logical] to turn Verbose mode on and off (default is off). In Verbose mode, the function does not apply rules to the data, but instead returns a data set in logbook form with extensive info about which rows and columns would be effected and in which way. Using Verbose mode takes a lot more time. +#' @param version_breakpoints the version number to use for the EUCAST Clinical Breakpoints guideline. Can be `r vector_or(names(EUCAST_VERSION_BREAKPOINTS), reverse = TRUE)`. +#' @param version_expertrules the version number to use for the EUCAST Expert Rules and Intrinsic Resistance guideline. Can be `r vector_or(names(EUCAST_VERSION_EXPERT_RULES), reverse = TRUE)`. +# @param version_resistant_phenotypes the version number to use for the EUCAST Expected Resistant Phenotypes. Can be `r vector_or(names(EUCAST_VERSION_RESISTANTPHENOTYPES), reverse = TRUE)`. +#' @param ampc_cephalosporin_resistance a [character] value that should be applied to cefotaxime, ceftriaxone and ceftazidime for AmpC de-repressed cephalosporin-resistant mutants - the default is `NA`. Currently only works when `version_expertrules` is `3.2` and higher; these version of '*EUCAST Expert Rules on Enterobacterales*' state that results of cefotaxime, ceftriaxone and ceftazidime should be reported with a note, or results should be suppressed (emptied) for these three drugs. A value of `NA` (the default) for this argument will remove results for these three drugs, while e.g. a value of `"R"` will make the results for these drugs resistant. Use `NULL` or `FALSE` to not alter results for these three drugs of AmpC de-repressed cephalosporin-resistant mutants. Using `TRUE` is equal to using `"R"`. \cr For *EUCAST Expert Rules* v3.2, this rule applies to: `r vector_and(gsub("[^a-zA-Z ]+", "", unlist(strsplit(EUCAST_RULES_DF[which(EUCAST_RULES_DF$reference.version %in% c(3.2, 3.3) & EUCAST_RULES_DF$reference.rule %like% "ampc"), "this_value"][1], "|", fixed = TRUE))), quotes = "*")`. +#' @param ... column name of an antibiotic, see section *Antibiotics* below +#' @param ab any (vector of) text that can be coerced to a valid antibiotic drug code with [as.ab()] +#' @param administration route of administration, either `r vector_or(dosage$administration)` +#' @param only_sir_columns a [logical] to indicate whether only antibiotic columns must be detected that were transformed to class `sir` (see [as.sir()]) on beforehand (default is `FALSE`) +#' @param custom_rules custom rules to apply, created with [custom_eucast_rules()] +#' @inheritParams first_isolate +#' @details +#' **Note:** This function does not translate MIC values to SIR values. Use [as.sir()] for that. \cr +#' **Note:** When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr +#' +#' The file containing all EUCAST rules is located here: . **Note:** Old taxonomic names are replaced with the current taxonomy where applicable. For example, *Ochrobactrum anthropi* was renamed to *Brucella anthropi* in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The `AMR` package contains the full microbial taxonomy updated until `r documentation_date(max(TAXONOMY_VERSION$GBIF$accessed_date, TAXONOMY_VERSION$LPSN$accessed_date))`, see [microorganisms]. +#' +#' ### Custom Rules +#' +#' Custom rules can be created using [custom_eucast_rules()], e.g.: +#' +#' ```r +#' x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", +#' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I") +#' +#' eucast_rules(example_isolates, rules = "custom", custom_rules = x) +#' ``` +#' +#' ### 'Other' Rules +#' +#' Before further processing, two non-EUCAST rules about drug combinations can be applied to improve the efficacy of the EUCAST rules, and the reliability of your data (analysis). These rules are: +#' +#' 1. A drug **with** enzyme inhibitor will be set to S if the same drug **without** enzyme inhibitor is S +#' 2. A drug **without** enzyme inhibitor will be set to R if the same drug **with** enzyme inhibitor is R +#' +#' Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set. +#' +#' Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include `"other"` to the `rules` argument, or use `eucast_rules(..., rules = "all")`. You can also set the package option [`AMR_eucastrules`][AMR-options], i.e. run `options(AMR_eucastrules = "all")`. +#' @section Antibiotics: +#' To define antibiotics column names, leave as it is to determine it automatically with [guess_ab_col()] or input a text (case-insensitive), or use `NULL` to skip a column (e.g. `TIC = NULL` to skip ticarcillin). Manually defined but non-existing columns will be skipped with a warning. +#' +#' The following antibiotics are eligible for the functions [eucast_rules()] and [mdro()]. These are shown below in the format 'name (`antimicrobial ID`, [ATC code](https://atcddd.fhi.no/atc/structure_and_principles/))', sorted alphabetically: +#' +#' `r create_eucast_ab_documentation()` +#' @aliases EUCAST +#' @rdname eucast_rules +#' @export +#' @return The input of `x`, possibly with edited values of antibiotics. Or, if `verbose = TRUE`, a [data.frame] with all original and new values of the affected bug-drug combinations. +#' @source +#' - EUCAST Expert Rules. Version 2.0, 2012.\cr +#' Leclercq et al. **EUCAST expert rules in antimicrobial susceptibility testing.** *Clin Microbiol Infect.* 2013;19(2):141-60; \doi{https://doi.org/10.1111/j.1469-0691.2011.03703.x} +#' - EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes Tables. Version 3.1, 2016. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf) +#' - EUCAST Intrinsic Resistance and Unusual Phenotypes. Version 3.2, 2020. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf) +#' - EUCAST Intrinsic Resistance and Unusual Phenotypes. Version 3.3, 2021. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2021/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.3_20211018.pdf) +#' - EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 9.0, 2019. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_9.0_Breakpoint_Tables.xlsx) +#' - EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 10.0, 2020. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_10.0_Breakpoint_Tables.xlsx) +#' - EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 11.0, 2021. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_11.0_Breakpoint_Tables.xlsx) +#' - EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 12.0, 2022. [(link)](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_12.0_Breakpoint_Tables.xlsx) +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' \donttest{ +#' a <- data.frame( +#' mo = c( +#' "Staphylococcus aureus", +#' "Enterococcus faecalis", +#' "Escherichia coli", +#' "Klebsiella pneumoniae", +#' "Pseudomonas aeruginosa" +#' ), +#' VAN = "-", # Vancomycin +#' AMX = "-", # Amoxicillin +#' COL = "-", # Colistin +#' CAZ = "-", # Ceftazidime +#' CXM = "-", # Cefuroxime +#' PEN = "S", # Benzylpenicillin +#' FOX = "S", # Cefoxitin +#' stringsAsFactors = FALSE +#' ) +#' +#' head(a) +#' +#' +#' # apply EUCAST rules: some results wil be changed +#' b <- eucast_rules(a) +#' +#' head(b) +#' +#' +#' # do not apply EUCAST rules, but rather get a data.frame +#' # containing all details about the transformations: +#' c <- eucast_rules(a, verbose = TRUE) +#' head(c) +#' } +#' +#' # Dosage guidelines: +#' +#' eucast_dosage(c("tobra", "genta", "cipro"), "iv") +#' +#' eucast_dosage(c("tobra", "genta", "cipro"), "iv", version_breakpoints = 10) +eucast_rules <- function(x, + col_mo = NULL, + info = interactive(), + rules = getOption("AMR_eucastrules", default = c("breakpoints", "expert")), + verbose = FALSE, + version_breakpoints = 12.0, + version_expertrules = 3.3, + # TODO version_resistant_phenotypes = 1.2, + ampc_cephalosporin_resistance = NA, + only_sir_columns = FALSE, + custom_rules = NULL, + ...) { + meet_criteria(x, allow_class = "data.frame") + meet_criteria(col_mo, allow_class = "character", has_length = 1, is_in = colnames(x), allow_NULL = TRUE) + meet_criteria(info, allow_class = "logical", has_length = 1) + meet_criteria(rules, allow_class = "character", has_length = c(1, 2, 3, 4, 5), is_in = c("breakpoints", "expert", "other", "all", "custom")) + meet_criteria(verbose, allow_class = "logical", has_length = 1) + meet_criteria(version_breakpoints, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_BREAKPOINTS))) + meet_criteria(version_expertrules, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_EXPERT_RULES))) + # meet_criteria(version_resistant_phenotypes, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_RESISTANTPHENOTYPES))) + meet_criteria(ampc_cephalosporin_resistance, allow_class = c("logical", "character", "sir"), has_length = 1, allow_NA = TRUE, allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(custom_rules, allow_class = "custom_eucast_rules", allow_NULL = TRUE) + + add_MO_lookup_to_AMR_env() + + if ("custom" %in% rules && is.null(custom_rules)) { + warning_("in `eucast_rules()`: no custom rules were set with the `custom_rules` argument", + immediate = TRUE + ) + rules <- rules[rules != "custom"] + if (length(rules) == 0) { + if (isTRUE(info)) { + message_("No other rules were set, returning original data", add_fn = font_red, as_note = FALSE) + } + return(x) + } + } + + x_deparsed <- deparse(substitute(x)) + if (length(x_deparsed) > 1 || any(x_deparsed %unlike% "[a-z]+")) { + x_deparsed <- "your_data" + } + + breakpoints_info <- EUCAST_VERSION_BREAKPOINTS[[which(as.double(names(EUCAST_VERSION_BREAKPOINTS)) == version_breakpoints)]] + expertrules_info <- EUCAST_VERSION_EXPERT_RULES[[which(as.double(names(EUCAST_VERSION_EXPERT_RULES)) == version_expertrules)]] + # resistantphenotypes_info <- EUCAST_VERSION_RESISTANTPHENOTYPES[[which(as.double(names(EUCAST_VERSION_RESISTANTPHENOTYPES)) == version_resistant_phenotypes)]] + + # support old setting (until AMR v1.3.0) + if (missing(rules) && !is.null(getOption("AMR.eucast_rules"))) { + rules <- getOption("AMR.eucast_rules") + } + + if (interactive() && isTRUE(verbose) && isTRUE(info)) { + txt <- paste0( + "WARNING: In Verbose mode, the eucast_rules() function does not apply rules to the data, but instead returns a data set in logbook form with extensive info about which rows and columns would be effected and in which way.", + "\n\nThis may overwrite your existing data if you use e.g.:", + "\ndata <- eucast_rules(data, verbose = TRUE)\n\nDo you want to continue?" + ) + showQuestion <- import_fn("showQuestion", "rstudioapi", error_on_fail = FALSE) + if (!is.null(showQuestion)) { + q_continue <- showQuestion("Using verbose = TRUE with eucast_rules()", txt) + } else { + q_continue <- utils::menu(choices = c("OK", "Cancel"), graphics = FALSE, title = txt) + } + if (q_continue %in% c(FALSE, 2)) { + message_("Cancelled, returning original data", add_fn = font_red, as_note = FALSE) + return(x) + } + } + + # try to find columns based on type + # -- mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = info) + stop_if(is.null(col_mo), "`col_mo` must be set") + } + + decimal.mark <- getOption("OutDec") + big.mark <- ifelse(decimal.mark != ",", ",", " ") + formatnr <- function(x, big = big.mark, dec = decimal.mark) { + trimws(format(x, big.mark = big, decimal.mark = dec)) + } + + warned <- FALSE + warn_lacking_sir_class <- character(0) + txt_ok <- function(n_added, n_changed, warned = FALSE) { + if (warned == FALSE) { + if (n_added + n_changed == 0) { + cat(font_subtle(" (no changes)\n")) + } else { + # opening + if (n_added > 0 && n_changed == 0) { + cat(font_green(" (")) + } else if (n_added == 0 && n_changed > 0) { + cat(font_blue(" (")) + } else { + cat(font_grey(" (")) + } + # additions + if (n_added > 0) { + if (n_added == 1) { + cat(font_green("1 value added")) + } else { + cat(font_green(formatnr(n_added), "values added")) + } + } + # separator + if (n_added > 0 && n_changed > 0) { + cat(font_grey(", ")) + } + # changes + if (n_changed > 0) { + if (n_changed == 1) { + cat(font_blue("1 value changed")) + } else { + cat(font_blue(formatnr(n_changed), "values changed")) + } + } + # closing + if (n_added > 0 && n_changed == 0) { + cat(font_green(")\n")) + } else if (n_added == 0 && n_changed > 0) { + cat(font_blue(")\n")) + } else { + cat(font_grey(")\n")) + } + } + warned <<- FALSE + } + } + + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c( + "AMC", + "AMP", + "AMX", + "CIP", + "ERY", + "FOX1", + "GEN", + "MFX", + "NAL", + "NOR", + "PEN", + "PIP", + "TCY", + "TIC", + "TOB" + ), + hard_dependencies = NULL, + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "eucast_rules", + ... + ) + + if (!"AMP" %in% names(cols_ab) && "AMX" %in% names(cols_ab)) { + # ampicillin column is missing, but amoxicillin is available + if (isTRUE(info)) { + message_("Using column '", cols_ab[names(cols_ab) == "AMX"], "' as input for ampicillin since many EUCAST rules depend on it.") + } + cols_ab <- c(cols_ab, c(AMP = unname(cols_ab[names(cols_ab) == "AMX"]))) + } + + # data preparation ---- + if (isTRUE(info) && NROW(x) > 10000) { + message_("Preparing data...", appendLF = FALSE, as_note = FALSE) + } + + # Some helper functions --------------------------------------------------- + get_antibiotic_names <- function(x) { + x <- x %pm>% + strsplit(",") %pm>% + unlist() %pm>% + trimws2() %pm>% + vapply(FUN.VALUE = character(1), function(x) if (x %in% AMR::antibiotics$ab) ab_name(x, language = NULL, tolower = TRUE, fast_mode = TRUE) else x) %pm>% + sort() %pm>% + paste(collapse = ", ") + x <- gsub("_", " ", x, fixed = TRUE) + x <- gsub("except CAZ", paste("except", ab_name("CAZ", language = NULL, tolower = TRUE)), x, fixed = TRUE) + x <- gsub("except TGC", paste("except", ab_name("TGC", language = NULL, tolower = TRUE)), x, fixed = TRUE) + x <- gsub("cephalosporins (1st|2nd|3rd|4th|5th)", "cephalosporins (\\1 gen.)", x) + x + } + format_antibiotic_names <- function(ab_names, ab_results) { + ab_names <- trimws2(unlist(strsplit(ab_names, ","))) + ab_results <- trimws2(unlist(strsplit(ab_results, ","))) + if (length(ab_results) == 1) { + if (length(ab_names) == 1) { + # like FOX S + x <- paste(ab_names, "is") + } else if (length(ab_names) == 2) { + # like PEN,FOX S + x <- paste(paste0(ab_names, collapse = " and "), "are both") + } else { + # like PEN,FOX,GEN S (although dependency on > 2 ABx does not exist at the moment) + # nolint start + # x <- paste(paste0(ab_names, collapse = " and "), "are all") + # nolint end + } + return(paste0(x, " '", ab_results, "'")) + } else { + if (length(ab_names) == 2) { + # like PEN,FOX S,R + paste0( + ab_names[1], " is '", ab_results[1], "' and ", + ab_names[2], " is '", ab_results[2], "'" + ) + } else { + # like PEN,FOX,GEN S,R,R (although dependency on > 2 ABx does not exist at the moment) + paste0( + ab_names[1], " is '", ab_results[1], "' and ", + ab_names[2], " is '", ab_results[2], "' and ", + ab_names[3], " is '", ab_results[3], "'" + ) + } + } + } + as.sir_no_warning <- function(x) { + if (is.sir(x)) { + return(x) + } + suppressWarnings(as.sir(x)) + } + + # Preparing the data ------------------------------------------------------ + + verbose_info <- data.frame( + rowid = character(0), + col = character(0), + mo_fullname = character(0), + old = as.sir(character(0)), + new = as.sir(character(0)), + rule = character(0), + rule_group = character(0), + rule_name = character(0), + rule_source = character(0), + stringsAsFactors = FALSE + ) + + old_cols <- colnames(x) + old_attributes <- attributes(x) + x <- as.data.frame(x, stringsAsFactors = FALSE) # no tibbles, data.tables, etc. + rownames(x) <- NULL # will later be restored with old_attributes + # create unique row IDs - combination of the MO and all ABx columns (so they will only run once per unique combination) + x$`.rowid` <- vapply( + FUN.VALUE = character(1), + as.list(as.data.frame(t(x[, c(col_mo, cols_ab), drop = FALSE]), + stringsAsFactors = FALSE + )), + function(x) { + x[is.na(x)] <- "." + paste0(x, collapse = "") + } + ) + + # save original table, with the new .rowid column + x.bak <- x + # keep only unique rows for MO and ABx + x <- x %pm>% + pm_arrange(`.rowid`) %pm>% + # big speed gain! only analyse unique rows: + pm_distinct(`.rowid`, .keep_all = TRUE) %pm>% + as.data.frame(stringsAsFactors = FALSE) + x[, col_mo] <- as.mo(as.character(x[, col_mo, drop = TRUE]), info = info) + # rename col_mo to prevent interference with joined columns + colnames(x)[colnames(x) == col_mo] <- ".col_mo" + col_mo <- ".col_mo" + # join to microorganisms data set + x <- left_join_microorganisms(x, by = col_mo, suffix = c("_oldcols", "")) + x$gramstain <- mo_gramstain(x[, col_mo, drop = TRUE], language = NULL, info = FALSE) + x$genus_species <- trimws(paste(x$genus, x$species)) + if (isTRUE(info) && NROW(x) > 10000) { + message_(" OK.", add_fn = list(font_green, font_bold), as_note = FALSE) + } + + if (any(x$genus == "Staphylococcus", na.rm = TRUE)) { + all_staph <- AMR_env$MO_lookup[which(AMR_env$MO_lookup$genus == "Staphylococcus"), , drop = FALSE] + all_staph$CNS_CPS <- suppressWarnings(mo_name(all_staph$mo, Becker = "all", language = NULL, info = FALSE)) + } + if (any(x$genus == "Streptococcus", na.rm = TRUE)) { + all_strep <- AMR_env$MO_lookup[which(AMR_env$MO_lookup$genus == "Streptococcus"), , drop = FALSE] + all_strep$Lancefield <- suppressWarnings(mo_name(all_strep$mo, Lancefield = TRUE, language = NULL, info = FALSE)) + } + + n_added <- 0 + n_changed <- 0 + + # Other rules: enzyme inhibitors ------------------------------------------ + if (any(c("all", "other") %in% rules)) { + if (isTRUE(info)) { + cat("\n") + cat(word_wrap( + font_bold(paste0( + "Rules by this AMR package (", + font_red(paste0( + "v", utils::packageDescription("AMR")$Version, ", ", + format(as.Date(utils::packageDescription("AMR")$Date), format = "%Y") + )), "), see `?eucast_rules`\n" + )) + )) + } + ab_enzyme <- subset(AMR::antibiotics, name %like% "/")[, c("ab", "name"), drop = FALSE] + colnames(ab_enzyme) <- c("enzyme_ab", "enzyme_name") + ab_enzyme$base_name <- gsub("^([a-zA-Z0-9]+).*", "\\1", ab_enzyme$enzyme_name) + ab_enzyme$base_ab <- AMR::antibiotics[match(ab_enzyme$base_name, AMR::antibiotics$name), "ab", drop = TRUE] + ab_enzyme <- subset(ab_enzyme, !is.na(base_ab)) + # make ampicillin and amoxicillin interchangable + ampi <- subset(ab_enzyme, base_ab == "AMX") + ampi$base_ab <- "AMP" + ampi$base_name <- ab_name("AMP", language = NULL) + amox <- subset(ab_enzyme, base_ab == "AMP") + amox$base_ab <- "AMX" + amox$base_name <- ab_name("AMX", language = NULL) + # merge and sort + ab_enzyme <- rbind_AMR(ab_enzyme, ampi, amox) + ab_enzyme <- ab_enzyme[order(ab_enzyme$enzyme_name), , drop = FALSE] + + for (i in seq_len(nrow(ab_enzyme))) { + # check if both base and base + enzyme inhibitor are part of the data set + if (all(c(ab_enzyme$base_ab[i], ab_enzyme$enzyme_ab[i]) %in% names(cols_ab), na.rm = TRUE)) { + col_base <- unname(cols_ab[ab_enzyme$base_ab[i]]) + col_enzyme <- unname(cols_ab[ab_enzyme$enzyme_ab[i]]) + + # Set base to R where base + enzyme inhibitor is R ---- + rule_current <- paste0( + ab_enzyme$base_name[i], " ('", font_bold(col_base), "') = R if ", + tolower(ab_enzyme$enzyme_name[i]), " ('", font_bold(col_enzyme), "') = R" + ) + if (isTRUE(info)) { + cat(word_wrap(rule_current, + width = getOption("width") - 30, + extra_indent = 6 + )) + } + run_changes <- edit_sir( + x = x, + to = "R", + rule = c( + rule_current, "Other rules", "", + paste0("Non-EUCAST: AMR package v", utils::packageDescription("AMR")$Version) + ), + rows = which(as.sir_no_warning(x[, col_enzyme, drop = TRUE]) == "R"), + cols = col_base, + last_verbose_info = verbose_info, + original_data = x.bak, + warned = warned, + info = info, + verbose = verbose + ) + n_added <- n_added + run_changes$added + n_changed <- n_changed + run_changes$changed + verbose_info <- run_changes$verbose_info + x <- run_changes$output + warn_lacking_sir_class <- c(warn_lacking_sir_class, run_changes$sir_warn) + # Print number of new changes + if (isTRUE(info)) { + # print only on last one of rules in this group + txt_ok(n_added = n_added, n_changed = n_changed, warned = warned) + # and reset counters + n_added <- 0 + n_changed <- 0 + } + + # Set base + enzyme inhibitor to S where base is S ---- + rule_current <- paste0( + ab_enzyme$enzyme_name[i], " ('", font_bold(col_enzyme), "') = S if ", + tolower(ab_enzyme$base_name[i]), " ('", font_bold(col_base), "') = S" + ) + + if (isTRUE(info)) { + cat(word_wrap(rule_current, + width = getOption("width") - 30, + extra_indent = 6 + )) + } + run_changes <- edit_sir( + x = x, + to = "S", + rule = c( + rule_current, "Other rules", "", + paste0("Non-EUCAST: AMR package v", utils::packageDescription("AMR")$Version) + ), + rows = which(as.sir_no_warning(x[, col_base, drop = TRUE]) == "S"), + cols = col_enzyme, + last_verbose_info = verbose_info, + original_data = x.bak, + warned = warned, + info = info, + verbose = verbose + ) + n_added <- n_added + run_changes$added + n_changed <- n_changed + run_changes$changed + verbose_info <- run_changes$verbose_info + x <- run_changes$output + warn_lacking_sir_class <- c(warn_lacking_sir_class, run_changes$sir_warn) + # Print number of new changes + if (isTRUE(info)) { + # print only on last one of rules in this group + txt_ok(n_added = n_added, n_changed = n_changed, warned = warned) + # and reset counters + n_added <- 0 + n_changed <- 0 + } + } + } + } else { + if (isTRUE(info)) { + cat("\n") + message_("Skipping inheritance rules defined by this AMR package, such as setting trimethoprim (TMP) = R where trimethoprim/sulfamethoxazole (SXT) = R. Add \"other\" or \"all\" to the `rules` argument to apply those rules.") + } + } + + if (!any(c("all", "custom") %in% rules) && !is.null(custom_rules)) { + if (isTRUE(info)) { + message_("Skipping custom EUCAST rules, since the `rules` argument does not contain \"custom\".") + } + custom_rules <- NULL + } + + # Official EUCAST rules --------------------------------------------------- + eucast_notification_shown <- FALSE + if (!is.null(list(...)$eucast_rules_df)) { + # this allows: eucast_rules(x, eucast_rules_df = AMR:::EUCAST_RULES_DF %>% filter(is.na(have_these_values))) + eucast_rules_df <- list(...)$eucast_rules_df + } else { + # otherwise internal data file, created in data-raw/_pre_commit_checks.R + eucast_rules_df <- EUCAST_RULES_DF + } + + # filter on user-set guideline versions ---- + if (any(c("all", "breakpoints") %in% rules)) { + eucast_rules_df <- subset( + eucast_rules_df, + reference.rule_group %unlike% "breakpoint" | + (reference.rule_group %like% "breakpoint" & reference.version == version_breakpoints) + ) + } + if (any(c("all", "expert") %in% rules)) { + eucast_rules_df <- subset( + eucast_rules_df, + reference.rule_group %unlike% "expert" | + (reference.rule_group %like% "expert" & reference.version == version_expertrules) + ) + } + # filter out AmpC de-repressed cephalosporin-resistant mutants ---- + # no need to filter on version number here - the rules contain these version number, so are inherently filtered + # cefotaxime, ceftriaxone, ceftazidime + if (is.null(ampc_cephalosporin_resistance) || isFALSE(ampc_cephalosporin_resistance)) { + eucast_rules_df <- subset( + eucast_rules_df, + reference.rule %unlike% "ampc" + ) + } else { + if (isTRUE(ampc_cephalosporin_resistance)) { + ampc_cephalosporin_resistance <- "R" + } + eucast_rules_df[which(eucast_rules_df$reference.rule %like% "ampc"), "to_value"] <- as.character(ampc_cephalosporin_resistance) + } + + # Go over all rules and apply them ---- + for (i in seq_len(nrow(eucast_rules_df))) { + rule_previous <- eucast_rules_df[max(1, i - 1), "reference.rule", drop = TRUE] + rule_current <- eucast_rules_df[i, "reference.rule", drop = TRUE] + rule_next <- eucast_rules_df[min(nrow(eucast_rules_df), i + 1), "reference.rule", drop = TRUE] + rule_group_previous <- eucast_rules_df[max(1, i - 1), "reference.rule_group", drop = TRUE] + rule_group_current <- eucast_rules_df[i, "reference.rule_group", drop = TRUE] + # don't apply rules if user doesn't want to apply them + if (rule_group_current %like% "breakpoint" && !any(c("all", "breakpoints") %in% rules)) { + next + } + if (rule_group_current %like% "expert" && !any(c("all", "expert") %in% rules)) { + next + } + + if (isFALSE(info) || isFALSE(verbose)) { + rule_text <- "" + } else { + if (is.na(eucast_rules_df[i, "and_these_antibiotics", drop = TRUE])) { + rule_text <- paste0("always report as '", eucast_rules_df[i, "to_value", drop = TRUE], "': ", get_antibiotic_names(eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE])) + } else { + rule_text <- paste0( + "report as '", eucast_rules_df[i, "to_value", drop = TRUE], "' when ", + format_antibiotic_names( + ab_names = get_antibiotic_names(eucast_rules_df[i, "and_these_antibiotics", drop = TRUE]), + ab_results = eucast_rules_df[i, "have_these_values", drop = TRUE] + ), ": ", + get_antibiotic_names(eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE]) + ) + } + } + if (i == 1) { + rule_previous <- "" + rule_group_previous <- "" + } + if (i == nrow(eucast_rules_df)) { + rule_next <- "" + } + + if (isTRUE(info)) { + # Print EUCAST intro ------------------------------------------------------ + if (rule_group_current %unlike% "other" && eucast_notification_shown == FALSE) { + cat( + paste0( + "\n", font_grey(strrep("-", 0.95 * getOption("width", 100))), "\n", + word_wrap("Rules by the ", font_bold("European Committee on Antimicrobial Susceptibility Testing (EUCAST)")), "\n", + font_blue("https://eucast.org/"), "\n" + ) + ) + eucast_notification_shown <- TRUE + } + + # Print rule (group) ------------------------------------------------------ + if (rule_group_current != rule_group_previous) { + # is new rule group, one of Breakpoints, Expert Rules and Other + cat(font_bold( + ifelse( + rule_group_current %like% "breakpoint", + paste0( + "\n", + word_wrap( + breakpoints_info$title, " (", + font_red(paste0(breakpoints_info$version_txt, ", ", breakpoints_info$year)), ")\n" + ) + ), + ifelse( + rule_group_current %like% "expert", + paste0( + "\n", + word_wrap( + expertrules_info$title, " (", + font_red(paste0(expertrules_info$version_txt, ", ", expertrules_info$year)), ")\n" + ) + ), + "" + ) + ) + ), "\n") + } + # Print rule ------------------------------------------------------------- + if (rule_current != rule_previous) { + # is new rule within group, print its name + cat(italicise_taxonomy( + word_wrap(rule_current, + width = getOption("width") - 30, + extra_indent = 6 + ), + type = "ansi" + )) + warned <- FALSE + } + } + + # Get rule from file ------------------------------------------------------ + if_mo_property <- trimws(eucast_rules_df[i, "if_mo_property", drop = TRUE]) + like_is_one_of <- trimws(eucast_rules_df[i, "like.is.one_of", drop = TRUE]) + mo_value <- trimws(eucast_rules_df[i, "this_value", drop = TRUE]) + + # be sure to comprise all coagulase-negative/-positive staphylococci when they are mentioned + if (mo_value %like% "coagulase" && any(x$genus == "Staphylococcus", na.rm = TRUE)) { + if (mo_value %like% "negative") { + eucast_rules_df[i, "this_value"] <- paste0( + "^(", paste0( + all_staph[which(all_staph$CNS_CPS %like% "negative"), + "fullname", + drop = TRUE + ], + collapse = "|" + ), + ")$" + ) + } else { + eucast_rules_df[i, "this_value"] <- paste0( + "^(", paste0( + all_staph[which(all_staph$CNS_CPS %like% "positive"), + "fullname", + drop = TRUE + ], + collapse = "|" + ), + ")$" + ) + } + like_is_one_of <- "like" + } + # be sure to comprise all beta-haemolytic Streptococci (Lancefield groups A, B, C and G) when they are mentioned + if (mo_value %like% "group [ABCG]" && any(x$genus == "Streptococcus", na.rm = TRUE)) { + eucast_rules_df[i, "this_value"] <- paste0( + "^(", paste0( + all_strep[which(all_strep$Lancefield %like% "group [ABCG]"), + "fullname", + drop = TRUE + ], + collapse = "|" + ), + ")$" + ) + like_is_one_of <- "like" + } + + if (like_is_one_of == "is") { + # so e.g. 'Enterococcus' will turn into '^Enterococcus$' + mo_value <- paste0("^", mo_value, "$") + } else if (like_is_one_of == "one_of") { + # so 'Clostridium, Actinomyces, ...' will turn into '^(Clostridium|Actinomyces|...)$' + mo_value <- paste0( + "^(", + paste(trimws(unlist(strsplit(mo_value, ",", fixed = TRUE))), + collapse = "|" + ), + ")$" + ) + } else if (like_is_one_of != "like") { + stop("invalid value for column 'like.is.one_of'", call. = FALSE) + } + + source_antibiotics <- eucast_rules_df[i, "and_these_antibiotics", drop = TRUE] + source_value <- trimws(unlist(strsplit(eucast_rules_df[i, "have_these_values", drop = TRUE], ",", fixed = TRUE))) + target_antibiotics <- eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE] + target_value <- eucast_rules_df[i, "to_value", drop = TRUE] + + if (is.na(source_antibiotics)) { + rows <- tryCatch(which(x[, if_mo_property, drop = TRUE] %like% mo_value), + error = function(e) integer(0) + ) + } else { + source_antibiotics <- get_ab_from_namespace(source_antibiotics, cols_ab) + if (length(source_value) == 1 && length(source_antibiotics) > 1) { + source_value <- rep(source_value, length(source_antibiotics)) + } + if (length(source_antibiotics) == 0) { + rows <- integer(0) + } else if (length(source_antibiotics) == 1) { + rows <- tryCatch( + which(x[, if_mo_property, drop = TRUE] %like% mo_value & + as.sir_no_warning(x[, source_antibiotics[1L]]) == source_value[1L]), + error = function(e) integer(0) + ) + } else if (length(source_antibiotics) == 2) { + rows <- tryCatch( + which(x[, if_mo_property, drop = TRUE] %like% mo_value & + as.sir_no_warning(x[, source_antibiotics[1L]]) == source_value[1L] & + as.sir_no_warning(x[, source_antibiotics[2L]]) == source_value[2L]), + error = function(e) integer(0) + ) + # nolint start + # } else if (length(source_antibiotics) == 3) { + # rows <- tryCatch(which(x[, if_mo_property, drop = TRUE] %like% mo_value + # & as.sir_no_warning(x[, source_antibiotics[1L]]) == source_value[1L] + # & as.sir_no_warning(x[, source_antibiotics[2L]]) == source_value[2L] + # & as.sir_no_warning(x[, source_antibiotics[3L]]) == source_value[3L]), + # error = function(e) integer(0)) + # nolint end + } else { + stop_("only 2 antibiotics supported for source_antibiotics") + } + } + + cols <- get_ab_from_namespace(target_antibiotics, cols_ab) + + # Apply rule on data ------------------------------------------------------ + # this will return the unique number of changes + run_changes <- edit_sir( + x = x, + to = target_value, + rule = c( + rule_text, rule_group_current, rule_current, + ifelse(rule_group_current %like% "breakpoint", + paste0(breakpoints_info$title, " ", breakpoints_info$version_txt, ", ", breakpoints_info$year), + paste0(expertrules_info$title, " ", expertrules_info$version_txt, ", ", expertrules_info$year) + ) + ), + rows = rows, + cols = cols, + last_verbose_info = verbose_info, + original_data = x.bak, + warned = warned, + info = info, + verbose = verbose + ) + n_added <- n_added + run_changes$added + n_changed <- n_changed + run_changes$changed + verbose_info <- run_changes$verbose_info + x <- run_changes$output + warn_lacking_sir_class <- c(warn_lacking_sir_class, run_changes$sir_warn) + # Print number of new changes --------------------------------------------- + if (isTRUE(info) && rule_next != rule_current) { + # print only on last one of rules in this group + txt_ok(n_added = n_added, n_changed = n_changed, warned = warned) + # and reset counters + n_added <- 0 + n_changed <- 0 + } + } # end of going over all rules + + # Apply custom rules ---- + if (!is.null(custom_rules)) { + if (isTRUE(info)) { + cat("\n") + cat(font_bold("Custom EUCAST rules, set by user"), "\n") + } + for (i in seq_len(length(custom_rules))) { + rule <- custom_rules[[i]] + rows <- which(eval(parse(text = rule$query), envir = x)) + cols <- as.character(rule$result_group) + cols <- c( + cols[cols %in% colnames(x)], # direct column names + unname(cols_ab[names(cols_ab) %in% cols]) + ) # based on previous cols_ab finding + cols <- unique(cols) + target_value <- as.character(rule$result_value) + rule_text <- paste0( + "report as '", target_value, "' when ", + format_custom_query_rule(rule$query, colours = FALSE), ": ", + get_antibiotic_names(cols) + ) + if (isTRUE(info)) { + # print rule + cat(italicise_taxonomy( + word_wrap(format_custom_query_rule(rule$query, colours = FALSE), + width = getOption("width") - 30, + extra_indent = 6 + ), + type = "ansi" + )) + cat("\n") + warned <- FALSE + } + run_changes <- edit_sir( + x = x, + to = target_value, + rule = c( + rule_text, + "Custom EUCAST rules", + paste0("Custom EUCAST rule ", i), + paste0( + "Object '", deparse(substitute(custom_rules)), + "' consisting of ", length(custom_rules), " custom rules" + ) + ), + rows = rows, + cols = cols, + last_verbose_info = verbose_info, + original_data = x.bak, + warned = warned, + info = info, + verbose = verbose + ) + n_added <- n_added + run_changes$added + n_changed <- n_changed + run_changes$changed + verbose_info <- run_changes$verbose_info + x <- run_changes$output + warn_lacking_sir_class <- c(warn_lacking_sir_class, run_changes$sir_warn) + # Print number of new changes --------------------------------------------- + if (isTRUE(info) && rule_next != rule_current) { + # print only on last one of rules in this group + txt_ok(n_added = n_added, n_changed = n_changed, warned = warned) + # and reset counters + n_added <- 0 + n_changed <- 0 + } + } + } + + # Print overview ---------------------------------------------------------- + if (isTRUE(info) || isTRUE(verbose)) { + verbose_info <- x.bak %pm>% + pm_mutate(row = pm_row_number()) %pm>% + pm_select(`.rowid`, row) %pm>% + pm_right_join(verbose_info, + by = c(".rowid" = "rowid") + ) %pm>% + pm_select(-`.rowid`) %pm>% + pm_select(row, pm_everything()) %pm>% + pm_filter(!is.na(new) | is.na(new) & !is.na(old)) %pm>% + pm_arrange(row, rule_group, rule_name, col) + rownames(verbose_info) <- NULL + } + + if (isTRUE(info)) { + if (isTRUE(verbose)) { + wouldve <- "would have " + } else { + wouldve <- "" + } + + cat(paste0("\n", font_grey(strrep("-", 0.95 * getOption("width", 100))), "\n")) + cat(word_wrap(paste0( + "The rules ", paste0(wouldve, "affected "), + font_bold( + formatnr(pm_n_distinct(verbose_info$row)), + "out of", formatnr(nrow(x.bak)), + "rows" + ), + ", making a total of ", + font_bold(formatnr(nrow(verbose_info)), "edits\n") + ))) + + total_n_added <- verbose_info %pm>% pm_filter(is.na(old)) %pm>% nrow() + total_n_changed <- verbose_info %pm>% pm_filter(!is.na(old)) %pm>% nrow() + + # print added values + if (total_n_added == 0) { + colour <- cat # is function + } else { + colour <- font_green # is function + } + cat(colour(paste0( + "=> ", wouldve, "added ", + font_bold(formatnr(verbose_info %pm>% + pm_filter(is.na(old)) %pm>% + nrow()), "test results"), + "\n" + ))) + if (total_n_added > 0) { + added_summary <- verbose_info %pm>% + pm_filter(is.na(old)) %pm>% + pm_count(new, name = "n") + cat(paste(" -", + paste0( + formatnr(added_summary$n), " test result", ifelse(added_summary$n > 1, "s", ""), + " added as ", paste0('"', added_summary$new, '"') + ), + collapse = "\n" + )) + } + + # print changed values + if (total_n_changed == 0) { + colour <- cat # is function + } else { + colour <- font_blue # is function + } + if (total_n_added + total_n_changed > 0) { + cat("\n") + } + cat(colour(paste0( + "=> ", wouldve, "changed ", + font_bold(formatnr(verbose_info %pm>% + pm_filter(!is.na(old)) %pm>% + nrow()), "test results"), + "\n" + ))) + if (total_n_changed > 0) { + changed_summary <- verbose_info %pm>% + pm_filter(!is.na(old)) %pm>% + pm_mutate(new = ifelse(is.na(new), "NA", new)) %pm>% + pm_count(old, new, name = "n") + cat(paste(" -", + paste0( + formatnr(changed_summary$n), " test result", ifelse(changed_summary$n > 1, "s", ""), " changed from ", + paste0('"', changed_summary$old, '"'), " to ", paste0('"', changed_summary$new, '"') + ), + collapse = "\n" + )) + cat("\n") + } + + cat(paste0(font_grey(strrep("-", 0.95 * getOption("width", 100))), "\n")) + + if (isFALSE(verbose) && total_n_added + total_n_changed > 0) { + cat("\n", word_wrap("Use ", font_bold("eucast_rules(..., verbose = TRUE)"), " (on your original data) to get a data.frame with all specified edits instead."), "\n\n", sep = "") + } else if (isTRUE(verbose)) { + cat("\n", word_wrap("Used 'Verbose mode' (", font_bold("verbose = TRUE"), "), which returns a data.frame with all specified edits.\nUse ", font_bold("verbose = FALSE"), " to apply the rules on your data."), "\n\n", sep = "") + } + } + + if (length(warn_lacking_sir_class) > 0) { + warn_lacking_sir_class <- unique(warn_lacking_sir_class) + # take order from original data set + warn_lacking_sir_class <- warn_lacking_sir_class[order(colnames(x.bak))] + warn_lacking_sir_class <- warn_lacking_sir_class[!is.na(warn_lacking_sir_class)] + warning_( + "in `eucast_rules()`: not all columns with antimicrobial results are of class 'sir'. Transform them on beforehand, with e.g.:\n", + " - ", x_deparsed, " %>% as.sir(", ifelse(length(warn_lacking_sir_class) == 1, + warn_lacking_sir_class, + paste0(warn_lacking_sir_class[1], ":", warn_lacking_sir_class[length(warn_lacking_sir_class)]) + ), ")\n", + " - ", x_deparsed, " %>% mutate_if(is_sir_eligible, as.sir)\n", + " - ", x_deparsed, " %>% mutate(across(where(is_sir_eligible), as.sir))" + ) + } + + # Return data set --------------------------------------------------------- + if (isTRUE(verbose)) { + as_original_data_class(verbose_info, old_attributes$class) # will remove tibble groups + } else { + # x was analysed with only unique rows, so join everything together again + x <- x[, c(cols_ab, ".rowid"), drop = FALSE] + x.bak <- x.bak[, setdiff(colnames(x.bak), cols_ab), drop = FALSE] + x.bak <- x.bak %pm>% + pm_left_join(x, by = ".rowid") + x.bak <- x.bak[, old_cols, drop = FALSE] + # reset original attributes + attributes(x.bak) <- old_attributes + x.bak <- as_original_data_class(x.bak, old_class = class(x.bak)) # will remove tibble groups + x.bak + } +} + +# helper function for editing the table ---- +edit_sir <- function(x, + to, + rule, + rows, + cols, + last_verbose_info, + original_data, + warned, + info, + verbose) { + cols <- unique(cols[!is.na(cols) & !is.null(cols)]) + + # for Verbose Mode, keep track of all changes and return them + track_changes <- list( + added = 0, + changed = 0, + output = x, + verbose_info = last_verbose_info, + sir_warn = character(0) + ) + + txt_error <- function() { + if (isTRUE(info)) cat("", font_red_bg(" ERROR "), "\n\n") + } + txt_warning <- function() { + if (warned == FALSE) { + if (isTRUE(info)) cat(" ", font_orange_bg(" WARNING "), sep = "") + } + warned <<- TRUE + } + + if (length(rows) > 0 && length(cols) > 0) { + new_edits <- x + if (any(!vapply(FUN.VALUE = logical(1), x[, cols, drop = FALSE], is.sir), na.rm = TRUE)) { + track_changes$sir_warn <- cols[!vapply(FUN.VALUE = logical(1), x[, cols, drop = FALSE], is.sir)] + } + tryCatch( + # insert into original table + new_edits[rows, cols] <- to, + warning = function(w) { + if (w$message %like% "invalid factor level") { + xyz <- vapply(FUN.VALUE = logical(1), cols, function(col) { + new_edits[, col] <<- factor( + x = as.character(pm_pull(new_edits, col)), + levels = unique(c(to, levels(pm_pull(new_edits, col)))) + ) + TRUE + }) + suppressWarnings(new_edits[rows, cols] <<- to) + warning_( + "in `eucast_rules()`: value \"", to, "\" added to the factor levels of column", + ifelse(length(cols) == 1, "", "s"), + " ", vector_and(cols, quotes = "`", sort = FALSE), + " because this value was not an existing factor level." + ) + txt_warning() + warned <- FALSE + } else { + warning_("in `eucast_rules()`: ", w$message) + txt_warning() + } + }, + error = function(e) { + txt_error() + stop( + paste0( + "In row(s) ", paste(rows[seq_len(min(length(rows), 10))], collapse = ","), + ifelse(length(rows) > 10, "...", ""), + " while writing value '", to, + "' to column(s) `", paste(cols, collapse = "`, `"), + "`:\n", e$message + ), + call. = FALSE + ) + } + ) + + track_changes$output <- new_edits + if ((isTRUE(info) || isTRUE(verbose)) && !isTRUE(all.equal(x, track_changes$output))) { + get_original_rows <- function(rowids) { + as.integer(rownames(original_data[which(original_data$.rowid %in% rowids), , drop = FALSE])) + } + for (i in seq_len(length(cols))) { + verbose_new <- data.frame( + rowid = new_edits[rows, ".rowid", drop = TRUE], + col = cols[i], + mo_fullname = new_edits[rows, "fullname", drop = TRUE], + old = x[rows, cols[i], drop = TRUE], + new = to, + rule = font_stripstyle(rule[1]), + rule_group = font_stripstyle(rule[2]), + rule_name = font_stripstyle(rule[3]), + rule_source = font_stripstyle(rule[4]), + stringsAsFactors = FALSE + ) + colnames(verbose_new) <- c( + "rowid", "col", "mo_fullname", "old", "new", + "rule", "rule_group", "rule_name", "rule_source" + ) + verbose_new <- verbose_new %pm>% pm_filter(old != new | is.na(old) | is.na(new) & !is.na(old)) + # save changes to data set 'verbose_info' + track_changes$verbose_info <- rbind_AMR( + track_changes$verbose_info, + verbose_new + ) + # count adds and changes + track_changes$added <- track_changes$added + verbose_new %pm>% + pm_filter(is.na(old)) %pm>% + pm_pull(rowid) %pm>% + get_original_rows() %pm>% + length() + track_changes$changed <- track_changes$changed + verbose_new %pm>% + pm_filter(!is.na(old)) %pm>% + pm_pull(rowid) %pm>% + get_original_rows() %pm>% + length() + } + } + } + return(track_changes) +} + +#' @rdname eucast_rules +#' @export +eucast_dosage <- function(ab, administration = "iv", version_breakpoints = 12.0) { + meet_criteria(ab, allow_class = c("character", "numeric", "integer", "factor")) + meet_criteria(administration, allow_class = "character", is_in = dosage$administration[!is.na(dosage$administration)], has_length = 1) + meet_criteria(version_breakpoints, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_BREAKPOINTS))) + + # show used version_breakpoints number once per session (AMR_env will reload every session) + if (message_not_thrown_before("eucast_dosage", "v", gsub("[^0-9]", "", version_breakpoints), entire_session = TRUE)) { + message_( + "Dosages for antimicrobial drugs, as meant for ", + format_eucast_version_nr(version_breakpoints, markdown = FALSE), ". ", + font_red("This note will be shown once per session.") + ) + } + + ab <- as.ab(ab) + lst <- vector("list", length = length(ab)) + for (i in seq_len(length(ab))) { + df <- AMR::dosage[which(AMR::dosage$ab == ab[i] & AMR::dosage$administration == administration), , drop = FALSE] + lst[[i]] <- list( + ab = "", + name = "", + standard_dosage = ifelse("standard_dosage" %in% df$type, + df[which(df$type == "standard_dosage"), "original_txt", drop = TRUE], + NA_character_ + ), + high_dosage = ifelse("high_dosage" %in% df$type, + df[which(df$type == "high_dosage"), "original_txt", drop = TRUE], + NA_character_ + ) + ) + } + out <- do.call(rbind_AMR, lapply(lst, as.data.frame, stringsAsFactors = FALSE)) + rownames(out) <- NULL + out$ab <- ab + out$name <- ab_name(ab, language = NULL) + if (pkg_is_available("tibble")) { + import_fn("as_tibble", "tibble")(out) + } else { + out + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/export_biosample.R + + + + +#' Export Data Set as NCBI BioSample Antibiogram +#' +#' +#' @param x a data set +#' @param filename a character string specifying the file name +#' @param type a character string specifying the type of data set, either "pathogen MIC" or "beta-lactamase MIC", see +#' @keywords internal +export_ncbi_biosample <- function(x, + filename = paste0("biosample_", format(Sys.time(), "%Y-%m-%d-%H%M%S"), ".xlsx"), + type = "pathogen MIC", + columns = where(is.mic), + save_as_xlsx = TRUE) { + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(filename, allow_class = "character", has_length = 1) + meet_criteria(type, allow_class = "character", has_length = 1, is_in = c("pathogen MIC", "beta-lactamase MIC")) + meet_criteria(save_as_xlsx, allow_class = "logical", has_length = 1) + + out <- x %pm>% + pm_select(columns) + stop_if(NROW(out) == 0, "No columns found.") + + if (isTRUE(save_as_xlsx)) { + export <- import_fn("write.xlsx", pkg = "openxlsx", error_on_fail = TRUE) + export(out, file = filename, overwrite = TRUE, asTable = FALSE) + } else { + out + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/first_isolate.R + + + + +#' Determine First Isolates +#' +#' Determine first isolates of all microorganisms of every patient per episode and (if needed) per specimen type. These functions support all four methods as summarised by Hindler *et al.* in 2007 (\doi{10.1086/511864}). To determine patient episodes not necessarily based on microorganisms, use [is_new_episode()] that also supports grouping with the `dplyr` package. +#' @param x a [data.frame] containing isolates. Can be left blank for automatic determination, see *Examples*. +#' @param col_date column name of the result date (or date that is was received on the lab) - the default is the first column with a date class +#' @param col_patient_id column name of the unique IDs of the patients - the default is the first column that starts with 'patient' or 'patid' (case insensitive) +#' @param col_mo column name of the names or codes of the microorganisms (see [as.mo()]) - the default is the first column of class [`mo`]. Values will be coerced using [as.mo()]. +#' @param col_testcode column name of the test codes. Use `col_testcode = NULL` to **not** exclude certain test codes (such as test codes for screening). In that case `testcodes_exclude` will be ignored. +#' @param col_specimen column name of the specimen type or group +#' @param col_icu column name of the logicals (`TRUE`/`FALSE`) whether a ward or department is an Intensive Care Unit (ICU). This can also be a [logical] vector with the same length as rows in `x`. +#' @param col_keyantimicrobials (only useful when `method = "phenotype-based"`) column name of the key antimicrobials to determine first isolates, see [key_antimicrobials()]. The default is the first column that starts with 'key' followed by 'ab' or 'antibiotics' or 'antimicrobials' (case insensitive). Use `col_keyantimicrobials = FALSE` to prevent this. Can also be the output of [key_antimicrobials()]. +#' @param episode_days episode in days after which a genus/species combination will be determined as 'first isolate' again. The default of 365 days is based on the guideline by CLSI, see *Source*. +#' @param testcodes_exclude a [character] vector with test codes that should be excluded (case-insensitive) +#' @param icu_exclude a [logical] to indicate whether ICU isolates should be excluded (rows with value `TRUE` in the column set with `col_icu`) +#' @param specimen_group value in the column set with `col_specimen` to filter on +#' @param type type to determine weighed isolates; can be `"keyantimicrobials"` or `"points"`, see *Details* +#' @param method the method to apply, either `"phenotype-based"`, `"episode-based"`, `"patient-based"` or `"isolate-based"` (can be abbreviated), see *Details*. The default is `"phenotype-based"` if antimicrobial test results are present in the data, and `"episode-based"` otherwise. +#' @param ignore_I [logical] to indicate whether antibiotic interpretations with `"I"` will be ignored when `type = "keyantimicrobials"`, see *Details* +#' @param points_threshold minimum number of points to require before differences in the antibiogram will lead to inclusion of an isolate when `type = "points"`, see *Details* +#' @param info a [logical] to indicate info should be printed - the default is `TRUE` only in interactive mode +#' @param include_unknown a [logical] to indicate whether 'unknown' microorganisms should be included too, i.e. microbial code `"UNKNOWN"`, which defaults to `FALSE`. For WHONET users, this means that all records with organism code `"con"` (*contamination*) will be excluded at default. Isolates with a microbial ID of `NA` will always be excluded as first isolate. +#' @param include_untested_sir a [logical] to indicate whether also rows without antibiotic results are still eligible for becoming a first isolate. Use `include_untested_sir = FALSE` to always return `FALSE` for such rows. This checks the data set for columns of class `sir` and consequently requires transforming columns with antibiotic results using [as.sir()] first. +#' @param ... arguments passed on to [first_isolate()] when using [filter_first_isolate()], otherwise arguments passed on to [key_antimicrobials()] (such as `universal`, `gram_negative`, `gram_positive`) +#' @details +#' To conduct epidemiological analyses on antimicrobial resistance data, only so-called first isolates should be included to prevent overestimation and underestimation of antimicrobial resistance. Different methods can be used to do so, see below. +#' +#' These functions are context-aware. This means that the `x` argument can be left blank if used inside a [data.frame] call, see *Examples*. +#' +#' The [first_isolate()] function is a wrapper around the [is_new_episode()] function, but more efficient for data sets containing microorganism codes or names. +#' +#' All isolates with a microbial ID of `NA` will be excluded as first isolate. +#' +#' ### Different methods +#' +#' According to Hindler *et al.* (2007, \doi{10.1086/511864}), there are different methods (algorithms) to select first isolates with increasing reliability: isolate-based, patient-based, episode-based and phenotype-based. All methods select on a combination of the taxonomic genus and species (not subspecies). +#' +#' All mentioned methods are covered in the [first_isolate()] function: +#' +#' +#' | **Method** | **Function to apply** | +#' |--------------------------------------------------|-------------------------------------------------------| +#' | **Isolate-based** | `first_isolate(x, method = "isolate-based")` | +#' | *(= all isolates)* | | +#' | | | +#' | | | +#' | **Patient-based** | `first_isolate(x, method = "patient-based")` | +#' | *(= first isolate per patient)* | | +#' | | | +#' | | | +#' | **Episode-based** | `first_isolate(x, method = "episode-based")`, or: | +#' | *(= first isolate per episode)* | | +#' | - 7-Day interval from initial isolate | - `first_isolate(x, method = "e", episode_days = 7)` | +#' | - 30-Day interval from initial isolate | - `first_isolate(x, method = "e", episode_days = 30)` | +#' | | | +#' | | | +#' | **Phenotype-based** | `first_isolate(x, method = "phenotype-based")`, or: | +#' | *(= first isolate per phenotype)* | | +#' | - Major difference in any antimicrobial result | - `first_isolate(x, type = "points")` | +#' | - Any difference in key antimicrobial results | - `first_isolate(x, type = "keyantimicrobials")` | +#' +#' ### Isolate-based +#' +#' This method does not require any selection, as all isolates should be included. It does, however, respect all arguments set in the [first_isolate()] function. For example, the default setting for `include_unknown` (`FALSE`) will omit selection of rows without a microbial ID. +#' +#' ### Patient-based +#' +#' To include every genus-species combination per patient once, set the `episode_days` to `Inf`. Although often inappropriate, this method makes sure that no duplicate isolates are selected from the same patient. In a large longitudinal data set, this could mean that isolates are *excluded* that were found years after the initial isolate. +#' +#' ### Episode-based +#' +#' To include every genus-species combination per patient episode once, set the `episode_days` to a sensible number of days. Depending on the type of analysis, this could be 14, 30, 60 or 365. Short episodes are common for analysing specific hospital or ward data, long episodes are common for analysing regional and national data. +#' +#' This is the most common method to correct for duplicate isolates. Patients are categorised into episodes based on their ID and dates (e.g., the date of specimen receipt or laboratory result). While this is a common method, it does not take into account antimicrobial test results. This means that e.g. a methicillin-resistant *Staphylococcus aureus* (MRSA) isolate cannot be differentiated from a wildtype *Staphylococcus aureus* isolate. +#' +#' ### Phenotype-based +#' +#' This is a more reliable method, since it also *weighs* the antibiogram (antimicrobial test results) yielding so-called 'first weighted isolates'. There are two different methods to weigh the antibiogram: +#' +#' 1. Using `type = "points"` and argument `points_threshold` (default) +#' +#' This method weighs *all* antimicrobial drugs available in the data set. Any difference from I to S or R (or vice versa) counts as `0.5` points, a difference from S to R (or vice versa) counts as `1` point. When the sum of points exceeds `points_threshold`, which defaults to `2`, an isolate will be selected as a first weighted isolate. +#' +#' All antimicrobials are internally selected using the [all_antimicrobials()] function. The output of this function does not need to be passed to the [first_isolate()] function. +#' +#' +#' 2. Using `type = "keyantimicrobials"` and argument `ignore_I` +#' +#' This method only weighs specific antimicrobial drugs, called *key antimicrobials*. Any difference from S to R (or vice versa) in these key antimicrobials will select an isolate as a first weighted isolate. With `ignore_I = FALSE`, also differences from I to S or R (or vice versa) will lead to this. +#' +#' Key antimicrobials are internally selected using the [key_antimicrobials()] function, but can also be added manually as a variable to the data and set in the `col_keyantimicrobials` argument. Another option is to pass the output of the [key_antimicrobials()] function directly to the `col_keyantimicrobials` argument. +#' +#' +#' The default method is phenotype-based (using `type = "points"`) and episode-based (using `episode_days = 365`). This makes sure that every genus-species combination is selected per patient once per year, while taking into account all antimicrobial test results. If no antimicrobial test results are available in the data set, only the episode-based method is applied at default. +#' @rdname first_isolate +#' @seealso [key_antimicrobials()] +#' @export +#' @return A [logical] vector +#' @source Methodology of this function is strictly based on: +#' +#' - **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition**, 2022, *Clinical and Laboratory Standards Institute (CLSI)*. . +#' +#' - Hindler JF and Stelling J (2007). **Analysis and Presentation of Cumulative Antibiograms: A New Consensus Guideline from the Clinical and Laboratory Standards Institute.** Clinical Infectious Diseases, 44(6), 867-873. \doi{10.1086/511864} +#' @examples +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates. +#' +#' example_isolates[first_isolate(info = TRUE), ] +#' \donttest{ +#' # get all first Gram-negatives +#' example_isolates[which(first_isolate(info = FALSE) & mo_is_gram_negative()), ] +#' +#' if (require("dplyr")) { +#' # filter on first isolates using dplyr: +#' example_isolates %>% +#' filter(first_isolate(info = TRUE)) +#' } +#' if (require("dplyr")) { +#' # short-hand version: +#' example_isolates %>% +#' filter_first_isolate(info = FALSE) +#' } +#' if (require("dplyr")) { +#' # flag the first isolates per group: +#' example_isolates %>% +#' group_by(ward) %>% +#' mutate(first = first_isolate(info = TRUE)) %>% +#' select(ward, date, patient, mo, first) +#' } +#' } +first_isolate <- function(x = NULL, + col_date = NULL, + col_patient_id = NULL, + col_mo = NULL, + col_testcode = NULL, + col_specimen = NULL, + col_icu = NULL, + col_keyantimicrobials = NULL, + episode_days = 365, + testcodes_exclude = NULL, + icu_exclude = FALSE, + specimen_group = NULL, + type = "points", + method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"), + ignore_I = TRUE, + points_threshold = 2, + info = interactive(), + include_unknown = FALSE, + include_untested_sir = TRUE, + ...) { + if (is_null_or_grouped_tbl(x)) { + # when `x` is left blank, auto determine it (get_current_data() searches underlying data within call) + # is also fix for using a grouped df as input (a dot as first argument) + x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x) + } + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(col_testcode, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + if (isFALSE(col_specimen)) { + col_specimen <- NULL + } + meet_criteria(col_specimen, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + if (is.logical(col_icu)) { + meet_criteria(col_icu, allow_class = "logical", has_length = c(1, nrow(x)), allow_NA = TRUE, allow_NULL = TRUE) + x$newvar_is_icu <- col_icu + } else if (!is.null(col_icu)) { + # add "logical" to the allowed classes here, since it may give an error in certain user input, and should then also say that logicals can be used too + meet_criteria(col_icu, allow_class = c("character", "logical"), has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + x$newvar_is_icu <- x[, col_icu, drop = TRUE] + } else { + x$newvar_is_icu <- NA + } + # method + method <- coerce_method(method) + meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based")) + # key antimicrobials + if (length(col_keyantimicrobials) > 1) { + meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = nrow(x)) + x$keyabcol <- col_keyantimicrobials + col_keyantimicrobials <- "keyabcol" + } else { + if (isFALSE(col_keyantimicrobials)) { + col_keyantimicrobials <- NULL + # method cannot be phenotype-based anymore + if (method == "phenotype-based") { + method <- "episode-based" + } + } + meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + } + meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE) + meet_criteria(testcodes_exclude, allow_class = "character", allow_NULL = TRUE) + meet_criteria(icu_exclude, allow_class = "logical", has_length = 1) + meet_criteria(specimen_group, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(type, allow_class = "character", has_length = 1, is_in = c("points", "keyantimicrobials")) + meet_criteria(ignore_I, allow_class = "logical", has_length = 1) + meet_criteria(points_threshold, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(info, allow_class = "logical", has_length = 1) + meet_criteria(include_unknown, allow_class = "logical", has_length = 1) + meet_criteria(include_untested_sir, allow_class = "logical", has_length = 1) + + # remove data.table, grouping from tibbles, etc. + x <- as.data.frame(x, stringsAsFactors = FALSE) + + any_col_contains_sir <- any(vapply( + FUN.VALUE = logical(1), + X = x, + # check only first 10,000 rows + FUN = function(x) any(as.character(x[1:10000]) %in% c("S", "SDD", "I", "R", "NI"), na.rm = TRUE), + USE.NAMES = FALSE + )) + if (method == "phenotype-based" && !any_col_contains_sir) { + method <- "episode-based" + } + if (isTRUE(info) && message_not_thrown_before("first_isolate", "method")) { + message_( + paste0( + "Determining first isolates ", + ifelse(method %in% c("episode-based", "phenotype-based"), + ifelse(is.infinite(episode_days), + paste(font_bold("without"), " a specified episode length"), + paste("using an episode length of", font_bold(paste(episode_days, "days"))) + ), + "" + ) + ), + add_fn = font_red + ) + } + + # try to find columns based on type + # -- mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = info) + stop_if(is.null(col_mo), "`col_mo` must be set") + } + + # methods ---- + if (method == "isolate-based") { + episode_days <- Inf + col_keyantimicrobials <- NULL + x$dummy_dates <- Sys.Date() + col_date <- "dummy_dates" + x$dummy_patients <- paste("dummy", seq_len(nrow(x))) # all 'patients' must be unique + col_patient_id <- "dummy_patients" + } else if (method == "patient-based") { + episode_days <- Inf + col_keyantimicrobials <- NULL + } else if (method == "episode-based") { + col_keyantimicrobials <- NULL + } else if (method == "phenotype-based") { + if (missing(type) && !is.null(col_keyantimicrobials)) { + # type = "points" is default, but not set explicitly, while col_keyantimicrobials is + type <- "keyantimicrobials" + } + if (type == "points") { + x$keyantimicrobials <- all_antimicrobials(x, only_sir_columns = FALSE) + col_keyantimicrobials <- "keyantimicrobials" + } else if (type == "keyantimicrobials" && is.null(col_keyantimicrobials)) { + col_keyantimicrobials <- search_type_in_df(x = x, type = "keyantimicrobials", info = info) + if (is.null(col_keyantimicrobials)) { + # still not found as a column, create it ourselves + x$keyantimicrobials <- key_antimicrobials(x, only_sir_columns = FALSE, col_mo = col_mo, ...) + col_keyantimicrobials <- "keyantimicrobials" + } + } + } + + # -- date + if (is.null(col_date)) { + col_date <- search_type_in_df(x = x, type = "date", info = info) + stop_if(is.null(col_date), "`col_date` must be set") + } + + # -- patient id + if (is.null(col_patient_id)) { + if (all(c("First name", "Last name", "Sex") %in% colnames(x))) { + # WHONET support + x$patient_id <- paste(x$`First name`, x$`Last name`, x$Sex) + col_patient_id <- "patient_id" + message_("Using combined columns '", font_bold("First name"), "', '", font_bold("Last name"), "' and '", font_bold("Sex"), "' as input for `col_patient_id`") + } else { + col_patient_id <- search_type_in_df(x = x, type = "patient_id", info = info) + } + stop_if(is.null(col_patient_id), "`col_patient_id` must be set") + } + + # -- specimen + if (is.null(col_specimen) && !is.null(specimen_group)) { + col_specimen <- search_type_in_df(x = x, type = "specimen", info = info) + } + + # check if columns exist + check_columns_existance <- function(column, tblname = x) { + if (!is.null(column)) { + stop_ifnot(column %in% colnames(tblname), + "Column '", column, "' not found.", + call = FALSE + ) + } + } + + check_columns_existance(col_date) + check_columns_existance(col_patient_id) + check_columns_existance(col_mo) + check_columns_existance(col_testcode) + check_columns_existance(col_keyantimicrobials) + + # convert dates to Date + dates <- as.Date(x[, col_date, drop = TRUE]) + dates[is.na(dates)] <- as.Date("1970-01-01") + x[, col_date] <- dates + + # create original row index + x$newvar_row_index <- seq_len(nrow(x)) + x$newvar_mo <- as.mo(x[, col_mo, drop = TRUE], keep_synonyms = TRUE, info = FALSE) + x$newvar_genus_species <- paste(mo_genus(x$newvar_mo, keep_synonyms = TRUE, info = FALSE), mo_species(x$newvar_mo, keep_synonyms = TRUE, info = FALSE)) + x$newvar_date <- x[, col_date, drop = TRUE] + x$newvar_patient_id <- as.character(x[, col_patient_id, drop = TRUE]) + + if (is.null(col_testcode)) { + testcodes_exclude <- NULL + } + # remove testcodes + if (!is.null(testcodes_exclude) && isTRUE(info) && message_not_thrown_before("first_isolate", "excludingtestcodes")) { + message_("Excluding test codes: ", vector_and(testcodes_exclude, quotes = TRUE), + add_fn = font_red + ) + } + + if (is.null(col_specimen)) { + specimen_group <- NULL + } + + # filter on specimen group and keyantibiotics when they are filled in + if (!is.null(specimen_group)) { + check_columns_existance(col_specimen, x) + if (isTRUE(info) && message_not_thrown_before("first_isolate", "excludingspecimen")) { + message_("Excluding other than specimen group '", specimen_group, "'", + add_fn = font_red + ) + } + } + if (!is.null(col_keyantimicrobials)) { + x$newvar_key_ab <- as.character(x[, col_keyantimicrobials, drop = TRUE]) + } + + if (is.null(testcodes_exclude)) { + testcodes_exclude <- "" + } + + # arrange data to the right sorting + if (is.null(specimen_group)) { + x <- x[order( + x$newvar_patient_id, + x$newvar_genus_species, + x$newvar_date + ), ] + rownames(x) <- NULL + row.start <- 1 + row.end <- nrow(x) + } else { + # filtering on specimen and only analyse these rows to save time + x <- x[order( + pm_pull(x, col_specimen), + x$newvar_patient_id, + x$newvar_genus_species, + x$newvar_date + ), ] + rownames(x) <- NULL + suppressWarnings( + row.start <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% min(na.rm = TRUE) + ) + suppressWarnings( + row.end <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% max(na.rm = TRUE) + ) + } + + # speed up - return immediately if obvious + if (abs(row.start) == Inf || abs(row.end) == Inf) { + if (isTRUE(info)) { + message_("=> Found ", font_bold("no isolates"), + add_fn = font_black, + as_note = FALSE + ) + } + return(rep(FALSE, nrow(x))) + } + if (row.start == row.end) { + if (isTRUE(info)) { + message_("=> Found ", font_bold("1 first isolate"), ", as the data only contained 1 row", + add_fn = font_black, + as_note = FALSE + ) + } + return(TRUE) + } + if (length(c(row.start:row.end)) == pm_n_distinct(x[c(row.start:row.end), col_mo, drop = TRUE])) { + if (isTRUE(info)) { + message_("=> Found ", font_bold(paste(length(c(row.start:row.end)), "first isolates")), + ", as all isolates were different microbial species", + add_fn = font_black, + as_note = FALSE + ) + } + return(rep(TRUE, length(c(row.start:row.end)))) + } + + # did find some isolates - add new index numbers of rows + x$newvar_row_index_sorted <- seq_len(nrow(x)) + + scope.size <- nrow(x[which(x$newvar_row_index_sorted %in% seq(row.start, row.end, 1) & + !is.na(x$newvar_mo)), , drop = FALSE]) + + # Analysis of first isolate ---- + if (!is.null(col_keyantimicrobials)) { + if (isTRUE(info) && message_not_thrown_before("first_isolate", "type")) { + if (type == "keyantimicrobials") { + message_("Basing inclusion on key antimicrobials, ", + ifelse(ignore_I == FALSE, "not ", ""), + "ignoring I", + add_fn = font_red + ) + } + if (type == "points") { + message_("Basing inclusion on all antimicrobial results, using a points threshold of ", + points_threshold, + add_fn = font_red + ) + } + } + } + + x$other_pat_or_mo <- !(x$newvar_patient_id == pm_lag(x$newvar_patient_id) & x$newvar_genus_species == pm_lag(x$newvar_genus_species)) + + x$newvar_episode_group <- paste(x$newvar_patient_id, x$newvar_genus_species) + + x$more_than_episode_ago <- unlist( + lapply( + split( + x$newvar_date, + x$newvar_episode_group + ), + is_new_episode, + episode_days = episode_days, + drop = FALSE + ), + use.names = FALSE + ) + + if (!is.null(col_keyantimicrobials)) { + # using phenotypes + x$different_antibiogram <- !unlist( + lapply( + split( + x$newvar_key_ab, + x$newvar_episode_group + ), + duplicated_antibiogram, + points_threshold = points_threshold, + ignore_I = ignore_I, + type = type + ), + use.names = FALSE + ) + } else { + x$different_antibiogram <- FALSE + } + + x$newvar_first_isolate <- x$newvar_row_index_sorted >= row.start & + x$newvar_row_index_sorted <= row.end & + x$newvar_genus_species != "" & + (x$other_pat_or_mo | x$more_than_episode_ago | x$different_antibiogram) + + decimal.mark <- getOption("OutDec") + big.mark <- ifelse(decimal.mark != ",", ",", " ") + + # first one as TRUE + x[row.start, "newvar_first_isolate"] <- TRUE + # no tests that should be included, or ICU + if (!is.null(col_testcode)) { + x[which(x[, col_testcode] %in% tolower(testcodes_exclude)), "newvar_first_isolate"] <- FALSE + } + if (any(!is.na(x$newvar_is_icu)) && any(x$newvar_is_icu == TRUE, na.rm = TRUE)) { + if (icu_exclude == TRUE) { + if (isTRUE(info)) { + message_("Excluding ", format(sum(x$newvar_is_icu, na.rm = TRUE), decimal.mark = decimal.mark, big.mark = big.mark), " isolates from ICU.", + add_fn = font_red) + } + x[which(x$newvar_is_icu), "newvar_first_isolate"] <- FALSE + } else if (isTRUE(info)) { + message_("Including isolates from ICU.") + } + } + + if (isTRUE(info)) { + # print group name if used in dplyr::group_by() + cur_group <- import_fn("cur_group", "dplyr", error_on_fail = FALSE) + if (!is.null(cur_group)) { + group_df <- tryCatch(cur_group(), error = function(e) data.frame()) + if (NCOL(group_df) > 0) { + # transform factors to characters + group <- vapply(FUN.VALUE = character(1), group_df, function(x) { + if (is.numeric(x)) { + format(x) + } else if (is.logical(x)) { + as.character(x) + } else { + paste0('"', x, '"') + } + }) + message_("\nGroup: ", paste0(names(group), " = ", group, collapse = ", "), "\n", + as_note = FALSE, + add_fn = font_red + ) + } + } + } + + # handle empty microorganisms + if (any(x$newvar_mo == "UNKNOWN", na.rm = TRUE) && isTRUE(info)) { + message_( + ifelse(include_unknown == TRUE, "Including ", "Excluding "), + format(sum(x$newvar_mo == "UNKNOWN", na.rm = TRUE), + decimal.mark = decimal.mark, big.mark = big.mark + ), + " isolates with a microbial ID 'UNKNOWN' (in column '", font_bold(col_mo), "')", + add_fn = font_red + ) + } + x[which(x$newvar_mo == "UNKNOWN"), "newvar_first_isolate"] <- include_unknown + + # exclude all NAs + if (anyNA(x$newvar_mo) && isTRUE(info)) { + message_( + "Excluding ", format(sum(is.na(x$newvar_mo), na.rm = TRUE), + decimal.mark = decimal.mark, big.mark = big.mark + ), + " isolates with a microbial ID `NA` (in column '", font_bold(col_mo), "')", + add_fn = font_red + ) + } + x[which(is.na(x$newvar_mo)), "newvar_first_isolate"] <- FALSE + + # handle isolates without antibiogram + if (include_untested_sir == FALSE && any(is.sir(x))) { + sir_all_NA <- which(unname(vapply( + FUN.VALUE = logical(1), + as.data.frame(t(x[, is.sir(x), drop = FALSE])), + function(sir_values) all(is.na(sir_values)) + ))) + x[sir_all_NA, "newvar_first_isolate"] <- FALSE + } + + # arrange back according to original sorting again + x <- x[order(x$newvar_row_index), , drop = FALSE] + rownames(x) <- NULL + + if (isTRUE(info)) { + n_found <- sum(x$newvar_first_isolate, na.rm = TRUE) + p_found_total <- percentage(n_found / nrow(x[which(!is.na(x$newvar_mo)), , drop = FALSE]), digits = 1) + p_found_scope <- percentage(n_found / scope.size, digits = 1) + if (p_found_total %unlike% "[.]") { + p_found_total <- gsub("%", ".0%", p_found_total, fixed = TRUE) + } + if (p_found_scope %unlike% "[.]") { + p_found_scope <- gsub("%", ".0%", p_found_scope, fixed = TRUE) + } + # mark up number of found + n_found <- format(n_found, big.mark = big.mark, decimal.mark = decimal.mark) + message_( + paste0( + "=> Found ", + font_bold(paste0( + n_found, + ifelse(method == "isolate-based", "", paste0(" '", method, "'")), + " first isolates" + )), + " (", + ifelse(p_found_total != p_found_scope, + paste0(p_found_scope, " within scope and "), + "" + ), + p_found_total, " of total where a microbial ID was available)" + ), + add_fn = font_black, as_note = FALSE + ) + } + + x$newvar_first_isolate +} + +#' @rdname first_isolate +#' @export +filter_first_isolate <- function(x = NULL, + col_date = NULL, + col_patient_id = NULL, + col_mo = NULL, + episode_days = 365, + method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"), + ...) { + if (is_null_or_grouped_tbl(x)) { + # when `x` is left blank, auto determine it (get_current_data() searches underlying data within call) + # is also fix for using a grouped df as input (a dot as first argument) + x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x) + } + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x)) + meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE) + method <- coerce_method(method) + meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based")) + + subset(x, first_isolate( + x = x, + col_date = col_date, + col_patient_id = col_patient_id, + col_mo = col_mo, + episode_days = episode_days, + method = method, + ... + )) +} + +coerce_method <- function(method) { + if (is.null(method)) { + return(method) + } + method <- tolower(as.character(method[1L])) + method[method %like% "^(p$|pheno)"] <- "phenotype-based" + method[method %like% "^(e$|episode)"] <- "episode-based" + method[method %like% "^pat"] <- "patient-based" + method[method %like% "^(i$|iso)"] <- "isolate-based" + method +} + +duplicated_antibiogram <- function(antibiogram, points_threshold, ignore_I, type) { + if (length(antibiogram) == 1) { + # fast return, only 1 isolate + return(FALSE) + } + # first sort on data availability - count the dots and order that ascending so that highest availability of SIR is on top + number_dots <- vapply(FUN.VALUE = integer(1), + antibiogram, + function(x) sum(strsplit(x, "", fixed = TRUE)[[1]] == "."), + USE.NAMES = FALSE) + new_order <- order(number_dots, antibiogram) + antibiogram.bak <- antibiogram + antibiogram <- antibiogram[new_order] + + out <- rep(NA, length(antibiogram)) + out[1] <- FALSE + out[2] <- antimicrobials_equal(antibiogram[1], antibiogram[2], + ignore_I = ignore_I, points_threshold = points_threshold, + type = type) + if (length(antibiogram) == 2) { + # fast return, no further check required + return(out) + } + + # we can skip the duplicates - they are never unique antibiograms of course + duplicates <- duplicated(antibiogram) + out[3:length(out)][duplicates[3:length(out)] == TRUE] <- TRUE + if (all(duplicates[3:length(out)] == TRUE, na.rm = TRUE)) { + # fast return, no further check required + return(c(out[1:2], rep(TRUE, length(out) - 2))) + } + + for (na in antibiogram[is.na(out)]) { + # check if this antibiogram has any change with other antibiograms + out[which(antibiogram == na)] <- all( + vapply(FUN.VALUE = logical(1), + antibiogram[!is.na(out) & antibiogram != na], + function(y) antimicrobials_equal(y = y, z = na, + ignore_I = ignore_I, points_threshold = points_threshold, + type = type))) + } + + out <- out[order(new_order)] + # rerun duplicated again + duplicates <- duplicated(antibiogram.bak) + out[duplicates == TRUE] <- TRUE + out +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/g.test.R + + + + +#' *G*-test for Count Data +#' +#' [g.test()] performs chi-squared contingency table tests and goodness-of-fit tests, just like [chisq.test()] but is more reliable (1). A *G*-test can be used to see whether the number of observations in each category fits a theoretical expectation (called a ***G*-test of goodness-of-fit**), or to see whether the proportions of one variable are different for different values of the other variable (called a ***G*-test of independence**). +#' @inherit stats::chisq.test params return +#' @details If `x` is a [matrix] with one row or column, or if `x` is a vector and `y` is not given, then a *goodness-of-fit test* is performed (`x` is treated as a one-dimensional contingency table). The entries of `x` must be non-negative integers. In this case, the hypothesis tested is whether the population probabilities equal those in `p`, or are all equal if `p` is not given. +#' +#' If `x` is a [matrix] with at least two rows and columns, it is taken as a two-dimensional contingency table: the entries of `x` must be non-negative integers. Otherwise, `x` and `y` must be vectors or factors of the same length; cases with missing values are removed, the objects are coerced to factors, and the contingency table is computed from these. Then Pearson's chi-squared test is performed of the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals. +#' +#' The p-value is computed from the asymptotic chi-squared distribution of the test statistic. +#' +#' In the contingency table case simulation is done by random sampling from the set of all contingency tables with given marginals, and works only if the marginals are strictly positive. Note that this is not the usual sampling situation assumed for a chi-squared test (such as the *G*-test) but rather that for Fisher's exact test. +#' +#' In the goodness-of-fit case simulation is done by random sampling from the discrete distribution specified by `p`, each sample being of size `n = sum(x)`. This simulation is done in \R and may be slow. +#' +#' ### *G*-test Of Goodness-of-Fit (Likelihood Ratio Test) +#' Use the *G*-test of goodness-of-fit when you have one nominal variable with two or more values (such as male and female, or red, pink and white flowers). You compare the observed counts of numbers of observations in each category with the expected counts, which you calculate using some kind of theoretical expectation (such as a 1:1 sex ratio or a 1:2:1 ratio in a genetic cross). +#' +#' If the expected number of observations in any category is too small, the *G*-test may give inaccurate results, and you should use an exact test instead ([fisher.test()]). +#' +#' The *G*-test of goodness-of-fit is an alternative to the chi-square test of goodness-of-fit ([chisq.test()]); each of these tests has some advantages and some disadvantages, and the results of the two tests are usually very similar. +#' +#' ### *G*-test of Independence +#' Use the *G*-test of independence when you have two nominal variables, each with two or more possible values. You want to know whether the proportions for one variable are different among values of the other variable. +#' +#' It is also possible to do a *G*-test of independence with more than two nominal variables. For example, Jackson et al. (2013) also had data for children under 3, so you could do an analysis of old vs. young, thigh vs. arm, and reaction vs. no reaction, all analyzed together. +#' +#' Fisher's exact test ([fisher.test()]) is an **exact** test, where the *G*-test is still only an **approximation**. For any 2x2 table, Fisher's Exact test may be slower but will still run in seconds, even if the sum of your observations is multiple millions. +#' +#' The *G*-test of independence is an alternative to the chi-square test of independence ([chisq.test()]), and they will give approximately the same results. +#' +#' ### How the Test Works +#' Unlike the exact test of goodness-of-fit ([fisher.test()]), the *G*-test does not directly calculate the probability of obtaining the observed results or something more extreme. Instead, like almost all statistical tests, the *G*-test has an intermediate step; it uses the data to calculate a test statistic that measures how far the observed data are from the null expectation. You then use a mathematical relationship, in this case the chi-square distribution, to estimate the probability of obtaining that value of the test statistic. +#' +#' The *G*-test uses the log of the ratio of two likelihoods as the test statistic, which is why it is also called a likelihood ratio test or log-likelihood ratio test. The formula to calculate a *G*-statistic is: +#' +#' \eqn{G = 2 * sum(x * log(x / E))} +#' +#' where `E` are the expected values. Since this is chi-square distributed, the p value can be calculated in \R with: +#' ``` +#' p <- stats::pchisq(G, df, lower.tail = FALSE) +#' ``` +#' where `df` are the degrees of freedom. +#' +#' If there are more than two categories and you want to find out which ones are significantly different from their null expectation, you can use the same method of testing each category vs. the sum of all categories, with the Bonferroni correction. You use *G*-tests for each category, of course. +#' @seealso [chisq.test()] +#' @references 1. McDonald, J.H. 2014. **Handbook of Biological Statistics (3rd ed.)**. Sparky House Publishing, Baltimore, Maryland. . +#' @source The code for this function is identical to that of [chisq.test()], except that: +#' - The calculation of the statistic was changed to \eqn{2 * sum(x * log(x / E))} +#' - Yates' continuity correction was removed as it does not apply to a *G*-test +#' - The possibility to simulate p values with `simulate.p.value` was removed +#' @export +#' @importFrom stats pchisq complete.cases +#' @examples +#' # = EXAMPLE 1 = +#' # Shivrain et al. (2006) crossed clearfield rice (which are resistant +#' # to the herbicide imazethapyr) with red rice (which are susceptible to +#' # imazethapyr). They then crossed the hybrid offspring and examined the +#' # F2 generation, where they found 772 resistant plants, 1611 moderately +#' # resistant plants, and 737 susceptible plants. If resistance is controlled +#' # by a single gene with two co-dominant alleles, you would expect a 1:2:1 +#' # ratio. +#' +#' x <- c(772, 1611, 737) +#' g.test(x, p = c(1, 2, 1) / 4) +#' +#' # There is no significant difference from a 1:2:1 ratio. +#' # Meaning: resistance controlled by a single gene with two co-dominant +#' # alleles, is plausible. +#' +#' +#' # = EXAMPLE 2 = +#' # Red crossbills (Loxia curvirostra) have the tip of the upper bill either +#' # right or left of the lower bill, which helps them extract seeds from pine +#' # cones. Some have hypothesized that frequency-dependent selection would +#' # keep the number of right and left-billed birds at a 1:1 ratio. Groth (1992) +#' # observed 1752 right-billed and 1895 left-billed crossbills. +#' +#' x <- c(1752, 1895) +#' g.test(x) +#' +#' # There is a significant difference from a 1:1 ratio. +#' # Meaning: there are significantly more left-billed birds. +g.test <- function(x, + y = NULL, + # correct = TRUE, + p = rep(1 / length(x), length(x)), + rescale.p = FALSE) { + DNAME <- deparse(substitute(x)) + if (is.data.frame(x)) { + x <- as.matrix(x) + } + if (is.matrix(x)) { + if (min(dim(x)) == 1L) { + x <- as.vector(x) + } + } + if (!is.matrix(x) && !is.null(y)) { + if (length(x) != length(y)) { + stop("'x' and 'y' must have the same length") + } + DNAME2 <- deparse(substitute(y)) + xname <- if (length(DNAME) > 1L || nchar(DNAME, "w") > + 30) { + "" + } else { + DNAME + } + yname <- if (length(DNAME2) > 1L || nchar(DNAME2, "w") > + 30) { + "" + } else { + DNAME2 + } + OK <- complete.cases(x, y) + x <- factor(x[OK]) + y <- factor(y[OK]) + if ((nlevels(x) < 2L) || (nlevels(y) < 2L)) { + stop("'x' and 'y' must have at least 2 levels") + } + x <- table(x, y) + names(dimnames(x)) <- c(xname, yname) + DNAME <- paste( + paste(DNAME, collapse = "\n"), "and", + paste(DNAME2, collapse = "\n") + ) + } + if (any(x < 0) || anyNA(x)) { + stop("all entries of 'x' must be nonnegative and finite") + } + if ((n <- sum(x)) == 0) { + stop("at least one entry of 'x' must be positive") + } + + + if (is.matrix(x)) { + METHOD <- "G-test of independence" + nr <- as.integer(nrow(x)) + nc <- as.integer(ncol(x)) + if (is.na(nr) || is.na(nc) || is.na(nr * nc)) { + stop("invalid nrow(x) or ncol(x)", domain = NA) + } + # add fisher.test suggestion + if (nr == 2 && nc == 2) { + warning("`fisher.test()` is always more reliable for 2x2 tables and although much slower, often only takes seconds.") + } + sr <- rowSums(x) + sc <- colSums(x) + E <- outer(sr, sc, "*") / n + v <- function(r, c, n) c * r * (n - r) * (n - c) / n^3 + V <- outer(sr, sc, v, n) + dimnames(E) <- dimnames(x) + + STATISTIC <- 2 * sum(x * log(x / E), na.rm = TRUE) # sum((abs(x - E) - YATES)^2/E) for chisq.test + PARAMETER <- (nr - 1L) * (nc - 1L) + PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE) + } else { + if (length(dim(x)) > 2L) { + stop("invalid 'x'") + } + if (length(x) == 1L) { + stop("'x' must at least have 2 elements") + } + if (length(x) != length(p)) { + stop("'x' and 'p' must have the same number of elements") + } + if (any(p < 0)) { + stop("probabilities must be non-negative.") + } + if (abs(sum(p) - 1) > sqrt(.Machine$double.eps)) { + if (rescale.p) { + p <- p / sum(p) + } else { + stop("probabilities must sum to 1.") + } + } + METHOD <- "G-test of goodness-of-fit (likelihood ratio test)" + E <- n * p + V <- n * p * (1 - p) + STATISTIC <- 2 * sum(x * log(x / E)) # sum((x - E)^2/E) for chisq.test + names(E) <- names(x) + + PARAMETER <- length(x) - 1 + PVAL <- pchisq(STATISTIC, PARAMETER, lower.tail = FALSE) + } + names(STATISTIC) <- "X-squared" + names(PARAMETER) <- "df" + if (any(E < 5) && is.finite(PARAMETER)) { + warning("G-statistic approximation may be incorrect due to E < 5") + } + + structure(list( + statistic = STATISTIC, argument = PARAMETER, + p.value = PVAL, method = METHOD, data.name = DNAME, + observed = x, expected = E, residuals = (x - E) / sqrt(E), + stdres = (x - E) / sqrt(V) + ), class = "htest") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/get_episode.R + + + + +#' Determine Clinical or Epidemic Episodes +#' +#' These functions determine which items in a vector can be considered (the start of) a new episode. This can be used to determine clinical episodes for any epidemiological analysis. The [get_episode()] function returns the index number of the episode per group, while the [is_new_episode()] function returns `TRUE` for every new [get_episode()] index. Both absolute and relative episode determination are supported. +#' @param x vector of dates (class `Date` or `POSIXt`), will be sorted internally to determine episodes +#' @param episode_days episode length in days to specify the time period after which a new episode begins, can also be less than a day or `Inf`, see *Details* +#' @param case_free_days (inter-epidemic) interval length in days after which a new episode will start, can also be less than a day or `Inf`, see *Details* +#' @param ... ignored, only in place to allow future extensions +#' @details Episodes can be determined in two ways: absolute and relative. +#' +#' 1. Absolute +#' +#' This method uses `episode_days` to define an episode length in days, after which a new episode will start. A common use case in AMR data analysis is microbial epidemiology: episodes of *S. aureus* bacteraemia in ICU patients for example. The episode length could then be 30 days, so that new *S. aureus* isolates after an ICU episode of 30 days will be considered a different (or new) episode. +#' +#' Thus, this method counts **since the start of the previous episode**. +#' +#' 2. Relative +#' +#' This method uses `case_free_days` to quantify the duration of case-free days (the inter-epidemic interval), after which a new episode will start. A common use case is infectious disease epidemiology: episodes of norovirus outbreaks in a hospital for example. The case-free period could then be 14 days, so that new norovirus cases after that time will be considered a different (or new) episode. +#' +#' Thus, this methods counts **since the last case in the previous episode**. +#' +#' In a table: +#' +#' | Date | Using `episode_days = 7` | Using `case_free_days = 7` | +#' |:----------:|:------------------------:|:--------------------------:| +#' | 2023-01-01 | 1 | 1 | +#' | 2023-01-02 | 1 | 1 | +#' | 2023-01-05 | 1 | 1 | +#' | 2023-01-08 | 2** | 1 | +#' | 2023-02-21 | 3 | 2*** | +#' | 2023-02-22 | 3 | 2 | +#' | 2023-02-23 | 3 | 2 | +#' | 2023-02-24 | 3 | 2 | +#' | 2023-03-01 | 4 | 2 | +#' +#' ** This marks the start of a new episode, because 8 January 2023 is more than 7 days since the start of the previous episode (1 January 2023). \cr +#' *** This marks the start of a new episode, because 21 January 2023 is more than 7 days since the last case in the previous episode (8 January 2023). +#' +#' Either `episode_days` or `case_free_days` must be provided in the function. +#' +#' ### Difference between `get_episode()` and `is_new_episode()` +#' +#' The [get_episode()] function returns the index number of the episode, so all cases/patients/isolates in the first episode will have the number 1, all cases/patients/isolates in the second episode will have the number 2, etc. +#' +#' The [is_new_episode()] function on the other hand, returns `TRUE` for every new [get_episode()] index. +#' +#' To specify, when setting `episode_days = 365` (using method 1 as explained above), this is how the two functions differ: +#' +#' | patient | date | `get_episode()` | `is_new_episode()` | +#' |:---------:|:----------:|:---------------:|:------------------:| +#' | A | 2019-01-01 | 1 | TRUE | +#' | A | 2019-03-01 | 1 | FALSE | +#' | A | 2021-01-01 | 2 | TRUE | +#' | B | 2008-01-01 | 1 | TRUE | +#' | B | 2008-01-01 | 1 | FALSE | +#' | C | 2020-01-01 | 1 | TRUE | +#' +#' ### Other +#' +#' The [first_isolate()] function is a wrapper around the [is_new_episode()] function, but is more efficient for data sets containing microorganism codes or names and allows for different isolate selection methods. +#' +#' The `dplyr` package is not required for these functions to work, but these episode functions do support [variable grouping][dplyr::group_by()] and work conveniently inside `dplyr` verbs such as [`filter()`][dplyr::filter()], [`mutate()`][dplyr::mutate()] and [`summarise()`][dplyr::summarise()]. +#' @return +#' * [get_episode()]: an [integer] vector +#' * [is_new_episode()]: a [logical] vector +#' @seealso [first_isolate()] +#' @rdname get_episode +#' @export +#' @examples +#' # difference between absolute and relative determination of episodes: +#' x <- data.frame(dates = as.Date(c( +#' "2021-01-01", +#' "2021-01-02", +#' "2021-01-05", +#' "2021-01-08", +#' "2021-02-21", +#' "2021-02-22", +#' "2021-02-23", +#' "2021-02-24", +#' "2021-03-01", +#' "2021-03-01" +#' ))) +#' x$absolute <- get_episode(x$dates, episode_days = 7) +#' x$relative <- get_episode(x$dates, case_free_days = 7) +#' x +#' +#' +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates +#' df <- example_isolates[sample(seq_len(2000), size = 100), ] +#' +#' get_episode(df$date, episode_days = 60) # indices +#' is_new_episode(df$date, episode_days = 60) # TRUE/FALSE +#' +#' # filter on results from the third 60-day episode only, using base R +#' df[which(get_episode(df$date, 60) == 3), ] +#' +#' # the functions also work for less than a day, e.g. to include one per hour: +#' get_episode( +#' c( +#' Sys.time(), +#' Sys.time() + 60 * 60 +#' ), +#' episode_days = 1 / 24 +#' ) +#' +#' \donttest{ +#' if (require("dplyr")) { +#' # is_new_episode() can also be used in dplyr verbs to determine patient +#' # episodes based on any (combination of) grouping variables: +#' df %>% +#' mutate(condition = sample( +#' x = c("A", "B", "C"), +#' size = 100, +#' replace = TRUE +#' )) %>% +#' group_by(patient, condition) %>% +#' mutate(new_episode = is_new_episode(date, 365)) %>% +#' select(patient, date, condition, new_episode) %>% +#' arrange(patient, condition, date) +#' } +#' +#' if (require("dplyr")) { +#' df %>% +#' group_by(ward, patient) %>% +#' transmute(date, +#' patient, +#' new_index = get_episode(date, 60), +#' new_logical = is_new_episode(date, 60) +#' ) %>% +#' arrange(patient, ward, date) +#' } +#' +#' if (require("dplyr")) { +#' df %>% +#' group_by(ward) %>% +#' summarise( +#' n_patients = n_distinct(patient), +#' n_episodes_365 = sum(is_new_episode(date, episode_days = 365)), +#' n_episodes_60 = sum(is_new_episode(date, episode_days = 60)), +#' n_episodes_30 = sum(is_new_episode(date, episode_days = 30)) +#' ) +#' } +#' +#' # grouping on patients and microorganisms leads to the same +#' # results as first_isolate() when using 'episode-based': +#' if (require("dplyr")) { +#' x <- df %>% +#' filter_first_isolate( +#' include_unknown = TRUE, +#' method = "episode-based" +#' ) +#' +#' y <- df %>% +#' group_by(patient, mo) %>% +#' filter(is_new_episode(date, 365)) %>% +#' ungroup() +#' +#' identical(x, y) +#' } +#' +#' # but is_new_episode() has a lot more flexibility than first_isolate(), +#' # since you can now group on anything that seems relevant: +#' if (require("dplyr")) { +#' df %>% +#' group_by(patient, mo, ward) %>% +#' mutate(flag_episode = is_new_episode(date, 365)) %>% +#' select(group_vars(.), flag_episode) +#' } +#' } +get_episode <- function(x, episode_days = NULL, case_free_days = NULL, ...) { + meet_criteria(x, allow_class = c("Date", "POSIXt"), allow_NA = TRUE) + meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE, allow_NULL = TRUE) + meet_criteria(case_free_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE, allow_NULL = TRUE) + as.integer(exec_episode(x, episode_days, case_free_days, ...)) +} + +#' @rdname get_episode +#' @export +is_new_episode <- function(x, episode_days = NULL, case_free_days = NULL, ...) { + meet_criteria(x, allow_class = c("Date", "POSIXt"), allow_NA = TRUE) + meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE, allow_NULL = TRUE) + meet_criteria(case_free_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE, allow_NULL = TRUE) + !duplicated(exec_episode(x, episode_days, case_free_days, ...)) +} + +exec_episode <- function(x, episode_days, case_free_days, ...) { + stop_ifnot(is.null(episode_days) || is.null(case_free_days), + "either argument `episode_days` or argument `case_free_days` must be set.", + call = -2 + ) + + # running as.double() on a POSIXct object will return its number of seconds since 1970-01-01 + x <- as.double(as.POSIXct(x)) # as.POSIXct() required for Date classes + + # since x is now in seconds, get seconds from episode_days as well + episode_seconds <- episode_days * 60 * 60 * 24 + case_free_seconds <- case_free_days * 60 * 60 * 24 + + if (length(x) == 1) { # this will also match 1 NA, which is fine + return(1) + } else if (length(x) == 2 && all(!is.na(x))) { + if ((length(episode_seconds) > 0 && (max(x) - min(x)) >= episode_seconds) || + (length(case_free_seconds) > 0 && (max(x) - min(x)) >= case_free_seconds)) { + if (x[1] <= x[2]) { + return(c(1, 2)) + } else { + return(c(2, 1)) + } + } else { + return(c(1, 1)) + } + } + + run_episodes <- function(x, episode_sec, case_free_sec) { + NAs <- which(is.na(x)) + x[NAs] <- 0 + + indices <- integer(length = length(x)) + start <- x[1] + ind <- 1 + indices[ind] <- 1 + for (i in 2:length(x)) { + if ((length(episode_sec) > 0 && (x[i] - start) >= episode_sec) || + (length(case_free_sec) > 0 && (x[i] - x[i - 1]) >= case_free_sec)) { + ind <- ind + 1 + start <- x[i] + } + indices[i] <- ind + } + indices[NAs] <- NA + indices + } + + ord <- order(x) + out <- run_episodes(x[ord], episode_seconds, case_free_seconds)[order(ord)] + out[is.na(x) & ord != 1] <- NA # every NA expect for the first must remain NA + out +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ggplot_pca.R + + + + +#' PCA Biplot with `ggplot2` +#' +#' Produces a `ggplot2` variant of a so-called [biplot](https://en.wikipedia.org/wiki/Biplot) for PCA (principal component analysis), but is more flexible and more appealing than the base \R [biplot()] function. +#' @param x an object returned by [pca()], [prcomp()] or [princomp()] +#' @inheritParams stats::biplot.prcomp +#' @param labels an optional vector of labels for the observations. If set, the labels will be placed below their respective points. When using the [pca()] function as input for `x`, this will be determined automatically based on the attribute `non_numeric_cols`, see [pca()]. +#' @param labels_textsize the size of the text used for the labels +#' @param labels_text_placement adjustment factor the placement of the variable names (`>=1` means further away from the arrow head) +#' @param groups an optional vector of groups for the labels, with the same length as `labels`. If set, the points and labels will be coloured according to these groups. When using the [pca()] function as input for `x`, this will be determined automatically based on the attribute `non_numeric_cols`, see [pca()]. +#' @param ellipse a [logical] to indicate whether a normal data ellipse should be drawn for each group (set with `groups`) +#' @param ellipse_prob statistical size of the ellipse in normal probability +#' @param ellipse_size the size of the ellipse line +#' @param ellipse_alpha the alpha (transparency) of the ellipse line +#' @param points_size the size of the points +#' @param points_alpha the alpha (transparency) of the points +#' @param arrows a [logical] to indicate whether arrows should be drawn +#' @param arrows_textsize the size of the text for variable names +#' @param arrows_colour the colour of the arrow and their text +#' @param arrows_size the size (thickness) of the arrow lines +#' @param arrows_textsize the size of the text at the end of the arrows +#' @param arrows_textangled a [logical] whether the text at the end of the arrows should be angled +#' @param arrows_alpha the alpha (transparency) of the arrows and their text +#' @param base_textsize the text size for all plot elements except the labels and arrows +#' @param ... arguments passed on to functions +#' @source The [ggplot_pca()] function is based on the `ggbiplot()` function from the `ggbiplot` package by Vince Vu, as found on GitHub: (retrieved: 2 March 2020, their latest commit: [`7325e88`](https://github.com/vqv/ggbiplot/commit/7325e880485bea4c07465a0304c470608fffb5d9); 12 February 2015). +#' +#' As per their GPL-2 licence that demands documentation of code changes, the changes made based on the source code were: +#' 1. Rewritten code to remove the dependency on packages `plyr`, `scales` and `grid` +#' 2. Parametrised more options, like arrow and ellipse settings +#' 3. Hardened all input possibilities by defining the exact type of user input for every argument +#' 4. Added total amount of explained variance as a caption in the plot +#' 5. Cleaned all syntax based on the `lintr` package, fixed grammatical errors and added integrity checks +#' 6. Updated documentation +#' @details The colours for labels and points can be changed by adding another scale layer for colour, such as `scale_colour_viridis_d()` and `scale_colour_brewer()`. +#' @rdname ggplot_pca +#' @export +#' @examples +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates. +#' +#' \donttest{ +#' if (require("dplyr")) { +#' # calculate the resistance per group first +#' resistance_data <- example_isolates %>% +#' group_by( +#' order = mo_order(mo), # group on anything, like order +#' genus = mo_genus(mo) +#' ) %>% # and genus as we do here; +#' filter(n() >= 30) %>% # filter on only 30 results per group +#' summarise_if(is.sir, resistance) # then get resistance of all drugs +#' +#' # now conduct PCA for certain antimicrobial drugs +#' pca_result <- resistance_data %>% +#' pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) +#' +#' summary(pca_result) +#' +#' # old base R plotting method: +#' biplot(pca_result, main = "Base R biplot") +#' +#' # new ggplot2 plotting method using this package: +#' if (require("ggplot2")) { +#' ggplot_pca(pca_result) + +#' labs(title = "ggplot2 biplot") +#' } +#' if (require("ggplot2")) { +#' # still extendible with any ggplot2 function +#' ggplot_pca(pca_result) + +#' scale_colour_viridis_d() + +#' labs(title = "ggplot2 biplot") +#' } +#' } +#' } +ggplot_pca <- function(x, + choices = 1:2, + scale = 1, + pc.biplot = TRUE, + labels = NULL, + labels_textsize = 3, + labels_text_placement = 1.5, + groups = NULL, + ellipse = TRUE, + ellipse_prob = 0.68, + ellipse_size = 0.5, + ellipse_alpha = 0.5, + points_size = 2, + points_alpha = 0.25, + arrows = TRUE, + arrows_colour = "darkblue", + arrows_size = 0.5, + arrows_textsize = 3, + arrows_textangled = TRUE, + arrows_alpha = 0.75, + base_textsize = 10, + ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(x, allow_class = c("prcomp", "princomp", "PCA", "lda")) + meet_criteria(choices, allow_class = c("numeric", "integer"), has_length = 2, is_positive = TRUE, is_finite = TRUE) + meet_criteria(scale, allow_class = c("numeric", "integer", "logical"), has_length = 1) + meet_criteria(pc.biplot, allow_class = "logical", has_length = 1) + meet_criteria(labels, allow_class = "character", allow_NULL = TRUE) + meet_criteria(labels_textsize, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(labels_text_placement, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(groups, allow_class = "character", allow_NULL = TRUE) + meet_criteria(ellipse, allow_class = "logical", has_length = 1) + meet_criteria(ellipse_prob, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(ellipse_size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(ellipse_alpha, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(points_size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(points_alpha, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(arrows, allow_class = "logical", has_length = 1) + meet_criteria(arrows_colour, allow_class = "character", has_length = 1) + meet_criteria(arrows_size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(arrows_textsize, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(arrows_textangled, allow_class = "logical", has_length = 1) + meet_criteria(arrows_alpha, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(base_textsize, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + + calculations <- pca_calculations( + pca_model = x, + groups = groups, + groups_missing = missing(groups), + labels = labels, + labels_missing = missing(labels), + choices = choices, + scale = scale, + pc.biplot = pc.biplot, + ellipse_prob = ellipse_prob, + labels_text_placement = labels_text_placement + ) + choices <- calculations$choices + df.u <- calculations$df.u + df.v <- calculations$df.v + ell <- calculations$ell + groups <- calculations$groups + group_name <- calculations$group_name + labels <- calculations$labels + + # Append the proportion of explained variance to the axis labels + if ((1 - as.integer(scale)) == 0) { + u.axis.labs <- paste0("Standardised PC", choices) + } else { + u.axis.labs <- paste0("PC", choices) + } + u.axis.labs <- paste0( + u.axis.labs, + paste0( + "\n(explained var: ", + percentage(x$sdev[choices]^2 / sum(x$sdev^2)), + ")" + ) + ) + + # Score Labels + if (!is.null(labels)) { + df.u$labels <- labels + } + + # Grouping variable + if (!is.null(groups)) { + df.u$groups <- groups + } + + # Base plot + g <- ggplot2::ggplot( + data = df.u, + ggplot2::aes(x = xvar, y = yvar) + ) + + ggplot2::xlab(u.axis.labs[1]) + + ggplot2::ylab(u.axis.labs[2]) + + ggplot2::expand_limits( + x = c(-1.15, 1.15), + y = c(-1.15, 1.15) + ) + + # Draw either labels or points + if (!is.null(df.u$labels)) { + if (!is.null(df.u$groups)) { + g <- g + ggplot2::geom_point(ggplot2::aes(colour = groups), + alpha = points_alpha, + size = points_size + ) + + ggplot2::geom_text(ggplot2::aes(label = labels, colour = groups), + nudge_y = -0.05, + size = labels_textsize + ) + + ggplot2::labs(colour = group_name) + } else { + g <- g + ggplot2::geom_point( + alpha = points_alpha, + size = points_size + ) + + ggplot2::geom_text(ggplot2::aes(label = labels), + nudge_y = -0.05, + size = labels_textsize + ) + } + } else { + if (!is.null(df.u$groups)) { + g <- g + ggplot2::geom_point(ggplot2::aes(colour = groups), + alpha = points_alpha, + size = points_size + ) + + ggplot2::labs(colour = group_name) + } else { + g <- g + ggplot2::geom_point( + alpha = points_alpha, + size = points_size + ) + } + } + + # Overlay a concentration ellipse if there are groups + if (!is.null(df.u$groups) && !is.null(ell) && isTRUE(ellipse)) { + g <- g + ggplot2::geom_path( + data = ell, + ggplot2::aes(colour = groups, group = groups), + size = ellipse_size, + alpha = points_alpha + ) + } + + # Label the variable axes + if (arrows == TRUE) { + g <- g + ggplot2::geom_segment( + data = df.v, + ggplot2::aes(x = 0, y = 0, xend = xvar, yend = yvar), + arrow = ggplot2::arrow( + length = ggplot2::unit(0.5, "picas"), + angle = 20, + ends = "last", + type = "open" + ), + colour = arrows_colour, + size = arrows_size, + alpha = arrows_alpha + ) + if (arrows_textangled == TRUE) { + g <- g + ggplot2::geom_text( + data = df.v, + ggplot2::aes(label = varname, x = xvar, y = yvar, angle = angle, hjust = hjust), + colour = arrows_colour, + size = arrows_textsize, + alpha = arrows_alpha + ) + } else { + g <- g + ggplot2::geom_text( + data = df.v, + ggplot2::aes(label = varname, x = xvar, y = yvar, hjust = hjust), + colour = arrows_colour, + size = arrows_textsize, + alpha = arrows_alpha + ) + } + } + + # Add caption label about total explained variance + g <- g + ggplot2::labs(caption = paste0( + "Total explained variance: ", + percentage(sum(x$sdev[choices]^2 / sum(x$sdev^2))) + )) + + # mark-up nicely + g <- g + ggplot2::theme_minimal(base_size = base_textsize) + + ggplot2::theme( + panel.grid.major = ggplot2::element_line(colour = "grey85"), + panel.grid.minor = ggplot2::element_blank(), + # centre title and subtitle + plot.title = ggplot2::element_text(hjust = 0.5), + plot.subtitle = ggplot2::element_text(hjust = 0.5) + ) + + g +} + +#' @importFrom stats qchisq var +pca_calculations <- function(pca_model, + groups = NULL, + groups_missing = TRUE, + labels = NULL, + labels_missing = TRUE, + choices = 1:2, + scale = 1, + pc.biplot = TRUE, + ellipse_prob = 0.68, + labels_text_placement = 1.5) { + non_numeric_cols <- attributes(pca_model)$non_numeric_cols + if (groups_missing) { + groups <- tryCatch(non_numeric_cols[[1]], + error = function(e) NULL + ) + group_name <- tryCatch(colnames(non_numeric_cols[1]), + error = function(e) NULL + ) + } + if (labels_missing) { + labels <- tryCatch(non_numeric_cols[[2]], + error = function(e) NULL + ) + } + if (!is.null(groups) && is.null(labels)) { + # turn them around + labels <- groups + groups <- NULL + group_name <- NULL + } + + # Recover the SVD + if (inherits(pca_model, "prcomp")) { + nobs.factor <- sqrt(nrow(pca_model$x) - 1) + d <- pca_model$sdev + u <- sweep(pca_model$x, 2, 1 / (d * nobs.factor), FUN = "*") + v <- pca_model$rotation + } else if (inherits(pca_model, "princomp")) { + nobs.factor <- sqrt(pca_model$n.obs) + d <- pca_model$sdev + u <- sweep(pca_model$scores, 2, 1 / (d * nobs.factor), FUN = "*") + v <- pca_model$loadings + } else if (inherits(pca_model, "PCA")) { + nobs.factor <- sqrt(nrow(pca_model$call$X)) + d <- unlist(sqrt(pca_model$eig)[1]) + u <- sweep(pca_model$ind$coord, 2, 1 / (d * nobs.factor), FUN = "*") + v <- sweep(pca_model$var$coord, 2, sqrt(pca_model$eig[seq_len(ncol(pca_model$var$coord)), 1]), FUN = "/") + } else if (inherits(pca_model, "lda")) { + nobs.factor <- sqrt(pca_model$N) + d <- pca_model$svd + u <- predict(pca_model)$x / nobs.factor + v <- pca_model$scaling + } else { + stop("Expected an object of class prcomp, princomp, PCA, or lda") + } + + # Scores + choices <- pmin(choices, ncol(u)) + obs.scale <- 1 - as.integer(scale) + df.u <- as.data.frame(sweep(u[, choices], 2, d[choices]^obs.scale, FUN = "*"), + stringsAsFactors = FALSE + ) + + # Directions + v <- sweep(v, 2, d^as.integer(scale), FUN = "*") + df.v <- as.data.frame(v[, choices], + stringsAsFactors = FALSE + ) + + names(df.u) <- c("xvar", "yvar") + names(df.v) <- names(df.u) + + if (isTRUE(pc.biplot)) { + df.u <- df.u * nobs.factor + } + + # Scale the radius of the correlation circle so that it corresponds to + # a data ellipse for the standardized PC scores + circle_prob <- 0.69 + r <- sqrt(qchisq(circle_prob, df = 2)) * prod(colMeans(df.u^2))^(0.25) + + # Scale directions + v.scale <- rowSums(v^2) + df.v <- r * df.v / sqrt(max(v.scale)) + + # Grouping variable + if (!is.null(groups)) { + df.u$groups <- groups + } + + df.v$varname <- rownames(v) + + # Variables for text label placement + df.v$angle <- with(df.v, (180 / pi) * atan(yvar / xvar)) + df.v$hjust <- with(df.v, (1 - labels_text_placement * sign(xvar)) / 2) + + if (!is.null(df.u$groups)) { + theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50)) + circle <- cbind(cos(theta), sin(theta)) + + df.groups <- lapply(unique(df.u$groups), function(g, df = df.u) { + x <- df[which(df$groups == g), , drop = FALSE] + if (nrow(x) <= 2) { + return(data.frame( + X1 = numeric(0), + X2 = numeric(0), + groups = character(0), + stringsAsFactors = FALSE + )) + } + sigma <- var(cbind(x$xvar, x$yvar)) + mu <- c(mean(x$xvar), mean(x$yvar)) + ed <- sqrt(qchisq(ellipse_prob, df = 2)) + data.frame( + sweep(circle %*% chol(sigma) * ed, + MARGIN = 2, + STATS = mu, + FUN = "+" + ), + groups = x$groups[1], + stringsAsFactors = FALSE + ) + }) + ell <- do.call(rbind, df.groups) + if (NROW(ell) == 0) { + ell <- NULL + } else { + names(ell)[1:2] <- c("xvar", "yvar") + } + } else { + ell <- NULL + } + + list( + choices = choices, + df.u = df.u, + df.v = df.v, + ell = ell, + groups = groups, + group_name = group_name, + labels = labels + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/ggplot_sir.R + + + + +#' AMR Plots with `ggplot2` +#' +#' Use these functions to create bar plots for AMR data analysis. All functions rely on [ggplot2][ggplot2::ggplot()] functions. +#' @param data a [data.frame] with column(s) of class [`sir`] (see [as.sir()]) +#' @param position position adjustment of bars, either `"fill"`, `"stack"` or `"dodge"` +#' @param x variable to show on x axis, either `"antibiotic"` (default) or `"interpretation"` or a grouping variable +#' @param fill variable to categorise using the plots legend, either `"antibiotic"` (default) or `"interpretation"` or a grouping variable +#' @param breaks a [numeric] vector of positions +#' @param limits a [numeric] vector of length two providing limits of the scale, use `NA` to refer to the existing minimum or maximum +#' @param facet variable to split plots by, either `"interpretation"` (default) or `"antibiotic"` or a grouping variable +#' @inheritParams proportion +#' @param nrow (when using `facet`) number of rows +#' @param colours a named vactor with colour to be used for filling. The default colours are colour-blind friendly. +#' @param aesthetics aesthetics to apply the colours to - the default is "fill" but can also be (a combination of) "alpha", "colour", "fill", "linetype", "shape" or "size" +#' @param datalabels show datalabels using [labels_sir_count()] +#' @param datalabels.size size of the datalabels +#' @param datalabels.colour colour of the datalabels +#' @param title text to show as title of the plot +#' @param subtitle text to show as subtitle of the plot +#' @param caption text to show as caption of the plot +#' @param x.title text to show as x axis description +#' @param y.title text to show as y axis description +#' @param ... other arguments passed on to [geom_sir()] or, in case of [scale_sir_colours()], named values to set colours. The default colours are colour-blind friendly, while maintaining the convention that e.g. 'susceptible' should be green and 'resistant' should be red. See *Examples*. +#' @details At default, the names of antibiotics will be shown on the plots using [ab_name()]. This can be set with the `translate_ab` argument. See [count_df()]. +#' +#' ### The Functions +#' [geom_sir()] will take any variable from the data that has an [`sir`] class (created with [as.sir()]) using [sir_df()] and will plot bars with the percentage S, I, and R. The default behaviour is to have the bars stacked and to have the different antibiotics on the x axis. +#' +#' [facet_sir()] creates 2d plots (at default based on S/I/R) using [ggplot2::facet_wrap()]. +#' +#' [scale_y_percent()] transforms the y axis to a 0 to 100% range using [ggplot2::scale_y_continuous()]. +#' +#' [scale_sir_colours()] sets colours to the bars (green for S, yellow for I, and red for R). with multilingual support. The default colours are colour-blind friendly, while maintaining the convention that e.g. 'susceptible' should be green and 'resistant' should be red. +#' +#' [theme_sir()] is a [ggplot2 theme][[ggplot2::theme()] with minimal distraction. +#' +#' [labels_sir_count()] print datalabels on the bars with percentage and amount of isolates using [ggplot2::geom_text()]. +#' +#' [ggplot_sir()] is a wrapper around all above functions that uses data as first input. This makes it possible to use this function after a pipe (`%>%`). See *Examples*. +#' @rdname ggplot_sir +#' @export +#' @examples +#' \donttest{ +#' if (require("ggplot2") && require("dplyr")) { +#' # get antimicrobial results for drugs against a UTI: +#' ggplot(example_isolates %>% select(AMX, NIT, FOS, TMP, CIP)) + +#' geom_sir() +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # prettify the plot using some additional functions: +#' df <- example_isolates %>% select(AMX, NIT, FOS, TMP, CIP) +#' ggplot(df) + +#' geom_sir() + +#' scale_y_percent() + +#' scale_sir_colours() + +#' labels_sir_count() + +#' theme_sir() +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # or better yet, simplify this using the wrapper function - a single command: +#' example_isolates %>% +#' select(AMX, NIT, FOS, TMP, CIP) %>% +#' ggplot_sir() +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # get only proportions and no counts: +#' example_isolates %>% +#' select(AMX, NIT, FOS, TMP, CIP) %>% +#' ggplot_sir(datalabels = FALSE) +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # add other ggplot2 arguments as you like: +#' example_isolates %>% +#' select(AMX, NIT, FOS, TMP, CIP) %>% +#' ggplot_sir( +#' width = 0.5, +#' colour = "black", +#' size = 1, +#' linetype = 2, +#' alpha = 0.25 +#' ) +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # you can alter the colours with colour names: +#' example_isolates %>% +#' select(AMX) %>% +#' ggplot_sir(colours = c(SI = "yellow")) +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # but you can also use the built-in colour-blind friendly colours for +#' # your plots, where "S" is green, "I" is yellow and "R" is red: +#' data.frame( +#' x = c("Value1", "Value2", "Value3"), +#' y = c(1, 2, 3), +#' z = c("Value4", "Value5", "Value6") +#' ) %>% +#' ggplot() + +#' geom_col(aes(x = x, y = y, fill = z)) + +#' scale_sir_colours(Value4 = "S", Value5 = "I", Value6 = "R") +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # resistance of ciprofloxacine per age group +#' example_isolates %>% +#' mutate(first_isolate = first_isolate()) %>% +#' filter( +#' first_isolate == TRUE, +#' mo == as.mo("Escherichia coli") +#' ) %>% +#' # age_groups() is also a function in this AMR package: +#' group_by(age_group = age_groups(age)) %>% +#' select(age_group, CIP) %>% +#' ggplot_sir(x = "age_group") +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # a shorter version which also adjusts data label colours: +#' example_isolates %>% +#' select(AMX, NIT, FOS, TMP, CIP) %>% +#' ggplot_sir(colours = FALSE) +#' } +#' if (require("ggplot2") && require("dplyr")) { +#' # it also supports groups (don't forget to use the group var on `x` or `facet`): +#' example_isolates %>% +#' filter(mo_is_gram_negative(), ward != "Outpatient") %>% +#' # select only UTI-specific drugs +#' select(ward, AMX, NIT, FOS, TMP, CIP) %>% +#' group_by(ward) %>% +#' ggplot_sir( +#' x = "ward", +#' facet = "antibiotic", +#' nrow = 1, +#' title = "AMR of Anti-UTI Drugs Per Ward", +#' x.title = "Ward", +#' datalabels = FALSE +#' ) +#' } +#' } +ggplot_sir <- function(data, + position = NULL, + x = "antibiotic", + fill = "interpretation", + # params = list(), + facet = NULL, + breaks = seq(0, 1, 0.1), + limits = NULL, + translate_ab = "name", + combine_SI = TRUE, + minimum = 30, + language = get_AMR_locale(), + nrow = NULL, + colours = c( + S = "#3CAEA3", + SI = "#3CAEA3", + I = "#F6D55C", + IR = "#ED553B", + R = "#ED553B" + ), + datalabels = TRUE, + datalabels.size = 2.5, + datalabels.colour = "grey15", + title = NULL, + subtitle = NULL, + caption = NULL, + x.title = "Antimicrobial", + y.title = "Proportion", + ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(data, allow_class = "data.frame") + data <- ascertain_sir_classes(data, "data") + meet_criteria(position, allow_class = "character", has_length = 1, is_in = c("fill", "stack", "dodge"), allow_NULL = TRUE) + meet_criteria(x, allow_class = "character", has_length = 1) + meet_criteria(fill, allow_class = "character", has_length = 1) + meet_criteria(facet, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(breaks, allow_class = c("numeric", "integer")) + meet_criteria(limits, allow_class = c("numeric", "integer"), has_length = 2, allow_NULL = TRUE, allow_NA = TRUE) + meet_criteria(translate_ab, allow_class = c("character", "logical"), has_length = 1, allow_NA = TRUE) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + language <- validate_language(language) + meet_criteria(nrow, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive = TRUE, is_finite = TRUE) + meet_criteria(colours, allow_class = c("character", "logical")) + meet_criteria(datalabels, allow_class = "logical", has_length = 1) + meet_criteria(datalabels.size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(datalabels.colour, allow_class = "character", has_length = 1) + meet_criteria(title, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(subtitle, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(caption, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(x.title, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(y.title, allow_class = "character", has_length = 1, allow_NULL = TRUE) + + # we work with aes_string later on + x_deparse <- deparse(substitute(x)) + if (x_deparse != "x") { + x <- x_deparse + } + if (x %like% '".*"') { + x <- substr(x, 2, nchar(x) - 1) + } + facet_deparse <- deparse(substitute(facet)) + if (facet_deparse != "facet") { + facet <- facet_deparse + } + if (facet %like% '".*"') { + facet <- substr(facet, 2, nchar(facet) - 1) + } + if (facet %in% c("NULL", "")) { + facet <- NULL + } + + if (is.null(position)) { + position <- "fill" + } + + p <- ggplot2::ggplot(data = data) + + geom_sir( + position = position, x = x, fill = fill, translate_ab = translate_ab, + minimum = minimum, language = language, + combine_SI = combine_SI, ... + ) + + theme_sir() + + if (fill == "interpretation") { + p <- p + scale_sir_colours(colours = colours) + } + + if (identical(position, "fill")) { + # proportions, so use y scale with percentage + p <- p + scale_y_percent(breaks = breaks, limits = limits) + } + + if (datalabels == TRUE) { + p <- p + labels_sir_count( + position = position, + x = x, + translate_ab = translate_ab, + minimum = minimum, + language = language, + combine_SI = combine_SI, + datalabels.size = datalabels.size, + datalabels.colour = datalabels.colour + ) + } + + if (!is.null(facet)) { + p <- p + facet_sir(facet = facet, nrow = nrow) + } + + p <- p + ggplot2::labs( + title = title, + subtitle = subtitle, + caption = caption, + x = x.title, + y = y.title + ) + + p +} + +#' @rdname ggplot_sir +#' @export +geom_sir <- function(position = NULL, + x = c("antibiotic", "interpretation"), + fill = "interpretation", + translate_ab = "name", + minimum = 30, + language = get_AMR_locale(), + combine_SI = TRUE, + ...) { + x <- x[1] + stop_ifnot_installed("ggplot2") + stop_if(is.data.frame(position), "`position` is invalid. Did you accidentally use '%>%' instead of '+'?") + meet_criteria(position, allow_class = "character", has_length = 1, is_in = c("fill", "stack", "dodge"), allow_NULL = TRUE) + meet_criteria(x, allow_class = "character", has_length = 1) + meet_criteria(fill, allow_class = "character", has_length = 1) + meet_criteria(translate_ab, allow_class = c("character", "logical"), has_length = 1, allow_NA = TRUE) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + language <- validate_language(language) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + + y <- "value" + if (missing(position) || is.null(position)) { + position <- "fill" + } + + if (identical(position, "fill")) { + position <- ggplot2::position_fill(vjust = 0.5, reverse = TRUE) + } + + # we work with aes_string later on + x_deparse <- deparse(substitute(x)) + if (x_deparse != "x") { + x <- x_deparse + } + if (x %like% '".*"') { + x <- substr(x, 2, nchar(x) - 1) + } + + if (tolower(x) %in% tolower(c("ab", "abx", "antibiotics"))) { + x <- "antibiotic" + } else if (tolower(x) %in% tolower(c("SIR", "sir", "interpretations", "result"))) { + x <- "interpretation" + } + + ggplot2::geom_col( + data = function(x) { + sir_df( + data = x, + translate_ab = translate_ab, + language = language, + minimum = minimum, + combine_SI = combine_SI + ) + }, + mapping = ggplot2::aes_string(x = x, y = y, fill = fill), + position = position, + ... + ) +} + +#' @rdname ggplot_sir +#' @export +facet_sir <- function(facet = c("interpretation", "antibiotic"), nrow = NULL) { + facet <- facet[1] + stop_ifnot_installed("ggplot2") + meet_criteria(facet, allow_class = "character", has_length = 1) + meet_criteria(nrow, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive = TRUE, is_finite = TRUE) + + # we work with aes_string later on + facet_deparse <- deparse(substitute(facet)) + if (facet_deparse != "facet") { + facet <- facet_deparse + } + if (facet %like% '".*"') { + facet <- substr(facet, 2, nchar(facet) - 1) + } + + if (tolower(facet) %in% tolower(c("SIR", "sir", "interpretations", "result"))) { + facet <- "interpretation" + } else if (tolower(facet) %in% tolower(c("ab", "abx", "antibiotics"))) { + facet <- "antibiotic" + } + + ggplot2::facet_wrap(facets = facet, scales = "free_x", nrow = nrow) +} + +#' @rdname ggplot_sir +#' @export +scale_y_percent <- function(breaks = seq(0, 1, 0.1), limits = NULL) { + stop_ifnot_installed("ggplot2") + meet_criteria(breaks, allow_class = c("numeric", "integer")) + meet_criteria(limits, allow_class = c("numeric", "integer"), has_length = 2, allow_NULL = TRUE, allow_NA = TRUE) + + if (all(breaks[breaks != 0] > 1)) { + breaks <- breaks / 100 + } + ggplot2::scale_y_continuous( + breaks = breaks, + labels = percentage(breaks), + limits = limits + ) +} + +#' @rdname ggplot_sir +#' @export +scale_sir_colours <- function(..., + aesthetics = "fill") { + stop_ifnot_installed("ggplot2") + meet_criteria(aesthetics, allow_class = "character", is_in = c("alpha", "colour", "color", "fill", "linetype", "shape", "size")) + # behaviour until AMR pkg v1.5.0 and also when coming from ggplot_sir() + if ("colours" %in% names(list(...))) { + original_cols <- c( + S = "#3CAEA3", + SI = "#3CAEA3", + I = "#F6D55C", + IR = "#ED553B", + R = "#ED553B" + ) + colours <- replace(original_cols, names(list(...)$colours), list(...)$colours) + # limits = force is needed in ggplot2 3.3.4 and 3.3.5, see here; + # https://github.com/tidyverse/ggplot2/issues/4511#issuecomment-866185530 + return(ggplot2::scale_fill_manual(values = colours, limits = force)) + } + if (identical(unlist(list(...)), FALSE)) { + return(invisible()) + } + + names_susceptible <- c( + "S", "SI", "IS", "S+I", "I+S", "susceptible", "Susceptible", + unique(TRANSLATIONS[which(TRANSLATIONS$pattern == "Susceptible"), + "replacement", + drop = TRUE + ]) + ) + names_incr_exposure <- c( + "I", "intermediate", "increased exposure", "incr. exposure", + "Increased exposure", "Incr. exposure", "Susceptible, incr. exp.", + unique(TRANSLATIONS[which(TRANSLATIONS$pattern == "Intermediate"), + "replacement", + drop = TRUE + ]), + unique(TRANSLATIONS[which(TRANSLATIONS$pattern == "Susceptible, incr. exp."), + "replacement", + drop = TRUE + ]) + ) + names_resistant <- c( + "R", "IR", "RI", "R+I", "I+R", "resistant", "Resistant", + unique(TRANSLATIONS[which(TRANSLATIONS$pattern == "Resistant"), + "replacement", + drop = TRUE + ]) + ) + + susceptible <- rep("#3CAEA3", length(names_susceptible)) + names(susceptible) <- names_susceptible + incr_exposure <- rep("#F6D55C", length(names_incr_exposure)) + names(incr_exposure) <- names_incr_exposure + resistant <- rep("#ED553B", length(names_resistant)) + names(resistant) <- names_resistant + + original_cols <- c(susceptible, incr_exposure, resistant) + dots <- c(...) + # replace S, I, R as colours: scale_sir_colours(mydatavalue = "S") + dots[dots == "S"] <- "#3CAEA3" + dots[dots == "I"] <- "#F6D55C" + dots[dots == "R"] <- "#ED553B" + cols <- replace(original_cols, names(dots), dots) + # limits = force is needed in ggplot2 3.3.4 and 3.3.5, see here; + # https://github.com/tidyverse/ggplot2/issues/4511#issuecomment-866185530 + ggplot2::scale_discrete_manual(aesthetics = aesthetics, values = cols, limits = force) +} + +#' @rdname ggplot_sir +#' @export +theme_sir <- function() { + stop_ifnot_installed("ggplot2") + ggplot2::theme_minimal(base_size = 10) + + ggplot2::theme( + panel.grid.major.x = ggplot2::element_blank(), + panel.grid.minor = ggplot2::element_blank(), + panel.grid.major.y = ggplot2::element_line(colour = "grey75"), + # center title and subtitle + plot.title = ggplot2::element_text(hjust = 0.5), + plot.subtitle = ggplot2::element_text(hjust = 0.5) + ) +} + +#' @rdname ggplot_sir +#' @export +labels_sir_count <- function(position = NULL, + x = "antibiotic", + translate_ab = "name", + minimum = 30, + language = get_AMR_locale(), + combine_SI = TRUE, + datalabels.size = 3, + datalabels.colour = "grey15") { + stop_ifnot_installed("ggplot2") + meet_criteria(position, allow_class = "character", has_length = 1, is_in = c("fill", "stack", "dodge"), allow_NULL = TRUE) + meet_criteria(x, allow_class = "character", has_length = 1) + meet_criteria(translate_ab, allow_class = c("character", "logical"), has_length = 1, allow_NA = TRUE) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + language <- validate_language(language) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(datalabels.size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(datalabels.colour, allow_class = "character", has_length = 1) + + if (is.null(position)) { + position <- "fill" + } + if (identical(position, "fill")) { + position <- ggplot2::position_fill(vjust = 0.5, reverse = TRUE) + } + x_name <- x + ggplot2::geom_text( + mapping = ggplot2::aes_string( + label = "lbl", + x = x, + y = "value" + ), + position = position, + inherit.aes = FALSE, + size = datalabels.size, + colour = datalabels.colour, + lineheight = 0.75, + data = function(x) { + transformed <- sir_df( + data = x, + translate_ab = translate_ab, + combine_SI = combine_SI, + minimum = minimum, + language = language + ) + transformed$gr <- transformed[, x_name, drop = TRUE] + transformed %pm>% + pm_group_by(gr) %pm>% + pm_mutate(lbl = paste0("n=", isolates)) %pm>% + pm_ungroup() %pm>% + pm_select(-gr) + } + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/guess_ab_col.R + + + + +#' Guess Antibiotic Column +#' +#' This tries to find a column name in a data set based on information from the [antibiotics] data set. Also supports WHONET abbreviations. +#' @param x a [data.frame] +#' @param search_string a text to search `x` for, will be checked with [as.ab()] if this value is not a column in `x` +#' @param verbose a [logical] to indicate whether additional info should be printed +#' @param only_sir_columns a [logical] to indicate whether only antibiotic columns must be detected that were transformed to class `sir` (see [as.sir()]) on beforehand (default is `FALSE`) +#' @details You can look for an antibiotic (trade) name or abbreviation and it will search `x` and the [antibiotics] data set for any column containing a name or code of that antibiotic. +#' @return A column name of `x`, or `NULL` when no result is found. +#' @export +#' @examples +#' df <- data.frame( +#' amox = "S", +#' tetr = "R" +#' ) +#' +#' guess_ab_col(df, "amoxicillin") +#' guess_ab_col(df, "J01AA07") # ATC code of tetracycline +#' +#' guess_ab_col(df, "J01AA07", verbose = TRUE) +#' # NOTE: Using column 'tetr' as input for J01AA07 (tetracycline). +#' +#' # WHONET codes +#' df <- data.frame( +#' AMP_ND10 = "R", +#' AMC_ED20 = "S" +#' ) +#' guess_ab_col(df, "ampicillin") +#' guess_ab_col(df, "J01CR02") +#' guess_ab_col(df, as.ab("augmentin")) +guess_ab_col <- function(x = NULL, search_string = NULL, verbose = FALSE, only_sir_columns = FALSE) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(search_string, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(verbose, allow_class = "logical", has_length = 1) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + + if (is.null(x) && is.null(search_string)) { + return(as.name("guess_ab_col")) + } else { + meet_criteria(search_string, allow_class = "character", has_length = 1, allow_NULL = FALSE) + } + + all_found <- get_column_abx(x, + info = verbose, only_sir_columns = only_sir_columns, + verbose = verbose, fn = "guess_ab_col" + ) + search_string.ab <- suppressWarnings(as.ab(search_string)) + ab_result <- unname(all_found[names(all_found) == search_string.ab]) + + if (length(ab_result) == 0) { + if (isTRUE(verbose)) { + message_("No column found as input for ", search_string, + " (", ab_name(search_string, language = NULL, tolower = TRUE), ").", + add_fn = font_black, + as_note = FALSE + ) + } + return(NULL) + } else { + if (isTRUE(verbose)) { + message_( + "Using column '", font_bold(ab_result), "' as input for ", search_string, + " (", ab_name(search_string, language = NULL, tolower = TRUE), ")." + ) + } + return(ab_result) + } +} + +get_column_abx <- function(x, + ..., + soft_dependencies = NULL, + hard_dependencies = NULL, + verbose = FALSE, + info = TRUE, + only_sir_columns = FALSE, + sort = TRUE, + reuse_previous_result = TRUE, + fn = NULL) { + # check if retrieved before, then get it from package environment + if (isTRUE(reuse_previous_result) && identical( + unique_call_id( + entire_session = FALSE, + match_fn = fn + ), + AMR_env$get_column_abx.call + )) { + # so within the same call, within the same environment, we got here again. + # but we could've come from another function within the same call, so now only check the columns that changed + + # first remove the columns that are not existing anymore + previous <- AMR_env$get_column_abx.out + current <- previous[previous %in% colnames(x)] + + # then compare columns in current call with columns in original call + new_cols <- colnames(x)[!colnames(x) %in% AMR_env$get_column_abx.checked_cols] + if (length(new_cols) > 0) { + # these columns did not exist in the last call, so add them + new_cols_sir <- get_column_abx(x[, new_cols, drop = FALSE], reuse_previous_result = FALSE, info = FALSE, sort = FALSE) + current <- c(current, new_cols_sir) + # order according to columns in current call + current <- current[match(colnames(x)[colnames(x) %in% current], current)] + } + + # update pkg environment to improve speed on next run + AMR_env$get_column_abx.out <- current + AMR_env$get_column_abx.checked_cols <- colnames(x) + + # and return right values + return(AMR_env$get_column_abx.out) + } + + meet_criteria(x, allow_class = "data.frame") + meet_criteria(soft_dependencies, allow_class = "character", allow_NULL = TRUE) + meet_criteria(hard_dependencies, allow_class = "character", allow_NULL = TRUE) + meet_criteria(verbose, allow_class = "logical", has_length = 1) + meet_criteria(info, allow_class = "logical", has_length = 1) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + meet_criteria(sort, allow_class = "logical", has_length = 1) + + if (isTRUE(info)) { + message_("Auto-guessing columns suitable for analysis", appendLF = FALSE, as_note = FALSE) + } + + x <- as.data.frame(x, stringsAsFactors = FALSE) + x.bak <- x + if (only_sir_columns == TRUE) { + x <- x[, which(is.sir(x)), drop = FALSE] + } + + if (NROW(x) > 10000) { + # only test maximum of 10,000 values per column + if (isTRUE(info)) { + message_(" (using only ", font_bold("the first 10,000 rows"), ")...", + appendLF = FALSE, + as_note = FALSE + ) + } + x <- x[1:10000, , drop = FALSE] + } else if (isTRUE(info)) { + message_("...", appendLF = FALSE, as_note = FALSE) + } + + # only check columns that are a valid AB code, ATC code, name, abbreviation or synonym, + # or already have the 'sir' class (as.sir) + # and that they have no more than 50% invalid values + vectr_antibiotics <- unlist(AMR_env$AB_lookup$generalised_all) + vectr_antibiotics <- vectr_antibiotics[!is.na(vectr_antibiotics) & nchar(vectr_antibiotics) >= 3] + x_columns <- vapply( + FUN.VALUE = character(1), + colnames(x), + function(col, df = x) { + if (generalise_antibiotic_name(col) %in% vectr_antibiotics || + is.sir(x[, col, drop = TRUE]) || + is_sir_eligible(x[, col, drop = TRUE], threshold = 0.5) + ) { + return(col) + } else { + return(NA_character_) + } + }, USE.NAMES = FALSE + ) + + x_columns <- x_columns[!is.na(x_columns)] + x <- x[, x_columns, drop = FALSE] # without drop = FALSE, x will become a vector when x_columns is length 1 + df_trans <- data.frame( + colnames = colnames(x), + abcode = suppressWarnings(as.ab(colnames(x), info = FALSE)), + stringsAsFactors = FALSE + ) + df_trans <- df_trans[!is.na(df_trans$abcode), , drop = FALSE] + out <- as.character(df_trans$colnames) + names(out) <- df_trans$abcode + + # add from self-defined dots (...): + # such as get_column_abx(example_isolates %>% rename(thisone = AMX), amox = "thisone") + all_okay <- TRUE + dots <- list(...) + # remove data.frames, since this is also used running `eucast_rules(eucast_rules_df = df)` + dots <- dots[!vapply(FUN.VALUE = logical(1), dots, is.data.frame)] + if (length(dots) > 0) { + newnames <- suppressWarnings(as.ab(names(dots), info = FALSE)) + if (anyNA(newnames)) { + if (isTRUE(info)) { + message_(" WARNING", add_fn = list(font_yellow, font_bold), as_note = FALSE) + } + warning_("Invalid antibiotic reference(s): ", vector_and(names(dots)[is.na(newnames)], quotes = FALSE), + call = FALSE, + immediate = TRUE + ) + all_okay <- FALSE + } + unexisting_cols <- which(!vapply(FUN.VALUE = logical(1), dots, function(col) all(col %in% x_columns))) + if (length(unexisting_cols) > 0) { + if (isTRUE(info)) { + message_(" ERROR", add_fn = list(font_red, font_bold), as_note = FALSE) + } + stop_("Column(s) not found: ", vector_and(unlist(dots[[unexisting_cols]]), quotes = FALSE), + call = FALSE + ) + all_okay <- FALSE + } + # turn all NULLs to NAs + dots <- unlist(lapply(dots, function(dot) if (is.null(dot)) NA else dot)) + names(dots) <- newnames + dots <- dots[!is.na(names(dots))] + # merge, but overwrite automatically determined ones by 'dots' + out <- c(out[!out %in% dots & !names(out) %in% names(dots)], dots) + # delete NAs, this will make e.g. eucast_rules(... TMP = NULL) work to prevent TMP from being used + out <- out[!is.na(out)] + } + + if (length(out) == 0) { + if (isTRUE(info) && all_okay == TRUE) { + message_("No columns found.") + } + AMR_env$get_column_abx.call <- unique_call_id(entire_session = FALSE, match_fn = fn) + AMR_env$get_column_abx.checked_cols <- colnames(x.bak) + AMR_env$get_column_abx.out <- out + return(out) + } + + # sort on name + if (sort == TRUE) { + out <- out[order(names(out), out)] + } + # only keep the first hits, no duplicates + duplicates <- c(out[duplicated(names(out))], out[duplicated(unname(out))]) + if (length(duplicates) > 0) { + all_okay <- FALSE + } + + if (isTRUE(info)) { + if (all_okay == TRUE) { + message_(" OK.", add_fn = list(font_green, font_bold), as_note = FALSE) + } else { + message_(" WARNING.", add_fn = list(font_yellow, font_bold), as_note = FALSE) + } + for (i in seq_len(length(out))) { + if (isTRUE(verbose) && !names(out[i]) %in% names(duplicates)) { + message_( + "Using column '", font_bold(out[i]), "' as input for ", names(out)[i], + " (", ab_name(names(out)[i], tolower = TRUE, language = NULL), ")." + ) + } + if (names(out[i]) %in% names(duplicates)) { + already_set_as <- out[unname(out) == unname(out[i])][1L] + if (names(out)[i] != names(already_set_as)) { + warning_( + paste0( + "Column '", font_bold(out[i]), "' will not be used for ", + names(out)[i], " (", ab_name(names(out)[i], tolower = TRUE, language = NULL), ")", + ", as it is already set for ", + names(already_set_as), " (", ab_name(names(already_set_as), tolower = TRUE, language = NULL), ")" + ), + add_fn = font_red, + immediate = verbose + ) + } + } + } + } + + out <- out[!duplicated(names(out))] + out <- out[!duplicated(unname(out))] + if (sort == TRUE) { + out <- out[order(names(out), out)] + } + + if (!is.null(hard_dependencies)) { + hard_dependencies <- unique(hard_dependencies) + if (!all(hard_dependencies %in% names(out))) { + # missing a hard dependency will return NA and consequently the data will not be analysed + missing <- hard_dependencies[!hard_dependencies %in% names(out)] + generate_warning_abs_missing(missing, any = FALSE) + return(NA) + } + } + if (!is.null(soft_dependencies)) { + soft_dependencies <- unique(soft_dependencies) + if (isTRUE(info) && !all(soft_dependencies %in% names(out))) { + # missing a soft dependency may lower the reliability + missing <- soft_dependencies[!soft_dependencies %in% names(out)] + missing_msg <- vector_and( + paste0( + ab_name(missing, tolower = TRUE, language = NULL), + " (", font_bold(missing, collapse = NULL), ")" + ), + quotes = FALSE + ) + message_( + "Reliability would be improved if these antimicrobial results would be available too: ", + missing_msg + ) + } + } + + AMR_env$get_column_abx.call <- unique_call_id(entire_session = FALSE, match_fn = fn) + AMR_env$get_column_abx.checked_cols <- colnames(x.bak) + AMR_env$get_column_abx.out <- out + out +} + +get_ab_from_namespace <- function(x, cols_ab) { + # cols_ab comes from get_column_abx() + + x <- trimws2(unique(toupper(unlist(strsplit(x, ",", fixed = TRUE))))) + x_new <- character() + for (val in x) { + if (paste0("AB_", val) %in% ls(envir = asNamespace("AMR"))) { + # antibiotic group names, as defined in data-raw/_pre_commit_checks.R, such as `AB_CARBAPENEMS` + val <- eval(parse(text = paste0("AB_", val)), envir = asNamespace("AMR")) + } else if (val %in% AMR_env$AB_lookup$ab) { + # separate drugs, such as `AMX` + val <- as.ab(val) + } else { + stop_("unknown antimicrobial drug (group): ", val, call = FALSE) + } + x_new <- c(x_new, val) + } + x_new <- unique(x_new) + out <- cols_ab[match(x_new, names(cols_ab))] + out[!is.na(out)] +} + +generate_warning_abs_missing <- function(missing, any = FALSE) { + missing <- paste0(missing, " (", ab_name(missing, tolower = TRUE, language = NULL), ")") + if (any == TRUE) { + any_txt <- c(" any of", "is") + } else { + any_txt <- c("", "are") + } + warning_( + paste0( + "Introducing NAs since", any_txt[1], " these antimicrobials ", any_txt[2], " required: ", + vector_and(missing, quotes = FALSE) + ), + immediate = TRUE + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/italicise_taxonomy.R + + + + +#' Italicise Taxonomic Families, Genera, Species, Subspecies +#' +#' According to the binomial nomenclature, the lowest four taxonomic levels (family, genus, species, subspecies) should be printed in italics. This function finds taxonomic names within strings and makes them italic. +#' @param string a [character] (vector) +#' @param type type of conversion of the taxonomic names, either "markdown", "html" or "ansi", see *Details* +#' @details +#' This function finds the taxonomic names and makes them italic based on the [microorganisms] data set. +#' +#' The taxonomic names can be italicised using markdown (the default) by adding `*` before and after the taxonomic names, or `` and `` when using html. When using 'ansi', ANSI colours will be added using `\033[3m` before and `\033[23m` after the taxonomic names. If multiple ANSI colours are not available, no conversion will occur. +#' +#' This function also supports abbreviation of the genus if it is followed by a species, such as "E. coli" and "K. pneumoniae ozaenae". +#' @export +#' @examples +#' italicise_taxonomy("An overview of Staphylococcus aureus isolates") +#' italicise_taxonomy("An overview of S. aureus isolates") +#' +#' cat(italicise_taxonomy("An overview of S. aureus isolates", type = "ansi")) +italicise_taxonomy <- function(string, type = c("markdown", "ansi", "html")) { + if (missing(type)) { + type <- "markdown" + } + meet_criteria(string, allow_class = "character") + meet_criteria(type, allow_class = "character", has_length = 1, is_in = c("markdown", "ansi", "html")) + + add_MO_lookup_to_AMR_env() + + if (type == "markdown") { + before <- "*" + after <- "*" + } else if (type == "html") { + before <- "" + after <- "" + } else if (type == "ansi") { + if (!has_colour() && !identical(Sys.getenv("IN_PKGDOWN"), "true")) { + return(string) + } + before <- "\033[3m" + after <- "\033[23m" + } + + vapply( + FUN.VALUE = character(1), + string, + function(s) { + s_split <- unlist(strsplit(s, " ", fixed = TRUE)) + + search_strings <- gsub("[^a-zA-Z-]", "", s_split) + + ind_species <- search_strings != "" & + search_strings %in% AMR_env$MO_lookup[ + which(AMR_env$MO_lookup$rank %in% c( + "family", + "genus", + "species", + "subspecies", + "infraspecies", + "subsp." + )), + "species", + drop = TRUE + ] + + ind_fullname <- search_strings != "" & + search_strings %in% c( + AMR_env$MO_lookup[ + which(AMR_env$MO_lookup$rank %in% c( + "family", + "genus", + "species", + "subspecies", + "infraspecies", + "subsp." + )), + "fullname", + drop = TRUE + ], + AMR_env$MO_lookup[ + which(AMR_env$MO_lookup$rank %in% c( + "family", + "genus", + "species", + "subspecies", + "infraspecies", + "subsp." + )), + "subspecies", + drop = TRUE + ] + ) + + # also support E. coli, add "E." to indices + has_previous_genera_abbr <- s_split[which(ind_species) - 1] %like_case% "^[A-Z][.]?$" + ind_species <- c(which(ind_species), which(ind_species)[has_previous_genera_abbr] - 1) + + ind <- c(ind_species, which(ind_fullname)) + + s_split[ind] <- paste0(before, s_split[ind], after) + s_paste <- paste(s_split, collapse = " ") + + # clean up a bit + s_paste <- gsub(paste0(after, " ", before), " ", s_paste, fixed = TRUE) + + s_paste + }, + USE.NAMES = FALSE + ) +} + +#' @rdname italicise_taxonomy +#' @export +italicize_taxonomy <- function(string, type = c("markdown", "ansi", "html")) { + if (missing(type)) { + type <- "markdown" + } + italicise_taxonomy(string = string, type = type) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/join_microorganisms.R + + + + +#' Join [microorganisms] to a Data Set +#' +#' Join the data set [microorganisms] easily to an existing data set or to a [character] vector. +#' @rdname join +#' @name join +#' @aliases join inner_join +#' @param x existing data set to join, or [character] vector. In case of a [character] vector, the resulting [data.frame] will contain a column 'x' with these values. +#' @param by a variable to join by - if left empty will search for a column with class [`mo`] (created with [as.mo()]) or will be `"mo"` if that column name exists in `x`, could otherwise be a column name of `x` with values that exist in `microorganisms$mo` (such as `by = "bacteria_id"`), or another column in [microorganisms] (but then it should be named, like `by = c("bacteria_id" = "fullname")`) +#' @param suffix if there are non-joined duplicate variables in `x` and `y`, these suffixes will be added to the output to disambiguate them. Should be a [character] vector of length 2. +#' @param ... ignored, only in place to allow future extensions +#' @details **Note:** As opposed to the `join()` functions of `dplyr`, [character] vectors are supported and at default existing columns will get a suffix `"2"` and the newly joined columns will not get a suffix. +#' +#' If the `dplyr` package is installed, their join functions will be used. Otherwise, the much slower [merge()] and [interaction()] functions from base \R will be used. +#' @return a [data.frame] +#' @export +#' @examples +#' left_join_microorganisms(as.mo("K. pneumoniae")) +#' left_join_microorganisms("B_KLBSL_PNMN") +#' +#' df <- data.frame( +#' date = seq( +#' from = as.Date("2018-01-01"), +#' to = as.Date("2018-01-07"), +#' by = 1 +#' ), +#' bacteria = as.mo(c( +#' "S. aureus", "MRSA", "MSSA", "STAAUR", +#' "E. coli", "E. coli", "E. coli" +#' )), +#' stringsAsFactors = FALSE +#' ) +#' colnames(df) +#' +#' df_joined <- left_join_microorganisms(df, "bacteria") +#' colnames(df_joined) +#' +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' left_join_microorganisms() %>% +#' colnames() +#' } +#' } +inner_join_microorganisms <- function(x, by = NULL, suffix = c("2", ""), ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + meet_criteria(suffix, allow_class = "character", has_length = 2) + + join_microorganisms(type = "inner_join", x = x, by = by, suffix = suffix, ...) +} + +#' @rdname join +#' @export +left_join_microorganisms <- function(x, by = NULL, suffix = c("2", ""), ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + meet_criteria(suffix, allow_class = "character", has_length = 2) + + join_microorganisms(type = "left_join", x = x, by = by, suffix = suffix, ...) +} + +#' @rdname join +#' @export +right_join_microorganisms <- function(x, by = NULL, suffix = c("2", ""), ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + meet_criteria(suffix, allow_class = "character", has_length = 2) + + join_microorganisms(type = "right_join", x = x, by = by, suffix = suffix, ...) +} + +#' @rdname join +#' @export +full_join_microorganisms <- function(x, by = NULL, suffix = c("2", ""), ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + meet_criteria(suffix, allow_class = "character", has_length = 2) + + join_microorganisms(type = "full_join", x = x, by = by, suffix = suffix, ...) +} + +#' @rdname join +#' @export +semi_join_microorganisms <- function(x, by = NULL, ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + + join_microorganisms(type = "semi_join", x = x, by = by, ...) +} + +#' @rdname join +#' @export +anti_join_microorganisms <- function(x, by = NULL, ...) { + meet_criteria(x, allow_class = c("data.frame", "character")) + meet_criteria(by, allow_class = "character", allow_NULL = TRUE) + + join_microorganisms(type = "anti_join", x = x, by = by, ...) +} + +join_microorganisms <- function(type, x, by, suffix, ...) { + add_MO_lookup_to_AMR_env() + + if (!is.data.frame(x)) { + if (pkg_is_available("tibble")) { + x <- import_fn("tibble", "tibble")(mo = x) + } else { + x <- data.frame(mo = x, stringsAsFactors = FALSE) + } + by <- "mo" + } + x.bak <- x + if (is.null(by)) { + by <- search_type_in_df(x, "mo", info = FALSE) + if (is.null(by) && NCOL(x) == 1) { + by <- colnames(x)[1L] + } else { + stop_if(is.null(by), "no column with microorganism names or codes found, set this column with `by`", call = -2) + } + message_('Joining, by = "', by, '"', add_fn = font_black, as_note = FALSE) # message same as dplyr::join functions + } + if (!all(x[, by, drop = TRUE] %in% AMR_env$MO_lookup$mo, na.rm = TRUE)) { + x$join.mo <- as.mo(x[, by, drop = TRUE]) + by <- c("join.mo" = "mo") + } else { + x[, by] <- as.mo(x[, by, drop = TRUE]) + } + + if (is.null(names(by))) { + # will always be joined to microorganisms$mo, so add name to that + by <- stats::setNames("mo", by) + } + + # use dplyr if available - it's much faster than poorman alternatives + dplyr_join <- import_fn(name = type, pkg = "dplyr", error_on_fail = FALSE) + if (!is.null(dplyr_join)) { + join_fn <- dplyr_join + } else { + # otherwise use poorman, see R/aa_helper_pm_functions.R + join_fn <- get(paste0("pm_", type), envir = asNamespace("AMR")) + } + MO_df <- AMR_env$MO_lookup[, colnames(AMR::microorganisms), drop = FALSE] + if (type %like% "full|left|right|inner") { + joined <- join_fn(x = x, y = MO_df, by = by, suffix = suffix, ...) + } else { + joined <- join_fn(x = x, y = MO_df, by = by, ...) + } + + if ("join.mo" %in% colnames(joined)) { + if ("mo" %in% colnames(joined)) { + ind_mo <- which(colnames(joined) %in% c("mo", "join.mo")) + colnames(joined)[ind_mo[1L]] <- paste0("mo", suffix[1L]) + colnames(joined)[ind_mo[2L]] <- paste0("mo", suffix[2L]) + } else { + colnames(joined)[colnames(joined) == "join.mo"] <- "mo" + } + } + + if (type %like% "full|left|right|inner" && NROW(joined) > NROW(x)) { + warning_("in `", type, "_microorganisms()`: the newly joined data set contains ", nrow(joined) - nrow(x), " rows more than the number of rows of `x`.") + } + + as_original_data_class(joined, class(x.bak)) # will remove tibble groups +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/key_antimicrobials.R + + + + +#' (Key) Antimicrobials for First Weighted Isolates +#' +#' These functions can be used to determine first weighted isolates by considering the phenotype for isolate selection (see [first_isolate()]). Using a phenotype-based method to determine first isolates is more reliable than methods that disregard phenotypes. +#' @param x a [data.frame] with antibiotics columns, like `AMX` or `amox`. Can be left blank to determine automatically +#' @param y,z [character] vectors to compare +#' @inheritParams first_isolate +#' @param universal names of **broad-spectrum** antimicrobial drugs, case-insensitive. Set to `NULL` to ignore. See *Details* for the default antimicrobial drugs +#' @param gram_negative names of antibiotic drugs for **Gram-positives**, case-insensitive. Set to `NULL` to ignore. See *Details* for the default antibiotic drugs +#' @param gram_positive names of antibiotic drugs for **Gram-negatives**, case-insensitive. Set to `NULL` to ignore. See *Details* for the default antibiotic drugs +#' @param antifungal names of antifungal drugs for **fungi**, case-insensitive. Set to `NULL` to ignore. See *Details* for the default antifungal drugs +#' @param only_sir_columns a [logical] to indicate whether only columns must be included that were transformed to class `sir` (see [as.sir()]) on beforehand (default is `FALSE`) +#' @param ... ignored, only in place to allow future extensions +#' @details +#' The [key_antimicrobials()] and [all_antimicrobials()] functions are context-aware. This means that the `x` argument can be left blank if used inside a [data.frame] call, see *Examples*. +#' +#' The function [key_antimicrobials()] returns a [character] vector with 12 antimicrobial results for every isolate. The function [all_antimicrobials()] returns a [character] vector with all antimicrobial drug results for every isolate. These vectors can then be compared using [antimicrobials_equal()], to check if two isolates have generally the same antibiogram. Missing and invalid values are replaced with a dot (`"."`) by [key_antimicrobials()] and ignored by [antimicrobials_equal()]. +#' +#' Please see the [first_isolate()] function how these important functions enable the 'phenotype-based' method for determination of first isolates. +#' +#' The default antimicrobial drugs used for **all rows** (set in `universal`) are: +#' +#' - Ampicillin +#' - Amoxicillin/clavulanic acid +#' - Cefuroxime +#' - Ciprofloxacin +#' - Piperacillin/tazobactam +#' - Trimethoprim/sulfamethoxazole +#' +#' The default antimicrobial drugs used for **Gram-negative bacteria** (set in `gram_negative`) are: +#' +#' - Cefotaxime +#' - Ceftazidime +#' - Colistin +#' - Gentamicin +#' - Meropenem +#' - Tobramycin +#' +#' The default antimicrobial drugs used for **Gram-positive bacteria** (set in `gram_positive`) are: +#' +#' - Erythromycin +#' - Oxacillin +#' - Rifampin +#' - Teicoplanin +#' - Tetracycline +#' - Vancomycin +#' +#' +#' The default antimicrobial drugs used for **fungi** (set in `antifungal`) are: +#' +#' - Anidulafungin +#' - Caspofungin +#' - Fluconazole +#' - Miconazole +#' - Nystatin +#' - Voriconazole +#' @rdname key_antimicrobials +#' @export +#' @seealso [first_isolate()] +#' @examples +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates. +#' +#' # output of the `key_antimicrobials()` function could be like this: +#' strainA <- "SSSRR.S.R..S" +#' strainB <- "SSSIRSSSRSSS" +#' +#' # those strings can be compared with: +#' antimicrobials_equal(strainA, strainB, type = "keyantimicrobials") +#' # TRUE, because I is ignored (as well as missing values) +#' +#' antimicrobials_equal(strainA, strainB, type = "keyantimicrobials", ignore_I = FALSE) +#' # FALSE, because I is not ignored and so the 4th [character] differs +#' +#' \donttest{ +#' if (require("dplyr")) { +#' # set key antibiotics to a new variable +#' my_patients <- example_isolates %>% +#' mutate(keyab = key_antimicrobials(antifungal = NULL)) %>% # no need to define `x` +#' mutate( +#' # now calculate first isolates +#' first_regular = first_isolate(col_keyantimicrobials = FALSE), +#' # and first WEIGHTED isolates +#' first_weighted = first_isolate(col_keyantimicrobials = "keyab") +#' ) +#' +#' # Check the difference in this data set, 'weighted' results in more isolates: +#' sum(my_patients$first_regular, na.rm = TRUE) +#' sum(my_patients$first_weighted, na.rm = TRUE) +#' } +#' } +key_antimicrobials <- function(x = NULL, + col_mo = NULL, + universal = c( + "ampicillin", "amoxicillin/clavulanic acid", "cefuroxime", + "piperacillin/tazobactam", "ciprofloxacin", "trimethoprim/sulfamethoxazole" + ), + gram_negative = c( + "gentamicin", "tobramycin", "colistin", + "cefotaxime", "ceftazidime", "meropenem" + ), + gram_positive = c( + "vancomycin", "teicoplanin", "tetracycline", + "erythromycin", "oxacillin", "rifampin" + ), + antifungal = c( + "anidulafungin", "caspofungin", "fluconazole", + "miconazole", "nystatin", "voriconazole" + ), + only_sir_columns = FALSE, + ...) { + if (is_null_or_grouped_tbl(x)) { + # when `x` is left blank, auto determine it (get_current_data() searches underlying data within call) + # is also fix for using a grouped df as input (a dot as first argument) + x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x) + } + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, allow_NA = TRUE, is_in = colnames(x)) + meet_criteria(universal, allow_class = "character", allow_NULL = TRUE) + meet_criteria(gram_negative, allow_class = "character", allow_NULL = TRUE) + meet_criteria(gram_positive, allow_class = "character", allow_NULL = TRUE) + meet_criteria(antifungal, allow_class = "character", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + + # force regular data.frame, not a tibble or data.table + x <- as.data.frame(x, stringsAsFactors = FALSE) + cols <- get_column_abx(x, info = FALSE, only_sir_columns = only_sir_columns, fn = "key_antimicrobials") + + # try to find columns based on type + # -- mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = FALSE) + } + if (is.null(col_mo)) { + warning_("in `key_antimicrobials()`: no column found for `col_mo`, ignoring antibiotics set in `gram_negative` and `gram_positive`, and antimycotics set in `antifungal`") + gramstain <- NA_character_ + kingdom <- NA_character_ + } else { + x.mo <- as.mo(x[, col_mo, drop = TRUE]) + gramstain <- mo_gramstain(x.mo, language = NULL) + kingdom <- mo_kingdom(x.mo, language = NULL) + } + + AMR_string <- function(x, values, name, filter, cols = cols) { + if (is.null(values)) { + return(rep(NA_character_, length(which(filter)))) + } + + values_old_length <- length(values) + values <- as.ab(values, flag_multiple_results = FALSE, info = FALSE) + values <- cols[names(cols) %in% values] + values_new_length <- length(values) + + if (values_new_length < values_old_length && + any(filter, na.rm = TRUE) && + message_not_thrown_before("key_antimicrobials", name)) { + warning_( + "in `key_antimicrobials()`: ", + ifelse(values_new_length == 0, + "No columns available ", + paste0("Only using ", values_new_length, " out of ", values_old_length, " defined columns ") + ), + "as key antimicrobials for ", name, "s. See `?key_antimicrobials`." + ) + } + + generate_antimcrobials_string(x[which(filter), c(universal, values), drop = FALSE]) + } + + if (is.null(universal)) { + universal <- character(0) + } else { + universal <- as.ab(universal, flag_multiple_results = FALSE, info = FALSE) + universal <- cols[names(cols) %in% universal] + } + + key_ab <- rep(NA_character_, nrow(x)) + + key_ab[which(gramstain == "Gram-negative")] <- AMR_string( + x = x, + values = gram_negative, + name = "Gram-negative", + filter = gramstain == "Gram-negative", + cols = cols + ) + + key_ab[which(gramstain == "Gram-positive")] <- AMR_string( + x = x, + values = gram_positive, + name = "Gram-positive", + filter = gramstain == "Gram-positive", + cols = cols + ) + + key_ab[which(kingdom == "Fungi")] <- AMR_string( + x = x, + values = antifungal, + name = "antifungal", + filter = kingdom == "Fungi", + cols = cols + ) + + # back-up - only use `universal` + key_ab[which(is.na(key_ab))] <- AMR_string( + x = x, + values = character(0), + name = "", + filter = is.na(key_ab), + cols = cols + ) + + if (length(unique(key_ab)) == 1) { + warning_("in `key_antimicrobials()`: no distinct key antibiotics determined.") + } + + key_ab +} + +#' @rdname key_antimicrobials +#' @export +all_antimicrobials <- function(x = NULL, + only_sir_columns = FALSE, + ...) { + if (is_null_or_grouped_tbl(x)) { + # when `x` is left blank, auto determine it (get_current_data() searches underlying data within call) + # is also fix for using a grouped df as input (a dot as first argument) + x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x) + } + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + + # force regular data.frame, not a tibble or data.table + x <- as.data.frame(x, stringsAsFactors = FALSE) + cols <- get_column_abx(x, + only_sir_columns = only_sir_columns, info = FALSE, + sort = FALSE, fn = "all_antimicrobials" + ) + + generate_antimcrobials_string(x[, cols, drop = FALSE]) +} + +generate_antimcrobials_string <- function(df) { + if (NCOL(df) == 0) { + return(rep("", NROW(df))) + } + if (NROW(df) == 0) { + return(character(0)) + } + tryCatch( + { + do.call( + paste0, + lapply( + as.list(df), + function(x) { + x <- toupper(as.character(x)) + x[x == "SDD"] <- "I" + # ignore "NI" here, no use for determining first isolates + x[!x %in% c("S", "I", "R")] <- "." + paste(x) + } + ) + ) + }, + error = function(e) rep(strrep(".", NCOL(df)), NROW(df)) + ) +} + +#' @rdname key_antimicrobials +#' @export +antimicrobials_equal <- function(y, + z, + type = c("points", "keyantimicrobials"), + ignore_I = TRUE, + points_threshold = 2, + ...) { + meet_criteria(y, allow_class = "character") + meet_criteria(z, allow_class = "character") + stop_if(missing(type), "argument \"type\" is missing, with no default") + meet_criteria(type, allow_class = "character", has_length = 1, is_in = c("points", "keyantimicrobials")) + meet_criteria(ignore_I, allow_class = "logical", has_length = 1) + meet_criteria(points_threshold, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + stop_ifnot(length(y) == length(z), "length of `y` and `z` must be equal") + + key2sir <- function(val) { + val <- strsplit(val, "", fixed = TRUE)[[1L]] + val.int <- rep(NA_real_, length(val)) + val.int[val == "S"] <- 1 + val.int[val %in% c("I", "SDD")] <- 2 + val.int[val == "R"] <- 3 + val.int + } + # only run on uniques + uniq <- unique(c(y, z)) + uniq_list <- lapply(uniq, key2sir) + names(uniq_list) <- uniq + + y <- uniq_list[match(y, names(uniq_list))] + z <- uniq_list[match(z, names(uniq_list))] + + determine_equality <- function(a, b, type, points_threshold, ignore_I) { + if (length(a) != length(b)) { + # incomparable, so not equal + return(FALSE) + } + # ignore NAs on both sides + NA_ind <- which(is.na(a) | is.na(b)) + a[NA_ind] <- NA_real_ + b[NA_ind] <- NA_real_ + + if (type == "points") { + # count points for every single character: + # - no change is 0 points + # - I <-> S|R is 0.5 point + # - S|R <-> R|S is 1 point + # use the levels of as.sir (S = 1, I = 2, R = 3) + # and divide by 2 (S = 0.5, I = 1, R = 1.5) + (sum(abs(a - b), na.rm = TRUE) / 2) < points_threshold + } else { + if (ignore_I == TRUE) { + ind <- which(a == 2 | b == 2) # since as.double(as.sir("I")) == 2 + a[ind] <- NA_real_ + b[ind] <- NA_real_ + } + all(a == b, na.rm = TRUE) + } + } + out <- unlist(Map( + f = determine_equality, + y, + z, + MoreArgs = list( + type = type, + points_threshold = points_threshold, + ignore_I = ignore_I + ), + USE.NAMES = FALSE + )) + out[is.na(y) | is.na(z)] <- NA + out +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/kurtosis.R + + + + +#' Kurtosis of the Sample +#' +#' @description Kurtosis is a measure of the "tailedness" of the probability distribution of a real-valued random variable. A normal distribution has a kurtosis of 3 and a excess kurtosis of 0. +#' @param x a vector of values, a [matrix] or a [data.frame] +#' @param na.rm a [logical] to indicate whether `NA` values should be stripped before the computation proceeds +#' @param excess a [logical] to indicate whether the *excess kurtosis* should be returned, defined as the kurtosis minus 3. +#' @seealso [skewness()] +#' @rdname kurtosis +#' @export +#' @examples +#' kurtosis(rnorm(10000)) +#' kurtosis(rnorm(10000), excess = TRUE) +kurtosis <- function(x, na.rm = FALSE, excess = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + meet_criteria(excess, allow_class = "logical", has_length = 1) + UseMethod("kurtosis") +} + +#' @method kurtosis default +#' @rdname kurtosis +#' @export +kurtosis.default <- function(x, na.rm = FALSE, excess = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + meet_criteria(excess, allow_class = "logical", has_length = 1) + x <- as.vector(x) + if (isTRUE(na.rm)) { + x <- x[!is.na(x)] + } + n <- length(x) + k <- n * sum((x - mean(x, na.rm = na.rm))^4, na.rm = na.rm) / + (sum((x - mean(x, na.rm = na.rm))^2, na.rm = na.rm)^2) + k - ifelse(excess, 3, 0) +} + +#' @method kurtosis matrix +#' @rdname kurtosis +#' @export +kurtosis.matrix <- function(x, na.rm = FALSE, excess = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + meet_criteria(excess, allow_class = "logical", has_length = 1) + apply(x, 2, kurtosis.default, na.rm = na.rm, excess = excess) +} + +#' @method kurtosis data.frame +#' @rdname kurtosis +#' @export +kurtosis.data.frame <- function(x, na.rm = FALSE, excess = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + meet_criteria(excess, allow_class = "logical", has_length = 1) + vapply(FUN.VALUE = double(1), x, kurtosis.default, na.rm = na.rm, excess = excess) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/like.R + + + + +#' Vectorised Pattern Matching with Keyboard Shortcut +#' +#' Convenient wrapper around [grepl()] to match a pattern: `x %like% pattern`. It always returns a [`logical`] vector and is always case-insensitive (use `x %like_case% pattern` for case-sensitive matching). Also, `pattern` can be as long as `x` to compare items of each index in both vectors, or they both can have the same length to iterate over all cases. +#' @param x a [character] vector where matches are sought, or an object which can be coerced by [as.character()] to a [character] vector. +#' @param pattern a [character] vector containing regular expressions (or a [character] string for `fixed = TRUE`) to be matched in the given [character] vector. Coerced by [as.character()] to a [character] string if possible. +#' @param ignore.case if `FALSE`, the pattern matching is *case sensitive* and if `TRUE`, case is ignored during matching. +#' @return A [logical] vector +#' @name like +#' @rdname like +#' @export +#' @details +#' These [like()] and `%like%`/`%unlike%` functions: +#' * Are case-insensitive (use `%like_case%`/`%unlike_case%` for case-sensitive matching) +#' * Support multiple patterns +#' * Check if `pattern` is a valid regular expression and sets `fixed = TRUE` if not, to greatly improve speed (vectorised over `pattern`) +#' * Always use compatibility with Perl unless `fixed = TRUE`, to greatly improve speed +#' +#' Using RStudio? The `%like%`/`%unlike%` functions can also be directly inserted in your code from the Addins menu and can have its own keyboard shortcut like `Shift+Ctrl+L` or `Shift+Cmd+L` (see menu `Tools` > `Modify Keyboard Shortcuts...`). If you keep pressing your shortcut, the inserted text will be iterated over `%like%` -> `%unlike%` -> `%like_case%` -> `%unlike_case%`. +#' @source Idea from the [`like` function from the `data.table` package](https://github.com/Rdatatable/data.table/blob/ec1259af1bf13fc0c96a1d3f9e84d55d8106a9a4/R/like.R), although altered as explained in *Details*. +#' @seealso [grepl()] + +#' @examples +#' # data.table has a more limited version of %like%, so unload it: +#' try(detach("package:data.table", unload = TRUE), silent = TRUE) +#' +#' a <- "This is a test" +#' b <- "TEST" +#' a %like% b +#' b %like% a +#' +#' # also supports multiple patterns +#' a <- c("Test case", "Something different", "Yet another thing") +#' b <- c("case", "diff", "yet") +#' a %like% b +#' a %unlike% b +#' +#' a[1] %like% b +#' a %like% b[1] +#' +#' \donttest{ +#' # get isolates whose name start with 'Entero' (case-insensitive) +#' example_isolates[which(mo_name() %like% "^entero"), ] +#' +#' if (require("dplyr")) { +#' example_isolates %>% +#' filter(mo_name() %like% "^ent") +#' } +#' } +like <- function(x, pattern, ignore.case = TRUE) { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(pattern, allow_NA = FALSE) + meet_criteria(ignore.case, allow_class = "logical", has_length = 1) + + if (all(is.na(x))) { + return(rep(FALSE, length(x))) + } + + # set to fixed if no valid regex (vectorised) + fixed <- !is_valid_regex(pattern) + + if (ignore.case == TRUE) { + # set here, otherwise if fixed = TRUE, this warning will be thrown: argument `ignore.case = TRUE` will be ignored + x <- tolower(x) + pattern <- tolower(pattern) + } + + if (is.factor(x)) { + x <- as.character(x) + } + + if (length(pattern) == 1) { + grepl(pattern, x, ignore.case = FALSE, fixed = fixed, perl = !fixed) + } else { + if (length(x) == 1) { + x <- rep(x, length(pattern)) + } else if (length(pattern) != length(x)) { + stop_( + "arguments `x` and `pattern` must be of same length, or either one must be 1 ", + "(`x` has length ", length(x), " and `pattern` has length ", length(pattern), ")" + ) + } + unlist( + Map( + f = grepl, + x = x, + pattern = pattern, + fixed = fixed, + perl = !fixed, + MoreArgs = list(ignore.case = FALSE), + USE.NAMES = FALSE + ) + ) + } +} + +#' @rdname like +#' @export +"%like%" <- function(x, pattern) { + like(x, pattern, ignore.case = TRUE) +} + +#' @rdname like +#' @export +"%unlike%" <- function(x, pattern) { + !like(x, pattern, ignore.case = TRUE) +} + +#' @rdname like +#' @export +"%like_case%" <- function(x, pattern) { + like(x, pattern, ignore.case = FALSE) +} + +#' @rdname like +#' @export +"%unlike_case%" <- function(x, pattern) { + !like(x, pattern, ignore.case = FALSE) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mdro.R + + + + +#' Determine Multidrug-Resistant Organisms (MDRO) +#' +#' Determine which isolates are multidrug-resistant organisms (MDRO) according to international, national and custom guidelines. +#' @param x a [data.frame] with antibiotics columns, like `AMX` or `amox`. Can be left blank for automatic determination. +#' @param guideline a specific guideline to follow, see sections *Supported international / national guidelines* and *Using Custom Guidelines* below. When left empty, the publication by Magiorakos *et al.* (see below) will be followed. +#' @param ... in case of [custom_mdro_guideline()]: a set of rules, see section *Using Custom Guidelines* below. Otherwise: column name of an antibiotic, see section *Antibiotics* below. +#' @param as_factor a [logical] to indicate whether the returned value should be an ordered [factor] (`TRUE`, default), or otherwise a [character] vector +#' @inheritParams eucast_rules +#' @param pct_required_classes minimal required percentage of antimicrobial classes that must be available per isolate, rounded down. For example, with the default guideline, 17 antimicrobial classes must be available for *S. aureus*. Setting this `pct_required_classes` argument to `0.5` (default) means that for every *S. aureus* isolate at least 8 different classes must be available. Any lower number of available classes will return `NA` for that isolate. +#' @param combine_SI a [logical] to indicate whether all values of S and I must be merged into one, so resistance is only considered when isolates are R, not I. As this is the default behaviour of the [mdro()] function, it follows the redefinition by EUCAST about the interpretation of I (increased exposure) in 2019, see section 'Interpretation of S, I and R' below. When using `combine_SI = FALSE`, resistance is considered when isolates are R or I. +#' @param verbose a [logical] to turn Verbose mode on and off (default is off). In Verbose mode, the function does not return the MDRO results, but instead returns a data set in logbook form with extensive info about which isolates would be MDRO-positive, or why they are not. +#' @inheritSection eucast_rules Antibiotics +#' @details +#' These functions are context-aware. This means that the `x` argument can be left blank if used inside a [data.frame] call, see *Examples*. +#' +#' For the `pct_required_classes` argument, values above 1 will be divided by 100. This is to support both fractions (`0.75` or `3/4`) and percentages (`75`). +#' +#' **Note:** Every test that involves the Enterobacteriaceae family, will internally be performed using its newly named *order* Enterobacterales, since the Enterobacteriaceae family has been taxonomically reclassified by Adeolu *et al.* in 2016. Before that, Enterobacteriaceae was the only family under the Enterobacteriales (with an i) order. All species under the old Enterobacteriaceae family are still under the new Enterobacterales (without an i) order, but divided into multiple families. The way tests are performed now by this [mdro()] function makes sure that results from before 2016 and after 2016 are identical. +#' +#' @section Supported International / National Guidelines: +#' +#' Currently supported guidelines are (case-insensitive): +#' +#' * `guideline = "CMI2012"` (default) +#' +#' Magiorakos AP, Srinivasan A *et al.* "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012) (\doi{10.1111/j.1469-0691.2011.03570.x}) +#' +#' * `guideline = "EUCAST3.3"` (or simply `guideline = "EUCAST"`) +#' +#' The European international guideline - EUCAST Expert Rules Version 3.3 "Intrinsic Resistance and Unusual Phenotypes" ([link](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2021/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.3_20211018.pdf)) +#' +#' * `guideline = "EUCAST3.2"` +#' +#' The European international guideline - EUCAST Expert Rules Version 3.2 "Intrinsic Resistance and Unusual Phenotypes" ([link](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf)) +#' +#' * `guideline = "EUCAST3.1"` +#' +#' The European international guideline - EUCAST Expert Rules Version 3.1 "Intrinsic Resistance and Exceptional Phenotypes Tables" ([link](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf)) +#' +#' * `guideline = "TB"` +#' +#' The international guideline for multi-drug resistant tuberculosis - World Health Organization "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis" ([link](https://www.who.int/publications/i/item/9789241548809)) +#' +#' * `guideline = "MRGN"` +#' +#' The German national guideline - Mueller et al. (2015) Antimicrobial Resistance and Infection Control 4:7; \doi{10.1186/s13756-015-0047-6} +#' +#' * `guideline = "BRMO"` +#' +#' The Dutch national guideline - Rijksinstituut voor Volksgezondheid en Milieu "WIP-richtlijn BRMO (Bijzonder Resistente Micro-Organismen) (ZKH)" ([link](https://www.rivm.nl/wip-richtlijn-brmo-bijzonder-resistente-micro-organismen-zkh)) +#' +#' Please suggest your own (country-specific) guidelines by letting us know: . +#' +#' @section Using Custom Guidelines: +#' +#' Custom guidelines can be set with the [custom_mdro_guideline()] function. This is of great importance if you have custom rules to determine MDROs in your hospital, e.g., rules that are dependent on ward, state of contact isolation or other variables in your data. +#' +#' If you are familiar with the [`case_when()`][dplyr::case_when()] function of the `dplyr` package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule is written *before* the tilde (`~`) and the consequence of the rule is written *after* the tilde: +#' +#' ``` +#' custom <- custom_mdro_guideline(CIP == "R" & age > 60 ~ "Elderly Type A", +#' ERY == "R" & age > 60 ~ "Elderly Type B") +#' ``` +#' +#' If a row/an isolate matches the first rule, the value after the first `~` (in this case *'Elderly Type A'*) will be set as MDRO value. Otherwise, the second rule will be tried and so on. The number of rules is unlimited. +#' +#' You can print the rules set in the console for an overview. Colours will help reading it if your console supports colours. +#' +#' ``` +#' custom +#' #> A set of custom MDRO rules: +#' #> 1. CIP is "R" and age is higher than 60 -> Elderly Type A +#' #> 2. ERY is "R" and age is higher than 60 -> Elderly Type B +#' #> 3. Otherwise -> Negative +#' #> +#' #> Unmatched rows will return NA. +#' ``` +#' +#' The outcome of the function can be used for the `guideline` argument in the [mdro()] function: +#' +#' ``` +#' x <- mdro(example_isolates, +#' guideline = custom) +#' table(x) +#' #> Negative Elderly Type A Elderly Type B +#' #> 1070 198 732 +#' ``` +#' +#' Rules can also be combined with other custom rules by using [c()]: +#' +#' ``` +#' x <- mdro(example_isolates, +#' guideline = c(custom, +#' custom_mdro_guideline(ERY == "R" & age > 50 ~ "Elderly Type C"))) +#' table(x) +#' #> Negative Elderly Type A Elderly Type B Elderly Type C +#' #> 961 198 732 109 +#' ``` +#' +#' The rules set (the `custom` object in this case) could be exported to a shared file location using [saveRDS()] if you collaborate with multiple users. The custom rules set could then be imported using [readRDS()]. +#' @inheritSection as.sir Interpretation of SIR +#' @return +#' - CMI 2012 paper - function [mdr_cmi2012()] or [mdro()]:\cr +#' Ordered [factor] with levels `Negative` < `Multi-drug-resistant (MDR)` < `Extensively drug-resistant (XDR)` < `Pandrug-resistant (PDR)` +#' - TB guideline - function [mdr_tb()] or [`mdro(..., guideline = "TB")`][mdro()]:\cr +#' Ordered [factor] with levels `Negative` < `Mono-resistant` < `Poly-resistant` < `Multi-drug-resistant` < `Extensively drug-resistant` +#' - German guideline - function [mrgn()] or [`mdro(..., guideline = "MRGN")`][mdro()]:\cr +#' Ordered [factor] with levels `Negative` < `3MRGN` < `4MRGN` +#' - Everything else, except for custom guidelines:\cr +#' Ordered [factor] with levels `Negative` < `Positive, unconfirmed` < `Positive`. The value `"Positive, unconfirmed"` means that, according to the guideline, it is not entirely sure if the isolate is multi-drug resistant and this should be confirmed with additional (e.g. molecular) tests +#' @rdname mdro +#' @aliases MDR XDR PDR BRMO 3MRGN 4MRGN +#' @export +#' @source +#' See the supported guidelines above for the [list] of publications used for this function. +#' @examples +#' out <- mdro(example_isolates, guideline = "EUCAST") +#' str(out) +#' table(out) +#' +#' out <- mdro(example_isolates, +#' guideline = custom_mdro_guideline( +#' AMX == "R" ~ "Custom MDRO 1", +#' VAN == "R" ~ "Custom MDRO 2" +#' ) +#' ) +#' table(out) +#' +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' mdro() %>% +#' table() +#' +#' # no need to define `x` when used inside dplyr verbs: +#' example_isolates %>% +#' mutate(MDRO = mdro()) %>% +#' pull(MDRO) %>% +#' table() +#' } +#' } +mdro <- function(x = NULL, + guideline = "CMI2012", + col_mo = NULL, + info = interactive(), + pct_required_classes = 0.5, + combine_SI = TRUE, + verbose = FALSE, + only_sir_columns = FALSE, + ...) { + if (is_null_or_grouped_tbl(x)) { + # when `x` is left blank, auto determine it (get_current_data() searches underlying data within call) + # is also a fix for using a grouped df as input (i.e., a dot as first argument) + x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x) + } + meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 + meet_criteria(guideline, allow_class = c("list", "character"), allow_NULL = TRUE) + if (!is.list(guideline)) { + meet_criteria(guideline, allow_class = "character", has_length = 1, allow_NULL = TRUE) + } + meet_criteria(col_mo, allow_class = "character", has_length = 1, is_in = colnames(x), allow_NULL = TRUE) + meet_criteria(info, allow_class = "logical", has_length = 1) + meet_criteria(pct_required_classes, allow_class = "numeric", has_length = 1) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(verbose, allow_class = "logical", has_length = 1) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + + if (!any(is_sir_eligible(x))) { + stop_("There were no possible SIR columns found in the data set. Transform columns with `as.sir()` for valid antimicrobial interpretations.") + } + + info.bak <- info + # don't throw info's more than once per call + if (isTRUE(info)) { + info <- message_not_thrown_before("mdro") + } + + if (interactive() && isTRUE(verbose) && isTRUE(info)) { + txt <- paste0( + "WARNING: In Verbose mode, the mdro() function does not return the MDRO results, but instead returns a data set in logbook form with extensive info about which isolates would be MDRO-positive, or why they are not.", + "\n\nThis may overwrite your existing data if you use e.g.:", + "\ndata <- mdro(data, verbose = TRUE)\n\nDo you want to continue?" + ) + showQuestion <- import_fn("showQuestion", "rstudioapi", error_on_fail = FALSE) + if (!is.null(showQuestion)) { + q_continue <- showQuestion("Using verbose = TRUE with mdro()", txt) + } else { + q_continue <- utils::menu(choices = c("OK", "Cancel"), graphics = FALSE, title = txt) + } + if (q_continue %in% c(FALSE, 2)) { + message_("Cancelled, returning original data", add_fn = font_red, as_note = FALSE) + return(x) + } + } + + group_msg <- "" + if (isTRUE(info.bak)) { + # print group name if used in dplyr::group_by() + cur_group <- import_fn("cur_group", "dplyr", error_on_fail = FALSE) + if (!is.null(cur_group)) { + group_df <- tryCatch(cur_group(), error = function(e) data.frame()) + if (NCOL(group_df) > 0) { + # transform factors to characters + group <- vapply(FUN.VALUE = character(1), group_df, function(x) { + if (is.numeric(x)) { + format(x) + } else if (is.logical(x)) { + as.character(x) + } else { + paste0('"', x, '"') + } + }) + group_msg <- paste0("\nGroup: ", paste0(names(group), " = ", group, collapse = ", "), "\n") + } + } + } + + # force regular [data.frame], not a tibble or data.table + x <- as.data.frame(x, stringsAsFactors = FALSE) + + if (pct_required_classes > 1) { + # allow pct_required_classes = 75 -> pct_required_classes = 0.75 + pct_required_classes <- pct_required_classes / 100 + } + + guideline.bak <- guideline + if (is.list(guideline)) { + # Custom MDRO guideline --------------------------------------------------- + stop_ifnot(inherits(guideline, "custom_mdro_guideline"), "use `custom_mdro_guideline()` to create custom guidelines") + if (isTRUE(info)) { + txt <- paste0( + "Determining MDROs based on custom rules", + ifelse(isTRUE(attributes(guideline)$as_factor), + paste0(", resulting in factor levels: ", paste0(attributes(guideline)$values, collapse = " < ")), + "" + ), + "." + ) + txt <- word_wrap(txt) + cat(txt, "\n", sep = "") + } + x <- run_custom_mdro_guideline(df = x, guideline = guideline, info = info) + if (isTRUE(info.bak)) { + cat(group_msg) + if (sum(!is.na(x$MDRO)) == 0) { + cat(word_wrap(font_bold(paste0("=> Found 0 MDROs since no isolates are covered by the custom guideline")))) + } else { + cat(word_wrap(font_bold(paste0( + "=> Found ", sum(x$MDRO != "Negative", na.rm = TRUE), + " custom defined MDROs out of ", sum(!is.na(x$MDRO)), + " isolates (", + trimws(percentage(sum(x$MDRO != "Negative", na.rm = TRUE) / sum(!is.na(x$MDRO)))), + ")\n" + )))) + } + } + if (isTRUE(verbose)) { + return(x[, c( + "row_number", + "MDRO", + "reason", + "columns_nonsusceptible" + )]) + } else { + return(x$MDRO) + } + } + guideline <- tolower(gsub("[^a-zA-Z0-9.]+", "", guideline)) + if (is.null(guideline)) { + # default to the paper by Magiorakos et al. (2012) + guideline <- "cmi2012" + } + if (guideline == "eucast") { + # turn into latest EUCAST guideline + guideline <- "eucast3.3" + } + if (guideline == "nl") { + guideline <- "brmo" + } + if (guideline == "de") { + guideline <- "mrgn" + } + stop_ifnot( + guideline %in% c("brmo", "mrgn", "eucast3.1", "eucast3.2", "eucast3.3", "tb", "cmi2012"), + "invalid guideline: ", guideline.bak + ) + guideline <- list(code = guideline) + + # try to find columns based on type + # -- mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = info) + } + if (is.null(col_mo) && guideline$code == "tb") { + message_( + "No column found as input for `col_mo`, ", + font_bold(paste0("assuming all rows contain ", font_italic("Mycobacterium tuberculosis"), ".")) + ) + x$mo <- as.mo("Mycobacterium tuberculosis", keep_synonyms = TRUE) + col_mo <- "mo" + } + stop_if(is.null(col_mo), "`col_mo` must be set") + + if (guideline$code == "cmi2012") { + guideline$name <- "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." + guideline$author <- "Magiorakos AP, Srinivasan A, Carey RB, ..., Vatopoulos A, Weber JT, Monnet DL" + guideline$version <- NA + guideline$source_url <- paste0("Clinical Microbiology and Infection 18:3, 2012; ", font_url("https://doi.org/10.1111/j.1469-0691.2011.03570.x", "doi: 10.1111/j.1469-0691.2011.03570.x")) + guideline$type <- "MDRs/XDRs/PDRs" + } else if (guideline$code == "eucast3.1") { + guideline$name <- "EUCAST Expert Rules, \"Intrinsic Resistance and Exceptional Phenotypes Tables\"" + guideline$author <- "EUCAST (European Committee on Antimicrobial Susceptibility Testing)" + guideline$version <- "3.1, 2016" + guideline$source_url <- font_url("https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf", "Direct download") + guideline$type <- "EUCAST Exceptional Phenotypes" + } else if (guideline$code == "eucast3.2") { + guideline$name <- "EUCAST Expert Rules, \"Intrinsic Resistance and Unusual Phenotypes\"" + guideline$author <- "EUCAST (European Committee on Antimicrobial Susceptibility Testing)" + guideline$version <- "3.2, February 2020" + guideline$source_url <- font_url("https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf", "Direct download") + guideline$type <- "EUCAST Unusual Phenotypes" + } else if (guideline$code == "eucast3.3") { + guideline$name <- "EUCAST Expert Rules, \"Intrinsic Resistance and Unusual Phenotypes\"" + guideline$author <- "EUCAST (European Committee on Antimicrobial Susceptibility Testing)" + guideline$version <- "3.3, October 2021" + guideline$source_url <- font_url("https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2021/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.3_20211018.pdf", "Direct download") + guideline$type <- "EUCAST Unusual Phenotypes" + } else if (guideline$code == "tb") { + guideline$name <- "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis" + guideline$author <- "WHO (World Health Organization)" + guideline$version <- "WHO/HTM/TB/2014.11, 2014" + guideline$source_url <- font_url("https://www.who.int/publications/i/item/9789241548809", "Direct download") + guideline$type <- "MDR-TB's" + + # support per country: + } else if (guideline$code == "mrgn") { + guideline$name <- "Cross-border comparison of the Dutch and German guidelines on multidrug-resistant Gram-negative microorganisms" + guideline$author <- "M\u00fcller J, Voss A, K\u00f6ck R, ..., Kern WV, Wendt C, Friedrich AW" + guideline$version <- NA + guideline$source_url <- paste0("Antimicrobial Resistance and Infection Control 4:7, 2015; ", font_url("https://doi.org/10.1186/s13756-015-0047-6", "doi: 10.1186/s13756-015-0047-6")) + guideline$type <- "MRGNs" + } else if (guideline$code == "brmo") { + guideline$name <- "WIP-Richtlijn Bijzonder Resistente Micro-organismen (BRMO)" + guideline$author <- "RIVM (Rijksinstituut voor de Volksgezondheid)" + guideline$version <- "Revision as of December 2017" + guideline$source_url <- font_url("https://www.rivm.nl/Documenten_en_publicaties/Professioneel_Praktisch/Richtlijnen/Infectieziekten/WIP_Richtlijnen/WIP_Richtlijnen/Ziekenhuizen/WIP_richtlijn_BRMO_Bijzonder_Resistente_Micro_Organismen_ZKH", "Direct download") + guideline$type <- "BRMOs" + } else { + stop("This guideline is currently unsupported: ", guideline$code, call. = FALSE) + } + + if (guideline$code == "cmi2012") { + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c( + # [table] 1 (S aureus) + "GEN", "RIF", "CPT", "OXA", "CIP", "MFX", "SXT", "FUS", "VAN", "TEC", "TLV", "TGC", "CLI", "DAP", "ERY", "LNZ", "CHL", "FOS", "QDA", "TCY", "DOX", "MNO", + # [table] 2 (Enterococcus) + "GEH", "STH", "IPM", "MEM", "DOR", "CIP", "LVX", "MFX", "VAN", "TEC", "TGC", "DAP", "LNZ", "AMP", "QDA", "DOX", "MNO", + # [table] 3 (Enterobacteriaceae) + "GEN", "TOB", "AMK", "NET", "CPT", "TCC", "TZP", "ETP", "IPM", "MEM", "DOR", "CZO", "CXM", "CTX", "CAZ", "FEP", "FOX", "CTT", "CIP", "SXT", "TGC", "ATM", "AMP", "AMC", "SAM", "CHL", "FOS", "COL", "TCY", "DOX", "MNO", + # [table] 4 (Pseudomonas) + "GEN", "TOB", "AMK", "NET", "IPM", "MEM", "DOR", "CAZ", "FEP", "CIP", "LVX", "TCC", "TZP", "ATM", "FOS", "COL", "PLB", + # [table] 5 (Acinetobacter) + "GEN", "TOB", "AMK", "NET", "IPM", "MEM", "DOR", "CIP", "LVX", "TZP", "TCC", "CTX", "CRO", "CAZ", "FEP", "SXT", "SAM", "COL", "PLB", "TCY", "DOX", "MNO" + ), + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } else if (guideline$code == "eucast3.2") { + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c("AMP", "AMX", "CIP", "DAL", "DAP", "ERV", "FDX", "GEN", "LNZ", "MEM", "MTR", "OMC", "ORI", "PEN", "QDA", "RIF", "TEC", "TGC", "TLV", "TOB", "TZD", "VAN"), + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } else if (guideline$code == "eucast3.3") { + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c("AMP", "AMX", "CIP", "DAL", "DAP", "ERV", "FDX", "GEN", "LNZ", "MEM", "MTR", "OMC", "ORI", "PEN", "QDA", "RIF", "TEC", "TGC", "TLV", "TOB", "TZD", "VAN"), + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } else if (guideline$code == "tb") { + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c("CAP", "ETH", "GAT", "INH", "PZA", "RIF", "RIB", "RFP"), + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } else if (guideline$code == "mrgn") { + cols_ab <- get_column_abx( + x = x, + soft_dependencies = c("PIP", "CTX", "CAZ", "IPM", "MEM", "CIP"), + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } else { + cols_ab <- get_column_abx( + x = x, + verbose = verbose, + info = info, + only_sir_columns = only_sir_columns, + fn = "mdro", + ... + ) + } + if (!"AMP" %in% names(cols_ab) && "AMX" %in% names(cols_ab)) { + # ampicillin column is missing, but amoxicillin is available + if (isTRUE(info)) { + message_("Using column '", cols_ab[names(cols_ab) == "AMX"], "' as input for ampicillin since many EUCAST rules depend on it.") + } + cols_ab <- c(cols_ab, c(AMP = unname(cols_ab[names(cols_ab) == "AMX"]))) + } + + # nolint start + AMC <- cols_ab["AMC"] + AMK <- cols_ab["AMK"] + AMP <- cols_ab["AMP"] + AMX <- cols_ab["AMX"] + ATM <- cols_ab["ATM"] + AZL <- cols_ab["AZL"] + AZM <- cols_ab["AZM"] + BPR <- cols_ab["BPR"] + CAC <- cols_ab["CAC"] + CAT <- cols_ab["CAT"] + CAZ <- cols_ab["CAZ"] + CCV <- cols_ab["CCV"] + CDR <- cols_ab["CDR"] + CDZ <- cols_ab["CDZ"] + CEC <- cols_ab["CEC"] + CED <- cols_ab["CED"] + CEI <- cols_ab["CEI"] + CEP <- cols_ab["CEP"] + CFM <- cols_ab["CFM"] + CFM1 <- cols_ab["CFM1"] + CFP <- cols_ab["CFP"] + CFR <- cols_ab["CFR"] + CFS <- cols_ab["CFS"] + CHL <- cols_ab["CHL"] + CID <- cols_ab["CID"] + CIP <- cols_ab["CIP"] + CLI <- cols_ab["CLI"] + CLR <- cols_ab["CLR"] + CMX <- cols_ab["CMX"] + CMZ <- cols_ab["CMZ"] + CND <- cols_ab["CND"] + COL <- cols_ab["COL"] + CPD <- cols_ab["CPD"] + CPM <- cols_ab["CPM"] + CPO <- cols_ab["CPO"] + CPR <- cols_ab["CPR"] + CPT <- cols_ab["CPT"] + CRD <- cols_ab["CRD"] + CRO <- cols_ab["CRO"] + CSL <- cols_ab["CSL"] + CTB <- cols_ab["CTB"] + CTF <- cols_ab["CTF"] + CTL <- cols_ab["CTL"] + CTT <- cols_ab["CTT"] + CTX <- cols_ab["CTX"] + CTZ <- cols_ab["CTZ"] + CXM <- cols_ab["CXM"] + CZD <- cols_ab["CZD"] + CZO <- cols_ab["CZO"] + CZX <- cols_ab["CZX"] + DAL <- cols_ab["DAL"] + DAP <- cols_ab["DAP"] + DIT <- cols_ab["DIT"] + DIZ <- cols_ab["DIZ"] + DOR <- cols_ab["DOR"] + DOX <- cols_ab["DOX"] + ENX <- cols_ab["ENX"] + ERV <- cols_ab["ERV"] + ERY <- cols_ab["ERY"] + ETP <- cols_ab["ETP"] + FDX <- cols_ab["FDX"] + FEP <- cols_ab["FEP"] + FLC <- cols_ab["FLC"] + FLE <- cols_ab["FLE"] + FOS <- cols_ab["FOS"] + FOX <- cols_ab["FOX"] + FUS <- cols_ab["FUS"] + GAT <- cols_ab["GAT"] + GEH <- cols_ab["GEH"] + GEM <- cols_ab["GEM"] + GEN <- cols_ab["GEN"] + GRX <- cols_ab["GRX"] + HAP <- cols_ab["HAP"] + IPM <- cols_ab["IPM"] + KAN <- cols_ab["KAN"] + LEX <- cols_ab["LEX"] + LIN <- cols_ab["LIN"] + LNZ <- cols_ab["LNZ"] + LOM <- cols_ab["LOM"] + LOR <- cols_ab["LOR"] + LTM <- cols_ab["LTM"] + LVX <- cols_ab["LVX"] + MAN <- cols_ab["MAN"] + MEM <- cols_ab["MEM"] + MEV <- cols_ab["MEV"] + MEZ <- cols_ab["MEZ"] + MFX <- cols_ab["MFX"] + MNO <- cols_ab["MNO"] + MTR <- cols_ab["MTR"] + NAL <- cols_ab["NAL"] + NEO <- cols_ab["NEO"] + NET <- cols_ab["NET"] + NIT <- cols_ab["NIT"] + NOR <- cols_ab["NOR"] + NOV <- cols_ab["NOV"] + OFX <- cols_ab["OFX"] + OMC <- cols_ab["OMC"] + ORI <- cols_ab["ORI"] + OXA <- cols_ab["OXA"] + PAZ <- cols_ab["PAZ"] + PEF <- cols_ab["PEF"] + PEN <- cols_ab["PEN"] + PIP <- cols_ab["PIP"] + PLB <- cols_ab["PLB"] + PRI <- cols_ab["PRI"] + PRU <- cols_ab["PRU"] + QDA <- cols_ab["QDA"] + RFL <- cols_ab["RFL"] + RID <- cols_ab["RID"] + RIF <- cols_ab["RIF"] + RXT <- cols_ab["RXT"] + SAM <- cols_ab["SAM"] + SIS <- cols_ab["SIS"] + SPT <- cols_ab["SPT"] + SPX <- cols_ab["SPX"] + STH <- cols_ab["STH"] + SXT <- cols_ab["SXT"] + TCC <- cols_ab["TCC"] + TCY <- cols_ab["TCY"] + TEC <- cols_ab["TEC"] + TGC <- cols_ab["TGC"] + TIC <- cols_ab["TIC"] + TLV <- cols_ab["TLV"] + TMP <- cols_ab["TMP"] + TMX <- cols_ab["TMX"] + TOB <- cols_ab["TOB"] + TVA <- cols_ab["TVA"] + TZD <- cols_ab["TZD"] + TZP <- cols_ab["TZP"] + VAN <- cols_ab["VAN"] + # additional for TB + CAP <- cols_ab["CAP"] + ETH <- cols_ab["ETH"] + GAT <- cols_ab["GAT"] + INH <- cols_ab["INH"] + PZA <- cols_ab["PZA"] + RIF <- cols_ab["RIF"] + RIB <- cols_ab["RIB"] + RFP <- cols_ab["RFP"] + abx_tb <- c(CAP, ETH, GAT, INH, PZA, RIF, RIB, RFP) + abx_tb <- abx_tb[!is.na(abx_tb)] + stop_if(guideline$code == "tb" & length(abx_tb) == 0, "no antimycobacterials found in data set") + # nolint end + + if (isTRUE(combine_SI)) { + search_result <- "R" + } else { + search_result <- c("R", "I") + } + + if (isTRUE(info)) { + if (isTRUE(combine_SI)) { + cat(font_red("\nOnly results with 'R' are considered as resistance. Use `combine_SI = FALSE` to also consider 'I' as resistance.\n")) + } else { + cat(font_red("\nResults with 'R' or 'I' are considered as resistance. Use `combine_SI = TRUE` to only consider 'R' as resistance.\n")) + } + cat("\n", word_wrap("Determining multidrug-resistant organisms (MDRO), according to:"), "\n", + word_wrap(paste0(font_bold("Guideline: "), font_italic(guideline$name)), extra_indent = 11, as_note = FALSE), "\n", + word_wrap(paste0(font_bold("Author(s): "), guideline$author), extra_indent = 11, as_note = FALSE), "\n", + ifelse(!is.na(guideline$version), + paste0(word_wrap(paste0(font_bold("Version: "), guideline$version), extra_indent = 11, as_note = FALSE), "\n"), + "" + ), + paste0(font_bold("Source: "), guideline$source_url), + "\n\n", + sep = "" + ) + } + + ab_missing <- function(ab) { + isTRUE(ab %in% c(NULL, NA)) | length(ab) == 0 + } + ab_NA <- function(x) { + x[!is.na(x)] + } + try_ab <- function(expr) { + out <- tryCatch(expr, error = function(e) FALSE) + out[is.na(out)] <- FALSE + out + } + + # antibiotic classes + # nolint start + aminoglycosides <- c(TOB, GEN) + cephalosporins <- c(CDZ, CAC, CEC, CFR, RID, MAN, CTZ, CZD, CZO, CDR, DIT, FEP, CAT, CFM, CMX, CMZ, DIZ, CID, CFP, CSL, CND, CTX, CTT, CTF, FOX, CPM, CPO, CPD, CPR, CRD, CFS, CPT, CAZ, CCV, CTL, CTB, CZX, BPR, CFM1, CEI, CRO, CXM, LEX, CEP, HAP, CED, LTM, LOR) + cephalosporins_1st <- c(CAC, CFR, RID, CTZ, CZD, CZO, CRD, CTL, LEX, CEP, HAP, CED) + cephalosporins_2nd <- c(CEC, MAN, CMZ, CID, CND, CTT, CTF, FOX, CPR, CXM, LOR) + cephalosporins_3rd <- c(CDZ, CDR, DIT, CAT, CFM, CMX, DIZ, CFP, CSL, CTX, CPM, CPD, CFS, CAZ, CCV, CTB, CZX, CRO, LTM) + carbapenems <- c(DOR, ETP, IPM, MEM, MEV) + fluoroquinolones <- c(CIP, ENX, FLE, GAT, GEM, GRX, LVX, LOM, MFX, NOR, OFX, PAZ, PEF, PRU, RFL, SPX, TMX, TVA) + # nolint end + + # helper function for editing the table + trans_tbl <- function(to, rows, cols, any_all) { + cols <- cols[!ab_missing(cols)] + cols <- cols[!is.na(cols)] + if (length(rows) > 0 && length(cols) > 0) { + x[, cols] <- as.data.frame( + lapply( + x[, cols, drop = FALSE], + function(col) as.sir(col) + ), + stringsAsFactors = FALSE + ) + x[rows, "columns_nonsusceptible"] <<- vapply( + FUN.VALUE = character(1), + rows, + function(row, group_vct = cols) { + cols_nonsus <- vapply( + FUN.VALUE = logical(1), + x[row, group_vct, drop = FALSE], + function(y) y %in% search_result + ) + paste( + sort(c( + unlist(strsplit(x[row, "columns_nonsusceptible", drop = TRUE], ", ", fixed = TRUE)), + names(cols_nonsus)[cols_nonsus] + )), + collapse = ", " + ) + } + ) + + if (any_all == "any") { + search_function <- any + } else if (any_all == "all") { + search_function <- all + } + x_transposed <- as.list(as.data.frame(t(x[, cols, drop = FALSE]), + stringsAsFactors = FALSE + )) + rows_affected <- vapply( + FUN.VALUE = logical(1), + x_transposed, + function(y) search_function(y %in% search_result, na.rm = TRUE) + ) + rows_affected <- x[which(rows_affected), "row_number", drop = TRUE] + rows_to_change <- rows[rows %in% rows_affected] + x[rows_to_change, "MDRO"] <<- to + x[rows_to_change, "reason"] <<- paste0( + any_all, + " of the required antibiotics ", + ifelse(any_all == "any", "is", "are"), + " R", + ifelse(!isTRUE(combine_SI), " or I", "") + ) + } + } + + trans_tbl2 <- function(txt, rows, lst) { + if (isTRUE(info)) { + message_(txt, "...", appendLF = FALSE, as_note = FALSE) + } + if (length(rows) > 0) { + # function specific for the CMI paper of 2012 (Magiorakos et al.) + lst_vector <- unlist(lst)[!is.na(unlist(lst))] + # keep only unique ones: + lst_vector <- lst_vector[!duplicated(paste(lst_vector, names(lst_vector)))] + + x[, lst_vector] <- as.data.frame( + lapply( + x[, lst_vector, drop = FALSE], + function(col) as.sir(col) + ), + stringsAsFactors = FALSE + ) + x[rows, "classes_in_guideline"] <<- length(lst) + x[rows, "classes_available"] <<- vapply( + FUN.VALUE = double(1), + rows, + function(row, group_tbl = lst) { + sum(vapply( + FUN.VALUE = logical(1), + group_tbl, + function(group) any(unlist(x[row, group[!is.na(group)], drop = TRUE]) %in% c("S", "SDD", "I", "R")) + )) + } + ) + + if (isTRUE(verbose)) { + x[rows, "columns_nonsusceptible"] <<- vapply( + FUN.VALUE = character(1), + rows, + function(row, group_vct = lst_vector) { + cols_nonsus <- vapply(FUN.VALUE = logical(1), x[row, group_vct, drop = FALSE], function(y) y %in% search_result) + paste(sort(names(cols_nonsus)[cols_nonsus]), collapse = ", ") + } + ) + } + x[rows, "classes_affected"] <<- vapply( + FUN.VALUE = double(1), + rows, + function(row, group_tbl = lst) { + sum( + vapply( + FUN.VALUE = logical(1), + group_tbl, + function(group) { + any(unlist(x[row, group[!is.na(group)], drop = TRUE]) %in% search_result, na.rm = TRUE) + } + ), + na.rm = TRUE + ) + } + ) + # for PDR; all drugs are R (or I if combine_SI = FALSE) + x_transposed <- as.list(as.data.frame(t(x[rows, lst_vector, drop = FALSE]), + stringsAsFactors = FALSE + )) + row_filter <- vapply(FUN.VALUE = logical(1), x_transposed, function(y) all(y %in% search_result, na.rm = TRUE)) + x[which(row_filter), "classes_affected"] <<- 999 + } + + if (isTRUE(info)) { + message_(" OK.", add_fn = list(font_green, font_bold), as_note = FALSE) + } + } + + x[, col_mo] <- as.mo(as.character(x[, col_mo, drop = TRUE])) + # rename col_mo to prevent interference with joined columns + colnames(x)[colnames(x) == col_mo] <- ".col_mo" + col_mo <- ".col_mo" + # join to microorganisms data set + x <- left_join_microorganisms(x, by = col_mo) + x$MDRO <- ifelse(!is.na(x$genus), 1, NA_integer_) + x$row_number <- seq_len(nrow(x)) + x$reason <- paste0("not covered by ", toupper(guideline$code), " guideline") + x$columns_nonsusceptible <- "" + + if (guideline$code == "cmi2012") { + # CMI, 2012 --------------------------------------------------------------- + # Non-susceptible = R and I + # (see header 'Approaches to Creating Definitions for MDR, XDR and PDR' in paper) + + # take amoxicillin if ampicillin is unavailable + if (is.na(AMP) && !is.na(AMX)) { + if (isTRUE(verbose)) { + message_("Filling ampicillin (AMP) results with amoxicillin (AMX) results") + } + AMP <- AMX + } + # take ceftriaxone if cefotaxime is unavailable and vice versa + if (is.na(CRO) && !is.na(CTX)) { + if (isTRUE(verbose)) { + message_("Filling ceftriaxone (CRO) results with cefotaxime (CTX) results") + } + CRO <- CTX + } + if (is.na(CTX) && !is.na(CRO)) { + if (isTRUE(verbose)) { + message_("Filling cefotaxime (CTX) results with ceftriaxone (CRO) results") + } + CTX <- CRO + } + + # intrinsic resistant must not be considered for the determination of MDR, + # so let's just remove them, meticulously following the paper + x[which(x$genus == "Enterococcus" & x$species == "faecium"), ab_NA(IPM)] <- NA + x[which(x$genus == "Enterococcus" & x$species == "faecalis"), ab_NA(QDA)] <- NA + x[which((x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii")), ab_NA(c(GEN, TOB, NET))] <- NA + x[which(x$genus == "Escherichia" & x$species == "hermannii"), ab_NA(c(TCC, TZP))] <- NA + x[which((x$genus == "Citrobacter" & x$species == "freundii") | + (x$genus == "Enterobacter" & x$species == "aerogenes") | + (x$genus == "Klebsiella" & x$species == "aerogenes") # new name (2017) + | (x$genus == "Enterobacter" & x$species == "cloacae") | + (x$genus == "Hafnia" & x$species == "alvei") | + (x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(CZO)] <- NA + x[which((x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(CXM)] <- NA + x[which((x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "mirabilis") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii")), ab_NA(TGC)] <- NA + x[which((x$genus == "Citrobacter" & x$species == "koseri") | + (x$genus == "Citrobacter" & x$species == "freundii") | + (x$genus == "Enterobacter" & x$species == "aerogenes") | + (x$genus == "Klebsiella" & x$species == "aerogenes") # new name (2017) + | (x$genus == "Enterobacter" & x$species == "cloacae") | + (x$genus == "Escherichia" & x$species == "hermannii") | + (x$genus == "Hafnia" & x$species == "alvei") | + (x$genus == "Klebsiella") | + (x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(AMP)] <- NA + x[which((x$genus == "Citrobacter" & x$species == "freundii") | + (x$genus == "Enterobacter" & x$species == "aerogenes") | + (x$genus == "Klebsiella" & x$species == "aerogenes") # new name (2017) + | (x$genus == "Enterobacter" & x$species == "cloacae") | + (x$genus == "Hafnia" & x$species == "alvei") | + (x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(AMC)] <- NA + x[which((x$genus == "Citrobacter" & x$species == "freundii") | + (x$genus == "Citrobacter" & x$species == "koseri") | + (x$genus == "Enterobacter" & x$species == "aerogenes") | + (x$genus == "Klebsiella" & x$species == "aerogenes") # new name (2017) + | (x$genus == "Enterobacter" & x$species == "cloacae") | + (x$genus == "Hafnia" & x$species == "alvei") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(SAM)] <- NA + x[which((x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "mirabilis") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii") | + (x$genus == "Serratia" & x$species == "marcescens")), ab_NA(COL)] <- NA + x[which((x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "mirabilis") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii")), ab_NA(TCY)] <- NA + x[which((x$genus == "Morganella" & x$species == "morganii") | + (x$genus == "Proteus" & x$species == "penneri") | + (x$genus == "Proteus" & x$species == "vulgaris") | + (x$genus == "Providencia" & x$species == "rettgeri") | + (x$genus == "Providencia" & x$species == "stuartii")), ab_NA(c(DOX, MNO))] <- NA + + x$classes_in_guideline <- NA_integer_ + x$classes_available <- NA_integer_ + x$classes_affected <- NA_integer_ + + # now add the MDR levels to the data + trans_tbl( + 2, + which(x$genus == "Staphylococcus" & x$species == "aureus"), + c(OXA, FOX), + "any" + ) + trans_tbl2( + paste("Table 1 -", font_italic("Staphylococcus aureus")), + which(x$genus == "Staphylococcus" & x$species == "aureus"), + list( + GEN, + RIF, + CPT, + c(OXA, FOX), + c(CIP, MFX), + SXT, + FUS, + c(VAN, TEC, TLV), + TGC, + CLI, + DAP, + ERY, + LNZ, + CHL, + FOS, + QDA, + c(TCY, DOX, MNO) + ) + ) + trans_tbl2( + paste("Table 2 -", font_italic("Enterococcus"), "spp."), + which(x$genus == "Enterococcus"), + list( + GEH, + STH, + c(IPM, MEM, DOR), + c(CIP, LVX, MFX), + c(VAN, TEC), + TGC, + DAP, + LNZ, + AMP, + QDA, + c(DOX, MNO) + ) + ) + trans_tbl2( + paste0("Table 3 - ", font_italic("Enterobacteriaceae")), + # this new order was previously 'Enterobacteriales' and contained only the family 'Enterobacteriaceae': + which(x$order == "Enterobacterales"), + list( + c(GEN, TOB, AMK, NET), + CPT, + c(TCC, TZP), + c(ETP, IPM, MEM, DOR), + CZO, + CXM, + c(CTX, CAZ, FEP), + c(FOX, CTT), + CIP, + SXT, + TGC, + ATM, + AMP, + c(AMC, SAM), + CHL, + FOS, + COL, + c(TCY, DOX, MNO) + ) + ) + trans_tbl2( + paste("Table 4 -", font_italic("Pseudomonas aeruginosa")), + which(x$genus == "Pseudomonas" & x$species == "aeruginosa"), + list( + c(GEN, TOB, AMK, NET), + c(IPM, MEM, DOR), + c(CAZ, FEP), + c(CIP, LVX), + c(TCC, TZP), + ATM, + FOS, + c(COL, PLB) + ) + ) + trans_tbl2( + paste("Table 5 -", font_italic("Acinetobacter"), "spp."), + which(x$genus == "Acinetobacter"), + list( + c(GEN, TOB, AMK, NET), + c(IPM, MEM, DOR), + c(CIP, LVX), + c(TZP, TCC), + c(CTX, CRO, CAZ, FEP), + SXT, + SAM, + c(COL, PLB), + c(TCY, DOX, MNO) + ) + ) + + # now set MDROs: + # MDR (=2): >=3 classes affected + x[which(x$classes_affected >= 3), "MDRO"] <- 2 + if (isTRUE(verbose)) { + x[which(x$classes_affected >= 3), "reason"] <- paste0( + "at least 3 classes contain R", + ifelse(!isTRUE(combine_SI), " or I", ""), ": ", + x$classes_affected[which(x$classes_affected >= 3)], + " out of ", x$classes_available[which(x$classes_affected >= 3)], " available classes" + ) + } + + # XDR (=3): all but <=2 classes affected + x[which((x$classes_in_guideline - x$classes_affected) <= 2), "MDRO"] <- 3 + if (isTRUE(verbose)) { + x[which(x$MDRO == 3), "reason"] <- paste0( + "less than 3 classes remain susceptible (", x$classes_in_guideline[which((x$classes_in_guideline - x$classes_affected) <= 2)] - x$classes_affected[which(x$MDRO == 3)], + " out of ", x$classes_in_guideline[which(x$MDRO == 3)], " classes)" + ) + } + + # PDR (=4): all drugs are R + x[which(x$classes_affected == 999 & x$classes_in_guideline == x$classes_available), "MDRO"] <- 4 + if (isTRUE(verbose)) { + x[which(x$MDRO == 4), "reason"] <- paste( + "all antibiotics in all", + x$classes_in_guideline[which(x$MDRO == 4)], + "classes were tested R", + ifelse(!isTRUE(combine_SI), " or I", "") + ) + } + + # not enough classes available + x[which(x$MDRO %in% c(1, 3) & x$classes_available < floor(x$classes_in_guideline * pct_required_classes)), "MDRO"] <- -1 + if (isTRUE(verbose)) { + x[which(x$MDRO == -1), "reason"] <- paste0( + "not enough classes available: ", x$classes_available[which(x$MDRO == -1)], + " of required ", (floor(x$classes_in_guideline * pct_required_classes))[which(x$MDRO == -1)], + " (~", percentage(pct_required_classes), " of ", x$classes_in_guideline[which(x$MDRO == -1)], ")" + ) + } + + # add antibiotic names of resistant ones to verbose output + } + + if (guideline$code == "eucast3.1") { + # EUCAST 3.1 -------------------------------------------------------------- + # Table 5 + trans_tbl( + 3, + which(x$order == "Enterobacterales" | + (x$genus == "Pseudomonas" & x$species == "aeruginosa") | + x$genus == "Acinetobacter"), + COL, + "all" + ) + trans_tbl( + 3, + which(x$genus == "Salmonella" & x$species == "Typhi"), + c(carbapenems, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Haemophilus" & x$species == "influenzae"), + c(cephalosporins_3rd, carbapenems, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Moraxella" & x$species == "catarrhalis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "meningitidis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "gonorrhoeae"), + AZM, + "any" + ) + # Table 6 + trans_tbl( + 3, + which(x$fullname %like% "^(Coagulase-negative|Staphylococcus (aureus|epidermidis|hominis|haemolyticus|intermedius|pseudointermedius))"), + c(VAN, TEC, DAP, LNZ, QDA, TGC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Corynebacterium"), + c(VAN, TEC, DAP, LNZ, QDA, TGC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Streptococcus" & x$species == "pneumoniae"), + c(carbapenems, VAN, TEC, DAP, LNZ, QDA, TGC, RIF), + "any" + ) + trans_tbl( + 3, # Sr. groups A/B/C/G + which(x$fullname %like% "^Streptococcus (group (A|B|C|G)|pyogenes|agalactiae|equisimilis|equi|zooepidemicus|dysgalactiae|anginosus)"), + c(PEN, cephalosporins, VAN, TEC, DAP, LNZ, QDA, TGC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus"), + c(DAP, LNZ, TGC, TEC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus" & x$species == "faecalis"), + c(AMP, AMX), + "any" + ) + # Table 7 + trans_tbl( + 3, + which(x$genus == "Bacteroides"), + MTR, + "any" + ) + trans_tbl( + 3, + which(x$genus %in% c("Clostridium", "Clostridioides") & x$species == "difficile"), + c(MTR, VAN), + "any" + ) + } + + if (guideline$code == "eucast3.2") { + # EUCAST 3.2 -------------------------------------------------------------- + # Table 6 + trans_tbl( + 3, + which((x$order == "Enterobacterales" & + !x$family == "Morganellaceae" & + !(x$genus == "Serratia" & x$species == "marcescens")) | + (x$genus == "Pseudomonas" & x$species == "aeruginosa") | + x$genus == "Acinetobacter"), + COL, + "all" + ) + trans_tbl( + 3, + which(x$genus == "Salmonella" & x$species == "Typhi"), + c(carbapenems), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Haemophilus" & x$species == "influenzae"), + c(cephalosporins_3rd, carbapenems, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Moraxella" & x$species == "catarrhalis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "meningitidis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "gonorrhoeae"), + SPT, + "any" + ) + # Table 7 + trans_tbl( + 3, + which(x$genus == "Staphylococcus" & x$species == "aureus"), + c(VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$mo %in% MO_CONS), # coagulase-negative Staphylococcus + c(VAN, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Corynebacterium"), + c(VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Streptococcus" & x$species == "pneumoniae"), + c(carbapenems, VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC, RIF), + "any" + ) + trans_tbl( + 3, # Sr. groups A/B/C/G + which(x$mo %in% MO_STREP_ABCG), + c(PEN, cephalosporins, VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus"), + c(DAP, LNZ, TGC, ERV, OMC, TEC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus" & x$species == "faecalis"), + c(AMP, AMX), + "any" + ) + # Table 8 + trans_tbl( + 3, + which(x$genus == "Bacteroides"), + MTR, + "any" + ) + trans_tbl( + 3, + which(x$genus %in% c("Clostridium", "Clostridioides") & x$species == "difficile"), + c(MTR, VAN, FDX), + "any" + ) + } + + if (guideline$code == "eucast3.3") { + # EUCAST 3.3 -------------------------------------------------------------- + # note: this guideline is equal to EUCAST 3.2 - no MDRO insights changed + # Table 6 + trans_tbl( + 3, + which((x$order == "Enterobacterales" & + !x$family == "Morganellaceae" & + !(x$genus == "Serratia" & x$species == "marcescens")) | + (x$genus == "Pseudomonas" & x$species == "aeruginosa") | + x$genus == "Acinetobacter"), + COL, + "all" + ) + trans_tbl( + 3, + which(x$genus == "Salmonella" & x$species == "Typhi"), + c(carbapenems), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Haemophilus" & x$species == "influenzae"), + c(cephalosporins_3rd, carbapenems, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Moraxella" & x$species == "catarrhalis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "meningitidis"), + c(cephalosporins_3rd, fluoroquinolones), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Neisseria" & x$species == "gonorrhoeae"), + SPT, + "any" + ) + # Table 7 + trans_tbl( + 3, + which(x$genus == "Staphylococcus" & x$species == "aureus"), + c(VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$mo %in% MO_CONS), # coagulase-negative Staphylococcus + c(VAN, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Corynebacterium"), + c(VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Streptococcus" & x$species == "pneumoniae"), + c(carbapenems, VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC, RIF), + "any" + ) + trans_tbl( + 3, # Sr. groups A/B/C/G + which(x$mo %in% MO_STREP_ABCG), + c(PEN, cephalosporins, VAN, TEC, TLV, DAL, ORI, DAP, LNZ, TZD, QDA, TGC, ERV, OMC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus"), + c(DAP, LNZ, TGC, ERV, OMC, TEC), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus" & x$species == "faecalis"), + c(AMP, AMX), + "any" + ) + # Table 8 + trans_tbl( + 3, + which(x$genus == "Bacteroides"), + MTR, + "any" + ) + trans_tbl( + 3, + which(x$genus %in% c("Clostridium", "Clostridioides") & x$species == "difficile"), + c(MTR, VAN, FDX), + "any" + ) + } + + if (guideline$code == "mrgn") { + # Germany ----------------------------------------------------------------- + + # Table 1 + trans_tbl( + 2, # 3MRGN + which((x$order == "Enterobacterales" | # following in fact the old Enterobacteriaceae classification + (x$genus == "Acinetobacter" & x$species == "baumannii")) & + try_ab(x[, PIP, drop = TRUE] == "R") & + (try_ab(x[, CTX, drop = TRUE] == "R") | try_ab(x[, CAZ, drop = TRUE] == "R")) & + (try_ab(x[, IPM, drop = TRUE] != "R") | try_ab(x[, MEM, drop = TRUE] != "R")) & + try_ab(x[, CIP, drop = TRUE] == "R")), + c(PIP, CTX, CAZ, IPM, MEM, CIP), + "any" + ) + + trans_tbl( + 3, # 4MRGN, overwrites 3MRGN if applicable + which((x$order == "Enterobacterales" | # following in fact the old Enterobacteriaceae classification + (x$genus == "Acinetobacter" & x$species == "baumannii")) & + try_ab(x[, PIP, drop = TRUE] == "R") & + (try_ab(x[, CTX, drop = TRUE] == "R") | try_ab(x[, CAZ, drop = TRUE] == "R")) & + (try_ab(x[, IPM, drop = TRUE] == "R") | try_ab(x[, MEM, drop = TRUE] == "R")) & + try_ab(x[, CIP, drop = TRUE] == "R")), + c(PIP, CTX, CAZ, IPM, MEM, CIP), + "any" + ) + + trans_tbl( + 3, # 4MRGN, overwrites 3MRGN if applicable + which((x$order == "Enterobacterales" | # following in fact the old Enterobacteriaceae classification + (x$genus == "Acinetobacter" & x$species == "baumannii")) & + (try_ab(x[, IPM, drop = TRUE] == "R") | try_ab(x[, MEM, drop = TRUE] == "R"))), + c(IPM, MEM), + "any" + ) + + trans_tbl( + 2, # 3MRGN, if only 1 group is S + which(x$genus == "Pseudomonas" & x$species == "aeruginosa" & + try_ab(x[, PIP, drop = TRUE] == "S") + + try_ab(x[, CTX, drop = TRUE] == "S") + + try_ab(x[, CAZ, drop = TRUE] == "S") + + try_ab(x[, IPM, drop = TRUE] == "S") + + try_ab(x[, MEM, drop = TRUE] == "S") + + try_ab(x[, CIP, drop = TRUE] == "S") == 1), + c(PIP, CTX, CAZ, IPM, MEM, CIP), + "any" + ) + + trans_tbl( + 3, # 4MRGN otherwise + which((x$genus == "Pseudomonas" & x$species == "aeruginosa") & + try_ab(x[, PIP, drop = TRUE] == "R") & + (try_ab(x[, CTX, drop = TRUE] == "R") | try_ab(x[, CAZ, drop = TRUE] == "R")) & + (try_ab(x[, IPM, drop = TRUE] == "R") | try_ab(x[, MEM, drop = TRUE] == "R")) & + try_ab(x[, CIP, drop = TRUE] == "R")), + c(PIP, CTX, CAZ, IPM, MEM, CIP), + "any" + ) + + x[which(x$MDRO == 2), "reason"] <- "3MRGN" + x[which(x$MDRO == 3), "reason"] <- "4MRGN" + } + + if (guideline$code == "brmo") { + # Netherlands ------------------------------------------------------------- + aminoglycosides <- aminoglycosides[!is.na(aminoglycosides)] + fluoroquinolones <- fluoroquinolones[!is.na(fluoroquinolones)] + carbapenems <- carbapenems[!is.na(carbapenems)] + amino <- AMX %or% AMP + third <- CAZ %or% CTX + ESBLs <- c(amino, third) + ESBLs <- ESBLs[!is.na(ESBLs)] + if (length(ESBLs) != 2) { + ESBLs <- character(0) + } + + # Table 1 + trans_tbl( + 3, + which(x$order == "Enterobacterales"), # following in fact the old Enterobacteriaceae classification + c(aminoglycosides, fluoroquinolones), + "all" + ) + + trans_tbl( + 2, + which(x$order == "Enterobacterales"), # following in fact the old Enterobacteriaceae classification + carbapenems, + "any" + ) + + trans_tbl( + 2, + which(x$order == "Enterobacterales"), # following in fact the old Enterobacteriaceae classification + ESBLs, + "all" + ) + + # Table 2 + trans_tbl( + 2, + which(x$genus == "Acinetobacter"), + c(carbapenems), + "any" + ) + trans_tbl( + 3, + which(x$genus == "Acinetobacter"), + c(aminoglycosides, fluoroquinolones), + "all" + ) + + trans_tbl( + 3, + which(x$genus == "Stenotrophomonas" & x$species == "maltophilia"), + SXT, + "all" + ) + + if (!ab_missing(MEM) && !ab_missing(IPM) && + !ab_missing(GEN) && !ab_missing(TOB) && + !ab_missing(CIP) && + !ab_missing(CAZ) && + !ab_missing(TZP)) { + x$psae <- 0 + x[which(x[, MEM, drop = TRUE] == "R" | x[, IPM, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, MEM, drop = TRUE] == "R" | x[, IPM, drop = TRUE] == "R"), "psae"] + x[which(x[, GEN, drop = TRUE] == "R" & x[, TOB, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, GEN, drop = TRUE] == "R" & x[, TOB, drop = TRUE] == "R"), "psae"] + x[which(x[, CIP, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, CIP, drop = TRUE] == "R"), "psae"] + x[which(x[, CAZ, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, CAZ, drop = TRUE] == "R"), "psae"] + x[which(x[, TZP, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, TZP, drop = TRUE] == "R"), "psae"] + } else { + x$psae <- 0 + } + trans_tbl( + 3, + which(x$genus == "Pseudomonas" & x$species == "aeruginosa" & x$psae >= 3), + c(CAZ, CIP, GEN, IPM, MEM, TOB, TZP), + "any" + ) + x[which( + x$genus == "Pseudomonas" & x$species == "aeruginosa" & + x$psae >= 3 + ), "reason"] <- paste0("at least 3 classes contain R", ifelse(!isTRUE(combine_SI), " or I", "")) + + # Table 3 + trans_tbl( + 3, + which(x$genus == "Streptococcus" & x$species == "pneumoniae"), + PEN, + "all" + ) + trans_tbl( + 3, + which(x$genus == "Streptococcus" & x$species == "pneumoniae"), + VAN, + "all" + ) + trans_tbl( + 3, + which(x$genus == "Enterococcus" & x$species == "faecium"), + c(PEN, VAN), + "all" + ) + } + + if (guideline$code == "tb") { + # Tuberculosis ------------------------------------------------------------ + prepare_drug <- function(ab) { + # returns vector values of drug + # if `ab` is a column name, looks up the values in `x` + if (length(ab) == 1 && is.character(ab)) { + if (ab %in% colnames(x)) { + ab <- x[, ab, drop = TRUE] + } + } + ab <- as.character(as.sir(ab)) + ab[is.na(ab)] <- "" + ab + } + drug_is_R <- function(ab) { + # returns [logical] vector + ab <- prepare_drug(ab) + if (length(ab) == 0) { + rep(FALSE, NROW(x)) + } else if (length(ab) == 1) { + rep(ab, NROW(x)) == "R" + } else { + ab == "R" + } + } + drug_is_not_R <- function(ab) { + # returns [logical] vector + ab <- prepare_drug(ab) + if (length(ab) == 0) { + rep(TRUE, NROW(x)) + } else if (length(ab) == 1) { + rep(ab, NROW(x)) != "R" + } else { + ab != "R" + } + } + + x$mono_count <- 0 + x[drug_is_R(INH), "mono_count"] <- x[drug_is_R(INH), "mono_count", drop = TRUE] + 1 + x[drug_is_R(RIF), "mono_count"] <- x[drug_is_R(RIF), "mono_count", drop = TRUE] + 1 + x[drug_is_R(ETH), "mono_count"] <- x[drug_is_R(ETH), "mono_count", drop = TRUE] + 1 + x[drug_is_R(PZA), "mono_count"] <- x[drug_is_R(PZA), "mono_count", drop = TRUE] + 1 + x[drug_is_R(RIB), "mono_count"] <- x[drug_is_R(RIB), "mono_count", drop = TRUE] + 1 + x[drug_is_R(RFP), "mono_count"] <- x[drug_is_R(RFP), "mono_count", drop = TRUE] + 1 + + x$mono <- x$mono_count > 0 + x$poly <- x$mono_count > 1 & drug_is_not_R(RIF) & drug_is_not_R(INH) + x$mdr <- drug_is_R(RIF) & drug_is_R(INH) + x$xdr <- drug_is_R(LVX) | drug_is_R(MFX) | drug_is_R(GAT) + x$second <- drug_is_R(CAP) | drug_is_R(KAN) | drug_is_R(AMK) + x$xdr <- x$mdr & x$xdr & x$second + x$MDRO <- ifelse(x$xdr, 5, + ifelse(x$mdr, 4, + ifelse(x$poly, 3, + ifelse(x$mono, 2, + 1 + ) + ) + ) + ) + # keep all real TB, make other species NA + x$MDRO <- ifelse(x$fullname == "Mycobacterium tuberculosis", x$MDRO, NA_real_) + x$reason <- "PDR/MDR/XDR criteria were met" + } + + # some more info on negative results + if (isTRUE(verbose)) { + if (guideline$code == "cmi2012") { + x[which(x$MDRO == 1 & !is.na(x$classes_affected)), "reason"] <- paste0( + x$classes_affected[which(x$MDRO == 1 & !is.na(x$classes_affected))], + " of ", + x$classes_available[which(x$MDRO == 1 & !is.na(x$classes_affected))], + " available classes contain R", + ifelse(!isTRUE(combine_SI), " or I", ""), + " (3 required for MDR)" + ) + } else { + x[which(x$MDRO == 1), "reason"] <- "too few antibiotics are R" + } + } + + if (isTRUE(info.bak)) { + cat(group_msg) + if (sum(!is.na(x$MDRO)) == 0) { + cat(font_bold(paste0("=> Found 0 MDROs since no isolates are covered by the guideline"))) + } else { + cat(font_bold(paste0( + "=> Found ", sum(x$MDRO %in% 2:5, na.rm = TRUE), " ", guideline$type, " out of ", sum(!is.na(x$MDRO)), + " isolates (", trimws(percentage(sum(x$MDRO %in% 2:5, na.rm = TRUE) / sum(!is.na(x$MDRO)))), ")" + ))) + } + } + + # Fill in blanks ---- + # for rows that have no results + x_transposed <- as.list(as.data.frame(t(x[, cols_ab, drop = FALSE]), + stringsAsFactors = FALSE + )) + rows_empty <- which(vapply( + FUN.VALUE = logical(1), + x_transposed, + function(y) all(is.na(y)) + )) + if (length(rows_empty) > 0) { + if (isTRUE(info.bak)) { + cat(font_italic(paste0(" (", length(rows_empty), " isolates had no test results)\n"))) + } + x[rows_empty, "MDRO"] <- NA + x[rows_empty, "reason"] <- "none of the antibiotics have test results" + } else if (isTRUE(info.bak)) { + cat("\n") + } + + # Results ---- + if (guideline$code == "cmi2012") { + if (any(x$MDRO == -1, na.rm = TRUE)) { + if (message_not_thrown_before("mdro", "availability")) { + warning_( + "in `mdro()`: NA introduced for isolates where the available percentage of antimicrobial classes was below ", + percentage(pct_required_classes), " (set with `pct_required_classes`)" + ) + } + # set these -1s to NA + x[which(x$MDRO == -1), "MDRO"] <- NA_integer_ + } + x$MDRO <- factor( + x = x$MDRO, + levels = 1:4, + labels = c( + "Negative", "Multi-drug-resistant (MDR)", + "Extensively drug-resistant (XDR)", "Pandrug-resistant (PDR)" + ), + ordered = TRUE + ) + } else if (guideline$code == "tb") { + x$MDRO <- factor( + x = x$MDRO, + levels = 1:5, + labels = c( + "Negative", "Mono-resistant", "Poly-resistant", + "Multi-drug-resistant", "Extensively drug-resistant" + ), + ordered = TRUE + ) + } else if (guideline$code == "mrgn") { + x$MDRO <- factor( + x = x$MDRO, + levels = 1:3, + labels = c("Negative", "3MRGN", "4MRGN"), + ordered = TRUE + ) + } else { + x$MDRO <- factor( + x = x$MDRO, + levels = 1:3, + labels = c("Negative", "Positive, unconfirmed", "Positive"), + ordered = TRUE + ) + } + + if (isTRUE(verbose)) { + colnames(x)[colnames(x) == col_mo] <- "microorganism" + x$microorganism <- mo_name(x$microorganism, language = NULL) + x[, c( + "row_number", + "microorganism", + "MDRO", + "reason", + "columns_nonsusceptible" + ), + drop = FALSE + ] + } else { + x$MDRO + } +} + +#' @rdname mdro +#' @export +custom_mdro_guideline <- function(..., as_factor = TRUE) { + meet_criteria(as_factor, allow_class = "logical", has_length = 1) + + dots <- tryCatch(list(...), + error = function(e) "error" + ) + stop_if( + identical(dots, "error"), + "rules must be a valid formula inputs (e.g., using '~'), see `?mdro`" + ) + n_dots <- length(dots) + stop_if(n_dots == 0, "no custom rules were set. Please read the documentation using `?mdro`.") + out <- vector("list", n_dots) + for (i in seq_len(n_dots)) { + stop_ifnot( + inherits(dots[[i]], "formula"), + "rule ", i, " must be a valid formula input (e.g., using '~'), see `?mdro`" + ) + + # Query + qry <- dots[[i]][[2]] + if (inherits(qry, "call")) { + qry <- as.expression(qry) + } + qry <- as.character(qry) + # these will prevent vectorisation, so replace them: + qry <- gsub("&&", "&", qry, fixed = TRUE) + qry <- gsub("||", "|", qry, fixed = TRUE) + # support filter()-like writing: custom_mdro_guideline('CIP == "R", AMX == "S"' ~ "result 1") + qry <- gsub(" *, *", " & ", qry) + # format nicely, setting spaces around operators + qry <- gsub(" *([&|+-/*^><==]+) *", " \\1 ", qry) + qry <- gsub("'", "\"", qry, fixed = TRUE) + out[[i]]$query <- as.expression(qry) + + # Value + val <- tryCatch(eval(dots[[i]][[3]]), error = function(e) NULL) + stop_if(is.null(val), "rule ", i, " must return a valid value, it now returns an error: ", tryCatch(eval(dots[[i]][[3]]), error = function(e) e$message)) + stop_if(length(val) > 1, "rule ", i, " must return a value of length 1, not ", length(val)) + out[[i]]$value <- as.character(val) + } + + names(out) <- paste0("rule", seq_len(n_dots)) + out <- set_clean_class(out, new_class = c("custom_mdro_guideline", "list")) + attr(out, "values") <- unname(c("Negative", vapply(FUN.VALUE = character(1), unclass(out), function(x) x$value))) + attr(out, "as_factor") <- as_factor + out +} + +#' @method c custom_mdro_guideline +#' @noRd +#' @export +c.custom_mdro_guideline <- function(x, ..., as_factor = NULL) { + if (length(list(...)) == 0) { + return(x) + } + if (!is.null(as_factor)) { + meet_criteria(as_factor, allow_class = "logical", has_length = 1) + } else { + as_factor <- attributes(x)$as_factor + } + for (g in list(...)) { + stop_ifnot(inherits(g, "custom_mdro_guideline"), + "for combining custom MDRO guidelines, all rules must be created with `custom_mdro_guideline()`", + call = FALSE + ) + vals <- attributes(x)$values + if (!all(attributes(g)$values %in% vals)) { + vals <- unname(unique(c(vals, attributes(g)$values))) + } + attributes(g) <- NULL + x <- c(unclass(x), unclass(g)) + attr(x, "values") <- vals + } + names(x) <- paste0("rule", seq_len(length(x))) + x <- set_clean_class(x, new_class = c("custom_mdro_guideline", "list")) + attr(x, "values") <- vals + attr(x, "as_factor") <- as_factor + x +} + +#' @method as.list custom_mdro_guideline +#' @noRd +#' @export +as.list.custom_mdro_guideline <- function(x, ...) { + c(x, ...) +} + +#' @method print custom_mdro_guideline +#' @export +#' @noRd +print.custom_mdro_guideline <- function(x, ...) { + cat("A set of custom MDRO rules:\n") + for (i in seq_len(length(x))) { + rule <- x[[i]] + rule$query <- format_custom_query_rule(rule$query) + cat(" ", i, ". ", font_bold("If "), font_blue(rule$query), font_bold(" then: "), font_red(rule$value), "\n", sep = "") + } + cat(" ", i + 1, ". ", font_bold("Otherwise: "), font_red(paste0("Negative")), "\n", sep = "") + cat("\nUnmatched rows will return ", font_red("NA"), ".\n", sep = "") + if (isTRUE(attributes(x)$as_factor)) { + cat("Results will be of class 'factor', with ordered levels: ", paste0(attributes(x)$values, collapse = " < "), "\n", sep = "") + } else { + cat("Results will be of class 'character'.\n") + } +} + +run_custom_mdro_guideline <- function(df, guideline, info) { + n_dots <- length(guideline) + stop_if(n_dots == 0, "no custom guidelines set", call = -2) + out <- character(length = NROW(df)) + reasons <- character(length = NROW(df)) + for (i in seq_len(n_dots)) { + qry <- tryCatch(eval(parse(text = guideline[[i]]$query), envir = df, enclos = parent.frame()), + error = function(e) { + AMR_env$err_msg <- e$message + return("error") + } + ) + if (identical(qry, "error")) { + warning_("in `custom_mdro_guideline()`: rule ", i, + " (`", as.character(guideline[[i]]$query), "`) was ignored because of this error message: ", + AMR_env$err_msg, + call = FALSE, + add_fn = font_red + ) + next + } + stop_ifnot(is.logical(qry), "in custom_mdro_guideline(): rule ", i, " (`", guideline[[i]]$query, + "`) must return `TRUE` or `FALSE`, not ", + format_class(class(qry), plural = FALSE), + call = FALSE + ) + + new_mdros <- which(qry == TRUE & out == "") + + if (isTRUE(info)) { + cat(word_wrap( + "- Custom MDRO rule ", i, ": `", as.character(guideline[[i]]$query), + "` (", length(new_mdros), " rows matched)" + ), "\n", sep = "") + } + val <- guideline[[i]]$value + out[new_mdros] <- val + reasons[new_mdros] <- paste0( + "matched rule ", + gsub("rule", "", names(guideline)[i], fixed = TRUE), ": ", as.character(guideline[[i]]$query) + ) + } + out[out == ""] <- "Negative" + reasons[out == "Negative"] <- "no rules matched" + + if (isTRUE(attributes(guideline)$as_factor)) { + out <- factor(out, levels = attributes(guideline)$values, ordered = TRUE) + } + + columns_nonsusceptible <- as.data.frame(t(df[, is.sir(df), drop = FALSE] == "R")) + columns_nonsusceptible <- vapply( + FUN.VALUE = character(1), + columns_nonsusceptible, + function(x) paste0(rownames(columns_nonsusceptible)[which(x)], collapse = " ") + ) + columns_nonsusceptible[is.na(out)] <- NA_character_ + + data.frame( + row_number = seq_len(NROW(df)), + MDRO = out, + reason = reasons, + columns_nonsusceptible = columns_nonsusceptible, + stringsAsFactors = FALSE + ) +} + +#' @rdname mdro +#' @export +brmo <- function(x = NULL, only_sir_columns = FALSE, ...) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + stop_if( + "guideline" %in% names(list(...)), + "argument `guideline` must not be set since this is a guideline-specific function" + ) + mdro(x = x, only_sir_columns = only_sir_columns, guideline = "BRMO", ...) +} + +#' @rdname mdro +#' @export +mrgn <- function(x = NULL, only_sir_columns = FALSE, ...) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + stop_if( + "guideline" %in% names(list(...)), + "argument `guideline` must not be set since this is a guideline-specific function" + ) + mdro(x = x, only_sir_columns = only_sir_columns, guideline = "MRGN", ...) +} + +#' @rdname mdro +#' @export +mdr_tb <- function(x = NULL, only_sir_columns = FALSE, ...) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + stop_if( + "guideline" %in% names(list(...)), + "argument `guideline` must not be set since this is a guideline-specific function" + ) + mdro(x = x, only_sir_columns = only_sir_columns, guideline = "TB", ...) +} + +#' @rdname mdro +#' @export +mdr_cmi2012 <- function(x = NULL, only_sir_columns = FALSE, ...) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + stop_if( + "guideline" %in% names(list(...)), + "argument `guideline` must not be set since this is a guideline-specific function" + ) + mdro(x = x, only_sir_columns = only_sir_columns, guideline = "CMI2012", ...) +} + +#' @rdname mdro +#' @export +eucast_exceptional_phenotypes <- function(x = NULL, only_sir_columns = FALSE, ...) { + meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) + stop_if( + "guideline" %in% names(list(...)), + "argument `guideline` must not be set since this is a guideline-specific function" + ) + mdro(x = x, only_sir_columns = only_sir_columns, guideline = "EUCAST", ...) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mean_amr_distance.R + + + + +#' Calculate the Mean AMR Distance +#' +#' Calculates a normalised mean for antimicrobial resistance between multiple observations, to help to identify similar isolates without comparing antibiograms by hand. +#' @param x a vector of class [sir][as.sir()], [mic][as.mic()] or [disk][as.disk()], or a [data.frame] containing columns of any of these classes +#' @param ... variables to select (supports [tidyselect language][tidyselect::language] such as `column1:column4` and `where(is.mic)`, and can thus also be [antibiotic selectors][ab_selector()] +#' @param combine_SI a [logical] to indicate whether all values of S, SDD, and I must be merged into one, so the input only consists of S+I vs. R (susceptible vs. resistant) - the default is `TRUE` +#' @details The mean AMR distance is effectively [the Z-score](https://en.wikipedia.org/wiki/Standard_score); a normalised numeric value to compare AMR test results which can help to identify similar isolates, without comparing antibiograms by hand. +#' +#' MIC values (see [as.mic()]) are transformed with [log2()] first; their distance is thus calculated as `(log2(x) - mean(log2(x))) / sd(log2(x))`. +#' +#' SIR values (see [as.sir()]) are transformed using `"S"` = 1, `"I"` = 2, and `"R"` = 3. If `combine_SI` is `TRUE` (default), the `"I"` will be considered to be 1. +#' +#' For data sets, the mean AMR distance will be calculated per column, after which the mean per row will be returned, see *Examples*. +#' +#' Use [amr_distance_from_row()] to subtract distances from the distance of one row, see *Examples*. +#' @section Interpretation: +#' Isolates with distances less than 0.01 difference from each other should be considered similar. Differences lower than 0.025 should be considered suspicious. +#' @export +#' @examples +#' sir <- random_sir(10) +#' sir +#' mean_amr_distance(sir) +#' +#' mic <- random_mic(10) +#' mic +#' mean_amr_distance(mic) +#' # equal to the Z-score of their log2: +#' (log2(mic) - mean(log2(mic))) / sd(log2(mic)) +#' +#' disk <- random_disk(10) +#' disk +#' mean_amr_distance(disk) +#' +#' y <- data.frame( +#' id = LETTERS[1:10], +#' amox = random_sir(10, ab = "amox", mo = "Escherichia coli"), +#' cipr = random_disk(10, ab = "cipr", mo = "Escherichia coli"), +#' gent = random_mic(10, ab = "gent", mo = "Escherichia coli"), +#' tobr = random_mic(10, ab = "tobr", mo = "Escherichia coli") +#' ) +#' y +#' mean_amr_distance(y) +#' y$amr_distance <- mean_amr_distance(y, where(is.mic)) +#' y[order(y$amr_distance), ] +#' +#' if (require("dplyr")) { +#' y %>% +#' mutate( +#' amr_distance = mean_amr_distance(y), +#' check_id_C = amr_distance_from_row(amr_distance, id == "C") +#' ) %>% +#' arrange(check_id_C) +#' } +#' if (require("dplyr")) { +#' # support for groups +#' example_isolates %>% +#' filter(mo_genus() == "Enterococcus" & mo_species() != "") %>% +#' select(mo, TCY, carbapenems()) %>% +#' group_by(mo) %>% +#' mutate(dist = mean_amr_distance(.)) %>% +#' arrange(mo, dist) +#' } +mean_amr_distance <- function(x, ...) { + UseMethod("mean_amr_distance") +} + +#' @noRd +#' @export +mean_amr_distance.default <- function(x, ...) { + x <- as.double(x) + # calculate z-score + (x - mean(x, na.rm = TRUE)) / stats::sd(x, na.rm = TRUE) +} + +#' @noRd +#' @export +mean_amr_distance.mic <- function(x, ...) { + mean_amr_distance(log2(x)) +} + +#' @noRd +#' @export +mean_amr_distance.disk <- function(x, ...) { + mean_amr_distance(as.double(x)) +} + +#' @rdname mean_amr_distance +#' @export +mean_amr_distance.sir <- function(x, ..., combine_SI = TRUE) { + meet_criteria(combine_SI, allow_class = "logical", has_length = 1, .call_depth = -1) + if (isTRUE(combine_SI)) { + x[x %in% c("I", "SDD")] <- "S" + } + mean_amr_distance(as.double(x)) +} + +#' @rdname mean_amr_distance +#' @export +mean_amr_distance.data.frame <- function(x, ..., combine_SI = TRUE) { + meet_criteria(combine_SI, allow_class = "logical", has_length = 1, .call_depth = -1) + df <- x + if (is_null_or_grouped_tbl(df)) { + df <- get_current_data("x", -2) + } + df <- as.data.frame(df, stringsAsFactors = FALSE) + if (tryCatch(length(list(...)) > 0, error = function(e) TRUE)) { + out <- tryCatch(suppressWarnings(c(...)), error = function(e) NULL) + if (!is.null(out)) { + df <- df[, out, drop = FALSE] + } else { + df <- pm_select(df, ...) + } + } + df_classes <- colnames(df)[vapply(FUN.VALUE = logical(1), df, function(x) is.disk(x) | is.mic(x) | is.disk(x), USE.NAMES = FALSE)] + df_antibiotics <- unname(get_column_abx(df, info = FALSE)) + df <- df[, colnames(df)[colnames(df) %in% union(df_classes, df_antibiotics)], drop = FALSE] + + stop_if(ncol(df) < 2, + "data set must contain at least two variables", + call = -2 + ) + if (message_not_thrown_before("mean_amr_distance", "groups")) { + message_("Calculating mean AMR distance based on columns ", vector_and(colnames(df), sort = FALSE)) + } + + res <- vapply( + FUN.VALUE = double(nrow(df)), + df, + mean_amr_distance, + combine_SI = combine_SI + ) + if (is.null(dim(res))) { + if (all(is.na(res))) { + return(NA_real_) + } else { + return(mean(res, na.rm = TRUE)) + } + } + res <- rowMeans(res, na.rm = TRUE) + res[is.infinite(res) | is.nan(res)] <- 0 + res +} + +#' @rdname mean_amr_distance +#' @param amr_distance the outcome of [mean_amr_distance()] +#' @param row an index, such as a row number +#' @export +amr_distance_from_row <- function(amr_distance, row) { + meet_criteria(amr_distance, allow_class = "numeric", is_finite = TRUE) + meet_criteria(row, allow_class = c("logical", "numeric", "integer")) + if (is.logical(row)) { + row <- which(row) + } + abs(amr_distance[row] - amr_distance) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mic.R + + + + +# these are allowed MIC values and will become factor levels +VALID_MIC_LEVELS <- c( + as.double(paste0("0.000", c(1:9))), + as.double(paste0("0.00", c(1:99, 1953125, 390625, 78125))), + as.double(paste0("0.0", c(1:99, 125, 128, 156, 165, 256, 512, 625, 3125, 15625))), + as.double(paste0("0.", c(1:99, 125, 128, 256, 512))), + 1:9, 1.5, + c(10:98)[9:98 %% 2 == TRUE], + 2^c(7:12), 192 * c(1:5), 80 * c(2:12) +) +VALID_MIC_LEVELS <- trimws(gsub("[.]?0+$", "", format(unique(sort(VALID_MIC_LEVELS)), scientific = FALSE), perl = TRUE)) +operators <- c("<", "<=", "", ">=", ">") +VALID_MIC_LEVELS <- c(t(vapply(FUN.VALUE = character(length(VALID_MIC_LEVELS)), + c("<", "<=", "", ">=", ">"), + paste0, + VALID_MIC_LEVELS))) +COMMON_MIC_VALUES <- c(0.001, 0.002, 0.004, 0.008, 0.016, 0.032, 0.064, + 0.125, 0.25, 0.5, 1, 2, 4, 8, 16, 32, + 64, 128, 256, 512, 1024) + +#' Transform Input to Minimum Inhibitory Concentrations (MIC) +#' +#' This transforms vectors to a new class [`mic`], which treats the input as decimal numbers, while maintaining operators (such as ">=") and only allowing valid MIC values known to the field of (medical) microbiology. +#' @rdname as.mic +#' @param x a [character] or [numeric] vector +#' @param na.rm a [logical] indicating whether missing values should be removed +#' @param keep_operators a [character] specifying how to handle operators (such as `>` and `<=`) in the input. Accepts one of three values: `"all"` (or `TRUE`) to keep all operators, `"none"` (or `FALSE`) to remove all operators, or `"edges"` to keep operators only at both ends of the range. +#' @param ... arguments passed on to methods +#' @details To interpret MIC values as SIR values, use [as.sir()] on MIC values. It supports guidelines from EUCAST (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`) and CLSI (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`). +#' +#' This class for MIC values is a quite a special data type: formally it is an ordered [factor] with valid MIC values as [factor] levels (to make sure only valid MIC values are retained), but for any mathematical operation it acts as decimal numbers: +#' +#' ``` +#' x <- random_mic(10) +#' x +#' #> Class 'mic' +#' #> [1] 16 1 8 8 64 >=128 0.0625 32 32 16 +#' +#' is.factor(x) +#' #> [1] TRUE +#' +#' x[1] * 2 +#' #> [1] 32 +#' +#' median(x) +#' #> [1] 26 +#' ``` +#' +#' This makes it possible to maintain operators that often come with MIC values, such ">=" and "<=", even when filtering using [numeric] values in data analysis, e.g.: +#' +#' ``` +#' x[x > 4] +#' #> Class 'mic' +#' #> [1] 16 8 8 64 >=128 32 32 16 +#' +#' df <- data.frame(x, hospital = "A") +#' subset(df, x > 4) # or with dplyr: df %>% filter(x > 4) +#' #> x hospital +#' #> 1 16 A +#' #> 5 64 A +#' #> 6 >=128 A +#' #> 8 32 A +#' #> 9 32 A +#' #> 10 16 A +#' ``` +#' +#' All so-called [group generic functions][groupGeneric()] are implemented for the MIC class (such as `!`, `!=`, `<`, `>=`, [exp()], [log2()]). Some functions of the `stats` package are also implemented (such as [quantile()], [median()], [fivenum()]). Since [sd()] and [var()] are non-generic functions, these could not be extended. Use [mad()] as an alternative, or use e.g. `sd(as.numeric(x))` where `x` is your vector of MIC values. +#' +#' Using [as.double()] or [as.numeric()] on MIC values will remove the operators and return a numeric vector. Do **not** use [as.integer()] on MIC values as by the \R convention on [factor]s, it will return the index of the factor levels (which is often useless for regular users). +#' +#' Use [droplevels()] to drop unused levels. At default, it will return a plain factor. Use `droplevels(..., as.mic = TRUE)` to maintain the `mic` class. +#' +#' With [rescale_mic()], existing MIC ranges can be limited to a defined range of MIC values. This can be useful to better compare MIC distributions. +#' +#' For `ggplot2`, use one of the [`scale_*_mic()`][scale_x_mic()] functions to plot MIC values. They allows custom MIC ranges and to plot intermediate log2 levels for missing MIC values. +#' @return Ordered [factor] with additional class [`mic`], that in mathematical operations acts as a [numeric] vector. Bear in mind that the outcome of any mathematical operation on MICs will return a [numeric] value. +#' @aliases mic +#' @export +#' @seealso [as.sir()] +#' @examples +#' mic_data <- as.mic(c(">=32", "1.0", "1", "1.00", 8, "<=0.128", "8", "16", "16")) +#' mic_data +#' is.mic(mic_data) +#' +#' # this can also coerce combined MIC/SIR values: +#' as.mic("<=0.002; S") +#' +#' # mathematical processing treats MICs as numeric values +#' fivenum(mic_data) +#' quantile(mic_data) +#' all(mic_data < 512) +#' +#' # rescale MICs using rescale_mic() +#' rescale_mic(mic_data, mic_range = c(4, 16)) +#' +#' # interpret MIC values +#' as.sir( +#' x = as.mic(2), +#' mo = as.mo("Streptococcus pneumoniae"), +#' ab = "AMX", +#' guideline = "EUCAST" +#' ) +#' as.sir( +#' x = as.mic(c(0.01, 2, 4, 8)), +#' mo = as.mo("Streptococcus pneumoniae"), +#' ab = "AMX", +#' guideline = "EUCAST" +#' ) +#' +#' # plot MIC values, see ?plot +#' plot(mic_data) +#' plot(mic_data, mo = "E. coli", ab = "cipro") +#' +#' if (require("ggplot2")) { +#' autoplot(mic_data, mo = "E. coli", ab = "cipro") +#' } +#' if (require("ggplot2")) { +#' autoplot(mic_data, mo = "E. coli", ab = "cipro", language = "nl") # Dutch +#' } +as.mic <- function(x, na.rm = FALSE, keep_operators = "all") { + meet_criteria(x, allow_NA = TRUE) + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + meet_criteria(keep_operators, allow_class = c("character", "logical"), is_in = c("all", "none", "edges", FALSE, TRUE), has_length = 1) + if (isTRUE(keep_operators)) { + keep_operators <- "all" + } else if (isFALSE(keep_operators)) { + keep_operators <- "none" + } + + if (is.mic(x) && (keep_operators == "all" || !any(x %like% "[>=<]", na.rm = TRUE))) { + if (!identical(levels(x), VALID_MIC_LEVELS)) { + # from an older AMR version - just update MIC factor levels + x <- set_clean_class(factor(as.character(x), levels = VALID_MIC_LEVELS, ordered = TRUE), + new_class = c("mic", "ordered", "factor")) + } + return(x) + } + + x.bak <- NULL + if (is.numeric(x)) { + x.bak <- format(x, scientific = FALSE) + # MICs never have more than 9 decimals, so: + x <- format(round(x, 9), scientific = FALSE) + } else { + x <- as.character(unlist(x)) + } + if (isTRUE(na.rm)) { + x <- x[!is.na(x)] + } + x <- trimws2(x) + x[x == ""] <- NA + if (is.null(x.bak)) { + x.bak <- x + } + + # comma to period + x <- gsub(",", ".", x, fixed = TRUE) + # transform scientific notation + x[x %like% "[-]?[0-9]+([.][0-9]+)?e[-]?[0-9]+"] <- as.double(x[x %like% "[-]?[0-9]+([.][0-9]+)?e[-]?[0-9]+"]) + # transform Unicode for >= and <= + x <- gsub("\u2264", "<=", x, fixed = TRUE) + x <- gsub("\u2265", ">=", x, fixed = TRUE) + # remove other invalid characters + x <- gsub("[^a-zA-Z0-9.><= ]+", "", x, perl = TRUE) + # remove space between operator and number ("<= 0.002" -> "<=0.002") + x <- gsub("(<|=|>) +", "\\1", x, perl = TRUE) + # transform => to >= and =< to <= + x <- gsub("=<", "<=", x, fixed = TRUE) + x <- gsub("=>", ">=", x, fixed = TRUE) + # dots without a leading zero must start with 0 + x <- gsub("([^0-9]|^)[.]", "\\10.", x, perl = TRUE) + # values like "<=0.2560.512" should be 0.512 + x <- gsub(".*[.].*[.]", "0.", x, perl = TRUE) + # remove ending .0 + x <- gsub("[.]+0$", "", x, perl = TRUE) + # remove all after last digit + x <- gsub("[^0-9]+$", "", x, perl = TRUE) + # keep only one zero before dot + x <- gsub("0+[.]", "0.", x, perl = TRUE) + # starting 00 is probably 0.0 if there's no dot yet + x[x %unlike% "[.]"] <- gsub("^00", "0.0", x[!x %like% "[.]"]) + # remove last zeroes + x <- gsub("([.].?)0+$", "\\1", x, perl = TRUE) + x <- gsub("(.*[.])0+$", "\\10", x, perl = TRUE) + # remove ending .0 again + x[x %like% "[.]"] <- gsub("0+$", "", x[x %like% "[.]"]) + # never end with dot + x <- gsub("[.]$", "", x, perl = TRUE) + # trim it + x <- trimws2(x) + + ## previously unempty values now empty - should return a warning later on + x[x.bak != "" & x == ""] <- "invalid" + + na_before <- x[is.na(x) | x == ""] %pm>% length() + x[!x %in% VALID_MIC_LEVELS] <- NA + na_after <- x[is.na(x) | x == ""] %pm>% length() + + if (na_before != na_after) { + list_missing <- x.bak[is.na(x) & !is.na(x.bak) & x.bak != ""] %pm>% + unique() %pm>% + sort() %pm>% + vector_and(quotes = TRUE) + cur_col <- get_current_column() + warning_("in `as.mic()`: ", na_after - na_before, " result", + ifelse(na_after - na_before > 1, "s", ""), + ifelse(is.null(cur_col), "", paste0(" in column '", cur_col, "'")), + " truncated (", + round(((na_after - na_before) / length(x)) * 100), + "%) that were invalid MICs: ", + list_missing, + call = FALSE + ) + } + + if (keep_operators == "none" && !all(is.na(x))) { + x <- gsub("[>=<]", "", x) + } else if (keep_operators == "edges" && !all(is.na(x))) { + dbls <- as.double(gsub("[>=<]", "", x)) + x[dbls == min(dbls, na.rm = TRUE)] <- paste0("<=", min(dbls, na.rm = TRUE)) + x[dbls == max(dbls, na.rm = TRUE)] <- paste0(">=", max(dbls, na.rm = TRUE)) + keep <- x[dbls == max(dbls, na.rm = TRUE) | dbls == min(dbls, na.rm = TRUE)] + x[!x %in% keep] <- gsub("[>=<]", "", x[!x %in% keep]) + } + + set_clean_class(factor(x, levels = VALID_MIC_LEVELS, ordered = TRUE), + new_class = c("mic", "ordered", "factor")) +} + +#' @rdname as.mic +#' @export +is.mic <- function(x) { + inherits(x, "mic") +} + +#' @rdname as.mic +#' @details `NA_mic_` is a missing value of the new `mic` class, analogous to e.g. base \R's [`NA_character_`][base::NA]. +#' @format NULL +#' @export +NA_mic_ <- set_clean_class(factor(NA, levels = VALID_MIC_LEVELS, ordered = TRUE), + new_class = c("mic", "ordered", "factor") +) + +#' @rdname as.mic +#' @param mic_range a manual range to limit the MIC values, e.g., `mic_range = c(0.001, 32)`. Use `NA` to set no limit on one side, e.g., `mic_range = c(NA, 32)`. +#' @export +rescale_mic <- function(x, mic_range, keep_operators = "edges", as.mic = TRUE) { + meet_criteria(mic_range, allow_class = c("numeric", "integer", "logical"), has_length = 2, allow_NA = TRUE, allow_NULL = TRUE) + stop_ifnot(all(mic_range %in% c(VALID_MIC_LEVELS, NA)), + "Values in `mic_range` must be valid MIC values. Unvalid: ", vector_and(mic_range[mic_range %in% c(VALID_MIC_LEVELS, NA)])) + x <- as.mic(x) + if (is.null(mic_range)) { + mic_range <- c(NA, NA) + } + mic_range <- as.mic(mic_range) + + min_mic <- mic_range[1] + max_mic <- mic_range[2] + if (!is.na(min_mic)) { + x[x < min_mic] <- min_mic + } + if (!is.na(max_mic)) { + x[x > max_mic] <- max_mic + } + + x <- as.mic(x, keep_operators = ifelse(keep_operators == "edges", "none", keep_operators)) + + if (isTRUE(as.mic)) { + if (keep_operators == "edges") { + x[x == min(x, na.rm = TRUE)] <- paste0("<=", x[x == min(x, na.rm = TRUE)]) + x[x == max(x, na.rm = TRUE)] <- paste0(">=", x[x == max(x, na.rm = TRUE)]) + } + return(x) + } + + # create a manual factor with levels only within desired range + expanded <- plotrange_as_table(x, + expand = TRUE, + keep_operators = ifelse(keep_operators == "edges", "none", keep_operators), + mic_range = mic_range) + if (keep_operators == "edges") { + names(expanded)[1] <- paste0("<=", names(expanded)[1]) + names(expanded)[length(expanded)] <- paste0(">=", names(expanded)[length(expanded)]) + } + # MICs contain all MIC levels, so strip this to only existing levels and their intermediate values + out <- factor(names(expanded), + levels = names(expanded), + ordered = TRUE) + # and only keep the ones in the data + if (keep_operators == "edges") { + out <- out[match(x, as.double(as.mic(out, keep_operators = "all")))] + } else { + out <- out[match(x, out)] + } + out +} + +#' @method as.double mic +#' @export +#' @noRd +as.double.mic <- function(x, ...) { + as.double(gsub("[<=>]+", "", as.character(x), perl = TRUE)) +} + +#' @method as.numeric mic +#' @export +#' @noRd +as.numeric.mic <- function(x, ...) { + as.numeric(gsub("[<=>]+", "", as.character(x), perl = TRUE)) +} + +#' @rdname as.mic +#' @method droplevels mic +#' @param as.mic a [logical] to indicate whether the `mic` class should be kept - the default is `FALSE` +#' @export +droplevels.mic <- function(x, as.mic = FALSE, ...) { + x <- as.mic(x) # make sure that currently implemented MIC levels are used + x <- droplevels.factor(x, ...) + if (as.mic == TRUE) { + class(x) <- c("mic", "ordered", "factor") + } + x +} + +all_valid_mics <- function(x) { + if (!inherits(x, c("mic", "character", "factor", "numeric", "integer"))) { + return(FALSE) + } + x_mic <- tryCatch(suppressWarnings(as.mic(x[!is.na(x)])), + error = function(e) NA + ) + !any(is.na(x_mic)) && !all(is.na(x)) +} + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.mic <- function(x, ...) { + if(!identical(levels(x), VALID_MIC_LEVELS) && message_not_thrown_before("pillar_shaft.mic")) { + warning_(AMR_env$sup_1_icon, " These columns contain an outdated or altered structure - convert with `as.mic()` to update", + call = FALSE) + } + crude_numbers <- as.double(x) + operators <- gsub("[^<=>]+", "", as.character(x)) + operators[!is.na(operators) & operators != ""] <- font_silver(operators[!is.na(operators) & operators != ""], collapse = NULL) + out <- trimws(paste0(operators, trimws(format(crude_numbers)))) + out[is.na(x)] <- font_na(NA) + # make trailing zeroes less visible + out[out %like% "[.]"] <- gsub("([.]?0+)$", font_silver("\\1"), out[out %like% "[.]"], perl = TRUE) + create_pillar_column(out, align = "right", width = max(nchar(font_stripstyle(out)))) +} + +# will be exported using s3_register() in R/zzz.R +type_sum.mic <- function(x, ...) { + if(!identical(levels(x), VALID_MIC_LEVELS)) { + paste0("mic", AMR_env$sup_1_icon) + } else { + "mic" + } +} + +#' @method print mic +#' @export +#' @noRd +print.mic <- function(x, ...) { + cat("Class 'mic'") + if(!identical(levels(x), VALID_MIC_LEVELS)) { + cat(font_red(" with an outdated or altered structure - convert with `as.mic()` to update")) + } + cat("\n") + print(as.character(x), quote = FALSE) + att <- attributes(x) + if ("na.action" %in% names(att)) { + cat(font_silver(paste0("(NA ", class(att$na.action), ": ", paste0(att$na.action, collapse = ", "), ")\n"))) + } +} + +#' @method summary mic +#' @export +#' @noRd +summary.mic <- function(object, ...) { + summary(as.double(object), ...) +} + +#' @method as.matrix mic +#' @export +#' @noRd +as.matrix.mic <- function(x, ...) { + as.matrix(as.double(x), ...) +} +#' @method as.vector mic +#' @export +#' @noRd +as.vector.mic <- function(x, mode = "numneric", ...) { + y <- NextMethod() + y <- as.mic(y) + calls <- unlist(lapply(sys.calls(), as.character)) + if (any(calls %in% c("rbind", "cbind")) && message_not_thrown_before("as.vector.mic")) { + warning_("Functions `rbind()` and `cbind()` cannot preserve the structure of MIC values. Use dplyr's `bind_rows()` or `bind_cols()` instead.", call = FALSE) + } + y +} +#' @method as.list mic +#' @export +#' @noRd +as.list.mic <- function(x, ...) { + lapply(as.list(as.character(x), ...), as.mic) +} +#' @method as.data.frame mic +#' @export +#' @noRd +as.data.frame.mic <- function(x, ...) { + as.data.frame.vector(as.mic(x), ...) +} + +#' @method [ mic +#' @export +#' @noRd +"[.mic" <- function(x, ...) { + y <- NextMethod() + as.mic(y) +} +#' @method [[ mic +#' @export +#' @noRd +"[[.mic" <- function(x, ...) { + y <- NextMethod() + as.mic(y) +} +#' @method [<- mic +#' @export +#' @noRd +"[<-.mic" <- function(i, j, ..., value) { + value <- as.mic(value) + y <- NextMethod() + as.mic(y) +} +#' @method [[<- mic +#' @export +#' @noRd +"[[<-.mic" <- function(i, j, ..., value) { + value <- as.mic(value) + y <- NextMethod() + as.mic(y) +} +#' @method c mic +#' @export +#' @noRd +c.mic <- function(...) { + as.mic(unlist(lapply(list(...), as.character))) +} + +#' @method unique mic +#' @export +#' @noRd +unique.mic <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + as.mic(y) +} + +#' @method rep mic +#' @export +#' @noRd +rep.mic <- function(x, ...) { + y <- NextMethod() + as.mic(y) +} + +#' @method sort mic +#' @export +#' @noRd +sort.mic <- function(x, decreasing = FALSE, ...) { + x <- as.mic(x) # make sure that currently implemented MIC levels are used + dbl <- as.double(x) + # make sure that e.g. '<0.001' comes before '0.001', and '>0.001' comes after + dbl[as.character(x) %like% "<[0-9]"] <- dbl[as.character(x) %like% "<[0-9]"] - 0.000002 + dbl[as.character(x) %like% "<="] <- dbl[as.character(x) %like% "<="] - 0.000001 + dbl[as.character(x) %like% ">="] <- dbl[as.character(x) %like% ">="] + 0.000001 + dbl[as.character(x) %like% ">[0-9]"] <- dbl[as.character(x) %like% ">[0-9]"] + 0.000002 + if (decreasing == TRUE) { + x[order(-dbl)] + } else { + x[order(dbl)] + } +} + +#' @method hist mic +#' @importFrom graphics hist +#' @export +#' @noRd +hist.mic <- function(x, ...) { + warning_("in `hist()`: use `plot()` or ggplot2's `autoplot()` for optimal plotting of MIC values") + hist(log2(x)) +} + +# will be exported using s3_register() in R/zzz.R +get_skimmers.mic <- function(column) { + column <- as.mic(column) # make sure that currently implemented MIC levels are used + skimr::sfl( + skim_type = "mic", + p0 = ~ stats::quantile(., probs = 0, na.rm = TRUE, names = FALSE), + p25 = ~ stats::quantile(., probs = 0.25, na.rm = TRUE, names = FALSE), + p50 = ~ stats::quantile(., probs = 0.5, na.rm = TRUE, names = FALSE), + p75 = ~ stats::quantile(., probs = 0.75, na.rm = TRUE, names = FALSE), + p100 = ~ stats::quantile(., probs = 1, na.rm = TRUE, names = FALSE), + hist = ~ skimr::inline_hist(log2(stats::na.omit(.)), 5) + ) +} + +# Miscellaneous mathematical functions ------------------------------------ + +#' @method mean mic +#' @export +#' @noRd +mean.mic <- function(x, trim = 0, na.rm = FALSE, ...) { + mean(as.double(x), trim = trim, na.rm = na.rm, ...) +} + +#' @method median mic +#' @importFrom stats median +#' @export +#' @noRd +median.mic <- function(x, na.rm = FALSE, ...) { + median(as.double(x), na.rm = na.rm, ...) +} + +#' @method quantile mic +#' @importFrom stats quantile +#' @export +#' @noRd +quantile.mic <- function(x, probs = seq(0, 1, 0.25), na.rm = FALSE, + names = TRUE, type = 7, ...) { + quantile(as.double(x), probs = probs, na.rm = na.rm, names = names, type = type, ...) +} + +# Math (see ?groupGeneric) ------------------------------------------------ + +#' @export +Math.mic <- function(x, ...) { + x <- as.double(x) + # set class to numeric, because otherwise NextMethod will be factor (since mic is a factor) + .Class <- class(x) + NextMethod(.Generic) +} + +# Ops (see ?groupGeneric) ------------------------------------------------- + +#' @export +Ops.mic <- function(e1, e2) { + e1_chr <- as.character(e1) + e2_chr <- character(0) + e1 <- as.double(e1) + if (!missing(e2)) { + # when .Generic is `!`, e2 is missing + e2_chr <- as.character(e2) + e2 <- as.double(e2) + } + if (as.character(.Generic) %in% c("<", "<=", "==", "!=", ">", ">=")) { + # make sure that <0.002 is lower than 0.002 + # and that >32 is higher than 32, but equal to >=32 + e1[e1_chr %like% "<" & e1_chr %unlike% "="] <- e1[e1_chr %like% "<" & e1_chr %unlike% "="] - 0.000001 + e1[e1_chr %like% ">" & e1_chr %unlike% "="] <- e1[e1_chr %like% ">" & e1_chr %unlike% "="] + 0.000001 + e2[e2_chr %like% "<" & e2_chr %unlike% "="] <- e2[e2_chr %like% "<" & e2_chr %unlike% "="] - 0.000001 + e2[e2_chr %like% ">" & e2_chr %unlike% "="] <- e2[e2_chr %like% ">" & e2_chr %unlike% "="] + 0.000001 + } + # set .Class to numeric, because otherwise NextMethod will be factor (since mic is a factor) + .Class <- class(e1) + NextMethod(.Generic) +} + +# Complex (see ?groupGeneric) --------------------------------------------- + +#' @export +Complex.mic <- function(z) { + z <- as.double(z) + # set class to numeric, because otherwise NextMethod will be factor (since mic is a factor) + .Class <- class(z) + NextMethod(.Generic) +} + +# Summary (see ?groupGeneric) --------------------------------------------- + +#' @export +Summary.mic <- function(..., na.rm = FALSE) { + # NextMethod() cannot be called from an anonymous function (`...`), so we get() the generic directly: + fn <- get(.Generic, envir = .GenericCallEnv) + fn(as.double(c(...)), + na.rm = na.rm) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mo.R + + + + +#' Transform Arbitrary Input to Valid Microbial Taxonomy +#' +#' Use this function to get a valid microorganism code ([`mo`]) based on arbitrary user input. Determination is done using intelligent rules and the complete taxonomic tree of the kingdoms `r vector_and(unique(microorganisms$kingdom[which(!grepl("(unknown|Fungi)", microorganisms$kingdom))]), quotes = FALSE)`, and most microbial species from the kingdom Fungi (see *Source*). The input can be almost anything: a full name (like `"Staphylococcus aureus"`), an abbreviated name (such as `"S. aureus"`), an abbreviation known in the field (such as `"MRSA"`), or just a genus. See *Examples*. +#' @param x a [character] vector or a [data.frame] with one or two columns +#' @param Becker a [logical] to indicate whether staphylococci should be categorised into coagulase-negative staphylococci ("CoNS") and coagulase-positive staphylococci ("CoPS") instead of their own species, according to Karsten Becker *et al.* (see *Source*). Please see *Details* for a full list of staphylococcal species that will be converted. +#' +#' This excludes *Staphylococcus aureus* at default, use `Becker = "all"` to also categorise *S. aureus* as "CoPS". +#' @param Lancefield a [logical] to indicate whether a beta-haemolytic *Streptococcus* should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield (see *Source*). These streptococci will be categorised in their first group, e.g. *Streptococcus dysgalactiae* will be group C, although officially it was also categorised into groups G and L. . Please see *Details* for a full list of streptococcal species that will be converted. +#' +#' This excludes enterococci at default (who are in group D), use `Lancefield = "all"` to also categorise all enterococci as group D. +#' @param minimum_matching_score a numeric value to set as the lower limit for the [MO matching score][mo_matching_score()]. When left blank, this will be determined automatically based on the character length of `x`, its [taxonomic kingdom][microorganisms] and [human pathogenicity][mo_matching_score()]. +#' @param keep_synonyms a [logical] to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is `FALSE`, which will return a note if old taxonomic names were processed. The default can be set with the package option [`AMR_keep_synonyms`][AMR-options], i.e. `options(AMR_keep_synonyms = TRUE)` or `options(AMR_keep_synonyms = FALSE)`. +#' @param reference_df a [data.frame] to be used for extra reference when translating `x` to a valid [`mo`]. See [set_mo_source()] and [get_mo_source()] to automate the usage of your own codes (e.g. used in your analysis or organisation). +#' @param ignore_pattern a Perl-compatible [regular expression][base::regex] (case-insensitive) of which all matches in `x` must return `NA`. This can be convenient to exclude known non-relevant input and can also be set with the package option [`AMR_ignore_pattern`][AMR-options], e.g. `options(AMR_ignore_pattern = "(not reported|contaminated flora)")`. +#' @param cleaning_regex a Perl-compatible [regular expression][base::regex] (case-insensitive) to clean the input of `x`. Every matched part in `x` will be removed. At default, this is the outcome of [mo_cleaning_regex()], which removes texts between brackets and texts such as "species" and "serovar". The default can be set with the package option [`AMR_cleaning_regex`][AMR-options]. +#' @param only_fungi a [logical] to indicate if only fungi must be found, making sure that e.g. misspellings always return records from the kingdom of Fungi. This can be set globally for [all microorganism functions][mo_property()] with the package option [`AMR_only_fungi`][AMR-options], i.e. `options(AMR_only_fungi = TRUE)`. +#' @param language language to translate text like "no growth", which defaults to the system language (see [get_AMR_locale()]) +#' @param info a [logical] to indicate that info must be printed, e.g. a progress bar when more than 25 items are to be coerced, or a list with old taxonomic names. The default is `TRUE` only in interactive mode. +#' @param ... other arguments passed on to functions +#' @rdname as.mo +#' @aliases mo +#' @details +#' A microorganism (MO) code from this package (class: [`mo`]) is human-readable and typically looks like these examples: +#' +#' ``` +#' Code Full name +#' --------------- -------------------------------------- +#' B_KLBSL Klebsiella +#' B_KLBSL_PNMN Klebsiella pneumoniae +#' B_KLBSL_PNMN_RHNS Klebsiella pneumoniae rhinoscleromatis +#' | | | | +#' | | | | +#' | | | \---> subspecies, a 3-5 letter acronym +#' | | \----> species, a 3-6 letter acronym +#' | \----> genus, a 4-8 letter acronym +#' \----> kingdom: A (Archaea), AN (Animalia), B (Bacteria), +#' C (Chromista), F (Fungi), PL (Plantae), +#' P (Protozoa) +#' ``` +#' +#' Values that cannot be coerced will be considered 'unknown' and will return the MO code `UNKNOWN` with a warning. +#' +#' Use the [`mo_*`][mo_property()] functions to get properties based on the returned code, see *Examples*. +#' +#' The [as.mo()] function uses a novel and scientifically validated (\doi{10.18637/jss.v104.i03}) matching score algorithm (see *Matching Score for Microorganisms* below) to match input against the [available microbial taxonomy][microorganisms] in this package. This implicates that e.g. `"E. coli"` (a microorganism highly prevalent in humans) will return the microbial ID of *Escherichia coli* and not *Entamoeba coli* (a microorganism less prevalent in humans), although the latter would alphabetically come first. +#' +#' ### Coping with Uncertain Results +#' +#' Results of non-exact taxonomic input are based on their [matching score][mo_matching_score()]. The lowest allowed score can be set with the `minimum_matching_score` argument. At default this will be determined based on the character length of the input, the [taxonomic kingdom][microorganisms], and the [human pathogenicity][mo_matching_score()] of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to inspect the results with [mo_uncertainties()], which returns a [data.frame] with all specifications. +#' +#' To increase the quality of matching, the `cleaning_regex` argument is used to clean the input. This must be a [regular expression][base::regex] that matches parts of the input that should be removed before the input is matched against the [available microbial taxonomy][microorganisms]. It will be matched Perl-compatible and case-insensitive. The default value of `cleaning_regex` is the outcome of the helper function [mo_cleaning_regex()]. +#' +#' There are three helper functions that can be run after using the [as.mo()] function: +#' - Use [mo_uncertainties()] to get a [data.frame] that prints in a pretty format with all taxonomic names that were guessed. The output contains the matching score for all matches (see *Matching Score for Microorganisms* below). +#' - Use [mo_failures()] to get a [character] [vector] with all values that could not be coerced to a valid value. +#' - Use [mo_renamed()] to get a [data.frame] with all values that could be coerced based on old, previously accepted taxonomic names. +#' +#' ### For Mycologists +#' +#' The [matching score algorithm][mo_matching_score()] gives precedence to bacteria over fungi. If you are only analysing fungi, be sure to use `only_fungi = TRUE`, or better yet, add this to your code and run it once every session: +#' +#' ```r +#' options(AMR_only_fungi = TRUE) +#' ``` +#' +#' This will make sure that no bacteria or other 'non-fungi' will be returned by [as.mo()], or any of the [`mo_*`][mo_property()] functions. +#' +#' ### Coagulase-negative and Coagulase-positive Staphylococci +#' +#' With `Becker = TRUE`, the following staphylococci will be converted to their corresponding coagulase group: +#' +#' * Coagulase-negative: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_CONS[MO_CONS != "B_STPHY_CONS"], keep_synonyms = TRUE)), quotes = "*")` +#' * Coagulase-positive: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_COPS[MO_COPS != "B_STPHY_COPS"], keep_synonyms = TRUE)), quotes = "*")` +#' +#' This is based on: +#' +#' * Becker K *et al.* (2014). **Coagulase-Negative Staphylococci.** *Clin Microbiol Rev.* 27(4): 870-926; \doi{10.1128/CMR.00109-13} +#' * Becker K *et al.* (2019). **Implications of identifying the recently defined members of the *S. aureus* complex, *S. argenteus* and *S. schweitzeri*: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).** *Clin Microbiol Infect*; \doi{10.1016/j.cmi.2019.02.028} +#' * Becker K *et al.* (2020). **Emergence of coagulase-negative staphylococci.** *Expert Rev Anti Infect Ther.* 18(4):349-366; \doi{10.1080/14787210.2020.1730813} +#' +#' For newly named staphylococcal species, such as *S. brunensis* (2024) and *S. shinii* (2023), we looked up the scientific reference to make sure the species are considered for the correct coagulase group. +#' +#' ### Lancefield Groups in Streptococci +#' +#' With `Lancefield = TRUE`, the following streptococci will be converted to their corresponding Lancefield group: +#' +#' * `r paste(apply(aggregate(mo_name ~ mo_group_name, data = microorganisms.groups[microorganisms.groups$mo_group_name %like_case% "Streptococcus Group [A-Z]$", ], FUN = function(x) vector_and(gsub("Streptococcus", "S.", x, fixed = TRUE), quotes = "*", sort = TRUE)), 1, function(row) paste(row["mo_group_name"], ": ", row["mo_name"], sep = "")), collapse = "\n* ")` +#' +#' This is based on: +#' +#' * Lancefield RC (1933). **A serological differentiation of human and other groups of hemolytic streptococci.** *J Exp Med.* 57(4): 571-95; \doi{10.1084/jem.57.4.571} +#' +#' @inheritSection mo_matching_score Matching Score for Microorganisms +#' +# (source as a section here, so it can be inherited by other man pages) +#' @section Source: +#' * Berends MS *et al.* (2022). **AMR: An R Package for Working with Antimicrobial Resistance Data**. *Journal of Statistical Software*, 104(3), 1-31; \doi{10.18637/jss.v104.i03} +#' * `r TAXONOMY_VERSION$LPSN$citation` Accessed from <`r TAXONOMY_VERSION$LPSN$url`> on `r documentation_date(TAXONOMY_VERSION$LPSN$accessed_date)`. +#' * `r TAXONOMY_VERSION$MycoBank$citation` Accessed from <`r TAXONOMY_VERSION$MycoBank$url`> on `r documentation_date(TAXONOMY_VERSION$MycoBank$accessed_date)`. +#' * `r TAXONOMY_VERSION$GBIF$citation` Accessed from <`r TAXONOMY_VERSION$GBIF$url`> on `r documentation_date(TAXONOMY_VERSION$GBIF$accessed_date)`. +#' * `r TAXONOMY_VERSION$BacDive$citation` Accessed from <`r TAXONOMY_VERSION$BacDive$url`> on `r documentation_date(TAXONOMY_VERSION$BacDive$accessed_date)`. +#' * `r TAXONOMY_VERSION$SNOMED$citation` URL: <`r TAXONOMY_VERSION$SNOMED$url`> +#' * Bartlett A *et al.* (2022). **A comprehensive list of bacterial pathogens infecting humans** *Microbiology* 168:001269; \doi{10.1099/mic.0.001269} +#' @export +#' @return A [character] [vector] with additional class [`mo`] +#' @seealso [microorganisms] for the [data.frame] that is being used to determine ID's. +#' +#' The [`mo_*`][mo_property()] functions (such as [mo_genus()], [mo_gramstain()]) to get properties based on the returned code. +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' \donttest{ +#' # These examples all return "B_STPHY_AURS", the ID of S. aureus: +#' as.mo(c( +#' "sau", # WHONET code +#' "stau", +#' "STAU", +#' "staaur", +#' "S. aureus", +#' "S aureus", +#' "Sthafilokkockus aureus", # handles incorrect spelling +#' "Staphylococcus aureus (MRSA)", +#' "MRSA", # Methicillin Resistant S. aureus +#' "VISA", # Vancomycin Intermediate S. aureus +#' "VRSA", # Vancomycin Resistant S. aureus +#' 115329001 # SNOMED CT code +#' )) +#' +#' # Dyslexia is no problem - these all work: +#' as.mo(c( +#' "Ureaplasma urealyticum", +#' "Ureaplasma urealyticus", +#' "Ureaplasmium urealytica", +#' "Ureaplazma urealitycium" +#' )) +#' +#' # input will get cleaned up with the input given in the `cleaning_regex` argument, +#' # which defaults to `mo_cleaning_regex()`: +#' cat(mo_cleaning_regex(), "\n") +#' +#' as.mo("Streptococcus group A") +#' +#' as.mo("S. epidermidis") # will remain species: B_STPHY_EPDR +#' as.mo("S. epidermidis", Becker = TRUE) # will not remain species: B_STPHY_CONS +#' +#' as.mo("S. pyogenes") # will remain species: B_STRPT_PYGN +#' as.mo("S. pyogenes", Lancefield = TRUE) # will not remain species: B_STRPT_GRPA +#' +#' # All mo_* functions use as.mo() internally too (see ?mo_property): +#' mo_genus("E. coli") +#' mo_gramstain("ESCO") +#' mo_is_intrinsic_resistant("ESCCOL", ab = "vanco") +#' } +as.mo <- function(x, + Becker = FALSE, + Lancefield = FALSE, + minimum_matching_score = NULL, + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + reference_df = get_mo_source(), + ignore_pattern = getOption("AMR_ignore_pattern", NULL), + cleaning_regex = getOption("AMR_cleaning_regex", mo_cleaning_regex()), + only_fungi = getOption("AMR_only_fungi", FALSE), + language = get_AMR_locale(), + info = interactive(), + ...) { + meet_criteria(x, allow_class = c("mo", "data.frame", "list", "character", "numeric", "integer", "factor"), allow_NA = TRUE) + meet_criteria(Becker, allow_class = c("logical", "character"), has_length = 1) + meet_criteria(Lancefield, allow_class = c("logical", "character"), has_length = 1) + meet_criteria(minimum_matching_score, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + meet_criteria(reference_df, allow_class = "data.frame", allow_NULL = TRUE) + meet_criteria(ignore_pattern, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(cleaning_regex, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(only_fungi, allow_class = "logical", has_length = 1) + language <- validate_language(language) + meet_criteria(info, allow_class = "logical", has_length = 1) + + add_MO_lookup_to_AMR_env() + + if (tryCatch(all(x %in% c(AMR_env$MO_lookup$mo, NA)), error = function(e) FALSE) && + isFALSE(Becker) && + isFALSE(Lancefield) && + isTRUE(keep_synonyms)) { + # don't look into valid MO codes, just return them + # is.mo() won't work - MO codes might change between package versions + return(set_clean_class(x, new_class = c("mo", "character"))) + } + + # start off with replaced language-specific non-ASCII characters with ASCII characters + x <- parse_and_convert(x) + # replace mo codes used in older package versions + x <- replace_old_mo_codes(x, property = "mo") + # ignore cases that match the ignore pattern + x <- replace_ignore_pattern(x, ignore_pattern) + + x_lower <- tolower(x) + + # WHONET: xxx = no growth + x[x_lower %in% c("", "xxx", "na", "nan")] <- NA_character_ + + out <- rep(NA_character_, length(x)) + + # below we use base R's match(), known for powering '%in%', and incredibly fast! + + # From reference_df ---- + reference_df <- repair_reference_df(reference_df) + if (!is.null(reference_df)) { + out[x %in% reference_df[[1]]] <- reference_df[[2]][match(x[x %in% reference_df[[1]]], reference_df[[1]])] + } + # From MO code ---- + out[is.na(out) & toupper(x) %in% AMR_env$MO_lookup$mo] <- toupper(x[is.na(out) & toupper(x) %in% AMR_env$MO_lookup$mo]) + # From full name ---- + out[is.na(out) & x_lower %in% AMR_env$MO_lookup$fullname_lower] <- AMR_env$MO_lookup$mo[match(x_lower[is.na(out) & x_lower %in% AMR_env$MO_lookup$fullname_lower], AMR_env$MO_lookup$fullname_lower)] + # one exception: "Fungi" matches the kingdom, but instead it should return the 'unknown' code for fungi + out[out == "F_[KNG]_FUNGI"] <- "F_FUNGUS" + # From known codes ---- + out[is.na(out) & toupper(x) %in% AMR::microorganisms.codes$code] <- AMR::microorganisms.codes$mo[match(toupper(x)[is.na(out) & toupper(x) %in% AMR::microorganisms.codes$code], AMR::microorganisms.codes$code)] + # From SNOMED ---- + # based on this extremely fast gem: https://stackoverflow.com/a/11002456/4575331 + snomeds <- unlist(AMR_env$MO_lookup$snomed) + snomeds <- snomeds[!is.na(snomeds)] + out[is.na(out) & x %in% snomeds] <- AMR_env$MO_lookup$mo[rep(seq_along(AMR_env$MO_lookup$snomed), vapply(FUN.VALUE = double(1), AMR_env$MO_lookup$snomed, length))[match(x[is.na(out) & x %in% snomeds], snomeds)]] + # From other familiar output ---- + # such as Salmonella groups, colloquial names, etc. + out[is.na(out)] <- convert_colloquial_input(x[is.na(out)]) + # From previous hits in this session ---- + old <- out + out[is.na(out) & paste(x, minimum_matching_score, only_fungi) %in% AMR_env$mo_previously_coerced$x] <- AMR_env$mo_previously_coerced$mo[match(paste(x, minimum_matching_score, only_fungi)[is.na(out) & paste(x, minimum_matching_score, only_fungi) %in% AMR_env$mo_previously_coerced$x], AMR_env$mo_previously_coerced$x)] + new <- out + if (isTRUE(info) && message_not_thrown_before("as.mo", old, new, entire_session = TRUE) && any(is.na(old) & !is.na(new), na.rm = TRUE)) { + message_( + "Returning previously coerced value", ifelse(sum(is.na(old) & !is.na(new)) > 1, "s", ""), + " for ", vector_and(x[is.na(old) & !is.na(new)]), ". Run `mo_reset_session()` to reset this. This note will be shown once per session for this input." + ) + } + + # For all other input ---- + if (any(is.na(out) & !is.na(x))) { + # reset uncertainties + AMR_env$mo_uncertainties <- AMR_env$mo_uncertainties[0, ] + AMR_env$mo_failures <- NULL + + # Laboratory systems: remove (translated) entries like "no growth", "not E. coli", etc. + x[trimws2(x) %like% translate_into_language("no .*growth", language = language)] <- NA_character_ + x[trimws2(x) %like% paste0("^(", translate_into_language("no|not", language = language), ") ")] <- NA_character_ + + # groups are in our taxonomic table with a capital G + x <- gsub(" group( |$)", " Group\\1", x, perl = TRUE) + + # run over all unique leftovers + x_unique <- unique(x[is.na(out) & !is.na(x)]) + + # set up progress bar + progress <- progress_ticker(n = length(x_unique), n_min = 10, print = info, title = "Converting microorganism input") + on.exit(close(progress)) + + msg <- character(0) + + MO_lookup_current <- AMR_env$MO_lookup + if (isTRUE(only_fungi)) { + MO_lookup_current <- MO_lookup_current[MO_lookup_current$kingdom == "Fungi", , drop = FALSE] + } + + # run it + x_coerced <- vapply(FUN.VALUE = character(1), x_unique, function(x_search) { + progress$tick() + + # some required cleaning steps + x_out <- trimws2(x_search) + # this applies the `cleaning_regex` argument, which defaults to mo_cleaning_regex() + x_out <- gsub(cleaning_regex, " ", x_out, ignore.case = TRUE, perl = TRUE) + x_out <- trimws2(gsub(" +", " ", x_out, perl = TRUE)) + x_search_cleaned <- x_out + x_out <- tolower(x_out) + # when x_search_cleaned are only capitals (such as in codes), make them lowercase to increase matching score + x_search_cleaned[x_search_cleaned == toupper(x_search_cleaned)] <- x_out[x_search_cleaned == toupper(x_search_cleaned)] + + # first check if cleaning led to an exact result, case-insensitive + if (x_out %in% MO_lookup_current$fullname_lower) { + return(as.character(MO_lookup_current$mo[match(x_out, MO_lookup_current$fullname_lower)])) + } + + # input must not be too short + if (nchar(x_out) < 3) { + return("UNKNOWN") + } + + # take out the parts, split by space + x_parts <- strsplit(gsub("-", " ", x_out, fixed = TRUE), " ", fixed = TRUE)[[1]] + # do a pre-match on first character (and if it contains a space, first chars of first two terms) + if (length(x_parts) %in% c(2, 3)) { + # for genus + species + subspecies + if (paste(x_parts[1:2], collapse = " ") %in% MO_lookup_current$fullname_lower) { + filtr <- which(MO_lookup_current$fullname_lower %like% paste(x_parts[1:2], collapse = " ")) + } else if (x_parts[1] %in% MO_lookup_current$genus_lower && !paste(x_parts[1:2], collapse = " ") %in% MO_lookup_current$fullname_lower) { + # for a known genus, but unknown (sub)species + filtr <- which(MO_lookup_current$genus_lower == x_parts[1]) + minimum_matching_score <- 0.05 + } else if (nchar(gsub("[^a-z]", "", x_parts[1], perl = TRUE)) <= 3) { + filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) & + (MO_lookup_current$species_first == substr(x_parts[2], 1, 1) | + MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) | + MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1))) + } else { + filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) | + MO_lookup_current$species_first == substr(x_parts[2], 1, 1) | + MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) | + MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1)) + } + } else if (length(x_parts) > 3) { + first_chars <- paste0("(^| )[", paste(substr(x_parts, 1, 1), collapse = ""), "]") + filtr <- which(MO_lookup_current$full_first %like_case% first_chars) + } else if (nchar(x_out) == 3) { + # no space and 3 characters - probably a code such as SAU or ECO + msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", totitle(substr(x_out, 1, 1)), AMR_env$dots, " ", substr(x_out, 2, 3), AMR_env$dots, "\"")) + filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 1), ".* ", substr(x_out, 2, 3))) + } else if (nchar(x_out) == 4) { + # no space and 4 characters - probably a code such as STAU or ESCO + msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", totitle(substr(x_out, 1, 2)), AMR_env$dots, " ", substr(x_out, 3, 4), AMR_env$dots, "\"")) + filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 2), ".* ", substr(x_out, 3, 4))) + } else if (nchar(x_out) <= 6) { + # no space and 5-6 characters - probably a code such as STAAUR or ESCCOL + first_part <- paste0(substr(x_out, 1, 2), "[a-z]*", substr(x_out, 3, 3)) + second_part <- substr(x_out, 4, nchar(x_out)) + msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", gsub("[a-z]*", AMR_env$dots, totitle(first_part), fixed = TRUE), " ", second_part, AMR_env$dots, "\"")) + filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", first_part, ".* ", second_part)) + } else { + # for genus or species or subspecies + filtr <- which(MO_lookup_current$full_first == substr(x_parts, 1, 1) | + MO_lookup_current$species_first == substr(x_parts, 1, 1) | + MO_lookup_current$subspecies_first == substr(x_parts, 1, 1)) + } + if (length(filtr) == 0) { + mo_to_search <- MO_lookup_current$fullname + } else { + mo_to_search <- MO_lookup_current$fullname[filtr] + } + + AMR_env$mo_to_search <- mo_to_search + # determine the matching score on the original search value + m <- mo_matching_score(x = x_search_cleaned, n = mo_to_search) + if (is.null(minimum_matching_score)) { + minimum_matching_score_current <- min(0.6, min(10, nchar(x_search_cleaned)) * 0.08) + # correct back for prevalence + minimum_matching_score_current <- minimum_matching_score_current / MO_lookup_current$prevalence[match(mo_to_search, MO_lookup_current$fullname)] + # correct back for kingdom + minimum_matching_score_current <- minimum_matching_score_current / MO_lookup_current$kingdom_index[match(mo_to_search, MO_lookup_current$fullname)] + minimum_matching_score_current <- pmax(minimum_matching_score_current, m) + if (length(x_parts) > 1 && all(m <= 0.55, na.rm = TRUE)) { + # if the highest score is 0.5, we have nothing serious - 0.5 is the lowest for pathogenic group 1 + # make everything NA so the results will get removed below + # (we added length(x_parts) > 1 to exclude microbial codes from this rule, such as "STAU") + m[seq_len(length(m))] <- NA_real_ + } + } else { + # minimum_matching_score was set, so remove everything below it + m[m < minimum_matching_score] <- NA_real_ + minimum_matching_score_current <- minimum_matching_score + } + + top_hits <- mo_to_search[order(m, decreasing = TRUE, na.last = NA)] # na.last = NA will remove the NAs + if (length(top_hits) == 0) { + warning_("No hits found for \"", x_search, "\" with minimum_matching_score = ", ifelse(is.null(minimum_matching_score), paste0("NULL (=", round(min(minimum_matching_score_current, na.rm = TRUE), 3), ")"), minimum_matching_score), ". Try setting this value lower or even to 0.", call = FALSE) + result_mo <- NA_character_ + } else { + result_mo <- MO_lookup_current$mo[match(top_hits[1], MO_lookup_current$fullname)] + AMR_env$mo_uncertainties <- rbind_AMR( + AMR_env$mo_uncertainties, + data.frame( + original_input = x_search, + input = x_search_cleaned, + fullname = top_hits[1], + mo = result_mo, + candidates = ifelse(length(top_hits) > 1, paste(top_hits[2:min(99, length(top_hits))], collapse = ", "), ""), + minimum_matching_score = ifelse(is.null(minimum_matching_score), "NULL", minimum_matching_score), + keep_synonyms = keep_synonyms, + stringsAsFactors = FALSE + ) + ) + # save to package env to save time for next time + AMR_env$mo_previously_coerced <- unique(rbind_AMR( + AMR_env$mo_previously_coerced, + data.frame( + x = paste(x_search, minimum_matching_score, only_fungi), + mo = result_mo, + stringsAsFactors = FALSE + ) + )) + } + # the actual result: + as.character(result_mo) + }) + + # remove progress bar from console + close(progress) + # expand from unique again + out[is.na(out)] <- x_coerced[match(x[is.na(out)], x_unique)] + + # Throw note about uncertainties ---- + if (isTRUE(info) && NROW(AMR_env$mo_uncertainties) > 0) { + if (message_not_thrown_before("as.mo", "uncertainties", AMR_env$mo_uncertainties$original_input)) { + plural <- c("", "this") + if (length(AMR_env$mo_uncertainties$original_input) > 1) { + plural <- c("s", "these uncertainties") + } + if (length(AMR_env$mo_uncertainties$original_input) <= 3) { + examples <- vector_and( + paste0( + '"', AMR_env$mo_uncertainties$original_input, + '" (assumed ', italicise(AMR_env$mo_uncertainties$fullname), ")" + ), + quotes = FALSE + ) + } else { + examples <- paste0(nr2char(length(AMR_env$mo_uncertainties$original_input)), " microorganism", plural[1]) + } + msg <- c(msg, paste0( + "Microorganism translation was uncertain for ", examples, + ". Run `mo_uncertainties()` to review ", plural[2], ", or use `add_custom_microorganisms()` to add custom entries." + )) + + for (m in msg) { + message_(m) + } + } + } + } # end of loop over all yet unknowns + + # Keep or replace synonyms ---- + out_current <- synonym_mo_to_accepted_mo(out, fill_in_accepted = FALSE) + AMR_env$mo_renamed <- list(old = out[!is.na(out_current)]) + if (isFALSE(keep_synonyms)) { + out[!is.na(out_current)] <- out_current[!is.na(out_current)] + if (isTRUE(info) && length(AMR_env$mo_renamed$old) > 0) { + print(mo_renamed(), extra_txt = " (use `keep_synonyms = TRUE` to leave uncorrected)") + } + } else if (is.null(getOption("AMR_keep_synonyms")) && length(AMR_env$mo_renamed$old) > 0 && message_not_thrown_before("as.mo", "keep_synonyms_warning", entire_session = TRUE)) { + # keep synonyms is TRUE, so check if any do have synonyms + warning_("Function `as.mo()` returned ", nr2char(length(unique(AMR_env$mo_renamed$old))), " old taxonomic name", ifelse(length(unique(AMR_env$mo_renamed$old)) > 1, "s", ""), ". Use `as.mo(..., keep_synonyms = FALSE)` to clean the input to currently accepted taxonomic names, or set the R option `AMR_keep_synonyms` to `FALSE`. This warning will be shown once per session.", call = FALSE) + } + + # Apply Becker ---- + if (!isTRUE(only_fungi) && (isTRUE(Becker) || Becker == "all")) { + # warn when species found that are not in: + # - Becker et al. 2014, PMID 25278577 + # - Becker et al. 2019, PMID 30872103 + # - Becker et al. 2020, PMID 32056452 + + # comment below code if all staphylococcal species are categorised as CoNS/CoPS + post_Becker <- paste( + "Staphylococcus", + c("caledonicus", "canis", "durrellii", "lloydii", "ratti", "roterodami", "singaporensis", "taiwanensis") + ) + if (any(out %in% AMR_env$MO_lookup$mo[match(post_Becker, AMR_env$MO_lookup$fullname)])) { + if (message_not_thrown_before("as.mo", "becker")) { + warning_("in `as.mo()`: Becker ", font_italic("et al."), " (2014, 2019, 2020) does not contain these species named after their publication: ", + vector_and(font_italic(gsub("Staphylococcus", "S.", post_Becker, fixed = TRUE), collapse = NULL), quotes = FALSE), + ". Categorisation to CoNS/CoPS was taken from the original scientific publication(s).", + immediate = TRUE, call = FALSE + ) + } + } + + # 'MO_CONS' and 'MO_COPS' are 'mo' vectors created in R/_pre_commit_checks.R + out[out %in% MO_CONS] <- "B_STPHY_CONS" + out[out %in% MO_COPS] <- "B_STPHY_COPS" + if (Becker == "all") { + out[out == "B_STPHY_AURS"] <- "B_STPHY_COPS" + } + } + + # Apply Lancefield ---- + if (!isTRUE(only_fungi) && (isTRUE(Lancefield) || Lancefield == "all")) { + # (using `%like_case%` to also match subspecies) + + # group A - S. pyogenes + out[out %like_case% "^B_STRPT_PYGN(_|$)"] <- "B_STRPT_GRPA" + # group B - S. agalactiae + out[out %like_case% "^B_STRPT_AGLC(_|$)"] <- "B_STRPT_GRPB" + # group C - all subspecies within S. dysgalactiae and S. equi (such as S. equi zooepidemicus) + out[out %like_case% "^B_STRPT_(DYSG|EQUI)(_|$)"] <- "B_STRPT_GRPC" + if (Lancefield == "all") { + # group D - all enterococci + out[out %like_case% "^B_ENTRC(_|$)"] <- "B_STRPT_GRPD" + } + # group F - Milleri group == S. anginosus group, which incl. S. anginosus, S. constellatus, S. intermedius + out[out %like_case% "^B_STRPT_(ANGN|CNST|INTR)(_|$)"] <- "B_STRPT_GRPF" + # group G - S. dysgalactiae and S. canis (though dysgalactiae is also group C and will be matched there) + out[out %like_case% "^B_STRPT_(DYSG|CANS)(_|$)"] <- "B_STRPT_GRPG" + # group H - S. sanguinis + out[out %like_case% "^B_STRPT_SNGN(_|$)"] <- "B_STRPT_GRPH" + # group K - S. salivarius, incl. S. salivarius salivarius and S. salivarius thermophilus + out[out %like_case% "^B_STRPT_SLVR(_|$)"] <- "B_STRPT_GRPK" + # group L - only S. dysgalactiae which is also group C & G, so ignore it here + } + + # All unknowns ---- + out[is.na(out) & !is.na(x)] <- "UNKNOWN" + AMR_env$mo_failures <- unique(x[out == "UNKNOWN" & !toupper(x) %in% c("UNKNOWN", "CON", "UNK") & !x %like_case% "^[(]unknown [a-z]+[)]$" & !is.na(x)]) + if (length(AMR_env$mo_failures) > 0) { + warning_("The following input could not be coerced and was returned as \"UNKNOWN\": ", vector_and(AMR_env$mo_failures, quotes = TRUE), ".\nYou can retrieve this list with `mo_failures()`.", call = FALSE) + } + + # Return class ---- + set_clean_class(out, + new_class = c("mo", "character") + ) +} + +# OTHER DOCUMENTED FUNCTIONS ---------------------------------------------- + +#' @rdname as.mo +#' @export +is.mo <- function(x) { + inherits(x, "mo") +} + +#' @rdname as.mo +#' @export +mo_uncertainties <- function() { + set_clean_class(AMR_env$mo_uncertainties, new_class = c("mo_uncertainties", "data.frame")) +} + +#' @rdname as.mo +#' @export +mo_renamed <- function() { + add_MO_lookup_to_AMR_env() + x <- AMR_env$mo_renamed + + x$new <- synonym_mo_to_accepted_mo(x$old) + mo_old <- AMR_env$MO_lookup$fullname[match(x$old, AMR_env$MO_lookup$mo)] + mo_new <- AMR_env$MO_lookup$fullname[match(x$new, AMR_env$MO_lookup$mo)] + ref_old <- AMR_env$MO_lookup$ref[match(x$old, AMR_env$MO_lookup$mo)] + ref_new <- AMR_env$MO_lookup$ref[match(x$new, AMR_env$MO_lookup$mo)] + + df_renamed <- data.frame( + old = mo_old, + new = mo_new, + ref_old = ref_old, + ref_new = ref_new, + stringsAsFactors = FALSE + ) + df_renamed <- unique(df_renamed) + df_renamed <- df_renamed[order(df_renamed$old), , drop = FALSE] + set_clean_class(df_renamed, new_class = c("mo_renamed", "data.frame")) +} + +#' @rdname as.mo +#' @export +mo_failures <- function() { + AMR_env$mo_failures +} + +#' @rdname as.mo +#' @export +mo_reset_session <- function() { + if (NROW(AMR_env$mo_previously_coerced) > 0) { + message_("Reset ", nr2char(NROW(AMR_env$mo_previously_coerced)), " previously matched input value", ifelse(NROW(AMR_env$mo_previously_coerced) > 1, "s", ""), ".") + AMR_env$mo_previously_coerced <- AMR_env$mo_previously_coerced[0, , drop = FALSE] + AMR_env$mo_uncertainties <- AMR_env$mo_uncertainties[0, , drop = FALSE] + } else { + message_("No previously matched input values to reset.") + } +} + +#' @rdname as.mo +#' @export +mo_cleaning_regex <- function() { + parts_to_remove <- c("e?spp([^a-z]+|$)", "e?ssp([^a-z]+|$)", "e?ss([^a-z]+|$)", "e?sp([^a-z]+|$)", "e?subsp", "sube?species", "e?species", + "biovar[a-z]*", "biotype", "serovar[a-z]*", "var([^a-z]+|$)", "serogr.?up[a-z]*", + "titer", "dummy", "Ig[ADEGM]", " ?[a-z-]+[-](resistant|susceptible) ?") + + paste0( + "(", + "[^A-Za-z- \\(\\)\\[\\]{}]+", + "|", + "([({]|\\[).+([})]|\\])", + "|(^| )(", + paste0(parts_to_remove[order(1 - nchar(parts_to_remove))], collapse = "|"), + "))") +} + +# UNDOCUMENTED METHODS ---------------------------------------------------- + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.mo <- function(x, ...) { + add_MO_lookup_to_AMR_env() + out <- trimws(format(x)) + # grey out the kingdom (part until first "_") + out[!is.na(x)] <- gsub("^([A-Z]+_)(.*)", paste0(font_subtle("\\1"), "\\2"), out[!is.na(x)], perl = TRUE) + # and grey out every _ + out[!is.na(x)] <- gsub("_", font_subtle("_"), out[!is.na(x)]) + + # markup NA and UNKNOWN + out[is.na(x)] <- font_na(" NA") + out[x == "UNKNOWN"] <- font_na(" UNKNOWN") + + # markup manual codes + out[x %in% AMR_env$MO_lookup$mo & !x %in% AMR::microorganisms$mo] <- font_blue(out[x %in% AMR_env$MO_lookup$mo & !x %in% AMR::microorganisms$mo], collapse = NULL) + + df <- tryCatch(get_current_data(arg_name = "x", call = 0), + error = function(e) NULL + ) + if (!is.null(df)) { + mo_cols <- vapply(FUN.VALUE = logical(1), df, is.mo) + } else { + mo_cols <- NULL + } + + all_mos <- c(AMR_env$MO_lookup$mo, NA) + if (!all(x %in% all_mos) || + (!is.null(df) && !all(unlist(df[, which(mo_cols), drop = FALSE]) %in% all_mos))) { + # markup old mo codes + out[!x %in% all_mos] <- font_italic( + font_na(x[!x %in% all_mos], + collapse = NULL + ), + collapse = NULL + ) + # throw a warning with the affected column name(s) + if (!is.null(mo_cols)) { + col <- paste0("Column ", vector_or(colnames(df)[mo_cols], quotes = TRUE, sort = FALSE)) + } else { + col <- "The data" + } + warning_( + col, " contains old MO codes (from a previous AMR package version). ", + "Please update your MO codes with `as.mo()`.", + call = FALSE + ) + } + + # add the names to the bugs as mouse-over! + if (tryCatch(isTRUE(getExportedValue("ansi_has_hyperlink_support", ns = asNamespace("cli"))()), error = function(e) FALSE)) { + out[!x %in% c("UNKNOWN", NA)] <- font_url(url = paste0(x[!x %in% c("UNKNOWN", NA)], ": ", + mo_name(x[!x %in% c("UNKNOWN", NA)], keep_synonyms = TRUE)), + txt = out[!x %in% c("UNKNOWN", NA)]) + } + + # make it always fit exactly + max_char <- max(nchar(x)) + if (is.na(max_char)) { + max_char <- 12 + } + create_pillar_column(out, + align = "left", + width = max_char + ifelse(any(x %in% c(NA, "UNKNOWN")), 2, 0) + ) +} + +# will be exported using s3_register() in R/zzz.R +type_sum.mo <- function(x, ...) { + "mo" +} + +# will be exported using s3_register() in R/zzz.R +freq.mo <- function(x, ...) { + x_noNA <- as.mo(x[!is.na(x)]) # as.mo() to get the newest mo codes + grams <- mo_gramstain(x_noNA, language = NULL) + digits <- list(...)$digits + if (is.null(digits)) { + digits <- 2 + } + cleaner::freq.default( + x = x, + ..., + .add_header = list( + `Gram-negative` = paste0( + format(sum(grams == "Gram-negative", na.rm = TRUE), + big.mark = " ", + decimal.mark = "." + ), + " (", percentage(sum(grams == "Gram-negative", na.rm = TRUE) / length(grams), + digits = digits + ), + ")" + ), + `Gram-positive` = paste0( + format(sum(grams == "Gram-positive", na.rm = TRUE), + big.mark = " ", + decimal.mark = "." + ), + " (", percentage(sum(grams == "Gram-positive", na.rm = TRUE) / length(grams), + digits = digits + ), + ")" + ), + `Nr. of genera` = pm_n_distinct(mo_genus(x_noNA, language = NULL)), + `Nr. of species` = pm_n_distinct(paste( + mo_genus(x_noNA, language = NULL), + mo_species(x_noNA, language = NULL) + )) + ) + ) +} + +# will be exported using s3_register() in R/zzz.R +get_skimmers.mo <- function(column) { + skimr::sfl( + skim_type = "mo", + unique_total = ~ length(unique(stats::na.omit(.))), + gram_negative = ~ sum(mo_is_gram_negative(.), na.rm = TRUE), + gram_positive = ~ sum(mo_is_gram_positive(.), na.rm = TRUE), + top_genus = ~ names(sort(-table(mo_genus(stats::na.omit(.), language = NULL))))[1L], + top_species = ~ names(sort(-table(mo_name(stats::na.omit(.), language = NULL))))[1L] + ) +} + +#' @method print mo +#' @export +#' @noRd +print.mo <- function(x, print.shortnames = FALSE, ...) { + add_MO_lookup_to_AMR_env() + cat("Class 'mo'\n") + x_names <- names(x) + if (is.null(x_names) & print.shortnames == TRUE) { + x_names <- tryCatch(mo_shortname(x, ...), error = function(e) NULL) + } + x <- as.character(x) + names(x) <- x_names + if (!all(x %in% c(AMR_env$MO_lookup$mo, NA))) { + warning_( + "Some MO codes are from a previous AMR package version. ", + "Please update the MO codes with `as.mo()`.", + call = FALSE + ) + } + print.default(x, quote = FALSE) +} + +#' @method summary mo +#' @export +#' @noRd +summary.mo <- function(object, ...) { + # unique and top 1-3 + x <- object + top_3 <- names(sort(-table(x[!is.na(x)])))[1:3] + out <- c( + "Class" = "mo", + "" = length(x[is.na(x)]), + "Unique" = length(unique(x[!is.na(x)])), + "#1" = top_3[1], + "#2" = top_3[2], + "#3" = top_3[3] + ) + class(out) <- c("summaryDefault", "table") + out +} + +#' @method as.data.frame mo +#' @export +#' @noRd +as.data.frame.mo <- function(x, ...) { + add_MO_lookup_to_AMR_env() + if (!all(x %in% c(AMR_env$MO_lookup$mo, NA))) { + warning_( + "The data contains old MO codes (from a previous AMR package version). ", + "Please update your MO codes with `as.mo()`." + ) + } + nm <- deparse1(substitute(x)) + if (!"nm" %in% names(list(...))) { + as.data.frame.vector(x, ..., nm = nm) + } else { + as.data.frame.vector(x, ...) + } +} + +#' @method [ mo +#' @export +#' @noRd +"[.mo" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [[ mo +#' @export +#' @noRd +"[[.mo" <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} +#' @method [<- mo +#' @export +#' @noRd +"[<-.mo" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + # must only contain valid MOs + add_MO_lookup_to_AMR_env() + return_after_integrity_check(y, "microorganism code", as.character(AMR_env$MO_lookup$mo)) +} +#' @method [[<- mo +#' @export +#' @noRd +"[[<-.mo" <- function(i, j, ..., value) { + y <- NextMethod() + attributes(y) <- attributes(i) + # must only contain valid MOs + add_MO_lookup_to_AMR_env() + return_after_integrity_check(y, "microorganism code", as.character(AMR_env$MO_lookup$mo)) +} +#' @method c mo +#' @export +#' @noRd +c.mo <- function(...) { + x <- list(...)[[1L]] + y <- NextMethod() + attributes(y) <- attributes(x) + add_MO_lookup_to_AMR_env() + return_after_integrity_check(y, "microorganism code", as.character(AMR_env$MO_lookup$mo)) +} + +#' @method unique mo +#' @export +#' @noRd +unique.mo <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method rep mo +#' @export +#' @noRd +rep.mo <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method print mo_uncertainties +#' @export +#' @noRd +print.mo_uncertainties <- function(x, n = 10, ...) { + more_than_50 <- FALSE + if (NROW(x) == 0) { + cat(word_wrap("No uncertainties to show. Only uncertainties of the last call to `as.mo()` or any `mo_*()` function are stored.\n\n", add_fn = font_blue)) + return(invisible(NULL)) + } else if (NROW(x) > 50) { + more_than_50 <- TRUE + x <- x[1:50, , drop = FALSE] + } + + cat(word_wrap("Matching scores are based on the resemblance between the input and the full taxonomic name, and the pathogenicity in humans. See `?mo_matching_score`.\n\n", add_fn = font_blue)) + + add_MO_lookup_to_AMR_env() + + col_red <- function(x) font_rose_bg(x, collapse = NULL) + col_orange <- function(x) font_orange_bg(x, collapse = NULL) + col_yellow <- function(x) font_yellow_bg(x, collapse = NULL) + col_green <- function(x) font_green_bg(x, collapse = NULL) + + if (has_colour()) { + cat(word_wrap("Colour keys: ", + col_red(" 0.000-0.549 "), + col_orange(" 0.550-0.649 "), + col_yellow(" 0.650-0.749 "), + col_green(" 0.750-1.000"), + add_fn = font_blue + ), font_green_bg(" "), "\n", sep = "") + } + + score_set_colour <- function(text, scores) { + # set colours to scores + text[scores >= 0.75] <- col_green(text[scores >= 0.75]) + text[scores >= 0.65 & scores < 0.75] <- col_yellow(text[scores >= 0.65 & scores < 0.75]) + text[scores >= 0.55 & scores < 0.65] <- col_orange(text[scores >= 0.55 & scores < 0.65]) + text[scores < 0.55] <- col_red(text[scores < 0.55]) + text + } + + txt <- "" + any_maxed_out <- FALSE + for (i in seq_len(nrow(x))) { + if (x[i, ]$candidates != "") { + candidates <- unlist(strsplit(x[i, ]$candidates, ", ", fixed = TRUE)) + if (length(candidates) > n) { + any_maxed_out <- TRUE + candidates <- candidates[seq_len(n)] + } + scores <- mo_matching_score(x = x[i, ]$input, n = candidates) + n_candidates <- length(candidates) + + candidates_formatted <- italicise(candidates) + scores_formatted <- trimws(formatC(round(scores, 3), format = "f", digits = 3)) + scores_formatted <- score_set_colour(scores_formatted, scores) + + # sort on descending scores + candidates_formatted <- candidates_formatted[order(1 - scores)] + scores_formatted <- scores_formatted[order(1 - scores)] + + candidates <- word_wrap( + paste0( + "Also matched: ", + vector_and( + paste0( + candidates_formatted, + font_blue(paste0(" (", scores_formatted, ")"), collapse = NULL) + ), + quotes = FALSE, sort = FALSE + ) + ), + extra_indent = nchar("Also matched: "), + width = 0.9 * getOption("width", 100) + ) + } else { + candidates <- "" + } + + score <- mo_matching_score( + x = x[i, ]$input, + n = x[i, ]$fullname + ) + score_formatted <- trimws(formatC(round(score, 3), format = "f", digits = 3)) + txt <- paste(txt, + paste0( + paste0( + "", strrep(font_grey("-"), times = getOption("width", 100)), "\n", + '"', x[i, ]$original_input, '"', + " -> ", + paste0( + font_bold(italicise(x[i, ]$fullname)), + " (", x[i, ]$mo, ", ", score_set_colour(score_formatted, score), ")" + ) + ), + collapse = "\n" + ), + ifelse(x[i, ]$mo %in% AMR_env$MO_lookup$mo[which(AMR_env$MO_lookup$status == "synonym")], + paste0( + strrep(" ", nchar(x[i, ]$original_input) + 6), + ifelse(x[i, ]$keep_synonyms == FALSE, + # Add note if result was coerced to accepted taxonomic name + font_red(paste0("This outdated taxonomic name was converted to ", font_italic(AMR_env$MO_lookup$fullname[match(synonym_mo_to_accepted_mo(x[i, ]$mo), AMR_env$MO_lookup$mo)], collapse = NULL), " (", synonym_mo_to_accepted_mo(x[i, ]$mo), ")."), collapse = NULL), + # Or add note if result is currently another taxonomic name + font_red(paste0(font_bold("Note: "), "The current name is ", font_italic(AMR_env$MO_lookup$fullname[match(synonym_mo_to_accepted_mo(x[i, ]$mo), AMR_env$MO_lookup$mo)], collapse = NULL), " (", AMR_env$MO_lookup$ref[match(synonym_mo_to_accepted_mo(x[i, ]$mo), AMR_env$MO_lookup$mo)], ")."), collapse = NULL) + ) + ), + "" + ), + candidates, + sep = "\n" + ) + txt <- gsub("[\n]+", "\n", txt) + # remove first and last break + txt <- gsub("(^[\n]|[\n]$)", "", txt) + txt <- paste0("\n", txt, "\n") + } + + cat(txt) + if (isTRUE(any_maxed_out)) { + cat(font_blue(word_wrap("\nOnly the first ", n, " other matches of each record are shown. Run `print(mo_uncertainties(), n = ...)` to view more entries, or save `mo_uncertainties()` to an object."))) + } + if (isTRUE(more_than_50)) { + cat(font_blue(word_wrap("\nOnly the first 50 uncertainties are shown. Run `View(mo_uncertainties())` to view all entries, or save `mo_uncertainties()` to an object."))) + } +} + +#' @method print mo_renamed +#' @export +#' @noRd +print.mo_renamed <- function(x, extra_txt = "", n = 25, ...) { + if (NROW(x) == 0) { + cat(word_wrap("No renamed taxonomy to show. Only renamed taxonomy of the last call of `as.mo()` or any `mo_*()` function are stored.\n", add_fn = font_blue)) + return(invisible(NULL)) + } + + x$ref_old[!is.na(x$ref_old)] <- paste0(" (", gsub("et al.", font_italic("et al."), x$ref_old[!is.na(x$ref_old)], fixed = TRUE), ")") + x$ref_new[!is.na(x$ref_new)] <- paste0(" (", gsub("et al.", font_italic("et al."), x$ref_new[!is.na(x$ref_new)], fixed = TRUE), ")") + x$ref_old[is.na(x$ref_old)] <- " (author unknown)" + x$ref_new[is.na(x$ref_new)] <- " (author unknown)" + + rows <- seq_len(min(NROW(x), n)) + + message_( + "The following microorganism", ifelse(NROW(x) > 1, "s were", " was"), " taxonomically renamed", extra_txt, ":\n", + paste0(" ", AMR_env$bullet_icon, " ", font_italic(x$old[rows], collapse = NULL), x$ref_old[rows], + " -> ", font_italic(x$new[rows], collapse = NULL), x$ref_new[rows], + collapse = "\n" + ), + ifelse(NROW(x) > n, paste0("\n\nOnly the first ", n, " (out of ", NROW(x), ") are shown. Run `print(mo_renamed(), n = ...)` to view more entries (might be slow), or save `mo_renamed()` to an object."), "") + ) +} + +# UNDOCUMENTED HELPER FUNCTIONS ------------------------------------------- + +convert_colloquial_input <- function(x) { + x.bak <- trimws2(x) + x <- trimws2(tolower(x)) + out <- rep(NA_character_, length(x)) + + # Streptococci, like GBS = Group B Streptococci (B_STRPT_GRPB) + out[x %like_case% "^g[abcdefghijkl]s$"] <- gsub("g([abcdefghijkl])s", + "B_STRPT_GRP\\U\\1", + x[x %like_case% "^g[abcdefghijkl]s$"], + perl = TRUE + ) + # Streptococci in different languages, like "estreptococos grupo B" + out[x %like_case% "strepto[ck]o[ck][a-zA-Z ]* [abcdefghijkl]$"] <- gsub(".*e?strepto[ck]o[ck].* ([abcdefghijkl])$", + "B_STRPT_GRP\\U\\1", + x[x %like_case% "strepto[ck]o[ck][a-zA-Z ]* [abcdefghijkl]$"], + perl = TRUE + ) + out[x %like_case% "strep[a-z]* group [abcdefghijkl]$"] <- gsub(".* ([abcdefghijkl])$", + "B_STRPT_GRP\\U\\1", + x[x %like_case% "strep[a-z]* group [abcdefghijkl]$"], + perl = TRUE + ) + out[x %like_case% "group [abcdefghijkl] strepto[ck]o[ck]"] <- gsub(".*group ([abcdefghijkl]) strepto[ck]o[ck].*", + "B_STRPT_GRP\\U\\1", + x[x %like_case% "group [abcdefghijkl] strepto[ck]o[ck]"], + perl = TRUE + ) + out[x %like_case% "ha?emoly.*strep"] <- "B_STRPT_HAEM" + out[x %like_case% "(strepto.* [abcg, ]{2,4}$)"] <- "B_STRPT_ABCG" + out[x %like_case% "(strepto.* mil+er+i|^mgs[^a-z]*$)"] <- "B_STRPT_MILL" + out[x %like_case% "mil+er+i gr"] <- "B_STRPT_MILL" + out[x %like_case% "((strepto|^s).* viridans|^vgs[^a-z]*$)"] <- "B_STRPT_VIRI" + out[x %like_case% "(viridans.* (strepto|^s).*|^vgs[^a-z]*$)"] <- "B_STRPT_VIRI" + + # Salmonella in different languages, like "Salmonella grupo B" + out[x %like_case% "salmonella.* [abcdefgh]$"] <- gsub(".*salmonella.* ([abcdefgh])$", + "B_SLMNL_GRP\\U\\1", + x[x %like_case% "salmonella.* [abcdefgh]$"], + perl = TRUE + ) + out[x %like_case% "group [abcdefgh] salmonella"] <- gsub(".*group ([abcdefgh]) salmonella*", + "B_SLMNL_GRP\\U\\1", + x[x %like_case% "group [abcdefgh] salmonella"], + perl = TRUE + ) + + # CoNS/CoPS in different languages (support for German, Dutch, Spanish, Portuguese) + out[x %like_case% "([ck]oagulas[ea].negatie?[vf]|^[ck]o?ns[^a-z]*$)"] <- "B_STPHY_CONS" + out[x %like_case% "([ck]oagulas[ea].positie?[vf]|^[ck]o?ps[^a-z]*$)"] <- "B_STPHY_COPS" + + # Gram stains + out[x %like_case% "gram[ -]?neg.*"] <- "B_GRAMN" + out[x %like_case% "( |^)gram[-]( |$)"] <- "B_GRAMN" + out[x %like_case% "gram[ -]?pos.*"] <- "B_GRAMP" + out[x %like_case% "( |^)gram[+]( |$)"] <- "B_GRAMP" + out[x %like_case% "anaerob[a-z]+ .*gram[ -]?neg.*"] <- "B_ANAER-NEG" + out[x %like_case% "anaerob[a-z]+ .*gram[ -]?pos.*"] <- "B_ANAER-POS" + out[is.na(out) & x %like_case% "anaerob[a-z]+ (micro)?.*organism"] <- "B_ANAER" + out[is.na(out) & x %like_case% "anaerob[a-z]+ bacter"] <- "B_ANAER" + + # coryneform bacteria + out[x %like_case% "^coryneform"] <- "B_CORYNF" + + # yeasts and fungi + out[x %like_case% "(^| )yeast?"] <- "F_YEAST" + out[x %like_case% "(^| )fung(us|i)"] <- "F_FUNGUS" + + # protozoa + out[x %like_case% "protozo"] <- "P_PROTOZOAN" # to hit it with most languages, and "protozo" does not occur in the microorganisms data set for anything else + + # trivial names known to the field + out[x %like_case% "meningo[ck]o[ck]"] <- "B_NESSR_MNNG" + out[x %like_case% "gono[ck]o[ck]"] <- "B_NESSR_GNRR" + out[x %like_case% "pneumo[ck]o[ck]"] <- "B_STRPT_PNMN" + out[x %like_case% "hacek"] <- "B_HACEK" + out[x %like_case% "haemophilus" & x %like_case% "aggregatibacter" & x %like_case% "cardiobacterium" & x %like_case% "eikenella" & x %like_case% "kingella"] <- "B_HACEK" + out[x %like_case% "slow.* grow.* mycobact"] <- "B_MYCBC_SGM" + out[x %like_case% "rapid.* grow.* mycobact"] <- "B_MYCBC_RGM" + + # unexisting names (con is the WHONET code for contamination) + out[x %in% c("con", "other", "none", "unknown") | x %like_case% "virus"] <- "UNKNOWN" + + # WHONET has a lot of E. coli and Vibrio cholerae names + out[x %like_case% "escherichia coli"] <- "B_ESCHR_COLI" + out[x %like_case% "vibrio cholerae"] <- "B_VIBRI_CHLR" + + out +} + +italicise <- function(x) { + if (!has_colour()) { + return(x) + } + out <- font_italic(x, collapse = NULL) + # city-like serovars of Salmonella (start with a capital) + out[x %like_case% "Salmonella [A-Z]"] <- paste( + font_italic("Salmonella"), + gsub("Salmonella ", "", x[x %like_case% "Salmonella [A-Z]"]) + ) + # streptococcal groups + out[x %like_case% "Streptococcus [A-Z]"] <- paste( + font_italic("Streptococcus"), + gsub("Streptococcus ", "", x[x %like_case% "Streptococcus [A-Z]"]) + ) + # be sure not to make these italic + out <- gsub("([ -]*)(Group|group|Complex|complex)(\033\\[23m)?", "\033[23m\\1\\2", out, perl = TRUE) + out <- gsub("(\033\\[3m)?(Beta[-]haemolytic|Coagulase[-](postive|negative)) ", "\\2 \033[3m", out, perl = TRUE) + out +} + +nr2char <- function(x) { + if (x %in% c(1:10)) { + v <- c( + "one" = 1, "two" = 2, "three" = 3, "four" = 4, "five" = 5, + "six" = 6, "seven" = 7, "eight" = 8, "nine" = 9, "ten" = 10 + ) + names(v[x]) + } else { + x + } +} + +parse_and_convert <- function(x) { + if (tryCatch(is.character(x) && all(Encoding(x) == "unknown", na.rm = TRUE), error = function(e) FALSE)) { + out <- x + } else { + out <- tryCatch( + { + if (!is.null(dim(x))) { + if (NCOL(x) > 2) { + stop("a maximum of two columns is allowed", call. = FALSE) + } else if (NCOL(x) == 2) { + # support Tidyverse selection like: df %>% select(colA, colB) + # paste these columns together + x <- as.data.frame(x, stringsAsFactors = FALSE) + colnames(x) <- c("A", "B") + x <- paste(x$A, x$B) + } else { + # support Tidyverse selection like: df %>% select(colA) + x <- as.data.frame(x, stringsAsFactors = FALSE)[[1]] + } + } + parsed <- iconv(as.character(x), to = "UTF-8") + parsed[is.na(parsed) & !is.na(x)] <- iconv(x[is.na(parsed) & !is.na(x)], from = "Latin1", to = "ASCII//TRANSLIT") + parsed <- gsub('"', "", parsed, fixed = TRUE) + parsed + }, + error = function(e) stop(e$message, call. = FALSE) + ) # this will also be thrown when running `as.mo(no_existing_object)` + } + out <- trimws2(out) + out <- gsub(" +", " ", out, perl = TRUE) + out <- gsub(" ?/ ? ", "/", out, perl = TRUE) + out +} + +replace_old_mo_codes <- function(x, property) { + # this function transform old MO codes to current codes, such as: + # B_ESCH_COL (AMR v0.5.0) -> B_ESCHR_COLI + ind <- x %like_case% "^[A-Z]_[A-Z_]+$" & !x %in% AMR_env$MO_lookup$mo + if (any(ind, na.rm = TRUE)) { + add_MO_lookup_to_AMR_env() + # get the ones that match + affected <- x[ind] + affected_unique <- unique(affected) + all_direct_matches <- TRUE + # find their new codes, once per code + solved_unique <- unlist(lapply( + strsplit(affected_unique, ""), + function(m) { + kingdom <- paste0("^", m[1]) + name <- m[3:length(m)] + name[name == "_"] <- " " + name <- tolower(paste0(name, ".*", collapse = "")) + name <- gsub(" .*", " ", name, fixed = TRUE) + name <- paste0("^", name) + results <- AMR_env$MO_lookup$mo[AMR_env$MO_lookup$kingdom %like_case% kingdom & + AMR_env$MO_lookup$fullname_lower %like_case% name] + if (length(results) > 1) { + all_direct_matches <<- FALSE + } + results[1L] + } + ), use.names = FALSE) + solved <- solved_unique[match(affected, affected_unique)] + # assign on places where a match was found + x[ind] <- solved + n_matched <- length(affected[!is.na(affected)]) + n_solved <- length(affected[!is.na(solved)]) + n_unsolved <- length(affected[is.na(solved)]) + n_unique <- length(affected_unique[!is.na(affected_unique)]) + if (n_unique < n_matched) { + n_unique <- paste0(n_unique, " unique, ") + } else { + n_unique <- "" + } + if (property != "mo") { + warning_( + "in `mo_", property, "()`: the input contained ", n_matched, + " old MO code", ifelse(n_matched == 1, "", "s"), + " (", n_unique, "from a previous AMR package version). ", + "Please update your MO codes with `as.mo()` to increase speed." + ) + } else { + warning_( + "in `as.mo()`: the input contained ", n_matched, + " old MO code", ifelse(n_matched == 1, "", "s"), + " (", n_unique, "from a previous AMR package version). ", + n_solved, " old MO code", ifelse(n_solved == 1, "", "s"), + ifelse(n_solved == 1, " was", " were"), + ifelse(all_direct_matches, " updated ", font_bold(" guessed ")), + "to ", ifelse(n_solved == 1, "a ", ""), + "currently used MO code", ifelse(n_solved == 1, "", "s"), + ifelse(n_unsolved > 0, + paste0(" and ", n_unsolved, " old MO code", ifelse(n_unsolved == 1, "", "s"), " could not be updated."), + "." + ) + ) + } + } + x +} + +replace_ignore_pattern <- function(x, ignore_pattern) { + if (!is.null(ignore_pattern) && !identical(trimws2(ignore_pattern), "")) { + ignore_cases <- x %like% ignore_pattern + if (sum(ignore_cases) > 0) { + message_( + "The following input was ignored by `ignore_pattern = \"", ignore_pattern, "\"`: ", + vector_and(x[ignore_cases], quotes = TRUE) + ) + x[ignore_cases] <- NA_character_ + } + } + x +} + +repair_reference_df <- function(reference_df) { + if (is.null(reference_df)) { + return(NULL) + } + # has valid own reference_df + reference_df <- reference_df %pm>% + pm_filter(!is.na(mo)) + + # keep only first two columns, second must be mo + if (colnames(reference_df)[1] == "mo") { + reference_df <- reference_df %pm>% pm_select(2, "mo") + } else { + reference_df <- reference_df %pm>% pm_select(1, "mo") + } + + # remove factors, just keep characters + colnames(reference_df)[1] <- "x" + reference_df[, "x"] <- as.character(reference_df[, "x", drop = TRUE]) + reference_df[, "mo"] <- as.character(reference_df[, "mo", drop = TRUE]) + + # some MO codes might be old + reference_df[, "mo"] <- as.mo(reference_df[, "mo", drop = TRUE], reference_df = NULL) + reference_df +} + +get_mo_uncertainties <- function() { + remember <- list(uncertainties = AMR_env$mo_uncertainties, + failures = AMR_env$mo_failures) + # empty them, otherwise e.g. mo_shortname("Chlamydophila psittaci") will give 3 notes + AMR_env$mo_uncertainties <- NULL + AMR_env$mo_failures <- NULL + remember +} + +load_mo_uncertainties <- function(metadata) { + AMR_env$mo_uncertainties <- metadata$uncertainties + AMR_env$mo_failures <- metadata$failures +} + +synonym_mo_to_accepted_mo <- function(x, fill_in_accepted = FALSE, dataset = AMR_env$MO_lookup) { + # `dataset` is an argument so that it can be used in the regeneration of the microorganisms data set + if (identical(dataset, AMR_env$MO_lookup)) { + add_MO_lookup_to_AMR_env() + dataset <- AMR_env$MO_lookup + } + + out <- x + is_still_synonym <- dataset$status[match(out, dataset$mo)] == "synonym" + limit <- 0 + while(any(is_still_synonym, na.rm = TRUE) && limit < 5) { + limit <- limit + 1 + + # make sure to get the latest name, e.g. Fusarium pulicaris robiniae was first renamed to Fusarium roseum, then to Fusarium sambucinum + # we need the MO of Fusarium pulicaris robiniae to return the MO of Fusarium sambucinum + must_be_corrected <- !is.na(is_still_synonym) & is_still_synonym + x_gbif <- dataset$gbif_renamed_to[match(out, dataset$mo)] + x_mycobank <- dataset$mycobank_renamed_to[match(out, dataset$mo)] + x_lpsn <- dataset$lpsn_renamed_to[match(out, dataset$mo)] + + out[must_be_corrected & !is.na(x_gbif)] <- dataset$mo[match(x_gbif[must_be_corrected & !is.na(x_gbif)], dataset$gbif)] + out[must_be_corrected & !is.na(x_mycobank)] <- dataset$mo[match(x_mycobank[must_be_corrected & !is.na(x_mycobank)], dataset$mycobank)] + out[must_be_corrected & !is.na(x_lpsn)] <- dataset$mo[match(x_lpsn[must_be_corrected & !is.na(x_lpsn)], dataset$lpsn)] + + is_still_synonym <- dataset$status[match(out, dataset$mo)] == "synonym" + } + + x_no_synonym <- dataset$status[match(x, dataset$mo)] != "synonym" + out[x_no_synonym] <- NA_character_ + if (isTRUE(fill_in_accepted)) { + out[!is.na(x_no_synonym) & x_no_synonym] <- x[!is.na(x_no_synonym) & x_no_synonym] + } + + out[is.na(match(x, dataset$mo))] <- NA_character_ + out +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mo_matching_score.R + + + + +#' Calculate the Matching Score for Microorganisms +#' +#' This algorithm is used by [as.mo()] and all the [`mo_*`][mo_property()] functions to determine the most probable match of taxonomic records based on user input. +#' @param x Any user input value(s) +#' @param n A full taxonomic name, that exists in [`microorganisms$fullname`][microorganisms] +#' @note This algorithm was originally developed in 2018 and subsequently described in: Berends MS *et al.* (2022). **AMR: An R Package for Working with Antimicrobial Resistance Data**. *Journal of Statistical Software*, 104(3), 1-31; \doi{10.18637/jss.v104.i03}. +#' +#' Later, the work of Bartlett A *et al.* about bacterial pathogens infecting humans (2022, \doi{10.1099/mic.0.001269}) was incorporated, and optimalisations to the algorithm were made. +#' @section Matching Score for Microorganisms: +#' With ambiguous user input in [as.mo()] and all the [`mo_*`][mo_property()] functions, the returned results are chosen based on their matching score using [mo_matching_score()]. This matching score \eqn{m}, is calculated as: +#' +#' \ifelse{latex}{\deqn{m_{(x, n)} = \frac{l_{n} - 0.5 \cdot \min \begin{cases}l_{n} \\ \textrm{lev}(x, n)\end{cases}}{l_{n} \cdot p_{n} \cdot k_{n}}}}{ +#' +#' \ifelse{html}{\figure{mo_matching_score.png}{options: width="300" alt="mo matching score"}}{m(x, n) = ( l_n * min(l_n, lev(x, n) ) ) / ( l_n * p_n * k_n )}} +#' +#' where: +#' +#' * \eqn{x} is the user input; +#' * \eqn{n} is a taxonomic name (genus, species, and subspecies); +#' * \eqn{l_n} is the length of \eqn{n}; +#' * \eqn{lev} is the [Levenshtein distance function](https://en.wikipedia.org/wiki/Levenshtein_distance) (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n}; +#' * \eqn{p_n} is the human pathogenic prevalence group of \eqn{n}, as described below; +#' * \eqn{k_n} is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 1.25, Protozoa = 1.5, Chromista = 1.75, Archaea = 2, others = 3. +#' +#' The grouping into human pathogenic prevalence \eqn{p} is based on recent work from Bartlett *et al.* (2022, \doi{10.1099/mic.0.001269}) who extensively studied medical-scientific literature to categorise all bacterial species into these groups: +#' +#' - **Established**, if a taxonomic species has infected at least three persons in three or more references. These records have `prevalence = 1.15` in the [microorganisms] data set; +#' - **Putative**, if a taxonomic species has fewer than three known cases. These records have `prevalence = 1.25` in the [microorganisms] data set. +#' +#' Furthermore, +#' +#' - Genera from the World Health Organization's (WHO) Priority Pathogen List have `prevalence = 1.0` in the [microorganisms] data set; +#' - Any genus present in the **established** list also has `prevalence = 1.15` in the [microorganisms] data set; +#' - Any other genus present in the **putative** list has `prevalence = 1.25` in the [microorganisms] data set; +#' - Any other species or subspecies of which the genus is present in the two aforementioned groups, has `prevalence = 1.5` in the [microorganisms] data set; +#' - Any *non-bacterial* genus, species or subspecies of which the genus is present in the following list, has `prevalence = 1.25` in the [microorganisms] data set: `r vector_or(MO_RELEVANT_GENERA, quotes = "*")`; +#' - All other records have `prevalence = 2.0` in the [microorganisms] data set. +#' +#' When calculating the matching score, all characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses. +#' +#' All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., `"E. coli"` will return the microbial ID of *Escherichia coli* (\eqn{m = `r round(mo_matching_score("E. coli", "Escherichia coli"), 3)`}, a highly prevalent microorganism found in humans) and not *Entamoeba coli* (\eqn{m = `r round(mo_matching_score("E. coli", "Entamoeba coli"), 3)`}, a less prevalent microorganism in humans), although the latter would alphabetically come first. +#' @export +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' mo_reset_session() +#' +#' as.mo("E. coli") +#' mo_uncertainties() +#' +#' mo_matching_score( +#' x = "E. coli", +#' n = c("Escherichia coli", "Entamoeba coli") +#' ) +mo_matching_score <- function(x, n) { + meet_criteria(x, allow_class = c("character", "data.frame", "list")) + meet_criteria(n, allow_class = "character") + + add_MO_lookup_to_AMR_env() + + x <- parse_and_convert(x) + # no dots and other non-whitespace characters + x <- gsub("[^a-zA-Z0-9 \\(\\)]+", "", x) + + # only keep one space + x <- gsub(" +", " ", x) + + # force a capital letter, so this conversion will not count as a substitution + substr(x, 1, 1) <- toupper(substr(x, 1, 1)) + + # n is always a taxonomically valid full name + if (length(n) == 1) { + n <- rep(n, length(x)) + } + if (length(x) == 1) { + x <- rep(x, length(n)) + } + + # length of fullname + l_n <- nchar(n) + lev <- double(length = length(x)) + l_n.lev <- double(length = length(x)) + # get Levenshtein distance + lev <- unlist(Map(f = function(a, b) { + as.double(utils::adist(a, b, + ignore.case = FALSE, + fixed = TRUE, + costs = c(insertions = 1, deletions = 2, substitutions = 2), + counts = FALSE + )) + }, x, n, USE.NAMES = FALSE)) + + l_n.lev[l_n < lev] <- l_n[l_n < lev] + l_n.lev[lev < l_n] <- lev[lev < l_n] + l_n.lev[lev == l_n] <- lev[lev == l_n] + + # human pathogenic prevalence (1 to 3), see ?as.mo + p_n <- AMR_env$MO_lookup[match(n, AMR_env$MO_lookup$fullname), "prevalence", drop = TRUE] + # kingdom index (Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5) + k_n <- AMR_env$MO_lookup[match(n, AMR_env$MO_lookup$fullname), "kingdom_index", drop = TRUE] + + # matching score: + (l_n - 0.5 * l_n.lev) / (l_n * p_n * k_n) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mo_property.R + + + + +#' Get Properties of a Microorganism +#' +#' Use these functions to return a specific property of a microorganism based on the latest accepted taxonomy. All input values will be evaluated internally with [as.mo()], which makes it possible to use microbial abbreviations, codes and names as input. See *Examples*. +#' @param x any [character] (vector) that can be coerced to a valid microorganism code with [as.mo()]. Can be left blank for auto-guessing the column containing microorganism codes if used in a data set, see *Examples*. +#' @param property one of the column names of the [microorganisms] data set: `r vector_or(colnames(microorganisms), sort = FALSE, quotes = TRUE)`, or must be `"shortname"` +#' @inheritParams as.mo +#' @param ... other arguments passed on to [as.mo()], such as 'minimum_matching_score', 'ignore_pattern', and 'remove_from_input' +#' @param ab any (vector of) text that can be coerced to a valid antibiotic drug code with [as.ab()] +#' @param open browse the URL using [`browseURL()`][utils::browseURL()] +#' @details All functions will, at default, **not** keep old taxonomic properties, as synonyms are automatically replaced with the current taxonomy. Take for example *Enterobacter aerogenes*, which was initially named in 1960 but renamed to *Klebsiella aerogenes* in 2017: +#' - `mo_genus("Enterobacter aerogenes")` will return `"Klebsiella"` (with a note about the renaming) +#' - `mo_genus("Enterobacter aerogenes", keep_synonyms = TRUE)` will return `"Enterobacter"` (with a once-per-session warning that the name is outdated) +#' - `mo_ref("Enterobacter aerogenes")` will return `"Tindall et al., 2017"` (with a note about the renaming) +#' - `mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)` will return `"Hormaeche et al., 1960"` (with a once-per-session warning that the name is outdated) +#' +#' The short name ([mo_shortname()]) returns the first character of the genus and the full species, such as `"E. coli"`, for species and subspecies. Exceptions are abbreviations of staphylococci (such as *"CoNS"*, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as *"GBS"*, Group B Streptococci). Please bear in mind that e.g. *E. coli* could mean *Escherichia coli* (kingdom of Bacteria) as well as *Entamoeba coli* (kingdom of Protozoa). Returning to the full name will be done using [as.mo()] internally, giving priority to bacteria and human pathogens, i.e. `"E. coli"` will be considered *Escherichia coli*. As a result, `mo_fullname(mo_shortname("Entamoeba coli"))` returns `"Escherichia coli"`. +#' +#' Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions [mo_kingdom()] and [mo_domain()] return the exact same results. +#' +#' Determination of human pathogenicity ([mo_pathogenicity()]) is strongly based on Bartlett *et al.* (2022, \doi{10.1099/mic.0.001269}). This function returns a [factor] with the levels *Pathogenic*, *Potentially pathogenic*, *Non-pathogenic*, and *Unknown*. +#' +#' Determination of the Gram stain ([mo_gramstain()]) will be based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, [PMID 11837318](https://pubmed.ncbi.nlm.nih.gov/11837318/)), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, [PMID 34694987](https://pubmed.ncbi.nlm.nih.gov/34694987/)). Bacteria in these phyla are considered Gram-positive in this `AMR` package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value `NA`. Functions [mo_is_gram_negative()] and [mo_is_gram_positive()] always return `TRUE` or `FALSE` (or `NA` when the input is `NA` or the MO code is `UNKNOWN`), thus always return `FALSE` for species outside the taxonomic kingdom of Bacteria. +#' +#' Determination of yeasts ([mo_is_yeast()]) will be based on the taxonomic kingdom and class. *Budding yeasts* are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. *True yeasts* quite specifically refers to yeasts in the underlying order Saccharomycetales (such as *Saccharomyces cerevisiae*). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return `TRUE`. It returns `FALSE` otherwise (or `NA` when the input is `NA` or the MO code is `UNKNOWN`). +#' +#' Determination of intrinsic resistance ([mo_is_intrinsic_resistant()]) will be based on the [intrinsic_resistant] data set, which is based on `r format_eucast_version_nr(3.3)`. The [mo_is_intrinsic_resistant()] function can be vectorised over both argument `x` (input for microorganisms) and `ab` (input for antibiotics). +#' +#' Determination of bacterial oxygen tolerance ([mo_oxygen_tolerance()]) will be based on BacDive, see *Source*. The function [mo_is_anaerobic()] only returns `TRUE` if the oxygen tolerance is `"anaerobe"`, indicting an obligate anaerobic species or genus. It always returns `FALSE` for species outside the taxonomic kingdom of Bacteria. +#' +#' The function [mo_url()] will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. [This MycoBank URL](`r TAXONOMY_VERSION$MycoBank$url`) will be used for fungi wherever available , [this LPSN URL](`r TAXONOMY_VERSION$MycoBank$url`) for bacteria wherever available, and [this GBIF link](`r TAXONOMY_VERSION$GBIF$url`) otherwise. +#' +#' SNOMED codes ([mo_snomed()]) was last updated on `r documentation_date(TAXONOMY_VERSION$SNOMED$accessed_date)`. See *Source* and the [microorganisms] data set for more info. +#' +#' Old taxonomic names (so-called 'synonyms') can be retrieved with [mo_synonyms()] (which will have the scientific reference as [name][base::names()]), the current taxonomic name can be retrieved with [mo_current()]. Both functions return full names. +#' +#' All output [will be translated][translate] where possible. +#' @section Matching Score for Microorganisms: +#' This function uses [as.mo()] internally, which uses an advanced algorithm to translate arbitrary user input to valid taxonomy using a so-called matching score. You can read about this public algorithm on the [MO matching score page][mo_matching_score()]. +#' @inheritSection as.mo Source +#' @rdname mo_property +#' @name mo_property +#' @return +#' - An [integer] in case of [mo_year()] +#' - An [ordered factor][factor] in case of [mo_pathogenicity()] +#' - A [list] in case of [mo_taxonomy()], [mo_synonyms()], [mo_snomed()], and [mo_info()] +#' - A [logical] in case of [mo_is_anaerobic()], [mo_is_gram_negative()], [mo_is_gram_positive()], [mo_is_intrinsic_resistant()], and [mo_is_yeast()] +#' - A named [character] in case of [mo_synonyms()] and [mo_url()] +#' - A [character] in all other cases +#' @export +#' @seealso Data set [microorganisms] +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' # taxonomic tree ----------------------------------------------------------- +#' +#' mo_kingdom("Klebsiella pneumoniae") +#' mo_phylum("Klebsiella pneumoniae") +#' mo_class("Klebsiella pneumoniae") +#' mo_order("Klebsiella pneumoniae") +#' mo_family("Klebsiella pneumoniae") +#' mo_genus("Klebsiella pneumoniae") +#' mo_species("Klebsiella pneumoniae") +#' mo_subspecies("Klebsiella pneumoniae") +#' +#' +#' # full names and short names ----------------------------------------------- +#' +#' mo_name("Klebsiella pneumoniae") +#' mo_fullname("Klebsiella pneumoniae") +#' mo_shortname("Klebsiella pneumoniae") +#' +#' +#' # other properties --------------------------------------------------------- +#' +#' mo_pathogenicity("Klebsiella pneumoniae") +#' mo_gramstain("Klebsiella pneumoniae") +#' mo_snomed("Klebsiella pneumoniae") +#' mo_type("Klebsiella pneumoniae") +#' mo_rank("Klebsiella pneumoniae") +#' mo_url("Klebsiella pneumoniae") +#' mo_is_yeast(c("Candida", "Trichophyton", "Klebsiella")) +#' +#' mo_group_members(c("Streptococcus group A", +#' "Streptococcus group C", +#' "Streptococcus group G", +#' "Streptococcus group L")) +#' +#' +#' # scientific reference ----------------------------------------------------- +#' +#' mo_ref("Klebsiella aerogenes") +#' mo_authors("Klebsiella aerogenes") +#' mo_year("Klebsiella aerogenes") +#' mo_synonyms("Klebsiella aerogenes") +#' mo_lpsn("Klebsiella aerogenes") +#' mo_gbif("Klebsiella aerogenes") +#' mo_mycobank("Candida albicans") +#' mo_mycobank("Candida krusei") +#' mo_mycobank("Candida krusei", keep_synonyms = TRUE) +#' +#' +#' # abbreviations known in the field ----------------------------------------- +#' +#' mo_genus("MRSA") +#' mo_species("MRSA") +#' mo_shortname("VISA") +#' mo_gramstain("VISA") +#' +#' mo_genus("EHEC") +#' mo_species("EIEC") +#' mo_name("UPEC") +#' +#' +#' # known subspecies --------------------------------------------------------- +#' +#' mo_fullname("K. pneu rh") +#' mo_shortname("K. pneu rh") +#' +#' \donttest{ +#' # Becker classification, see ?as.mo ---------------------------------------- +#' +#' mo_fullname("Staph epidermidis") +#' mo_fullname("Staph epidermidis", Becker = TRUE) +#' mo_shortname("Staph epidermidis") +#' mo_shortname("Staph epidermidis", Becker = TRUE) +#' +#' +#' # Lancefield classification, see ?as.mo ------------------------------------ +#' +#' mo_fullname("Strep agalactiae") +#' mo_fullname("Strep agalactiae", Lancefield = TRUE) +#' mo_shortname("Strep agalactiae") +#' mo_shortname("Strep agalactiae", Lancefield = TRUE) +#' +#' +#' # language support -------------------------------------------------------- +#' +#' mo_gramstain("Klebsiella pneumoniae", language = "de") # German +#' mo_gramstain("Klebsiella pneumoniae", language = "nl") # Dutch +#' mo_gramstain("Klebsiella pneumoniae", language = "es") # Spanish +#' mo_gramstain("Klebsiella pneumoniae", language = "el") # Greek +#' mo_gramstain("Klebsiella pneumoniae", language = "uk") # Ukrainian +#' +#' # mo_type is equal to mo_kingdom, but mo_kingdom will remain untranslated +#' mo_kingdom("Klebsiella pneumoniae") +#' mo_type("Klebsiella pneumoniae") +#' mo_kingdom("Klebsiella pneumoniae", language = "zh") # Chinese, no effect +#' mo_type("Klebsiella pneumoniae", language = "zh") # Chinese, translated +#' +#' mo_fullname("S. pyogenes", Lancefield = TRUE, language = "de") +#' mo_fullname("S. pyogenes", Lancefield = TRUE, language = "uk") +#' +#' +#' # other -------------------------------------------------------------------- +#' +#' # gram stains and intrinsic resistance can be used as a filter in dplyr verbs +#' if (require("dplyr")) { +#' example_isolates %>% +#' filter(mo_is_gram_positive()) %>% +#' count(mo_genus(), sort = TRUE) +#' } +#' if (require("dplyr")) { +#' example_isolates %>% +#' filter(mo_is_intrinsic_resistant(ab = "vanco")) %>% +#' count(mo_genus(), sort = TRUE) +#' } +#' +#' # get a list with the complete taxonomy (from kingdom to subspecies) +#' mo_taxonomy("Klebsiella pneumoniae") +#' +#' # get a list with the taxonomy, the authors, Gram-stain, +#' # SNOMED codes, and URL to the online database +#' mo_info("Klebsiella pneumoniae") +#' } +mo_name <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_name") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "fullname", language = language, keep_synonyms = keep_synonyms, ...), + language = language, + only_unknown = FALSE, + only_affect_mo_names = TRUE + ) +} + +#' @rdname mo_property +#' @export +mo_fullname <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_fullname") + } + mo_name(x = x, language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_shortname <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_shortname") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + + metadata <- get_mo_uncertainties() + + replace_empty <- function(x) { + x[x == ""] <- "spp." + x + } + + # get first char of genus and complete species in English + genera <- mo_genus(x.mo, language = NULL, keep_synonyms = keep_synonyms) + shortnames <- paste0(substr(genera, 1, 1), ". ", replace_empty(mo_species(x.mo, language = NULL, keep_synonyms = keep_synonyms))) + + # exceptions for where no species is known + shortnames[shortnames %like% ".[.] spp[.]"] <- genera[shortnames %like% ".[.] spp[.]"] + # exceptions for staphylococci + shortnames[shortnames == "S. coagulase-negative"] <- "CoNS" + shortnames[shortnames == "S. coagulase-positive"] <- "CoPS" + # exceptions for streptococci: Group A Streptococcus -> GAS + shortnames[shortnames %like_case% "S. Group [ABCDFGHK]"] <- paste0("G", gsub("S. Group ([ABCDFGHK])", "\\1", shortnames[shortnames %like_case% "S. Group [ABCDFGHK]"], perl = TRUE), "S") + # unknown species etc. + shortnames[shortnames %like% "unknown"] <- paste0("(", trimws2(gsub("[^a-zA-Z -]", "", shortnames[shortnames %like% "unknown"], perl = TRUE)), ")") + + shortnames[mo_rank(x.mo) %in% c("kingdom", "phylum", "class", "order", "family")] <- mo_name(x.mo[mo_rank(x.mo) %in% c("kingdom", "phylum", "class", "order", "family")], language = NULL, keep_synonyms = keep_synonyms) + + shortnames[is.na(x.mo)] <- NA_character_ + load_mo_uncertainties(metadata) + translate_into_language(shortnames, language = language, only_unknown = FALSE, only_affect_mo_names = TRUE) +} + + + +#' @rdname mo_property +#' @export +mo_subspecies <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_subspecies") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "subspecies", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_species <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_species") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "species", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_genus <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_genus") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "genus", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_family <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_family") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "family", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_order <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_order") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "order", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_class <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_class") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "class", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_phylum <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_phylum") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "phylum", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_kingdom <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_kingdom") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "kingdom", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_domain <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_domain") + } + mo_kingdom(x = x, language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_type <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_type") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + out <- mo_kingdom(x.mo, language = NULL, keep_synonyms = keep_synonyms) + out[which(mo_is_yeast(x.mo, keep_synonyms = keep_synonyms))] <- "Yeasts" + translate_into_language(out, language = language, only_unknown = FALSE) +} + +#' @rdname mo_property +#' @export +mo_status <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_status") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = "status", language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +#' @rdname mo_property +#' @export +mo_pathogenicity <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_pathogenicity") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + add_MO_lookup_to_AMR_env() + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + prev <- AMR_env$MO_lookup$prevalence[match(x.mo, AMR_env$MO_lookup$mo)] + kngd <- AMR_env$MO_lookup$kingdom[match(x.mo, AMR_env$MO_lookup$mo)] + rank <- AMR_env$MO_lookup$rank[match(x.mo, AMR_env$MO_lookup$mo)] + + out <- factor(case_when_AMR(prev <= 1.15 & kngd == "Bacteria" & rank != "genus" ~ "Pathogenic", + prev < 2 & kngd == "Fungi" ~ "Potentially pathogenic", + prev == 2 & kngd == "Bacteria" ~ "Non-pathogenic", + kngd == "Bacteria" ~ "Potentially pathogenic", + TRUE ~ "Unknown"), + levels = c("Pathogenic", "Potentially pathogenic", "Non-pathogenic", "Unknown"), + ordered = TRUE + ) + + load_mo_uncertainties(metadata) + out +} + +#' @rdname mo_property +#' @export +mo_gramstain <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_gramstain") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + x <- rep(NA_character_, length(x)) + # make all bacteria Gram negative + x[mo_kingdom(x.mo, language = NULL, keep_synonyms = keep_synonyms) == "Bacteria"] <- "Gram-negative" + # overwrite these 4 phyla with Gram-positives + # Source: https://itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=956097 (Cavalier-Smith, 2002) + x[(mo_phylum(x.mo, language = NULL, keep_synonyms = keep_synonyms) %in% c( + # no longer in use, does not hurt to keep here: + "Actinobacteria", + "Chloroflexi", + "Firmicutes", + "Tenericutes", + "Actinomycetota", # since 2021, old name was Actinobacteria + "Chloroflexota", # since 2021, old name was Chloroflexi + "Bacillota", # since 2021, old name was Firmicutes + "Mycoplasmatota" # since 2021, old name was Tenericutes + ) & + # but class Negativicutes (of phylum Bacillota) are Gram-negative! + mo_class(x.mo, language = NULL, keep_synonyms = keep_synonyms) != "Negativicutes") + # and of course our own ID for Gram-positives + | x.mo %in% c("B_GRAMP", "B_ANAER-POS")] <- "Gram-positive" + + load_mo_uncertainties(metadata) + translate_into_language(x, language = language, only_unknown = FALSE) +} + +#' @rdname mo_property +#' @export +mo_is_gram_negative <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_is_gram_negative") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + grams <- mo_gramstain(x.mo, language = NULL, keep_synonyms = keep_synonyms) + load_mo_uncertainties(metadata) + out <- grams == "Gram-negative" & !is.na(grams) + out[x.mo %in% c(NA_character_, "UNKNOWN")] <- NA + out +} + +#' @rdname mo_property +#' @export +mo_is_gram_positive <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_is_gram_positive") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + grams <- mo_gramstain(x.mo, language = NULL, keep_synonyms = keep_synonyms) + load_mo_uncertainties(metadata) + out <- grams == "Gram-positive" & !is.na(grams) + out[x.mo %in% c(NA_character_, "UNKNOWN")] <- NA + out +} + +#' @rdname mo_property +#' @export +mo_is_yeast <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_is_yeast") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + x.kingdom <- mo_kingdom(x.mo, language = NULL, keep_synonyms = keep_synonyms) + x.class <- mo_class(x.mo, language = NULL, keep_synonyms = keep_synonyms) + + load_mo_uncertainties(metadata) + + out <- x.mo == "F_YEAST" | (x.kingdom == "Fungi" & x.class %in% c("Saccharomycetes", "Pichiomycetes")) + out[x.mo %in% c(NA_character_, "UNKNOWN")] <- NA + out +} + +#' @rdname mo_property +#' @export +mo_is_intrinsic_resistant <- function(x, ab, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_is_intrinsic_resistant") + } + meet_criteria(x, allow_NA = TRUE) + meet_criteria(ab, allow_NA = FALSE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + ab <- as.ab(ab, language = NULL, flag_multiple_results = FALSE, info = FALSE) + + if (length(x) == 1 & length(ab) > 1) { + x <- rep(x, length(ab)) + } else if (length(ab) == 1 & length(x) > 1) { + ab <- rep(ab, length(x)) + } + if (length(x) != length(ab)) { + stop_("length of `x` and `ab` must be equal, or one of them must be of length 1.") + } + + # show used version number once per session (AMR_env will reload every session) + if (message_not_thrown_before("mo_is_intrinsic_resistant", "version.mo", entire_session = TRUE)) { + message_( + "Determining intrinsic resistance based on ", + format_eucast_version_nr(3.3, markdown = FALSE), ". ", + font_red("This note will be shown once per session.") + ) + } + + # runs against internal vector: intrinsic_resistant (see zzz.R) + add_intrinsic_resistance_to_AMR_env() + paste(x, ab) %in% AMR_env$intrinsic_resistant +} + +#' @rdname mo_property +#' @export +mo_oxygen_tolerance <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_oxygen_tolerance") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "oxygen_tolerance", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_is_anaerobic <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_is_anaerobic") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + oxygen <- mo_oxygen_tolerance(x.mo, language = NULL, keep_synonyms = keep_synonyms) + load_mo_uncertainties(metadata) + out <- oxygen == "anaerobe" & !is.na(oxygen) + out[x.mo %in% c(NA_character_, "UNKNOWN")] <- NA + out +} + +#' @rdname mo_property +#' @export +mo_snomed <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_snomed") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "snomed", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_ref <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_ref") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "ref", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_authors <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_authors") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x <- mo_validate(x = x, property = "ref", language = language, keep_synonyms = keep_synonyms, ...) + # remove last 4 digits and presumably the comma and space that preceed them + x[!is.na(x)] <- gsub(",? ?[0-9]{4}", "", x[!is.na(x)], perl = TRUE) + suppressWarnings(x) +} + +#' @rdname mo_property +#' @export +mo_year <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_year") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x <- mo_validate(x = x, property = "ref", language = language, keep_synonyms = keep_synonyms, ...) + # get last 4 digits + x[!is.na(x)] <- gsub(".*([0-9]{4})$", "\\1", x[!is.na(x)], perl = TRUE) + suppressWarnings(as.integer(x)) +} + +#' @rdname mo_property +#' @export +mo_lpsn <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_lpsn") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "lpsn", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_mycobank <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_mycobank") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "mycobank", language = language, keep_synonyms = keep_synonyms, ...) +} + + +#' @rdname mo_property +#' @export +mo_gbif <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_gbif") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "gbif", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_rank <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_rank") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + mo_validate(x = x, property = "rank", language = language, keep_synonyms = keep_synonyms, ...) +} + +#' @rdname mo_property +#' @export +mo_taxonomy <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_taxonomy") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + out <- list( + kingdom = mo_kingdom(x, language = language, keep_synonyms = keep_synonyms), + phylum = mo_phylum(x, language = language, keep_synonyms = keep_synonyms), + class = mo_class(x, language = language, keep_synonyms = keep_synonyms), + order = mo_order(x, language = language, keep_synonyms = keep_synonyms), + family = mo_family(x, language = language, keep_synonyms = keep_synonyms), + genus = mo_genus(x, language = language, keep_synonyms = keep_synonyms), + species = mo_species(x, language = language, keep_synonyms = keep_synonyms), + subspecies = mo_subspecies(x, language = language, keep_synonyms = keep_synonyms) + ) + + load_mo_uncertainties(metadata) + out +} + +#' @rdname mo_property +#' @export +mo_synonyms <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_synonyms") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + add_MO_lookup_to_AMR_env() + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + syns <- lapply(x.mo, function(y) { + gbif <- AMR_env$MO_lookup$gbif[match(y, AMR_env$MO_lookup$mo)] + lpsn <- AMR_env$MO_lookup$lpsn[match(y, AMR_env$MO_lookup$mo)] + fullname <- AMR_env$MO_lookup[which(AMR_env$MO_lookup$lpsn_renamed_to == lpsn | AMR_env$MO_lookup$gbif_renamed_to == gbif), "fullname", drop = TRUE] + if (length(fullname) == 0) { + NULL + } else { + ref <- AMR_env$MO_lookup[which(AMR_env$MO_lookup$lpsn_renamed_to == lpsn | AMR_env$MO_lookup$gbif_renamed_to == gbif), "ref", drop = TRUE] + names(fullname) <- ref + fullname + } + }) + + if (length(syns) == 1) { + syns <- unlist(syns) + } + + load_mo_uncertainties(metadata) + syns +} + +#' @rdname mo_property +#' @export +mo_current <- function(x, language = get_AMR_locale(), ...) { + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + x.mo <- suppressWarnings(as.mo(x, keep_synonyms = TRUE, info = FALSE, ...)) + out <- synonym_mo_to_accepted_mo(x.mo, fill_in_accepted = TRUE) + mo_name(out, language = language) +} + + +#' @rdname mo_property +#' @export +mo_group_members <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_synonyms") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + add_MO_lookup_to_AMR_env() + + x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + members <- lapply(x.mo, function(y) { + AMR::microorganisms.groups$mo_name[which(AMR::microorganisms.groups$mo_group == y)] + }) + names(members) <- mo_name(x, keep_synonyms = TRUE, language = language) + + if (length(members) == 1) { + members <- unname(unlist(members)) + } + + load_mo_uncertainties(metadata) + members +} + + +#' @rdname mo_property +#' @export +mo_info <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_info") + } + meet_criteria(x, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + x <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + metadata <- get_mo_uncertainties() + + info <- lapply(x, function(y) { + c( + list(mo = as.character(y), + rank = mo_rank(y, language = language, keep_synonyms = keep_synonyms)), + mo_taxonomy(y, language = language, keep_synonyms = keep_synonyms), + list( + status = mo_status(y, language = language, keep_synonyms = keep_synonyms), + synonyms = mo_synonyms(y, keep_synonyms = keep_synonyms), + gramstain = mo_gramstain(y, language = language, keep_synonyms = keep_synonyms), + oxygen_tolerance = mo_oxygen_tolerance(y, language = language, keep_synonyms = keep_synonyms), + url = unname(mo_url(y, open = FALSE, keep_synonyms = keep_synonyms)), + ref = mo_ref(y, keep_synonyms = keep_synonyms), + snomed = unlist(mo_snomed(y, keep_synonyms = keep_synonyms)), + lpsn = mo_lpsn(y, language = language, keep_synonyms = keep_synonyms), + mycobank = mo_mycobank(y, language = language, keep_synonyms = keep_synonyms), + gbif = mo_gbif(y, language = language, keep_synonyms = keep_synonyms), + group_members = mo_group_members(y, language = language, keep_synonyms = keep_synonyms) + ) + ) + }) + if (length(info) > 1) { + names(info) <- mo_name(x) + result <- info + } else { + result <- info[[1L]] + } + + load_mo_uncertainties(metadata) + result +} + +#' @rdname mo_property +#' @export +mo_url <- function(x, open = FALSE, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_url") + } + meet_criteria(x, allow_NA = TRUE) + meet_criteria(open, allow_class = "logical", has_length = 1) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + add_MO_lookup_to_AMR_env() + + x.mo <- as.mo(x = x, language = language, keep_synonyms = keep_synonyms, ... = ...) + metadata <- get_mo_uncertainties() + + x.rank <- AMR_env$MO_lookup$rank[match(x.mo, AMR_env$MO_lookup$mo)] + x.name <- AMR_env$MO_lookup$fullname[match(x.mo, AMR_env$MO_lookup$mo)] + + x.lpsn <- AMR_env$MO_lookup$lpsn[match(x.mo, AMR_env$MO_lookup$mo)] + x.mycobank <- AMR_env$MO_lookup$mycobank[match(x.mo, AMR_env$MO_lookup$mo)] + x.gbif <- AMR_env$MO_lookup$gbif[match(x.mo, AMR_env$MO_lookup$mo)] + + u <- character(length(x)) + u[!is.na(x.gbif)] <- paste0(TAXONOMY_VERSION$GBIF$url, "/species/", x.gbif[!is.na(x.gbif)]) + # overwrite with LPSN: + u[!is.na(x.lpsn)] <- paste0(TAXONOMY_VERSION$LPSN$url, "/", x.rank[!is.na(x.lpsn)], "/", gsub(" ", "-", tolower(x.name[!is.na(x.lpsn)]), fixed = TRUE)) + # overwrite with MycoBank (bacteria from LPSN will not be overwritten since MycoBank has no bacteria) + u[!is.na(x.mycobank)] <- paste0(TAXONOMY_VERSION$MycoBank$url, "/mb/", gsub(" ", "%20", tolower(x.mycobank[!is.na(x.mycobank)]), fixed = TRUE)) + + names(u) <- x.name + + if (isTRUE(open)) { + if (length(u) > 1) { + warning_("in `mo_url()`: only the first URL will be opened, as R's built-in function `browseURL()` only suports one string.") + } + utils::browseURL(u[1L]) + } + + load_mo_uncertainties(metadata) + u +} + + +#' @rdname mo_property +#' @export +mo_property <- function(x, property = "fullname", language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { + if (missing(x)) { + # this tries to find the data and an 'mo' column + x <- find_mo_col(fn = "mo_property") + } + meet_criteria(x, allow_NA = TRUE) + meet_criteria(property, allow_class = "character", has_length = 1, is_in = colnames(AMR::microorganisms)) + language <- validate_language(language) + meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) + + translate_into_language(mo_validate(x = x, property = property, language = language, keep_synonyms = keep_synonyms, ...), language = language, only_unknown = TRUE) +} + +mo_validate <- function(x, property, language, keep_synonyms = keep_synonyms, ...) { + add_MO_lookup_to_AMR_env() + + # try to catch an error when inputting an invalid argument + # so the 'call.' can be set to FALSE + tryCatch(x[1L] %in% unlist(AMR_env$MO_lookup[1, property, drop = TRUE]), + error = function(e) stop(e$message, call. = FALSE) + ) + + dots <- list(...) + Becker <- dots$Becker + if (is.null(Becker) || property %in% c("kingdom", "phylum", "class", "order", "family", "genus")) { + Becker <- FALSE + } + Lancefield <- dots$Lancefield + if (is.null(Lancefield) || property %in% c("kingdom", "phylum", "class", "order", "family", "genus")) { + Lancefield <- FALSE + } + has_Becker_or_Lancefield <- Becker %in% c(TRUE, "all") || Lancefield %in% c(TRUE, "all") + + if (isFALSE(has_Becker_or_Lancefield) && isTRUE(keep_synonyms) && all(x %in% c(AMR_env$MO_lookup$mo, NA))) { + # fastest way to get properties + if (property == "snomed") { + x <- lapply(x, function(y) unlist(AMR_env$MO_lookup$snomed[match(y, AMR_env$MO_lookup$mo)])) + } else { + x <- AMR_env$MO_lookup[[property]][match(x, AMR_env$MO_lookup$mo)] + } + + } else { + # get microorganisms data set, but remove synonyms if keep_synonyms is FALSE + mo_data_check <- AMR_env$MO_lookup[which(AMR_env$MO_lookup$status %in% if (isTRUE(keep_synonyms)) c("synonym", "accepted") else "accepted"), , drop = FALSE] + + if (all(x %in% c(mo_data_check$mo, NA)) && !has_Becker_or_Lancefield) { + # do nothing, just don't run the other if-else's + } else if (all(x %in% c(unlist(mo_data_check[[property]]), NA)) && !has_Becker_or_Lancefield) { + # no need to do anything, just return it + return(x) + } else { + # we need to get MO codes now + x <- replace_old_mo_codes(x, property = property) + x <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) + } + + # get property reeaaally fast using match() + if (property == "snomed") { + x <- lapply(x, function(y) unlist(AMR_env$MO_lookup$snomed[match(y, AMR_env$MO_lookup$mo)])) + } else { + x <- AMR_env$MO_lookup[[property]][match(x, AMR_env$MO_lookup$mo)] + } + } + + if (property == "mo") { + return(set_clean_class(x, new_class = c("mo", "character"))) + } else if (property == "snomed") { + return(x) + } else if (property == "prevalence") { + return(as.double(x)) + } else { + # everything else as character + return(as.character(x)) + } +} + +find_mo_col <- function(fn) { + # this function tries to find an mo column in the data the function was called in, + # which is useful when functions are used within dplyr verbs + df <- get_current_data(arg_name = "x", call = -3) # will return an error if not found + mo <- NULL + try( + { + mo <- suppressMessages(search_type_in_df(df, "mo")) + }, + silent = TRUE + ) + if (!is.null(df) && !is.null(mo) && is.data.frame(df)) { + if (message_not_thrown_before(fn = fn)) { + message_("Using column '", font_bold(mo), "' as input for `", fn, "()`") + } + return(df[, mo, drop = TRUE]) + } else { + stop_("argument `x` is missing and no column with info about microorganisms could be found.", call = -2) + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/mo_source.R + + + + +#' User-Defined Reference Data Set for Microorganisms +#' +#' @description These functions can be used to predefine your own reference to be used in [as.mo()] and consequently all [`mo_*`][mo_property()] functions (such as [mo_genus()] and [mo_gramstain()]). +#' +#' This is **the fastest way** to have your organisation (or analysis) specific codes picked up and translated by this package, since you don't have to bother about it again after setting it up once. +#' @param path location of your reference file, this can be any text file (comma-, tab- or pipe-separated) or an Excel file (see *Details*). Can also be `""`, `NULL` or `FALSE` to delete the reference file. +#' @param destination destination of the compressed data file - the default is the user's home directory. +#' @rdname mo_source +#' @name mo_source +#' @aliases set_mo_source get_mo_source +#' @details The reference file can be a text file separated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an \R object file (extension '.rds'). To use an Excel file, you will need to have the `readxl` package installed. +#' +#' [set_mo_source()] will check the file for validity: it must be a [data.frame], must have a column named `"mo"` which contains values from [`microorganisms$mo`][microorganisms] or [`microorganisms$fullname`][microorganisms] and must have a reference column with your own defined values. If all tests pass, [set_mo_source()] will read the file into \R and will ask to export it to `"~/mo_source.rds"`. The CRAN policy disallows packages to write to the file system, although '*exceptions may be allowed in interactive sessions if the package obtains confirmation from the user*'. For this reason, this function only works in interactive sessions so that the user can **specifically confirm and allow** that this file will be created. The destination of this file can be set with the `destination` argument and defaults to the user's home directory. It can also be set with the package option [`AMR_mo_source`][AMR-options], e.g. `options(AMR_mo_source = "my/location/file.rds")`. +#' +#' The created compressed data file `"mo_source.rds"` will be used at default for MO determination (function [as.mo()] and consequently all `mo_*` functions like [mo_genus()] and [mo_gramstain()]). The location and timestamp of the original file will be saved as an [attribute][base::attributes()] to the compressed data file. +#' +#' The function [get_mo_source()] will return the data set by reading `"mo_source.rds"` with [readRDS()]. If the original file has changed (by checking the location and timestamp of the original file), it will call [set_mo_source()] to update the data file automatically if used in an interactive session. +#' +#' Reading an Excel file (`.xlsx`) with only one row has a size of 8-9 kB. The compressed file created with [set_mo_source()] will then have a size of 0.1 kB and can be read by [get_mo_source()] in only a couple of microseconds (millionths of a second). +#' +#' @section How to Setup: +#' +#' Imagine this data on a sheet of an Excel file. The first column contains the organisation specific codes, the second column contains valid taxonomic names: +#' +#' ``` +#' | A | B | +#' --|--------------------|-----------------------| +#' 1 | Organisation XYZ | mo | +#' 2 | lab_mo_ecoli | Escherichia coli | +#' 3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +#' 4 | | | +#' ``` +#' +#' We save it as `"home/me/ourcodes.xlsx"`. Now we have to set it as a source: +#' +#' ``` +#' set_mo_source("home/me/ourcodes.xlsx") +#' #> NOTE: Created mo_source file '/Users/me/mo_source.rds' (0.3 kB) from +#' #> '/Users/me/Documents/ourcodes.xlsx' (9 kB), columns +#' #> "Organisation XYZ" and "mo" +#' ``` +#' +#' It has now created a file `"~/mo_source.rds"` with the contents of our Excel file. Only the first column with foreign values and the 'mo' column will be kept when creating the RDS file. +#' +#' And now we can use it in our functions: +#' +#' ``` +#' as.mo("lab_mo_ecoli") +#' #> Class 'mo' +#' #> [1] B_ESCHR_COLI +#' +#' mo_genus("lab_mo_kpneumoniae") +#' #> [1] "Klebsiella" +#' +#' # other input values still work too +#' as.mo(c("Escherichia coli", "E. coli", "lab_mo_ecoli")) +#' #> NOTE: Translation to one microorganism was guessed with uncertainty. +#' #> Use mo_uncertainties() to review it. +#' #> Class 'mo' +#' #> [1] B_ESCHR_COLI B_ESCHR_COLI B_ESCHR_COLI +#' ``` +#' +#' If we edit the Excel file by, let's say, adding row 4 like this: +#' +#' ``` +#' | A | B | +#' --|--------------------|-----------------------| +#' 1 | Organisation XYZ | mo | +#' 2 | lab_mo_ecoli | Escherichia coli | +#' 3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +#' 4 | lab_Staph_aureus | Staphylococcus aureus | +#' 5 | | | +#' ``` +#' +#' ...any new usage of an MO function in this package will update your data file: +#' +#' ``` +#' as.mo("lab_mo_ecoli") +#' #> NOTE: Updated mo_source file '/Users/me/mo_source.rds' (0.3 kB) from +#' #> '/Users/me/Documents/ourcodes.xlsx' (9 kB), columns +#' #> "Organisation XYZ" and "mo" +#' #> Class 'mo' +#' #> [1] B_ESCHR_COLI +#' +#' mo_genus("lab_Staph_aureus") +#' #> [1] "Staphylococcus" +#' ``` +#' +#' To delete the reference data file, just use `""`, `NULL` or `FALSE` as input for [set_mo_source()]: +#' +#' ``` +#' set_mo_source(NULL) +#' #> Removed mo_source file '/Users/me/mo_source.rds' +#' ``` +#' +#' If the original file (in the previous case an Excel file) is moved or deleted, the `mo_source.rds` file will be removed upon the next use of [as.mo()] or any [`mo_*`][mo_property()] function. +#' @export +set_mo_source <- function(path, destination = getOption("AMR_mo_source", "~/mo_source.rds")) { + stop_ifnot(interactive(), "this function can only be used in interactive mode, since it must ask for the user's permission to write a file to their file system.") + + meet_criteria(path, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(destination, allow_class = "character", has_length = 1) + stop_ifnot(destination %like% "[.]rds$", "the `destination` must be a file location with file extension .rds.") + mo_source_destination <- path.expand(destination) + + if (is.null(path) || path %in% c(FALSE, "")) { + AMR_env$mo_source <- NULL + if (file.exists(mo_source_destination)) { + unlink(mo_source_destination) + message_("Removed mo_source file '", font_bold(mo_source_destination), "'", + add_fn = font_red, + as_note = FALSE + ) + } + return(invisible()) + } + + stop_ifnot(file.exists(path), "file not found: ", path) + + df <- NULL + if (path %like% "[.]rds$") { + df <- readRDS_AMR(path) + } else if (path %like% "[.]xlsx?$") { + # is Excel file (old or new) + stop_ifnot_installed("readxl") + df <- readxl::read_excel(path) + } else if (path %like% "[.]tsv$") { + df <- utils::read.table(file = path, header = TRUE, sep = "\t", stringsAsFactors = FALSE) + } else if (path %like% "[.]csv$") { + df <- utils::read.table(file = path, header = TRUE, sep = ",", stringsAsFactors = FALSE) + } else { + # try comma first + try( + df <- utils::read.table(file = path, header = TRUE, sep = ",", stringsAsFactors = FALSE), + silent = TRUE + ) + if (!check_validity_mo_source(df, stop_on_error = FALSE)) { + # try tab + try( + df <- utils::read.table(file = path, header = TRUE, sep = "\t", stringsAsFactors = FALSE), + silent = TRUE + ) + } + if (!check_validity_mo_source(df, stop_on_error = FALSE)) { + # try pipe + try( + df <- utils::read.table(file = path, header = TRUE, sep = "|", stringsAsFactors = FALSE), + silent = TRUE + ) + } + } + + # check integrity + if (is.null(df)) { + stop_("the path '", path, "' could not be imported as a dataset.") + } + check_validity_mo_source(df) + + df <- subset(df, !is.na(mo)) + + # keep only first two columns, second must be mo + if (colnames(df)[1] == "mo") { + df <- df[, c(colnames(df)[2], "mo")] + } else { + df <- df[, c(colnames(df)[1], "mo")] + } + + df <- as.data.frame(df, stringAsFactors = FALSE) + df[, "mo"] <- as.mo(df[, "mo", drop = TRUE]) + + # success + if (file.exists(mo_source_destination)) { + action <- "Updated" + } else { + action <- "Created" + # only ask when file is created, not when it is updated + txt <- paste0( + word_wrap(paste0( + "This will write create the new file '", + mo_source_destination, + "', for which your permission is required." + )), + "\n\n", + word_wrap("Do you agree that this file will be created?") + ) + showQuestion <- import_fn("showQuestion", "rstudioapi", error_on_fail = FALSE) + if (!is.null(showQuestion)) { + q_continue <- showQuestion("Create new file", txt) + } else { + q_continue <- utils::menu(choices = c("OK", "Cancel"), graphics = FALSE, title = txt) + } + if (q_continue %in% c(FALSE, 2)) { + return(invisible()) + } + } + attr(df, "mo_source_location") <- path + attr(df, "mo_source_destination") <- mo_source_destination + attr(df, "mo_source_timestamp") <- file.mtime(path) + saveRDS(df, mo_source_destination) + AMR_env$mo_source <- df + message_( + action, " mo_source file '", font_bold(mo_source_destination), + "' (", formatted_filesize(mo_source_destination), + ") from '", font_bold(path), + "' (", formatted_filesize(path), + '), columns "', colnames(df)[1], '" and "', colnames(df)[2], '"' + ) +} + +#' @rdname mo_source +#' @export +get_mo_source <- function(destination = getOption("AMR_mo_source", "~/mo_source.rds")) { + if (!file.exists(path.expand(destination))) { + if (interactive()) { + # source file might have been deleted, so update reference + set_mo_source("") + } + return(NULL) + } + if (destination %unlike% "[.]rds$") { + current_ext <- regexpr("\\.([[:alnum:]]+)$", destination) + current_ext <- ifelse(current_ext > -1L, substring(destination, current_ext + 1L), "") + vowel <- ifelse(current_ext %like% "^[AEFHILMNORSX]", "n", "") + stop_("The AMR mo source must be an RDS file, not a", vowel, " ", toupper(current_ext), " file. If `\"", basename(destination), "\"` was meant as your input file, use `set_mo_source()` on this file. In any case, the option `AMR_mo_source` must be set to another path.") + } + if (is.null(AMR_env$mo_source)) { + AMR_env$mo_source <- readRDS_AMR(path.expand(destination)) + } + + old_time <- attributes(AMR_env$mo_source)$mo_source_timestamp + new_time <- file.mtime(attributes(AMR_env$mo_source)$mo_source_location) + if (interactive() && !identical(old_time, new_time)) { + # source file was updated, also update reference + set_mo_source(attributes(AMR_env$mo_source)$mo_source_location) + } + AMR_env$mo_source +} + +check_validity_mo_source <- function(x, refer_to_name = "`reference_df`", stop_on_error = TRUE) { + add_MO_lookup_to_AMR_env() + + if (paste(deparse(substitute(x)), collapse = "") == "get_mo_source()") { + return(TRUE) + } + if (is.null(AMR_env$mo_source) && (identical(x, get_mo_source()))) { + return(TRUE) + } + if (is.null(x)) { + if (stop_on_error == TRUE) { + stop_(refer_to_name, " cannot be NULL", call = FALSE) + } else { + return(FALSE) + } + } + if (!is.data.frame(x)) { + if (stop_on_error == TRUE) { + stop_(refer_to_name, " must be a data.frame", call = FALSE) + } else { + return(FALSE) + } + } + if (!"mo" %in% colnames(x)) { + if (stop_on_error == TRUE) { + stop_(refer_to_name, " must contain a column 'mo'", call = FALSE) + } else { + return(FALSE) + } + } + if (!all(x$mo %in% c("", AMR_env$MO_lookup$mo, AMR_env$MO_lookup$fullname), na.rm = TRUE)) { + if (stop_on_error == TRUE) { + invalid <- x[which(!x$mo %in% c("", AMR_env$MO_lookup$mo, AMR_env$MO_lookup$fullname)), , drop = FALSE] + if (nrow(invalid) > 1) { + plural <- "s" + } else { + plural <- "" + } + stop_("Value", plural, " ", vector_and(invalid[, 1, drop = TRUE], quotes = TRUE), + " found in ", tolower(refer_to_name), + ", but with invalid microorganism code", plural, " ", vector_and(invalid$mo, quotes = TRUE), + call = FALSE + ) + } else { + return(FALSE) + } + } + if (colnames(x)[1] != "mo" && nrow(x) > length(unique(x[, 1, drop = TRUE]))) { + if (stop_on_error == TRUE) { + stop_(refer_to_name, " contains duplicate values in column '", colnames(x)[1], "'", call = FALSE) + } else { + return(FALSE) + } + } + if (colnames(x)[2] != "mo" && nrow(x) > length(unique(x[, 2, drop = TRUE]))) { + if (stop_on_error == TRUE) { + stop_(refer_to_name, " contains duplicate values in column '", colnames(x)[2], "'", call = FALSE) + } else { + return(FALSE) + } + } + return(TRUE) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/pca.R + + + + +#' Principal Component Analysis (for AMR) +#' +#' Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels, and automatic filtering on only suitable (i.e. non-empty and numeric) variables. +#' @param x a [data.frame] containing [numeric] columns +#' @param ... columns of `x` to be selected for PCA, can be unquoted since it supports quasiquotation. +#' @inheritParams stats::prcomp +#' @details The [pca()] function takes a [data.frame] as input and performs the actual PCA with the \R function [prcomp()]. +#' +#' The result of the [pca()] function is a [prcomp] object, with an additional attribute `non_numeric_cols` which is a vector with the column names of all columns that do not contain [numeric] values. These are probably the groups and labels, and will be used by [ggplot_pca()]. +#' @return An object of classes [pca] and [prcomp] +#' @importFrom stats prcomp +#' @export +#' @examples +#' # `example_isolates` is a data set available in the AMR package. +#' # See ?example_isolates. +#' +#' \donttest{ +#' if (require("dplyr")) { +#' # calculate the resistance per group first +#' resistance_data <- example_isolates %>% +#' group_by( +#' order = mo_order(mo), # group on anything, like order +#' genus = mo_genus(mo) +#' ) %>% # and genus as we do here; +#' filter(n() >= 30) %>% # filter on only 30 results per group +#' summarise_if(is.sir, resistance) # then get resistance of all drugs +#' +#' # now conduct PCA for certain antimicrobial drugs +#' pca_result <- resistance_data %>% +#' pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) +#' +#' pca_result +#' summary(pca_result) +#' +#' # old base R plotting method: +#' biplot(pca_result) +#' # new ggplot2 plotting method using this package: +#' if (require("ggplot2")) { +#' ggplot_pca(pca_result) +#' +#' ggplot_pca(pca_result) + +#' scale_colour_viridis_d() + +#' labs(title = "Title here") +#' } +#' } +#' } +pca <- function(x, + ..., + retx = TRUE, + center = TRUE, + scale. = TRUE, + tol = NULL, + rank. = NULL) { + meet_criteria(x, allow_class = "data.frame") + meet_criteria(retx, allow_class = "logical", has_length = 1) + meet_criteria(center, allow_class = "logical", has_length = 1) + meet_criteria(scale., allow_class = "logical", has_length = 1) + meet_criteria(tol, allow_class = "numeric", has_length = 1, allow_NULL = TRUE) + meet_criteria(rank., allow_class = "numeric", has_length = 1, allow_NULL = TRUE) + + # unset data.table, tibble, etc. + # also removes groups made by dplyr::group_by + x <- as.data.frame(x, stringsAsFactors = FALSE) + x.bak <- x + + # defuse R expressions, this replaces rlang::enquos() + dots <- substitute(list(...)) + if (length(dots) > 1) { + new_list <- list(0) + for (i in seq_len(length(dots) - 1)) { + new_list[[i]] <- tryCatch(eval(dots[[i + 1]], envir = x), + error = function(e) stop(e$message, call. = FALSE) + ) + if (length(new_list[[i]]) == 1) { + if (is.character(new_list[[i]]) && new_list[[i]] %in% colnames(x)) { + # this is to support quoted variables: df %pm>% pca("mycol1", "mycol2") + new_list[[i]] <- x[, new_list[[i]]] + } else { + # remove item - it's an argument like `center` + new_list[[i]] <- NULL + } + } + } + + x <- as.data.frame(new_list, stringsAsFactors = FALSE) + if (any(vapply(FUN.VALUE = logical(1), x, function(y) !is.numeric(y)))) { + warning_("in `pca()`: be sure to first calculate the resistance (or susceptibility) of variables with antimicrobial test results, since PCA works with numeric variables only. See Examples in `?pca`.", call = FALSE) + } + + # set column names + tryCatch(colnames(x) <- as.character(dots)[2:length(dots)], + error = function(e) warning("column names could not be set") + ) + + # keep only numeric columns + x <- x[, vapply(FUN.VALUE = logical(1), x, function(y) is.numeric(y)), drop = FALSE] + # bind the data set with the non-numeric columns + x <- cbind(x.bak[, vapply(FUN.VALUE = logical(1), x.bak, function(y) !is.numeric(y) & !all(is.na(y))), drop = FALSE], x) + } + + x <- pm_ungroup(x) # would otherwise select the grouping vars + x <- x[rowSums(is.na(x)) == 0, ] # remove columns containing NAs + + pca_data <- x[, which(vapply(FUN.VALUE = logical(1), x, function(x) is.numeric(x))), drop = FALSE] + + message_( + "Columns selected for PCA: ", vector_and(font_bold(colnames(pca_data), collapse = NULL), quotes = TRUE), + ". Total observations available: ", nrow(pca_data), "." + ) + + if (getRversion() < "3.4.0") { + # stats::prcomp prior to 3.4.0 does not have the 'rank.' argument + pca_model <- prcomp(pca_data, retx = retx, center = center, scale. = scale., tol = tol) + } else { + pca_model <- prcomp(pca_data, retx = retx, center = center, scale. = scale., tol = tol, rank. = rank.) + } + groups <- x[, vapply(FUN.VALUE = logical(1), x, function(y) !is.numeric(y) & !all(is.na(y))), drop = FALSE] + rownames(groups) <- NULL + attr(pca_model, "non_numeric_cols") <- groups + class(pca_model) <- c("pca", class(pca_model)) + pca_model +} + +#' @method print pca +#' @export +#' @noRd +print.pca <- function(x, ...) { + a <- attributes(x)$non_numeric_cols + if (!is.null(a)) { + print_pca_group(a) + class(x) <- class(x)[class(x) != "pca"] + } + print(x, ...) +} + +#' @method summary pca +#' @export +#' @noRd +summary.pca <- function(object, ...) { + a <- attributes(object)$non_numeric_cols + if (!is.null(a)) { + print_pca_group(a) + class(object) <- class(object)[class(object) != "pca"] + } + summary(object, ...) +} + +print_pca_group <- function(a) { + grps <- sort(unique(a[, 1, drop = TRUE])) + cat("Groups (n=", length(grps), ", named as '", colnames(a)[1], "'):\n", sep = "") + print(grps) + cat("\n") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/plotting.R + + + + +#' Plotting for Classes `sir`, `mic` and `disk` +#' +#' @description +#' Functions to plot classes `sir`, `mic` and `disk`, with support for base \R and `ggplot2`. +#' +#' Especially the `scale_*_mic()` functions are relevant wrappers to plot MIC values for `ggplot2`. They allows custom MIC ranges and to plot intermediate log2 levels for missing MIC values. +#' @param x,object values created with [as.mic()], [as.disk()] or [as.sir()] (or their `random_*` variants, such as [random_mic()]) +#' @param mo any (vector of) text that can be coerced to a valid microorganism code with [as.mo()] +#' @param ab any (vector of) text that can be coerced to a valid antimicrobial drug code with [as.ab()] +#' @param guideline interpretation guideline to use - the default is the latest included EUCAST guideline, see *Details* +#' @param main,title title of the plot +#' @param xlab,ylab axis title +#' @param colours_SIR colours to use for filling in the bars, must be a vector of three values (in the order S, I and R). The default colours are colour-blind friendly. +#' @param language language to be used to translate 'Susceptible', 'Increased exposure'/'Intermediate' and 'Resistant' - the default is system language (see [get_AMR_locale()]) and can be overwritten by setting the package option [`AMR_locale`][AMR-options], e.g. `options(AMR_locale = "de")`, see [translate]. Use `language = NULL` or `language = ""` to prevent translation. +#' @param expand a [logical] to indicate whether the range on the x axis should be expanded between the lowest and highest value. For MIC values, intermediate values will be factors of 2 starting from the highest MIC value. For disk diameters, the whole diameter range will be filled. +#' @inheritParams as.sir +#' @details +#' The interpretation of "I" will be named "Increased exposure" for all EUCAST guidelines since 2019, and will be named "Intermediate" in all other cases. +#' +#' For interpreting MIC values as well as disk diffusion diameters, supported guidelines to be used as input for the `guideline` argument are: `r vector_and(AMR::clinical_breakpoints$guideline, quotes = TRUE, reverse = TRUE)`. +#' +#' Simply using `"CLSI"` or `"EUCAST"` as input will automatically select the latest version of that guideline. +#' @name plot +#' @rdname plot +#' @return The `autoplot()` functions return a [`ggplot`][ggplot2::ggplot()] model that is extendible with any `ggplot2` function. +#' +#' The `fortify()` functions return a [data.frame] as an extension for usage in the [ggplot2::ggplot()] function. +#' @param ... arguments passed on to methods +#' @examples +#' some_mic_values <- random_mic(size = 100) +#' some_disk_values <- random_disk(size = 100, mo = "Escherichia coli", ab = "cipro") +#' some_sir_values <- random_sir(50, prob_SIR = c(0.55, 0.05, 0.30)) +#' +#' plot(some_mic_values) +#' plot(some_disk_values) +#' plot(some_sir_values) +#' +#' # when providing the microorganism and antibiotic, colours will show interpretations: +#' plot(some_mic_values, mo = "S. aureus", ab = "ampicillin") +#' plot(some_disk_values, mo = "Escherichia coli", ab = "cipro") +#' plot(some_disk_values, mo = "Escherichia coli", ab = "cipro", language = "nl") +#' +#' +#' # Plotting using scale_x_mic() +#' \donttest{ +#' if (require("ggplot2")) { +#' mic_plot <- ggplot(data.frame(mics = as.mic(c(0.25, "<=4", 4, 8, 32, ">=32")), +#' counts = c(1, 1, 2, 2, 3, 3)), +#' aes(mics, counts)) + +#' geom_col() +#' mic_plot + +#' labs(title = "without scale_x_mic()") +#' } +#' if (require("ggplot2")) { +#' mic_plot + +#' scale_x_mic() + +#' labs(title = "with scale_x_mic()") +#' } +#' if (require("ggplot2")) { +#' mic_plot + +#' scale_x_mic(keep_operators = "all") + +#' labs(title = "with scale_x_mic() keeping all operators") +#' } +#' if (require("ggplot2")) { +#' mic_plot + +#' scale_x_mic(mic_range = c(1, 16)) + +#' labs(title = "with scale_x_mic() using a manual 'within' range") +#' } +#' if (require("ggplot2")) { +#' mic_plot + +#' scale_x_mic(mic_range = c(0.032, 256)) + +#' labs(title = "with scale_x_mic() using a manual 'outside' range") +#' } +#' +#' if (require("ggplot2")) { +#' autoplot(some_mic_values) +#' } +#' if (require("ggplot2")) { +#' autoplot(some_disk_values, mo = "Escherichia coli", ab = "cipro") +#' } +#' if (require("ggplot2")) { +#' autoplot(some_sir_values) +#' } +#' } +NULL + +#' @export +#' @inheritParams as.mic +#' @param drop a [logical] to remove intermediate MIC values, defaults to `FALSE` +#' @rdname plot +scale_x_mic <- function(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(drop, allow_class = "logical", has_length = 1) + scale <- ggplot2::scale_x_discrete(drop = drop, ...) + scale$transform <- function(x, keep_ops = keep_operators, mic_rng = mic_range) { + rescale_mic(x = x, keep_operators = keep_ops, mic_range = mic_rng, as.mic = FALSE) + } + scale +} + +#' @export +#' @inheritParams as.mic +#' @rdname plot +scale_y_mic <- function(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(drop, allow_class = "logical", has_length = 1) + scale <- ggplot2::scale_y_discrete(drop = drop, ...) + scale$transform <- function(x, keep_ops = keep_operators, mic_rng = mic_range) { + rescale_mic(x = x, keep_operators = keep_ops, mic_range = mic_rng, as.mic = FALSE) + } + scale +} + +#' @export +#' @inheritParams as.mic +#' @rdname plot +scale_colour_mic <- function(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(drop, allow_class = "logical", has_length = 1) + scale <- ggplot2::scale_colour_discrete(drop = drop, ...) + scale$transform <- function(x, keep_ops = keep_operators, mic_rng = mic_range) { + rescale_mic(x = x, keep_operators = keep_ops, mic_range = mic_rng, as.mic = FALSE) + } + scale +} + +#' @export +#' @inheritParams as.mic +#' @rdname plot +scale_fill_mic <- function(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(drop, allow_class = "logical", has_length = 1) + scale <- ggplot2::scale_fill_discrete(drop = drop, ...) + scale$transform <- function(x, keep_ops = keep_operators, mic_rng = mic_range) { + rescale_mic(x = x, keep_operators = keep_ops, mic_range = mic_rng, as.mic = FALSE) + } + scale +} + +#' @method plot mic +#' @importFrom graphics barplot axis mtext legend +#' @export +#' @rdname plot +plot.mic <- function(x, + mo = NULL, + ab = NULL, + guideline = "EUCAST", + main = deparse(substitute(x)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Minimum Inhibitory Concentration (mg/L)", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ...) { + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + x <- as.mic(x) # make sure that currently implemented MIC levels are used + + if (length(colours_SIR) == 1) { + colours_SIR <- rep(colours_SIR, 3) + } + main <- gsub(" +", " ", paste0(main, collapse = " ")) + + x <- plotrange_as_table(x, expand = expand) + cols_sub <- plot_colours_subtitle_guideline( + x = x, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + fn = as.mic, + language = language, + method = "MIC", + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + ... + ) + barplot(x, + col = cols_sub$cols, + main = main, + ylim = c(0, max(x) * ifelse(any(colours_SIR %in% cols_sub$cols), 1.1, 1)), + ylab = ylab, + xlab = xlab, + axes = FALSE + ) + axis(2, seq(0, max(x))) + if (!is.null(cols_sub$sub)) { + mtext(side = 3, line = 0.5, adj = 0.5, cex = 0.75, cols_sub$sub) + } + + if (any(colours_SIR %in% cols_sub$cols)) { + legend_txt <- character(0) + legend_col <- character(0) + if (any(cols_sub$cols == colours_SIR[1] & cols_sub$count > 0)) { + legend_txt <- c(legend_txt, "(S) Susceptible") + legend_col <- colours_SIR[1] + } + if (any(cols_sub$cols == colours_SIR[2] & cols_sub$count > 0)) { + legend_txt <- c(legend_txt, paste("(I)", plot_name_of_I(cols_sub$guideline))) + legend_col <- c(legend_col, colours_SIR[2]) + } + if (any(cols_sub$cols == colours_SIR[3] & cols_sub$count > 0)) { + legend_txt <- c(legend_txt, "(R) Resistant") + legend_col <- c(legend_col, colours_SIR[3]) + } + + legend("top", + x.intersp = 0.5, + legend = translate_into_language(legend_txt, language = language), + fill = legend_col, + horiz = TRUE, + cex = 0.75, + box.lwd = 0, + box.col = "#FFFFFF55", + bg = "#FFFFFF55" + ) + } +} + +#' @method barplot mic +#' @export +#' @noRd +barplot.mic <- function(height, + mo = NULL, + ab = NULL, + guideline = "EUCAST", + main = deparse(substitute(height)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Minimum Inhibitory Concentration (mg/L)", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + ...) { + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + main <- gsub(" +", " ", paste0(main, collapse = " ")) + + height <- as.mic(height) # make sure that currently implemented MIC levels are used + + plot( + x = height, + main = main, + ylab = ylab, + xlab = xlab, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + ... + ) +} + +#' @method autoplot mic +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +autoplot.mic <- function(object, + mo = NULL, + ab = NULL, + guideline = "EUCAST", + title = deparse(substitute(object)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Minimum Inhibitory Concentration (mg/L)", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(title, allow_class = "character", allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + if ("main" %in% names(list(...))) { + title <- list(...)$main + } + if (!is.null(title)) { + title <- gsub(" +", " ", paste0(title, collapse = " ")) + } + + object <- as.mic(object) # make sure that currently implemented MIC levels are used + x <- plotrange_as_table(object, expand = expand) + cols_sub <- plot_colours_subtitle_guideline( + x = x, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + fn = as.mic, + language = language, + method = "MIC", + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + ... + ) + df <- as.data.frame(x, stringsAsFactors = TRUE) + colnames(df) <- c("mic", "count") + df$cols <- cols_sub$cols + df$cols[df$cols == colours_SIR[1]] <- "(S) Susceptible" + df$cols[df$cols == colours_SIR[2]] <- paste("(I)", plot_name_of_I(cols_sub$guideline)) + df$cols[df$cols == colours_SIR[3]] <- "(R) Resistant" + df$cols <- factor(translate_into_language(df$cols, language = language), + levels = translate_into_language( + c( + "(S) Susceptible", + paste("(I)", plot_name_of_I(cols_sub$guideline)), + "(R) Resistant" + ), + language = language + ), + ordered = TRUE + ) + p <- ggplot2::ggplot(df) + + if (any(colours_SIR %in% cols_sub$cols)) { + vals <- c( + "(S) Susceptible" = colours_SIR[1], + "(SDD) Susceptible dose-dependent" = colours_SIR[2], + "(I) Susceptible, incr. exp." = colours_SIR[2], + "(I) Intermediate" = colours_SIR[2], + "(R) Resistant" = colours_SIR[3] + ) + names(vals) <- translate_into_language(names(vals), language = language) + p <- p + + ggplot2::geom_col(ggplot2::aes(x = mic, y = count, fill = cols)) + + # limits = force is needed because of a ggplot2 >= 3.3.4 bug (#4511) + ggplot2::scale_fill_manual( + values = vals, + name = NULL, + limits = force + ) + } else { + p <- p + + ggplot2::geom_col(ggplot2::aes(x = mic, y = count)) + } + + p + + ggplot2::labs(title = title, x = xlab, y = ylab, subtitle = cols_sub$sub) +} + +#' @method fortify mic +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +fortify.mic <- function(object, ...) { + object <- as.mic(object) # make sure that currently implemented MIC levels are used + stats::setNames( + as.data.frame(plotrange_as_table(object, expand = FALSE)), + c("x", "y") + ) +} + + +#' @method plot disk +#' @export +#' @importFrom graphics barplot axis mtext legend +#' @rdname plot +plot.disk <- function(x, + main = deparse(substitute(x)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Disk diffusion diameter (mm)", language = language), + mo = NULL, + ab = NULL, + guideline = "EUCAST", + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ...) { + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + if (length(colours_SIR) == 1) { + colours_SIR <- rep(colours_SIR, 3) + } + main <- gsub(" +", " ", paste0(main, collapse = " ")) + + x <- plotrange_as_table(x, expand = expand) + cols_sub <- plot_colours_subtitle_guideline( + x = x, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + fn = as.disk, + language = language, + method = "disk", + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + ... + ) + + barplot(x, + col = cols_sub$cols, + main = main, + ylim = c(0, max(x) * ifelse(any(colours_SIR %in% cols_sub$cols), 1.1, 1)), + ylab = ylab, + xlab = xlab, + axes = FALSE + ) + axis(2, seq(0, max(x))) + if (!is.null(cols_sub$sub)) { + mtext(side = 3, line = 0.5, adj = 0.5, cex = 0.75, cols_sub$sub) + } + + if (any(colours_SIR %in% cols_sub$cols)) { + legend_txt <- character(0) + legend_col <- character(0) + if (any(cols_sub$cols == colours_SIR[3] & cols_sub$count > 0)) { + legend_txt <- "(R) Resistant" + legend_col <- colours_SIR[3] + } + if (any(cols_sub$cols == colours_SIR[2] & cols_sub$count > 0)) { + legend_txt <- c(legend_txt, paste("(I)", plot_name_of_I(cols_sub$guideline))) + legend_col <- c(legend_col, colours_SIR[2]) + } + if (any(cols_sub$cols == colours_SIR[1] & cols_sub$count > 0)) { + legend_txt <- c(legend_txt, "(S) Susceptible") + legend_col <- c(legend_col, colours_SIR[1]) + } + legend("top", + x.intersp = 0.5, + legend = translate_into_language(legend_txt, language = language), + fill = legend_col, + horiz = TRUE, + cex = 0.75, + box.lwd = 0, + box.col = "#FFFFFF55", + bg = "#FFFFFF55" + ) + } +} + +#' @method barplot disk +#' @export +#' @noRd +barplot.disk <- function(height, + main = deparse(substitute(height)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Disk diffusion diameter (mm)", language = language), + mo = NULL, + ab = NULL, + guideline = "EUCAST", + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + ...) { + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + main <- gsub(" +", " ", paste0(main, collapse = " ")) + + plot( + x = height, + main = main, + ylab = ylab, + xlab = xlab, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + ... + ) +} + +#' @method autoplot disk +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +autoplot.disk <- function(object, + mo = NULL, + ab = NULL, + title = deparse(substitute(object)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Disk diffusion diameter (mm)", language = language), + guideline = "EUCAST", + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(title, allow_class = "character", allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(mo, allow_class = c("mo", "character"), allow_NULL = TRUE) + meet_criteria(ab, allow_class = c("ab", "character"), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + if ("main" %in% names(list(...))) { + title <- list(...)$main + } + if (!is.null(title)) { + title <- gsub(" +", " ", paste0(title, collapse = " ")) + } + + x <- plotrange_as_table(object, expand = expand) + cols_sub <- plot_colours_subtitle_guideline( + x = x, + mo = mo, + ab = ab, + guideline = guideline, + colours_SIR = colours_SIR, + fn = as.disk, + language = language, + method = "disk", + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + ... + ) + df <- as.data.frame(x, stringsAsFactors = TRUE) + colnames(df) <- c("disk", "count") + df$cols <- cols_sub$cols + + df$cols[df$cols == colours_SIR[1]] <- "(S) Susceptible" + df$cols[df$cols == colours_SIR[2]] <- paste("(I)", plot_name_of_I(cols_sub$guideline)) + df$cols[df$cols == colours_SIR[3]] <- "(R) Resistant" + df$cols <- factor(translate_into_language(df$cols, language = language), + levels = translate_into_language( + c( + "(S) Susceptible", + paste("(I)", plot_name_of_I(cols_sub$guideline)), + "(R) Resistant" + ), + language = language + ), + ordered = TRUE + ) + p <- ggplot2::ggplot(df) + + if (any(colours_SIR %in% cols_sub$cols)) { + vals <- c( + "(S) Susceptible" = colours_SIR[1], + "(SDD) Susceptible dose-dependent" = colours_SIR[2], + "(I) Susceptible, incr. exp." = colours_SIR[2], + "(I) Intermediate" = colours_SIR[2], + "(R) Resistant" = colours_SIR[3] + ) + names(vals) <- translate_into_language(names(vals), language = language) + p <- p + + ggplot2::geom_col(ggplot2::aes(x = disk, y = count, fill = cols)) + + # limits = force is needed because of a ggplot2 >= 3.3.4 bug (#4511) + ggplot2::scale_fill_manual( + values = vals, + name = NULL, + limits = force + ) + } else { + p <- p + + ggplot2::geom_col(ggplot2::aes(x = disk, y = count)) + } + + p + + ggplot2::labs(title = title, x = xlab, y = ylab, subtitle = cols_sub$sub) +} + +#' @method fortify disk +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +fortify.disk <- function(object, ...) { + stats::setNames( + as.data.frame(plotrange_as_table(object, expand = FALSE)), + c("x", "y") + ) +} + +#' @method plot sir +#' @export +#' @importFrom graphics plot text axis +#' @rdname plot +plot.sir <- function(x, + ylab = translate_AMR("Percentage", language = language), + xlab = translate_AMR("Antimicrobial Interpretation", language = language), + main = deparse(substitute(x)), + language = get_AMR_locale(), + ...) { + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + + data <- as.data.frame(table(x), stringsAsFactors = FALSE) + colnames(data) <- c("x", "n") + data$s <- round((data$n / sum(data$n)) * 100, 1) + + if (!"S" %in% data$x) { + data <- rbind_AMR(data, data.frame(x = "S", n = 0, s = 0, stringsAsFactors = FALSE)) + } + if (!"SDD" %in% data$x) { + data <- rbind_AMR(data, data.frame(x = "SDD", n = 0, s = 0, stringsAsFactors = FALSE)) + } + if (!"I" %in% data$x) { + data <- rbind_AMR(data, data.frame(x = "I", n = 0, s = 0, stringsAsFactors = FALSE)) + } + if (!"R" %in% data$x) { + data <- rbind_AMR(data, data.frame(x = "R", n = 0, s = 0, stringsAsFactors = FALSE)) + } + if (!"NI" %in% data$x) { + data <- rbind_AMR(data, data.frame(x = "NI", n = 0, s = 0, stringsAsFactors = FALSE)) + } + + data <- data[!(data$n == 0 & data$x %in% c("SDD", "I", "NI")), , drop = FALSE] + data$x <- factor(data$x, levels = intersect(unique(data$x), c("S", "SDD", "I", "R", "NI")), ordered = TRUE) + + ymax <- pm_if_else(max(data$s) > 95, 105, 100) + + plot( + x = data$x, + y = data$s, + lwd = 2, + ylim = c(0, ymax), + ylab = ylab, + xlab = xlab, + main = main, + axes = FALSE + ) + # x axis + axis(side = 1, at = 1:pm_n_distinct(data$x), labels = levels(data$x), lwd = 0) + # y axis, 0-100% + axis(side = 2, at = seq(0, 100, 5)) + + text( + x = data$x, + y = data$s + 4, + labels = paste0(data$s, "% (n = ", data$n, ")") + ) +} + + +#' @method barplot sir +#' @importFrom graphics barplot axis +#' @export +#' @noRd +barplot.sir <- function(height, + main = deparse(substitute(height)), + xlab = translate_AMR("Antimicrobial Interpretation", language = language), + ylab = translate_AMR("Frequency", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + ...) { + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(main, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + language <- validate_language(language) + meet_criteria(expand, allow_class = "logical", has_length = 1) + + if (length(colours_SIR) == 1) { + colours_SIR <- rep(colours_SIR, 3) + } + # add SDD and N to colours + colours_SIR <- c(colours_SIR[1:2], colours_SIR[2], colours_SIR[3], "#888888") + main <- gsub(" +", " ", paste0(main, collapse = " ")) + + x <- table(height) + # remove missing I, SDD, and N + colours_SIR <- colours_SIR[!(names(x) %in% c("SDD", "I", "NI") & x == 0)] + x <- x[!(names(x) %in% c("SDD", "I", "NI") & x == 0)] + # plot it + barplot(x, + col = colours_SIR, + xlab = xlab, + main = main, + ylab = ylab, + axes = FALSE + ) + axis(2, seq(0, max(x))) +} + +#' @method autoplot sir +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +autoplot.sir <- function(object, + title = deparse(substitute(object)), + xlab = translate_AMR("Antimicrobial Interpretation", language = language), + ylab = translate_AMR("Frequency", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + ...) { + stop_ifnot_installed("ggplot2") + meet_criteria(title, allow_class = "character", allow_NULL = TRUE) + meet_criteria(ylab, allow_class = "character", has_length = 1) + meet_criteria(xlab, allow_class = "character", has_length = 1) + meet_criteria(colours_SIR, allow_class = "character", has_length = c(1, 3)) + + if ("main" %in% names(list(...))) { + title <- list(...)$main + } + if (!is.null(title)) { + title <- gsub(" +", " ", paste0(title, collapse = " ")) + } + + if (length(colours_SIR) == 1) { + colours_SIR <- rep(colours_SIR, 3) + } + + df <- as.data.frame(table(object), stringsAsFactors = TRUE) + colnames(df) <- c("x", "n") + df <- df[!(df$n == 0 & df$x %in% c("SDD", "I", "NI")), , drop = FALSE] + ggplot2::ggplot(df) + + ggplot2::geom_col(ggplot2::aes(x = x, y = n, fill = x)) + + # limits = force is needed because of a ggplot2 >= 3.3.4 bug (#4511) + ggplot2::scale_fill_manual( + values = c( + "S" = colours_SIR[1], + "SDD" = colours_SIR[2], + "I" = colours_SIR[2], + "R" = colours_SIR[3], + "NI" = "#888888" + ), + limits = force + ) + + ggplot2::labs(title = title, x = xlab, y = ylab) + + ggplot2::theme(legend.position = "none") +} + +#' @method fortify sir +#' @rdname plot +# will be exported using s3_register() in R/zzz.R +fortify.sir <- function(object, ...) { + stats::setNames( + as.data.frame(table(object)), + c("x", "y") + ) +} + +plotrange_as_table <- function(x, expand, keep_operators = "all", mic_range = NULL) { + x <- x[!is.na(x)] + if (is.mic(x)) { + x <- as.mic(x, keep_operators = keep_operators) + if (expand == TRUE) { + # expand range for MIC by adding common intermediate factors levels + if (!is.null(mic_range) && !all(is.na(mic_range))) { + # base on mic_range + `%na_or%` <- function(x, y) if (is.na(x)) y else x + extra_range <- COMMON_MIC_VALUES[COMMON_MIC_VALUES >= (mic_range[1] %na_or% min(x, na.rm = TRUE)) & COMMON_MIC_VALUES <= (mic_range[2] %na_or% max(x, na.rm = TRUE))] + } else { + # base on x + extra_range <- COMMON_MIC_VALUES[COMMON_MIC_VALUES > min(x, na.rm = TRUE) & COMMON_MIC_VALUES < max(x, na.rm = TRUE)] + } + # remove the ones that are in 25% range of user values + extra_range <- extra_range[!vapply(FUN.VALUE = logical(1), extra_range, function(r) any(abs(r - x) / x < 0.25, na.rm = TRUE))] + nms <- extra_range + extra_range <- rep(0, length(extra_range)) + names(extra_range) <- nms + x <- table(droplevels(x, as.mic = FALSE)) + extra_range <- extra_range[!names(extra_range) %in% names(x) & names(extra_range) %in% VALID_MIC_LEVELS] + x <- as.table(c(x, extra_range)) + } else { + x <- table(droplevels(x, as.mic = FALSE)) + } + x <- x[order(as.double(as.mic(names(x))))] + } else if (is.disk(x)) { + if (expand == TRUE) { + # expand range for disks from lowest to highest so all mm's in between also print + extra_range <- rep(0, max(x) - min(x) - 1) + names(extra_range) <- seq(min(x) + 1, max(x) - 1) + x <- table(x) + extra_range <- extra_range[!names(extra_range) %in% names(x)] + x <- as.table(c(x, extra_range)) + } else { + x <- table(x) + } + x <- x[order(as.double(names(x)))] + } + as.table(x) +} + +ggplot2_get_from_dots <- function(arg, default, ...) { + dots <- list(...) + if (!arg %in% names(dots)) { + default + } else { + dots[[arg]] + } +} + +plot_name_of_I <- function(guideline) { + if (guideline %unlike% "CLSI" && as.double(gsub("[^0-9]+", "", guideline)) >= 2019) { + # interpretation since 2019 + "Susceptible, incr. exp." + } else { + # interpretation until 2019 + "Intermediate" + } +} + +plot_colours_subtitle_guideline <- function(x, mo, ab, guideline, colours_SIR, fn, language, method, breakpoint_type, include_PKPD, ...) { + stop_if(length(x) == 0, "no observations to plot", call = FALSE) + + guideline <- get_guideline(guideline, AMR::clinical_breakpoints) + + # store previous interpretations to backup + sir_history <- AMR_env$sir_interpretation_history + # and clear previous interpretations + AMR_env$sir_interpretation_history <- AMR_env$sir_interpretation_history[0, , drop = FALSE] + + if (!is.null(mo) && !is.null(ab)) { + # interpret and give colour based on MIC values + mo <- as.mo(mo) + moname <- mo_name(mo, language = language) + ab <- as.ab(ab) + abname <- ab_name(ab, language = language) + + sir <- suppressWarnings(suppressMessages(as.sir(fn(names(x)), mo = mo, ab = ab, guideline = guideline, include_screening = FALSE, include_PKPD = include_PKPD, breakpoint_type = breakpoint_type, ...))) + guideline_txt <- guideline + if (all(is.na(sir))) { + sir_screening <- suppressWarnings(suppressMessages(as.sir(fn(names(x)), mo = mo, ab = ab, guideline = guideline, include_screening = TRUE, include_PKPD = include_PKPD, breakpoint_type = breakpoint_type, ...))) + if (!all(is.na(sir_screening))) { + message_( + "Only ", guideline, " ", method, " interpretations found for ", + ab_name(ab, language = NULL, tolower = TRUE), " in ", italicise(moname), " for screening" + ) + sir <- sir_screening + guideline_txt <- paste0("(Screen, ", guideline_txt, ")") + } else { + message_( + "No ", guideline, " ", method, " interpretations found for ", + ab_name(ab, language = NULL, tolower = TRUE), " in ", italicise(moname) + ) + guideline_txt <- paste0("(", guideline_txt, ")") + } + } else { + if (isTRUE(list(...)$uti)) { + guideline_txt <- paste("UTIs,", guideline_txt) + } + ref_tbl <- paste0('"', unique(AMR_env$sir_interpretation_history$ref_table), '"', collapse = "/") + guideline_txt <- paste0("(", guideline_txt, ": ", ref_tbl, ")") + } + cols <- character(length = length(sir)) + cols[is.na(sir)] <- "#BEBEBE" + cols[sir == "S"] <- colours_SIR[1] + cols[sir == "SDD"] <- colours_SIR[2] + cols[sir == "I"] <- colours_SIR[2] + cols[sir == "R"] <- colours_SIR[3] + cols[sir == "NI"] <- "#888888" + sub <- bquote(.(abname) ~ "-" ~ italic(.(moname)) ~ .(guideline_txt)) + } else { + cols <- "#BEBEBE" + sub <- NULL + } + + # restore previous interpretations to backup + AMR_env$sir_interpretation_history <- sir_history + + list(cols = cols, count = as.double(x), sub = sub, guideline = guideline) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/proportion.R + + + + +#' Calculate Antimicrobial Resistance +#' +#' @description These functions can be used to calculate the (co-)resistance or susceptibility of microbial isolates (i.e. percentage of S, SI, I, IR or R). All functions support quasiquotation with pipes, can be used in `summarise()` from the `dplyr` package and also support grouped variables, see *Examples*. +#' +#' [resistance()] should be used to calculate resistance, [susceptibility()] should be used to calculate susceptibility.\cr +#' @param ... one or more vectors (or columns) with antibiotic interpretations. They will be transformed internally with [as.sir()] if needed. Use multiple columns to calculate (the lack of) co-resistance: the probability where one of two drugs have a resistant or susceptible result. See *Examples*. +#' @param minimum the minimum allowed number of available (tested) isolates. Any isolate count lower than `minimum` will return `NA` with a warning. The default number of `30` isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see *Source*. +#' @param as_percent a [logical] to indicate whether the output must be returned as a hundred fold with % sign (a character). A value of `0.123456` will then be returned as `"12.3%"`. +#' @param only_all_tested (for combination therapies, i.e. using more than one variable for `...`): a [logical] to indicate that isolates must be tested for all antibiotics, see section *Combination Therapy* below +#' @param data a [data.frame] containing columns with class [`sir`] (see [as.sir()]) +#' @param translate_ab a column name of the [antibiotics] data set to translate the antibiotic abbreviations to, using [ab_property()] +#' @inheritParams ab_property +#' @param combine_SI a [logical] to indicate whether all values of S, SDD, and I must be merged into one, so the output only consists of S+SDD+I vs. R (susceptible vs. resistant) - the default is `TRUE` +#' @param ab_result antibiotic results to test against, must be one or more values of "S", "SDD", "I", or "R" +#' @param confidence_level the confidence level for the returned confidence interval. For the calculation, the number of S or SI isolates, and R isolates are compared with the total number of available isolates with R, S, or I by using [binom.test()], i.e., the Clopper-Pearson method. +#' @param side the side of the confidence interval to return. The default is `"both"` for a length 2 vector, but can also be (abbreviated as) `"min"`/`"left"`/`"lower"`/`"less"` or `"max"`/`"right"`/`"higher"`/`"greater"`. +#' @param collapse a [logical] to indicate whether the output values should be 'collapsed', i.e. be merged together into one value, or a character value to use for collapsing +#' @inheritSection as.sir Interpretation of SIR +#' @details +#' **Remember that you should filter your data to let it contain only first isolates!** This is needed to exclude duplicates and to reduce selection bias. Use [first_isolate()] to determine them in your data set with one of the four available algorithms. +#' +#' The function [resistance()] is equal to the function [proportion_R()]. The function [susceptibility()] is equal to the function [proportion_SI()]. Since AMR v3.0, [proportion_SI()] and [proportion_I()] include dose-dependent susceptibility ('SDD'). +#' +#' Use [sir_confidence_interval()] to calculate the confidence interval, which relies on [binom.test()], i.e., the Clopper-Pearson method. This function returns a vector of length 2 at default for antimicrobial *resistance*. Change the `side` argument to "left"/"min" or "right"/"max" to return a single value, and change the `ab_result` argument to e.g. `c("S", "I")` to test for antimicrobial *susceptibility*, see Examples. +#' +#' These functions are not meant to count isolates, but to calculate the proportion of resistance/susceptibility. Use the [`count_*()`][AMR::count()] functions to count isolates. The function [susceptibility()] is essentially equal to [count_susceptible()]` / `[count_all()]. *Low counts can influence the outcome - the `proportion_*()` functions may camouflage this, since they only return the proportion (albeit dependent on the `minimum` argument).* +#' +#' The function [proportion_df()] takes any variable from `data` that has an [`sir`] class (created with [as.sir()]) and calculates the proportions S, I, and R. It also supports grouped variables. The function [sir_df()] works exactly like [proportion_df()], but adds the number of isolates. +#' @section Combination Therapy: +#' When using more than one variable for `...` (= combination therapy), use `only_all_tested` to only count isolates that are tested for all antibiotics/variables that you test them for. See this example for two antibiotics, Drug A and Drug B, about how [susceptibility()] works to calculate the %SI: +#' +#' +#' ``` +#' -------------------------------------------------------------------- +#' only_all_tested = FALSE only_all_tested = TRUE +#' ----------------------- ----------------------- +#' Drug A Drug B include as include as include as include as +#' numerator denominator numerator denominator +#' -------- -------- ---------- ----------- ---------- ----------- +#' S or I S or I X X X X +#' R S or I X X X X +#' S or I X X - - +#' S or I R X X X X +#' R R - X - X +#' R - - - - +#' S or I X X - - +#' R - - - - +#' - - - - +#' -------------------------------------------------------------------- +#' ``` +#' +#' Please note that, in combination therapies, for `only_all_tested = TRUE` applies that: +#' +#' ``` +#' count_S() + count_I() + count_R() = count_all() +#' proportion_S() + proportion_I() + proportion_R() = 1 +#' ``` +#' +#' and that, in combination therapies, for `only_all_tested = FALSE` applies that: +#' +#' ``` +#' count_S() + count_I() + count_R() >= count_all() +#' proportion_S() + proportion_I() + proportion_R() >= 1 +#' ``` +#' +#' Using `only_all_tested` has no impact when only using one antibiotic as input. +#' @source **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition**, 2022, *Clinical and Laboratory Standards Institute (CLSI)*. . +#' @seealso [AMR::count()] to count resistant and susceptible isolates. +#' @return A [double] or, when `as_percent = TRUE`, a [character]. +#' @rdname proportion +#' @aliases portion +#' @name proportion +#' @export +#' @examples +#' # example_isolates is a data set available in the AMR package. +#' # run ?example_isolates for more info. +#' example_isolates +#' +#' +#' # base R ------------------------------------------------------------ +#' # determines %R +#' resistance(example_isolates$AMX) +#' sir_confidence_interval(example_isolates$AMX) +#' sir_confidence_interval(example_isolates$AMX, +#' confidence_level = 0.975 +#' ) +#' sir_confidence_interval(example_isolates$AMX, +#' confidence_level = 0.975, +#' collapse = ", " +#' ) +#' +#' # determines %S+I: +#' susceptibility(example_isolates$AMX) +#' sir_confidence_interval(example_isolates$AMX, +#' ab_result = c("S", "I") +#' ) +#' +#' # be more specific +#' proportion_S(example_isolates$AMX) +#' proportion_SI(example_isolates$AMX) +#' proportion_I(example_isolates$AMX) +#' proportion_IR(example_isolates$AMX) +#' proportion_R(example_isolates$AMX) +#' +#' # dplyr ------------------------------------------------------------- +#' \donttest{ +#' if (require("dplyr")) { +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise( +#' r = resistance(CIP), +#' n = n_sir(CIP) +#' ) # n_sir works like n_distinct in dplyr, see ?n_sir +#' } +#' if (require("dplyr")) { +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise( +#' cipro_R = resistance(CIP), +#' ci_min = sir_confidence_interval(CIP, side = "min"), +#' ci_max = sir_confidence_interval(CIP, side = "max"), +#' ) +#' } +#' if (require("dplyr")) { +#' # scoped dplyr verbs with antibiotic selectors +#' # (you could also use across() of course) +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise_at( +#' c(aminoglycosides(), carbapenems()), +#' resistance +#' ) +#' } +#' if (require("dplyr")) { +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise( +#' R = resistance(CIP, as_percent = TRUE), +#' SI = susceptibility(CIP, as_percent = TRUE), +#' n1 = count_all(CIP), # the actual total; sum of all three +#' n2 = n_sir(CIP), # same - analogous to n_distinct +#' total = n() +#' ) # NOT the number of tested isolates! +#' +#' # Calculate co-resistance between amoxicillin/clav acid and gentamicin, +#' # so we can see that combination therapy does a lot more than mono therapy: +#' example_isolates %>% susceptibility(AMC) # %SI = 76.3% +#' example_isolates %>% count_all(AMC) # n = 1879 +#' +#' example_isolates %>% susceptibility(GEN) # %SI = 75.4% +#' example_isolates %>% count_all(GEN) # n = 1855 +#' +#' example_isolates %>% susceptibility(AMC, GEN) # %SI = 94.1% +#' example_isolates %>% count_all(AMC, GEN) # n = 1939 +#' +#' +#' # See Details on how `only_all_tested` works. Example: +#' example_isolates %>% +#' summarise( +#' numerator = count_susceptible(AMC, GEN), +#' denominator = count_all(AMC, GEN), +#' proportion = susceptibility(AMC, GEN) +#' ) +#' +#' example_isolates %>% +#' summarise( +#' numerator = count_susceptible(AMC, GEN, only_all_tested = TRUE), +#' denominator = count_all(AMC, GEN, only_all_tested = TRUE), +#' proportion = susceptibility(AMC, GEN, only_all_tested = TRUE) +#' ) +#' +#' +#' example_isolates %>% +#' group_by(ward) %>% +#' summarise( +#' cipro_p = susceptibility(CIP, as_percent = TRUE), +#' cipro_n = count_all(CIP), +#' genta_p = susceptibility(GEN, as_percent = TRUE), +#' genta_n = count_all(GEN), +#' combination_p = susceptibility(CIP, GEN, as_percent = TRUE), +#' combination_n = count_all(CIP, GEN) +#' ) +#' +#' # Get proportions S/I/R immediately of all sir columns +#' example_isolates %>% +#' select(AMX, CIP) %>% +#' proportion_df(translate = FALSE) +#' +#' # It also supports grouping variables +#' # (use sir_df to also include the count) +#' example_isolates %>% +#' select(ward, AMX, CIP) %>% +#' group_by(ward) %>% +#' sir_df(translate = FALSE) +#' } +#' } +resistance <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = "R", + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +susceptibility <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = c("S", "SDD", "I"), + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +sir_confidence_interval <- function(..., + ab_result = "R", + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE, + confidence_level = 0.95, + side = "both", + collapse = FALSE) { + meet_criteria(ab_result, allow_class = c("character", "sir"), has_length = c(1:5), is_in = c("S", "SDD", "I", "R", "NI")) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(as_percent, allow_class = "logical", has_length = 1) + meet_criteria(only_all_tested, allow_class = "logical", has_length = 1) + meet_criteria(confidence_level, allow_class = "numeric", is_positive = TRUE, has_length = 1) + meet_criteria(side, allow_class = "character", has_length = 1, is_in = c("both", "b", "left", "l", "lower", "lowest", "less", "min", "right", "r", "higher", "highest", "greater", "g", "max")) + meet_criteria(collapse, allow_class = c("logical", "character"), has_length = 1) + + x <- tryCatch( + sir_calc(..., + ab_result = ab_result, + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) + n <- tryCatch( + sir_calc(..., + ab_result = c("S", "SDD", "I", "R", "NI"), + only_all_tested = only_all_tested, + only_count = TRUE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) + + if (x == 0) { + out <- c(0, 0) + } else { + # this applies the Clopper-Pearson method + out <- stats::binom.test(x = x, n = n, conf.level = confidence_level)$conf.int + } + out <- set_clean_class(out, "numeric") + + if (side %in% c("left", "l", "lower", "lowest", "less", "min")) { + out <- out[1] + } else if (side %in% c("right", "r", "higher", "highest", "greater", "g", "max")) { + out <- out[2] + } + if (isTRUE(as_percent)) { + out <- trimws(percentage(out, digits = 1)) + } + if (!isFALSE(collapse) && length(out) > 1) { + if (is.numeric(out)) { + out <- round(out, digits = 3) + } + # out[is.na(out)] <- 0 + out <- paste(out, collapse = ifelse(isTRUE(collapse), "-", collapse)) + } + + if (n < minimum) { + warning_("Introducing NA: ", + ifelse(n == 0, "no", paste("only", n)), + " results available for `sir_confidence_interval()` (`minimum` = ", minimum, ").", + call = FALSE + ) + if (is.character(out)) { + return(NA_character_) + } else { + return(NA_real_) + } + } + out +} + +#' @rdname proportion +#' @export +proportion_R <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = "R", + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +proportion_IR <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + if (message_not_thrown_before("proportion_IR", entire_session = TRUE)) { + message_("Note that `proportion_IR()` will also include dose-dependent susceptibility, 'SDD'. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("I", "SDD", "R"), + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +proportion_I <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + if (message_not_thrown_before("proportion_I", entire_session = TRUE)) { + message_("Note that `proportion_I()` will also include dose-dependent susceptibility, 'SDD'. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("I", "SDD"), + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +proportion_SI <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + if (message_not_thrown_before("proportion_SI", entire_session = TRUE)) { + message_("Note that `proportion_SI()` will also include dose-dependent susceptibility, 'SDD'. This note will be shown once for this session.", as_note = FALSE) + } + tryCatch( + sir_calc(..., + ab_result = c("S", "I", "SDD"), + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +proportion_S <- function(..., + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE) { + tryCatch( + sir_calc(..., + ab_result = "S", + minimum = minimum, + as_percent = as_percent, + only_all_tested = only_all_tested, + only_count = FALSE + ), + error = function(e) stop_(gsub("in sir_calc(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + +#' @rdname proportion +#' @export +proportion_df <- function(data, + translate_ab = "name", + language = get_AMR_locale(), + minimum = 30, + as_percent = FALSE, + combine_SI = TRUE, + confidence_level = 0.95) { + tryCatch( + sir_calc_df( + type = "proportion", + data = data, + translate_ab = translate_ab, + language = language, + minimum = minimum, + as_percent = as_percent, + combine_SI = combine_SI, + confidence_level = confidence_level + ), + error = function(e) stop_(gsub("in sir_calc_df(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/random.R + + + + +#' Random MIC Values/Disk Zones/SIR Generation +#' +#' These functions can be used for generating random MIC values and disk diffusion diameters, for AMR data analysis practice. By providing a microorganism and antimicrobial drug, the generated results will reflect reality as much as possible. +#' @param size desired size of the returned vector. If used in a [data.frame] call or `dplyr` verb, will get the current (group) size if left blank. +#' @param mo any [character] that can be coerced to a valid microorganism code with [as.mo()] +#' @param ab any [character] that can be coerced to a valid antimicrobial drug code with [as.ab()] +#' @param prob_SIR a vector of length 3: the probabilities for "S" (1st value), "I" (2nd value) and "R" (3rd value) +#' @param ... ignored, only in place to allow future extensions +#' @details The base \R function [sample()] is used for generating values. +#' +#' Generated values are based on the EUCAST `r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))` guideline as implemented in the [clinical_breakpoints] data set. To create specific generated values per bug or drug, set the `mo` and/or `ab` argument. +#' @return class `mic` for [random_mic()] (see [as.mic()]) and class `disk` for [random_disk()] (see [as.disk()]) +#' @name random +#' @rdname random +#' @export +#' @examples +#' random_mic(25) +#' random_disk(25) +#' random_sir(25) +#' +#' \donttest{ +#' # make the random generation more realistic by setting a bug and/or drug: +#' random_mic(25, "Klebsiella pneumoniae") # range 0.0625-64 +#' random_mic(25, "Klebsiella pneumoniae", "meropenem") # range 0.0625-16 +#' random_mic(25, "Streptococcus pneumoniae", "meropenem") # range 0.0625-4 +#' +#' random_disk(25, "Klebsiella pneumoniae") # range 8-50 +#' random_disk(25, "Klebsiella pneumoniae", "ampicillin") # range 11-17 +#' random_disk(25, "Streptococcus pneumoniae", "ampicillin") # range 12-27 +#' } +random_mic <- function(size = NULL, mo = NULL, ab = NULL, ...) { + meet_criteria(size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE, allow_NULL = TRUE) + meet_criteria(mo, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ab, allow_class = "character", has_length = 1, allow_NULL = TRUE) + if (is.null(size)) { + size <- NROW(get_current_data(arg_name = "size", call = -3)) + } + random_exec("MIC", size = size, mo = mo, ab = ab) +} + +#' @rdname random +#' @export +random_disk <- function(size = NULL, mo = NULL, ab = NULL, ...) { + meet_criteria(size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE, allow_NULL = TRUE) + meet_criteria(mo, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(ab, allow_class = "character", has_length = 1, allow_NULL = TRUE) + if (is.null(size)) { + size <- NROW(get_current_data(arg_name = "size", call = -3)) + } + random_exec("DISK", size = size, mo = mo, ab = ab) +} + +#' @rdname random +#' @export +random_sir <- function(size = NULL, prob_SIR = c(0.33, 0.33, 0.33), ...) { + meet_criteria(size, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE, allow_NULL = TRUE) + meet_criteria(prob_SIR, allow_class = c("numeric", "integer"), has_length = 3) + if (is.null(size)) { + size <- NROW(get_current_data(arg_name = "size", call = -3)) + } + sample(as.sir(c("S", "I", "R")), size = size, replace = TRUE, prob = prob_SIR) +} + +random_exec <- function(method_type, size, mo = NULL, ab = NULL) { + df <- AMR::clinical_breakpoints %pm>% + pm_filter(guideline %like% "EUCAST") %pm>% + pm_arrange(pm_desc(guideline)) %pm>% + subset(guideline == max(guideline) & + method == method_type & + type == "human") + + if (!is.null(mo)) { + mo_coerced <- as.mo(mo) + mo_include <- c( + mo_coerced, + as.mo(mo_genus(mo_coerced)), + as.mo(mo_family(mo_coerced)), + as.mo(mo_order(mo_coerced)) + ) + df_new <- df %pm>% + subset(mo %in% mo_include) + if (nrow(df_new) > 0) { + df <- df_new + } else { + warning_("in `random_", tolower(method_type), "()`: no rows found that match mo '", mo, "', ignoring argument `mo`") + } + } + + if (!is.null(ab)) { + ab_coerced <- as.ab(ab) + df_new <- df %pm>% + subset(ab %in% ab_coerced) + if (nrow(df_new) > 0) { + df <- df_new + } else { + warning_("in `random_", tolower(method_type), "()`: no rows found that match ab '", ab, "' (", ab_name(ab_coerced, tolower = TRUE, language = NULL), "), ignoring argument `ab`") + } + } + + if (method_type == "MIC") { + # set range + mic_range <- c(0.001, 0.002, 0.005, 0.010, 0.025, 0.0625, 0.125, 0.250, 0.5, 1, 2, 4, 8, 16, 32, 64, 128, 256) + + # get highest/lowest +/- random 1 to 3 higher factors of two + max_range <- mic_range[min( + length(mic_range), + which(mic_range == max(df$breakpoint_R, na.rm = TRUE)) + sample(c(1:3), 1) + )] + min_range <- mic_range[max( + 1, + which(mic_range == min(df$breakpoint_S, na.rm = TRUE)) - sample(c(1:3), 1) + )] + + mic_range_new <- mic_range[mic_range <= max_range & mic_range >= min_range] + if (length(mic_range_new) == 0) { + mic_range_new <- mic_range + } + out <- as.mic(sample(mic_range_new, size = size, replace = TRUE)) + # 50% chance that lowest will get <= and highest will get >= + if (stats::runif(1) > 0.5) { + out[out == min(out)] <- paste0("<=", out[out == min(out)]) + } + if (stats::runif(1) > 0.5) { + out[out == max(out)] <- paste0(">=", out[out == max(out)]) + } + return(out) + } else if (method_type == "DISK") { + set_range <- seq( + from = as.integer(min(df$breakpoint_R, na.rm = TRUE) / 1.25), + to = as.integer(max(df$breakpoint_S, na.rm = TRUE) * 1.25), + by = 1 + ) + out <- sample(set_range, size = size, replace = TRUE) + out[out < 6] <- sample(c(6:10), length(out[out < 6]), replace = TRUE) + out[out > 50] <- sample(c(40:50), length(out[out > 50]), replace = TRUE) + return(as.disk(out)) + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/resistance_predict.R + + + + +#' Predict Antimicrobial Resistance +#' +#' Create a prediction model to predict antimicrobial resistance for the next years on statistical solid ground. Standard errors (SE) will be returned as columns `se_min` and `se_max`. See *Examples* for a real live example. +#' @param object model data to be plotted +#' @param col_ab column name of `x` containing antimicrobial interpretations (`"R"`, `"I"` and `"S"`) +#' @param col_date column name of the date, will be used to calculate years if this column doesn't consist of years already - the default is the first column of with a date class +#' @param year_min lowest year to use in the prediction model, dafaults to the lowest year in `col_date` +#' @param year_max highest year to use in the prediction model - the default is 10 years after today +#' @param year_every unit of sequence between lowest year found in the data and `year_max` +#' @param minimum minimal amount of available isolates per year to include. Years containing less observations will be estimated by the model. +#' @param model the statistical model of choice. This could be a generalised linear regression model with binomial distribution (i.e. using `glm(..., family = binomial)`, assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance. See *Details* for all valid options. +#' @param I_as_S a [logical] to indicate whether values `"I"` should be treated as `"S"` (will otherwise be treated as `"R"`). The default, `TRUE`, follows the redefinition by EUCAST about the interpretation of I (increased exposure) in 2019, see section *Interpretation of S, I and R* below. +#' @param preserve_measurements a [logical] to indicate whether predictions of years that are actually available in the data should be overwritten by the original data. The standard errors of those years will be `NA`. +#' @param info a [logical] to indicate whether textual analysis should be printed with the name and [summary()] of the statistical model. +#' @param main title of the plot +#' @param ribbon a [logical] to indicate whether a ribbon should be shown (default) or error bars +#' @param ... arguments passed on to functions +#' @inheritSection as.sir Interpretation of SIR +#' @inheritParams first_isolate +#' @inheritParams graphics::plot +#' @details Valid options for the statistical model (argument `model`) are: +#' - `"binomial"` or `"binom"` or `"logit"`: a generalised linear regression model with binomial distribution +#' - `"loglin"` or `"poisson"`: a generalised log-linear regression model with poisson distribution +#' - `"lin"` or `"linear"`: a linear regression model +#' @return A [data.frame] with extra class [`resistance_predict`] with columns: +#' - `year` +#' - `value`, the same as `estimated` when `preserve_measurements = FALSE`, and a combination of `observed` and `estimated` otherwise +#' - `se_min`, the lower bound of the standard error with a minimum of `0` (so the standard error will never go below 0%) +#' - `se_max` the upper bound of the standard error with a maximum of `1` (so the standard error will never go above 100%) +#' - `observations`, the total number of available observations in that year, i.e. \eqn{S + I + R} +#' - `observed`, the original observed resistant percentages +#' - `estimated`, the estimated resistant percentages, calculated by the model +#' +#' Furthermore, the model itself is available as an attribute: `attributes(x)$model`, see *Examples*. +#' @seealso The [proportion()] functions to calculate resistance +#' +#' Models: [lm()] [glm()] +#' @rdname resistance_predict +#' @export +#' @importFrom stats predict glm lm +#' @examples +#' x <- resistance_predict(example_isolates, +#' col_ab = "AMX", +#' year_min = 2010, +#' model = "binomial" +#' ) +#' plot(x) +#' \donttest{ +#' if (require("ggplot2")) { +#' ggplot_sir_predict(x) +#' } +#' +#' # using dplyr: +#' if (require("dplyr")) { +#' x <- example_isolates %>% +#' filter_first_isolate() %>% +#' filter(mo_genus(mo) == "Staphylococcus") %>% +#' resistance_predict("PEN", model = "binomial") +#' print(plot(x)) +#' +#' # get the model from the object +#' mymodel <- attributes(x)$model +#' summary(mymodel) +#' } +#' +#' # create nice plots with ggplot2 yourself +#' if (require("dplyr") && require("ggplot2")) { +#' data <- example_isolates %>% +#' filter(mo == as.mo("E. coli")) %>% +#' resistance_predict( +#' col_ab = "AMX", +#' col_date = "date", +#' model = "binomial", +#' info = FALSE, +#' minimum = 15 +#' ) +#' head(data) +#' autoplot(data) +#' } +#' } +resistance_predict <- function(x, + col_ab, + col_date = NULL, + year_min = NULL, + year_max = NULL, + year_every = 1, + minimum = 30, + model = NULL, + I_as_S = TRUE, + preserve_measurements = TRUE, + info = interactive(), + ...) { + meet_criteria(x, allow_class = "data.frame") + meet_criteria(col_ab, allow_class = "character", has_length = 1, is_in = colnames(x)) + meet_criteria(col_date, allow_class = "character", has_length = 1, is_in = colnames(x), allow_NULL = TRUE) + meet_criteria(year_min, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive = TRUE, is_finite = TRUE) + meet_criteria(year_max, allow_class = c("numeric", "integer"), has_length = 1, allow_NULL = TRUE, is_positive = TRUE, is_finite = TRUE) + meet_criteria(year_every, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(model, allow_class = c("character", "function"), has_length = 1, allow_NULL = TRUE) + meet_criteria(I_as_S, allow_class = "logical", has_length = 1) + meet_criteria(preserve_measurements, allow_class = "logical", has_length = 1) + meet_criteria(info, allow_class = "logical", has_length = 1) + + stop_if(is.null(model), 'choose a regression model with the `model` argument, e.g. resistance_predict(..., model = "binomial")') + + x.bak <- x + x <- as.data.frame(x, stringsAsFactors = FALSE) + + # -- date + if (is.null(col_date)) { + col_date <- search_type_in_df(x = x, type = "date") + stop_if(is.null(col_date), "`col_date` must be set") + } + stop_ifnot( + col_date %in% colnames(x), + "column '", col_date, "' not found" + ) + + year <- function(x) { + # don't depend on lubridate or so, would be overkill for only this function + if (all(grepl("^[0-9]{4}$", x))) { + as.integer(x) + } else { + as.integer(format(as.Date(x), "%Y")) + } + } + + df <- x + df[, col_ab] <- droplevels(as.sir(df[, col_ab, drop = TRUE])) + if (I_as_S == TRUE) { + # then I as S + df[, col_ab] <- gsub("I", "S", df[, col_ab, drop = TRUE], fixed = TRUE) + } else { + # then I as R + df[, col_ab] <- gsub("I", "R", df[, col_ab, drop = TRUE], fixed = TRUE) + } + df[, col_ab] <- ifelse(is.na(df[, col_ab, drop = TRUE]), 0, df[, col_ab, drop = TRUE]) + + # remove rows with NAs + df <- subset(df, !is.na(df[, col_ab, drop = TRUE])) + df$year <- year(df[, col_date, drop = TRUE]) + df <- as.data.frame(rbind(table(df[, c("year", col_ab), drop = FALSE])), + stringsAsFactors = FALSE + ) + df$year <- as.integer(rownames(df)) + rownames(df) <- NULL + + df <- subset(df, sum(df$R + df$S, na.rm = TRUE) >= minimum) + # nolint start + df_matrix <- as.matrix(df[, c("R", "S"), drop = FALSE]) + # nolint end + + stop_if(NROW(df) == 0, "there are no observations") + + year_lowest <- min(df$year) + if (is.null(year_min)) { + year_min <- year_lowest + } else { + year_min <- max(year_min, year_lowest, na.rm = TRUE) + } + if (is.null(year_max)) { + year_max <- year(Sys.Date()) + 10 + } + + years <- list(year = seq(from = year_min, to = year_max, by = year_every)) + + if (model %in% c("binomial", "binom", "logit")) { + model <- "binomial" + model_lm <- with(df, glm(df_matrix ~ year, family = binomial)) + if (isTRUE(info)) { + cat("\nLogistic regression model (logit) with binomial distribution") + cat("\n------------------------------------------------------------\n") + print(summary(model_lm)) + } + + predictmodel <- predict(model_lm, newdata = years, type = "response", se.fit = TRUE) + prediction <- predictmodel$fit + se <- predictmodel$se.fit + } else if (model %in% c("loglin", "poisson")) { + model <- "poisson" + model_lm <- with(df, glm(R ~ year, family = poisson)) + if (isTRUE(info)) { + cat("\nLog-linear regression model (loglin) with poisson distribution") + cat("\n--------------------------------------------------------------\n") + print(summary(model_lm)) + } + + predictmodel <- predict(model_lm, newdata = years, type = "response", se.fit = TRUE) + prediction <- predictmodel$fit + se <- predictmodel$se.fit + } else if (model %in% c("lin", "linear")) { + model <- "linear" + model_lm <- with(df, lm((R / (R + S)) ~ year)) + if (isTRUE(info)) { + cat("\nLinear regression model") + cat("\n-----------------------\n") + print(summary(model_lm)) + } + + predictmodel <- predict(model_lm, newdata = years, se.fit = TRUE) + prediction <- predictmodel$fit + se <- predictmodel$se.fit + } else { + stop("no valid model selected. See `?resistance_predict`.") + } + + # prepare the output dataframe + df_prediction <- data.frame( + year = unlist(years), + value = prediction, + se_min = prediction - se, + se_max = prediction + se, + stringsAsFactors = FALSE + ) + + if (model == "poisson") { + df_prediction$value <- as.integer(format(df_prediction$value, scientific = FALSE)) + df_prediction$se_min <- as.integer(df_prediction$se_min) + df_prediction$se_max <- as.integer(df_prediction$se_max) + } else { + # se_max not above 1 + df_prediction$se_max <- pmin(df_prediction$se_max, 1) + } + # se_min not below 0 + df_prediction$se_min <- pmax(df_prediction$se_min, 0) + + df_observations <- data.frame( + year = df$year, + observations = df$R + df$S, + observed = df$R / (df$R + df$S), + stringsAsFactors = FALSE + ) + df_prediction <- df_prediction %pm>% + pm_left_join(df_observations, by = "year") + df_prediction$estimated <- df_prediction$value + + if (preserve_measurements == TRUE) { + # replace estimated data by observed data + df_prediction$value <- ifelse(!is.na(df_prediction$observed), df_prediction$observed, df_prediction$value) + df_prediction$se_min <- ifelse(!is.na(df_prediction$observed), NA, df_prediction$se_min) + df_prediction$se_max <- ifelse(!is.na(df_prediction$observed), NA, df_prediction$se_max) + } + + df_prediction$value <- ifelse(df_prediction$value > 1, 1, pmax(df_prediction$value, 0)) + df_prediction <- df_prediction[order(df_prediction$year), , drop = FALSE] + + out <- as_original_data_class(df_prediction, class(x.bak)) # will remove tibble groups + structure(out, + class = c("resistance_predict", class(out)), + I_as_S = I_as_S, + model_title = model, + model = model_lm, + ab = col_ab + ) +} + +#' @rdname resistance_predict +#' @export +sir_predict <- resistance_predict + +#' @method plot resistance_predict +#' @export +#' @importFrom graphics plot axis arrows points +#' @rdname resistance_predict +plot.resistance_predict <- function(x, main = paste("Resistance Prediction of", x_name), ...) { + x_name <- paste0(ab_name(attributes(x)$ab), " (", attributes(x)$ab, ")") + meet_criteria(main, allow_class = "character", has_length = 1) + + if (attributes(x)$I_as_S == TRUE) { + ylab <- "%R" + } else { + ylab <- "%IR" + } + + plot( + x = x$year, + y = x$value, + ylim = c(0, 1), + yaxt = "n", # no y labels + pch = 19, # closed dots + ylab = paste0("Percentage (", ylab, ")"), + xlab = "Year", + main = main, + sub = paste0( + "(n = ", sum(x$observations, na.rm = TRUE), + ", model: ", attributes(x)$model_title, ")" + ), + cex.sub = 0.75 + ) + + + axis(side = 2, at = seq(0, 1, 0.1), labels = paste0(0:10 * 10, "%")) + + # hack for error bars: https://stackoverflow.com/a/22037078/4575331 + arrows( + x0 = x$year, + y0 = x$se_min, + x1 = x$year, + y1 = x$se_max, + length = 0.05, angle = 90, code = 3, lwd = 1.5 + ) + + # overlay grey points for prediction + points( + x = subset(x, is.na(observations))$year, + y = subset(x, is.na(observations))$value, + pch = 19, + col = "grey40" + ) +} + +#' @rdname resistance_predict +#' @export +ggplot_sir_predict <- function(x, + main = paste("Resistance Prediction of", x_name), + ribbon = TRUE, + ...) { + x_name <- paste0(ab_name(attributes(x)$ab), " (", attributes(x)$ab, ")") + meet_criteria(main, allow_class = "character", has_length = 1) + meet_criteria(ribbon, allow_class = "logical", has_length = 1) + + stop_ifnot_installed("ggplot2") + stop_ifnot(inherits(x, "resistance_predict"), "`x` must be a resistance prediction model created with resistance_predict()") + + if (attributes(x)$I_as_S == TRUE) { + ylab <- "%R" + } else { + ylab <- "%IR" + } + + p <- ggplot2::ggplot( + as.data.frame(x, stringsAsFactors = FALSE), + ggplot2::aes(x = year, y = value) + ) + + ggplot2::geom_point( + data = subset(x, !is.na(observations)), + size = 2 + ) + + scale_y_percent(limits = c(0, 1)) + + ggplot2::labs( + title = main, + y = paste0("Percentage (", ylab, ")"), + x = "Year", + caption = paste0( + "(n = ", sum(x$observations, na.rm = TRUE), + ", model: ", attributes(x)$model_title, ")" + ) + ) + + if (ribbon == TRUE) { + p <- p + ggplot2::geom_ribbon(ggplot2::aes(ymin = se_min, ymax = se_max), alpha = 0.25) + } else { + p <- p + ggplot2::geom_errorbar(ggplot2::aes(ymin = se_min, ymax = se_max), na.rm = TRUE, width = 0.5) + } + p <- p + + # overlay grey points for prediction + ggplot2::geom_point( + data = subset(x, is.na(observations)), + size = 2, + colour = "grey40" + ) + p +} + +#' @method autoplot resistance_predict +#' @rdname resistance_predict +# will be exported using s3_register() in R/zzz.R +autoplot.resistance_predict <- function(object, + main = paste("Resistance Prediction of", x_name), + ribbon = TRUE, + ...) { + x_name <- paste0(ab_name(attributes(object)$ab), " (", attributes(object)$ab, ")") + meet_criteria(main, allow_class = "character", has_length = 1) + meet_criteria(ribbon, allow_class = "logical", has_length = 1) + ggplot_sir_predict(x = object, main = main, ribbon = ribbon, ...) +} + +#' @method fortify resistance_predict +#' @noRd +# will be exported using s3_register() in R/zzz.R +fortify.resistance_predict <- function(model, data, ...) { + as.data.frame(model) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/sir.R + + + + +#' Translate MIC and Disk Diffusion to SIR, or Clean Existing SIR Data +#' +#' @description Clean up existing SIR values, or interpret minimum inhibitory concentration (MIC) values and disk diffusion diameters according to EUCAST or CLSI. [as.sir()] transforms the input to a new class [`sir`], which is an ordered [factor] containing the levels `S`, `SDD`, `I`, `R`, `NI`. +#' +#' These breakpoints are currently implemented: +#' - For **clinical microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`; +#' - For **veterinary microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`; +#' - For **ECOFFs** (Epidemiological Cut-off Values): EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`. +#' +#' All breakpoints used for interpretation are available in our [clinical_breakpoints] data set. +#' @rdname as.sir +#' @param x vector of values (for class [`mic`]: MIC values in mg/L, for class [`disk`]: a disk diffusion radius in millimetres) +#' @param mo a vector (or column name) with [character]s that can be coerced to valid microorganism codes with [as.mo()], can be left empty to determine it automatically +#' @param ab a vector (or column name) with [character]s that can be coerced to a valid antimicrobial drug code with [as.ab()] +#' @param uti (Urinary Tract Infection) a vector (or column name) with [logical]s (`TRUE` or `FALSE`) to specify whether a UTI specific interpretation from the guideline should be chosen. For using [as.sir()] on a [data.frame], this can also be a column containing [logical]s or when left blank, the data set will be searched for a column 'specimen', and rows within this column containing 'urin' (such as 'urine', 'urina') will be regarded isolates from a UTI. See *Examples*. +#' @inheritParams first_isolate +#' @param guideline defaults to EUCAST `r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))` (the latest implemented EUCAST guideline in the [AMR::clinical_breakpoints] data set), but can be set with the package option [`AMR_guideline`][AMR-options]. Currently supports EUCAST (`r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`) and CLSI (`r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`), see *Details*. +#' @param conserve_capped_values a [logical] to indicate that MIC values starting with `">"` (but not `">="`) must always return "R" , and that MIC values starting with `"<"` (but not `"<="`) must always return "S" +#' @param add_intrinsic_resistance *(only useful when using a EUCAST guideline)* a [logical] to indicate whether intrinsic antibiotic resistance must also be considered for applicable bug-drug combinations, meaning that e.g. ampicillin will always return "R" in *Klebsiella* species. Determination is based on the [intrinsic_resistant] data set, that itself is based on `r format_eucast_version_nr(3.3)`. +#' @param include_screening a [logical] to indicate that clinical breakpoints for screening are allowed - the default is `FALSE`. Can also be set with the package option [`AMR_include_screening`][AMR-options]. +#' @param include_PKPD a [logical] to indicate that PK/PD clinical breakpoints must be applied as a last resort - the default is `TRUE`. Can also be set with the package option [`AMR_include_PKPD`][AMR-options]. +#' @param breakpoint_type the type of breakpoints to use, either `r vector_or(clinical_breakpoints$type)`. ECOFF stands for Epidemiological Cut-Off values. The default is `"human"`, which can also be set with the package option [`AMR_breakpoint_type`][AMR-options]. If `host` is set to values of veterinary species, this will automatically be set to `"animal"`. +#' @param host a vector (or column name) with [character]s to indicate the host. Only useful for veterinary breakpoints, as it requires `breakpoint_type = "animal"`. The values can be any text resembling the animal species, even in any of the `r length(LANGUAGES_SUPPORTED)` supported languages of this package. For foreign languages, be sure to set the language with [set_AMR_locale()] (though it will be automatically guessed based on the system language). +#' @param verbose a [logical] to indicate that all notes should be printed during interpretation of MIC values or disk diffusion values. +#' @param reference_data a [data.frame] to be used for interpretation, which defaults to the [clinical_breakpoints] data set. Changing this argument allows for using own interpretation guidelines. This argument must contain a data set that is equal in structure to the [clinical_breakpoints] data set (same column names and column types). Please note that the `guideline` argument will be ignored when `reference_data` is manually set. +#' @param threshold maximum fraction of invalid antimicrobial interpretations of `x`, see *Examples* +#' @param ... for using on a [data.frame]: names of columns to apply [as.sir()] on (supports tidy selection such as `column1:column4`). Otherwise: arguments passed on to methods. +#' @details +#' *Note: The clinical breakpoints in this package were validated through, and imported from, [WHONET](https://whonet.org). The public use of this `AMR` package has been endorsed by both CLSI and EUCAST. See [clinical_breakpoints] for more information.* +#' +#' ### How it Works +#' +#' The [as.sir()] function can work in four ways: +#' +#' 1. For **cleaning raw / untransformed data**. The data will be cleaned to only contain valid values, namely: **S** for susceptible, **I** for intermediate or 'susceptible, increased exposure', **R** for resistant, **NI** for non-interpretable, and **SDD** for susceptible dose-dependent. Each of these can be set using a [regular expression][base::regex]. Furthermore, [as.sir()] will try its best to clean with some intelligence. For example, mixed values with SIR interpretations and MIC values such as `"<0.25; S"` will be coerced to `"S"`. Combined interpretations for multiple test methods (as seen in laboratory records) such as `"S; S"` will be coerced to `"S"`, but a value like `"S; I"` will return `NA` with a warning that the input is invalid. +#' +#' 2. For **interpreting minimum inhibitory concentration (MIC) values** according to EUCAST or CLSI. You must clean your MIC values first using [as.mic()], that also gives your columns the new data class [`mic`]. Also, be sure to have a column with microorganism names or codes. It will be found automatically, but can be set manually using the `mo` argument. +#' * Using `dplyr`, SIR interpretation can be done very easily with either: +#' ```r +#' your_data %>% mutate_if(is.mic, as.sir) +#' your_data %>% mutate(across(where(is.mic), as.sir)) +#' your_data %>% mutate_if(is.mic, as.sir, ab = "column_with_antibiotics", mo = "column_with_microorganisms") +#' your_data %>% mutate_if(is.mic, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...)) +#' +#' # for veterinary breakpoints, also set `host`: +#' your_data %>% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI") +#' ``` +#' * Operators like "<=" will be stripped before interpretation. When using `conserve_capped_values = TRUE`, an MIC value of e.g. ">2" will always return "R", even if the breakpoint according to the chosen guideline is ">=4". This is to prevent that capped values from raw laboratory data would not be treated conservatively. The default behaviour (`conserve_capped_values = FALSE`) considers ">2" to be lower than ">=4" and might in this case return "S" or "I". +#' 3. For **interpreting disk diffusion diameters** according to EUCAST or CLSI. You must clean your disk zones first using [as.disk()], that also gives your columns the new data class [`disk`]. Also, be sure to have a column with microorganism names or codes. It will be found automatically, but can be set manually using the `mo` argument. +#' * Using `dplyr`, SIR interpretation can be done very easily with either: +#' ```r +#' your_data %>% mutate_if(is.disk, as.sir) +#' your_data %>% mutate(across(where(is.disk), as.sir)) +#' your_data %>% mutate_if(is.disk, as.sir, ab = "column_with_antibiotics", mo = "column_with_microorganisms") +#' your_data %>% mutate_if(is.disk, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...)) +#' +#' # for veterinary breakpoints, also set `host`: +#' your_data %>% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI") +#' ``` +#' 4. For **interpreting a complete data set**, with automatic determination of MIC values, disk diffusion diameters, microorganism names or codes, and antimicrobial test results. This is done very simply by running `as.sir(your_data)`. +#' +#' **For points 2, 3 and 4: Use [sir_interpretation_history()]** to retrieve a [data.frame] (or [tibble][tibble::tibble()] if the `tibble` package is installed) with all results of the last [as.sir()] call. +#' +#' ### Supported Guidelines +#' +#' For interpreting MIC values as well as disk diffusion diameters, currently implemented guidelines are for **clinical microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`, and for **veterinary microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`. +#' +#' Thus, the `guideline` argument must be set to e.g., ``r paste0('"', subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline[1], '"')`` or ``r paste0('"', subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline[1], '"')``. By simply using `"EUCAST"` (the default) or `"CLSI"` as input, the latest included version of that guideline will automatically be selected. You can set your own data set using the `reference_data` argument. The `guideline` argument will then be ignored. +#' +#' You can set the default guideline with the package option [`AMR_guideline`][AMR-options] (e.g. in your `.Rprofile` file), such as: +#' +#' ``` +#' options(AMR_guideline = "CLSI") +#' options(AMR_guideline = "CLSI 2018") +#' options(AMR_guideline = "EUCAST 2020") +#' # or to reset: +#' options(AMR_guideline = NULL) +#' ``` +#' +#' For veterinary guidelines, these might be the best options: +#' +#' ``` +#' options(AMR_guideline = "CLSI") +#' options(AMR_breakpoint_type = "animal") +#' ``` +#' +#' When applying veterinary breakpoints (by setting `host` or by setting `breakpoint_type = "animal"`), the [CLSI VET09 guideline](https://clsi.org/standards/products/veterinary-medicine/documents/vet09/) will be applied to cope with missing animal species-specific breakpoints. +#' +#' ### After Interpretation +#' +#' After using [as.sir()], you can use the [eucast_rules()] defined by EUCAST to (1) apply inferred susceptibility and resistance based on results of other antimicrobials and (2) apply intrinsic resistance based on taxonomic properties of a microorganism. +#' +#' To determine which isolates are multi-drug resistant, be sure to run [mdro()] (which applies the MDR/PDR/XDR guideline from 2012 at default) on a data set that contains S/I/R values. Read more about [interpreting multidrug-resistant organisms here][mdro()]. +#' +#' ### Machine-Readable Clinical Breakpoints +#' +#' The repository of this package [contains a machine-readable version](https://github.com/msberends/AMR/blob/main/data-raw/clinical_breakpoints.txt) of all guidelines. This is a CSV file consisting of `r format(nrow(AMR::clinical_breakpoints), big.mark = " ")` rows and `r ncol(AMR::clinical_breakpoints)` columns. This file is machine-readable, since it contains one row for every unique combination of the test method (MIC or disk diffusion), the antimicrobial drug and the microorganism. **This allows for easy implementation of these rules in laboratory information systems (LIS)**. Note that it only contains interpretation guidelines for humans - interpretation guidelines from CLSI for animals were removed. +#' +#' ### Other +#' +#' The function [is.sir()] detects if the input contains class `sir`. If the input is a [data.frame], it iterates over all columns and returns a [logical] vector. +#' +#' The base R function [as.double()] can be used to retrieve quantitative values from a `sir` object: `"S"` = 1, `"I"`/`"SDD"` = 2, `"R"` = 3. All other values are rendered `NA` . **Note:** Do not use `as.integer()`, since that (because of how R works internally) will return the factor level indices, and not these aforementioned quantitative values. +#' +#' The function [is_sir_eligible()] returns `TRUE` when a column contains at most 5% invalid antimicrobial interpretations (not S and/or I and/or R and/or NI and/or SDD), and `FALSE` otherwise. The threshold of 5% can be set with the `threshold` argument. If the input is a [data.frame], it iterates over all columns and returns a [logical] vector. +#' @section Interpretation of SIR: +#' In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (): +#' +#' - **S - Susceptible, standard dosing regimen**\cr +#' A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +#' - **I - Susceptible, increased exposure** *\cr +#' A microorganism is categorised as "Susceptible, Increased exposure*" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +#' - **R = Resistant**\cr +#' A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +#' +#' * *Exposure* is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +#' +#' This AMR package honours this insight. Use [susceptibility()] (equal to [proportion_SI()]) to determine antimicrobial susceptibility and [count_susceptible()] (equal to [count_SI()]) to count susceptible isolates. +#' @return Ordered [factor] with new class `sir` +#' @aliases sir +#' @export +#' @seealso [as.mic()], [as.disk()], [as.mo()] +#' @source +#' For interpretations of minimum inhibitory concentration (MIC) values and disk diffusion diameters: +#' +#' - **CLSI M39: Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). . +#' - **CLSI M100: Performance Standard for Antimicrobial Susceptibility Testing**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type != "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type != "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). . +#' - **CLSI VET01: Performance Standards for Antimicrobial Disk and Dilution Susceptibility Tests for Bacteria Isolated From Animals**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). . +#' - **CLSI VET09: Understanding Susceptibility Test Data as a Component of Antimicrobial Stewardship in Veterinary Settings**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). . +#' - **EUCAST Breakpoint tables for interpretation of MICs and zone diameters**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`, *European Committee on Antimicrobial Susceptibility Testing* (EUCAST). . +#' - **WHONET** as a source for machine-reading the clinical breakpoints ([read more here](https://msberends.github.io/AMR/reference/clinical_breakpoints.html#imported-from-whonet)), 1989-`r max(as.integer(gsub("[^0-9]", "", AMR::clinical_breakpoints$guideline)))`, *WHO Collaborating Centre for Surveillance of Antimicrobial Resistance*. . +#' +#' @inheritSection AMR Reference Data Publicly Available +#' @examples +#' example_isolates +#' summary(example_isolates) # see all SIR results at a glance +#' +#' # For INTERPRETING disk diffusion and MIC values ----------------------- +#' +#' # example data sets, with combined MIC values and disk zones +#' df_wide <- data.frame( +#' microorganism = "Escherichia coli", +#' amoxicillin = as.mic(8), +#' cipro = as.mic(0.256), +#' tobra = as.disk(16), +#' genta = as.disk(18), +#' ERY = "R" +#' ) +#' df_long <- data.frame( +#' bacteria = rep("Escherichia coli", 4), +#' antibiotic = c("amoxicillin", "cipro", "tobra", "genta"), +#' mics = as.mic(c(0.01, 1, 4, 8)), +#' disks = as.disk(c(6, 10, 14, 18)) +#' ) +#' +#' \donttest{ +#' ## Using dplyr ------------------------------------------------- +#' if (require("dplyr")) { +#' # approaches that all work without additional arguments: +#' df_wide %>% mutate_if(is.mic, as.sir) +#' df_wide %>% mutate_if(function(x) is.mic(x) | is.disk(x), as.sir) +#' df_wide %>% mutate(across(where(is.mic), as.sir)) +#' df_wide %>% mutate_at(vars(amoxicillin:tobra), as.sir) +#' df_wide %>% mutate(across(amoxicillin:tobra, as.sir)) +#' +#' # approaches that all work with additional arguments: +#' df_long %>% +#' # given a certain data type, e.g. MIC values +#' mutate_if(is.mic, as.sir, +#' mo = "bacteria", +#' ab = "antibiotic", +#' guideline = "CLSI") +#' df_long %>% +#' mutate(across(where(is.mic), +#' function(x) as.sir(x, +#' mo = "bacteria", +#' ab = "antibiotic", +#' guideline = "CLSI"))) +#' df_wide %>% +#' # given certain columns, e.g. from 'cipro' to 'genta' +#' mutate_at(vars(cipro:genta), as.sir, +#' mo = "bacteria", +#' guideline = "CLSI") +#' df_wide %>% +#' mutate(across(cipro:genta, +#' function(x) as.sir(x, +#' mo = "bacteria", +#' guideline = "CLSI"))) +#' +#' # for veterinary breakpoints, add 'host': +#' df_long$animal_species <- c("cats", "dogs", "horses", "cattle") +#' df_long %>% +#' # given a certain data type, e.g. MIC values +#' mutate_if(is.mic, as.sir, +#' mo = "bacteria", +#' ab = "antibiotic", +#' host = "animal_species", +#' guideline = "CLSI") +#' df_long %>% +#' mutate(across(where(is.mic), +#' function(x) as.sir(x, +#' mo = "bacteria", +#' ab = "antibiotic", +#' host = "animal_species", +#' guideline = "CLSI"))) +#' df_wide %>% +#' mutate_at(vars(cipro:genta), as.sir, +#' mo = "bacteria", +#' ab = "antibiotic", +#' host = "animal_species", +#' guideline = "CLSI") +#' df_wide %>% +#' mutate(across(cipro:genta, +#' function(x) as.sir(x, +#' mo = "bacteria", +#' host = "animal_species", +#' guideline = "CLSI"))) +#' +#' # to include information about urinary tract infections (UTI) +#' data.frame(mo = "E. coli", +#' nitrofuratoin = c("<= 2", 32), +#' from_the_bladder = c(TRUE, FALSE)) %>% +#' as.sir(uti = "from_the_bladder") +#' +#' data.frame(mo = "E. coli", +#' nitrofuratoin = c("<= 2", 32), +#' specimen = c("urine", "blood")) %>% +#' as.sir() # automatically determines urine isolates +#' +#' df_wide %>% +#' mutate_at(vars(cipro:genta), as.sir, mo = "E. coli", uti = TRUE) +#' } +#' +#' +#' ## Using base R ------------------------------------------------ +#' +#' as.sir(df_wide) +#' +#' # return a 'logbook' about the results: +#' sir_interpretation_history() +#' +#' # for single values +#' as.sir( +#' x = as.mic(2), +#' mo = as.mo("S. pneumoniae"), +#' ab = "AMP", +#' guideline = "EUCAST" +#' ) +#' +#' as.sir( +#' x = as.disk(18), +#' mo = "Strep pneu", # `mo` will be coerced with as.mo() +#' ab = "ampicillin", # and `ab` with as.ab() +#' guideline = "EUCAST" +#' ) +#' +#' +#' # For CLEANING existing SIR values ------------------------------------ +#' +#' as.sir(c("S", "SDD", "I", "R", "NI", "A", "B", "C")) +#' as.sir("<= 0.002; S") # will return "S" +#' sir_data <- as.sir(c(rep("S", 474), rep("I", 36), rep("R", 370))) +#' is.sir(sir_data) +#' plot(sir_data) # for percentages +#' barplot(sir_data) # for frequencies +#' +#' # as common in R, you can use as.integer() to return factor indices: +#' as.integer(as.sir(c("S", "SDD", "I", "R", "NI", NA))) +#' # but for computational use, as.double() will return 1 for S, 2 for I/SDD, and 3 for R: +#' as.double(as.sir(c("S", "SDD", "I", "R", "NI", NA))) +#' +#' # the dplyr way +#' if (require("dplyr")) { +#' example_isolates %>% +#' mutate_at(vars(PEN:RIF), as.sir) +#' # same: +#' example_isolates %>% +#' as.sir(PEN:RIF) +#' +#' # fastest way to transform all columns with already valid AMR results to class `sir`: +#' example_isolates %>% +#' mutate_if(is_sir_eligible, as.sir) +#' +#' # since dplyr 1.0.0, this can also be: +#' # example_isolates %>% +#' # mutate(across(where(is_sir_eligible), as.sir)) +#' } +#' } +as.sir <- function(x, ...) { + UseMethod("as.sir") +} + +as_sir_structure <- function(x, + guideline = NULL, + mo = NULL, + ab = NULL, + method = NULL, + ref_tbl = NULL, + ref_breakpoints = NULL) { + out <- structure(factor(as.character(unlist(unname(x))), + levels = c("S", "SDD", "I", "R", "NI"), + ordered = TRUE), + guideline = guideline, + mo = mo, + ab = ab, + method = method, + ref_tbl = ref_tbl, + ref_breakpoints = ref_breakpoints, + class = c("sir", "ordered", "factor")) +} + +#' @rdname as.sir +#' @details `NA_sir_` is a missing value of the new `sir` class, analogous to e.g. base \R's [`NA_character_`][base::NA]. +#' @format NULL +#' @export +NA_sir_ <- as_sir_structure(NA_character_) + +#' @rdname as.sir +#' @export +is.sir <- function(x) { + if (inherits(x, "data.frame")) { + unname(vapply(FUN.VALUE = logical(1), x, is.sir)) + } else { + isTRUE(inherits(x, "sir")) + } +} + +#' @rdname as.sir +#' @export +is_sir_eligible <- function(x, threshold = 0.05) { + meet_criteria(threshold, allow_class = "numeric", has_length = 1) + + if (inherits(x, "data.frame")) { + # iterate this function over all columns + return(unname(vapply(FUN.VALUE = logical(1), x, is_sir_eligible))) + } + + stop_if(NCOL(x) > 1, "`x` must be a one-dimensional vector.") + if (any(c( + "numeric", + "integer", + "mo", + "ab", + "Date", + "POSIXt", + "raw", + "hms", + "mic", + "disk" + ) + %in% class(x))) { + # no transformation needed + return(FALSE) + } else if (all(x %in% c("S", "SDD", "I", "R", "NI", NA)) & !all(is.na(x))) { + return(TRUE) + } else if (!any(c("S", "SDD", "I", "R", "NI") %in% x, na.rm = TRUE) & !all(is.na(x))) { + return(FALSE) + } else { + x <- x[!is.na(x) & !is.null(x) & !x %in% c("", "-", "NULL")] + if (length(x) == 0) { + # no other values than empty + cur_col <- get_current_column() + if (!is.null(cur_col)) { + ab <- suppressWarnings(as.ab(cur_col, fast_mode = TRUE, info = FALSE)) + if (!is.na(ab)) { + # this is a valid antibiotic drug code + message_( + "Column '", font_bold(cur_col), "' is SIR eligible (despite only having empty values), since it seems to be ", + ab_name(ab, language = NULL, tolower = TRUE), " (", ab, ")" + ) + return(TRUE) + } + } + # all values empty and no antibiotic col name - return FALSE + return(FALSE) + } + # transform all values and see if it meets the set threshold + checked <- suppressWarnings(as.sir(x)) + outcome <- sum(is.na(checked)) / length(x) + outcome <= threshold + } +} + +#' @rdname as.sir +#' @export +#' @param S,I,R,NI,SDD a case-independent [regular expression][base::regex] to translate input to this result. This regular expression will be run *after* all non-letters and whitespaces are removed from the input. +# extra param: warn (logical, to never throw a warning) +as.sir.default <- function(x, + S = "^(S|U)+$", + I = "^(I)+$", + R = "^(R)+$", + NI = "^(N|NI|V)+$", + SDD = "^(SDD|D|H)+$", + ...) { + if (inherits(x, "sir")) { + return(as_sir_structure(x)) + } + + x.bak <- x + x <- as.character(x) # this is needed to prevent the vctrs pkg from throwing an error + + if (inherits(x.bak, c("numeric", "integer")) && all(x %in% c(1:3, NA))) { + # support haven package for importing e.g., from SPSS - it adds the 'labels' attribute + lbls <- attributes(x.bak)$labels + if (!is.null(lbls) && all(c("S", "I", "R") %in% names(lbls)) && all(c(1:3) %in% lbls)) { + x[x.bak == 1] <- names(lbls[lbls == 1]) + x[x.bak == 2] <- names(lbls[lbls == 2]) + x[x.bak == 3] <- names(lbls[lbls == 3]) + } else { + x[x.bak == 1] <- "S" + x[x.bak == 2] <- "I" + x[x.bak == 3] <- "R" + } + } else if (inherits(x.bak, "character") && all(x %in% c("1", "2", "3", "S", "I", "R", NA_character_))) { + x[x.bak == "1"] <- "S" + x[x.bak == "2"] <- "I" + x[x.bak == "3"] <- "R" + } else if (inherits(x.bak, "character") && all(x %in% c("1", "2", "3", "4", "5", "S", "SDD", "I", "R", "NI", NA_character_))) { + x[x.bak == "1"] <- "S" + x[x.bak == "2"] <- "SDD" + x[x.bak == "3"] <- "I" + x[x.bak == "4"] <- "R" + x[x.bak == "5"] <- "NI" + } else if (!all(is.na(x)) && !identical(levels(x), c("S", "SDD", "I", "R", "NI")) && !all(x %in% c("S", "SDD", "I", "R", "NI", NA))) { + if (all(x %unlike% "(S|I|R)", na.rm = TRUE)) { + # check if they are actually MICs or disks + if (all_valid_mics(x)) { + warning_("in `as.sir()`: the input seems to contain MIC values. You can transform them with `as.mic()` before running `as.sir()` to interpret them.") + } else if (all_valid_disks(x)) { + warning_("in `as.sir()`: the input seems to contain disk diffusion values. You can transform them with `as.disk()` before running `as.sir()` to interpret them.") + } + } + + # trim leading and trailing spaces, new lines, etc. + x <- trimws2(as.character(unlist(x))) + x[x %in% c(NA, "", "-", "NULL")] <- NA_character_ + x.bak <- x + + na_before <- length(x[is.na(x)]) + + # correct for translations + trans_R <- unlist(TRANSLATIONS[ + which(TRANSLATIONS$pattern == "Resistant"), + LANGUAGES_SUPPORTED[LANGUAGES_SUPPORTED %in% colnames(TRANSLATIONS)] + ]) + trans_S <- unlist(TRANSLATIONS[ + which(TRANSLATIONS$pattern == "Susceptible"), + LANGUAGES_SUPPORTED[LANGUAGES_SUPPORTED %in% colnames(TRANSLATIONS)] + ]) + trans_I <- unlist(TRANSLATIONS[ + which(TRANSLATIONS$pattern %in% c("Incr. exposure", "Susceptible, incr. exp.", "Intermediate")), + LANGUAGES_SUPPORTED[LANGUAGES_SUPPORTED %in% colnames(TRANSLATIONS)] + ]) + x <- gsub(paste0(unique(trans_R[!is.na(trans_R)]), collapse = "|"), "R", x, ignore.case = TRUE) + x <- gsub(paste0(unique(trans_S[!is.na(trans_S)]), collapse = "|"), "S", x, ignore.case = TRUE) + x <- gsub(paste0(unique(trans_I[!is.na(trans_I)]), collapse = "|"), "I", x, ignore.case = TRUE) + # replace all English textual input + x[x %like% "([^a-z]|^)res(is(tant)?)?"] <- "R" + x[x %like% "([^a-z]|^)sus(cep(tible)?)?"] <- "S" + x[x %like% "not|non"] <- "NI" + x[x %like% "([^a-z]|^)int(er(mediate)?)?|incr.*exp"] <- "I" + x[x %like% "dose"] <- "SDD" + x <- gsub("[^A-Z]+", "", x, perl = TRUE) + # apply regexes set by user + x[x %like% S] <- "S" + x[x %like% I] <- "I" + x[x %like% R] <- "R" + x[x %like% NI] <- "NI" + x[x %like% SDD] <- "SDD" + x[!x %in% c("S", "SDD", "I", "R", "NI")] <- NA_character_ + na_after <- length(x[is.na(x) | x == ""]) + + if (!isFALSE(list(...)$warn)) { # so as.sir(..., warn = FALSE) will never throw a warning + if (na_before != na_after) { + list_missing <- x.bak[is.na(x) & !is.na(x.bak) & x.bak != ""] %pm>% + unique() %pm>% + sort() %pm>% + vector_and(quotes = TRUE) + cur_col <- get_current_column() + warning_("in `as.sir()`: ", na_after - na_before, " result", + ifelse(na_after - na_before > 1, "s", ""), + ifelse(is.null(cur_col), "", paste0(" in column '", cur_col, "'")), + " truncated (", + round(((na_after - na_before) / length(x)) * 100), + "%) that were invalid antimicrobial interpretations: ", + list_missing, + call = FALSE + ) + } + } + } + + as_sir_structure(x) +} + +#' @rdname as.sir +#' @export +as.sir.mic <- function(x, + mo = NULL, + ab = deparse(substitute(x)), + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + conserve_capped_values = FALSE, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE, + ...) { + as_sir_method( + method_short = "mic", + method_long = "MIC values", + x = x, + mo = mo, + ab = ab, + guideline = guideline, + uti = uti, + conserve_capped_values = conserve_capped_values, + add_intrinsic_resistance = add_intrinsic_resistance, + reference_data = reference_data, + include_screening = include_screening, + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + host = host, + verbose = verbose, + ... + ) +} + +#' @rdname as.sir +#' @export +as.sir.disk <- function(x, + mo = NULL, + ab = deparse(substitute(x)), + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE, + ...) { + as_sir_method( + method_short = "disk", + method_long = "disk diffusion zones", + x = x, + mo = mo, + ab = ab, + guideline = guideline, + uti = uti, + conserve_capped_values = FALSE, + add_intrinsic_resistance = add_intrinsic_resistance, + reference_data = reference_data, + include_screening = include_screening, + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + host = NULL, + verbose = verbose, + ... + ) +} + +#' @rdname as.sir +#' @export +as.sir.data.frame <- function(x, + ..., + col_mo = NULL, + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + conserve_capped_values = FALSE, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE) { + meet_criteria(x, allow_class = "data.frame") # will also check for dimensions > 0 + meet_criteria(col_mo, allow_class = "character", is_in = colnames(x), allow_NULL = TRUE) + meet_criteria(guideline, allow_class = "character", has_length = 1) + meet_criteria(uti, allow_class = c("logical", "character"), allow_NULL = TRUE, allow_NA = TRUE) + meet_criteria(conserve_capped_values, allow_class = "logical", has_length = 1) + meet_criteria(add_intrinsic_resistance, allow_class = "logical", has_length = 1) + meet_criteria(reference_data, allow_class = "data.frame") + meet_criteria(include_screening, allow_class = "logical", has_length = 1) + meet_criteria(include_PKPD, allow_class = "logical", has_length = 1) + meet_criteria(breakpoint_type, allow_class = "character", is_in = reference_data$type, has_length = 1) + meet_criteria(host, allow_class = c("character", "factor"), allow_NULL = TRUE, allow_NA = TRUE) + meet_criteria(verbose, allow_class = "logical", has_length = 1) + x.bak <- x + for (i in seq_len(ncol(x))) { + # don't keep factors, overwriting them is hard + if (is.factor(x[, i, drop = TRUE])) { + x[, i] <- as.character(x[, i, drop = TRUE]) + } + } + + # -- MO + col_mo.bak <- col_mo + if (is.null(col_mo)) { + col_mo <- search_type_in_df(x = x, type = "mo", info = FALSE) + } + + # -- host + if (missing(breakpoint_type) && any(host %in% clinical_breakpoints$host[!clinical_breakpoints$host %in% c("human", "ECOFF")], na.rm = TRUE)) { + message_("Assuming `breakpoint_type = \"animal\"` since `host` contains animal species.") + breakpoint_type <- "animal" + } else if (any(!suppressMessages(convert_host(host)) %in% c("human", "ECOFF"), na.rm = TRUE)) { + message_("Assuming `breakpoint_type = \"animal\"`.") + breakpoint_type <- "animal" + } + if (breakpoint_type == "animal") { + if (is.null(host)) { + host <- search_type_in_df(x = x, type = "host", add_col_prefix = FALSE) + } else if (length(host) == 1 && as.character(host) %in% colnames(x)) { + host <- x[[as.character(host)]] + } + } else { + host <- breakpoint_type + } + + # -- UTIs + col_uti <- uti + if (is.null(col_uti)) { + col_uti <- search_type_in_df(x = x, type = "uti", add_col_prefix = FALSE) + } + if (!is.null(col_uti)) { + if (is.logical(col_uti)) { + # already a logical vector as input + if (length(col_uti) == 1) { + uti <- rep(col_uti, NROW(x)) + } else { + uti <- col_uti + } + } else { + # column found, transform to logical + stop_if( + length(col_uti) != 1 | !col_uti %in% colnames(x), + "argument `uti` must be a [logical] vector, of must be a single column name of `x`" + ) + uti <- as.logical(x[, col_uti, drop = TRUE]) + } + } else { + # col_uti is still NULL - look for specimen column and make logicals of the urines + col_specimen <- suppressMessages(search_type_in_df(x = x, type = "specimen")) + if (!is.null(col_specimen)) { + uti <- x[, col_specimen, drop = TRUE] %like% "urin" + values <- sort(unique(x[uti, col_specimen, drop = TRUE])) + if (length(values) > 1) { + plural <- c("s", "", "") + } else { + plural <- c("", "s", "a ") + } + message_( + "Assuming value", plural[1], " ", + vector_and(values, quotes = TRUE), + " in column '", font_bold(col_specimen), + "' reflect", plural[2], " ", plural[3], "urinary tract infection", plural[1], + ".\n Use `as.sir(uti = FALSE)` to prevent this." + ) + } else { + # no data about UTI's found + uti <- NULL + } + } + + i <- 0 + if (tryCatch(length(list(...)) > 0, error = function(e) TRUE)) { + sel <- colnames(pm_select(x, ...)) + } else { + sel <- colnames(x) + } + if (!is.null(col_mo)) { + sel <- sel[sel != col_mo] + } + + ab_cols <- colnames(x)[vapply(FUN.VALUE = logical(1), x, function(y) { + i <<- i + 1 + check <- is.mic(y) | is.disk(y) + ab <- colnames(x)[i] + if (!is.null(col_mo) && ab == col_mo) { + return(FALSE) + } + if (!is.null(col_uti) && ab == col_uti) { + return(FALSE) + } + if (length(sel) == 0 || (length(sel) > 0 && ab %in% sel)) { + ab_coerced <- suppressWarnings(as.ab(ab)) + if (is.na(ab_coerced) || (length(sel) > 0 & !ab %in% sel)) { + # not even a valid AB code + return(FALSE) + } else { + return(TRUE) + } + } else { + return(FALSE) + } + })] + + stop_if( + length(ab_cols) == 0, + "no columns with MIC values, disk zones or antibiotic column names found in this data set. Use as.mic() or as.disk() to transform antimicrobial columns." + ) + # set type per column + types <- character(length(ab_cols)) + types[vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.disk)] <- "disk" + types[vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.mic)] <- "mic" + types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_disks)] <- "disk" + types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_mics)] <- "mic" + types[types == "" & !vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.sir)] <- "sir" + if (any(types %in% c("mic", "disk"), na.rm = TRUE)) { + # now we need an mo column + stop_if(is.null(col_mo), "`col_mo` must be set") + # if not null, we already found it, now find again so a message will show + if (is.null(col_mo.bak)) { + col_mo <- search_type_in_df(x = x, type = "mo") + } + x_mo <- as.mo(x[, col_mo, drop = TRUE]) + } + + for (i in seq_len(length(ab_cols))) { + if (types[i] == "mic") { + x[, ab_cols[i]] <- x %pm>% + pm_pull(ab_cols[i]) %pm>% + as.character() %pm>% + as.mic() %pm>% + as.sir( + mo = x_mo, + mo.bak = x[, col_mo, drop = TRUE], + ab = ab_cols[i], + guideline = guideline, + uti = uti, + conserve_capped_values = conserve_capped_values, + add_intrinsic_resistance = add_intrinsic_resistance, + reference_data = reference_data, + include_screening = include_screening, + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + host = host, + verbose = verbose, + is_data.frame = TRUE + ) + } else if (types[i] == "disk") { + x[, ab_cols[i]] <- x %pm>% + pm_pull(ab_cols[i]) %pm>% + as.character() %pm>% + as.disk() %pm>% + as.sir( + mo = x_mo, + mo.bak = x[, col_mo, drop = TRUE], + ab = ab_cols[i], + guideline = guideline, + uti = uti, + add_intrinsic_resistance = add_intrinsic_resistance, + reference_data = reference_data, + include_screening = include_screening, + include_PKPD = include_PKPD, + breakpoint_type = breakpoint_type, + host = host, + verbose = verbose, + is_data.frame = TRUE + ) + } else if (types[i] == "sir") { + show_message <- FALSE + ab <- ab_cols[i] + ab_coerced <- suppressWarnings(as.ab(ab)) + if (!all(x[, ab_cols[i], drop = TRUE] %in% c("S", "SDD", "I", "R", "NI", NA), na.rm = TRUE)) { + show_message <- TRUE + # only print message if values are not already clean + message_("Cleaning values in column '", font_bold(ab), "' (", + ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""), + ab_name(ab_coerced, tolower = TRUE), ")... ", + appendLF = FALSE, + as_note = FALSE + ) + } else if (!is.sir(x.bak[, ab_cols[i], drop = TRUE])) { + show_message <- TRUE + # only print message if class not already set + message_("Assigning class 'sir' to already clean column '", font_bold(ab), "' (", + ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""), + ab_name(ab_coerced, tolower = TRUE, language = NULL), ")... ", + appendLF = FALSE, + as_note = FALSE + ) + } + x[, ab_cols[i]] <- as.sir.default(x = as.character(x[, ab_cols[i], drop = TRUE])) + if (show_message == TRUE) { + message(font_green_bg(" OK ")) + } + } + } + + x +} + +get_guideline <- function(guideline, reference_data) { + if (!identical(reference_data, AMR::clinical_breakpoints)) { + return(guideline) + } + guideline_param <- toupper(guideline) + if (guideline_param %in% c("CLSI", "EUCAST")) { + guideline_param <- rev(sort(subset(reference_data, guideline %like% guideline_param)$guideline))[1L] + } + if (guideline_param %unlike% " ") { + # like 'EUCAST2020', should be 'EUCAST 2020' + guideline_param <- gsub("([a-z]+)([0-9]+)", "\\1 \\2", guideline_param, ignore.case = TRUE) + } + + stop_ifnot(guideline_param %in% reference_data$guideline, + "invalid guideline: '", guideline, + "'.\nValid guidelines are: ", vector_and(reference_data$guideline, quotes = TRUE, reverse = TRUE), + call = FALSE + ) + + guideline_param +} + +convert_host <- function(x, lang = get_AMR_locale()) { + x <- gsub("[^a-zA-Z ]", "", trimws2(tolower(as.character(x))), perl = TRUE) + x_out <- rep(NA_character_, length(x)) + x_out[trimws2(tolower(x)) == "human"] <- "human" + x_out[trimws2(tolower(x)) == "ecoff"] <- "ecoff" + # this order is based on: clinical_breakpoints |> filter(type == "animal") |> count(host, sort = TRUE) + x_out[is.na(x_out) & (x %like% "dog|canine|Canis lupus" | x %like% translate_AMR("dog|dogs|canine", lang))] <- "dogs" + x_out[is.na(x_out) & (x %like% "cattle|bovine|Bos taurus" | x %like% translate_AMR("cattle|bovine", lang))] <- "cattle" + x_out[is.na(x_out) & (x %like% "swine|suida(e)?|Sus scrofa" | x %like% translate_AMR("swine|swines", lang))] <- "swine" + x_out[is.na(x_out) & (x %like% "cat|feline|Felis catus" | x %like% translate_AMR("cat|cats|feline", lang))] <- "cats" + x_out[is.na(x_out) & (x %like% "horse|equine|Equus ferus" | x %like% translate_AMR("horse|horses|equine", lang))] <- "horse" + x_out[is.na(x_out) & (x %like% "aqua|fish|Pisces" | x %like% translate_AMR("aquatic|fish", lang))] <- "aquatic" + x_out[is.na(x_out) & (x %like% "bird|chicken|poultry|avia|Gallus gallus" | x %like% translate_AMR("bird|birds|poultry", lang))] <- "poultry" + + # additional animals, not necessarily currently in breakpoint guidelines: + x_out[is.na(x_out) & (x %like% "camel|camelid|Camelus dromedarius" | x %like% translate_AMR("camel|camels|camelid", lang))] <- "camels" + x_out[is.na(x_out) & (x %like% "deer|cervine|Cervidae" | x %like% translate_AMR("deer|deers|cervine", lang))] <- "deer" + x_out[is.na(x_out) & (x %like% "donkey|asinine|Equus africanus" | x %like% translate_AMR("donkey|donkeys|asinine", lang))] <- "donkeys" + x_out[is.na(x_out) & (x %like% "ferret|musteline|Mustela putorius" | x %like% translate_AMR("ferret|ferrets|musteline", lang))] <- "ferrets" + x_out[is.na(x_out) & (x %like% "goat|caprine|Capra aegagrus" | x %like% translate_AMR("goat|goats|caprine", lang))] <- "goats" + x_out[is.na(x_out) & (x %like% "guinea pig|caviine|Cavia porcellus" | x %like% translate_AMR("guinea pig|guinea pigs|caviine", lang))] <- "guinea pigs" + x_out[is.na(x_out) & (x %like% "hamster|cricetine|Cricetinae" | x %like% translate_AMR("hamster|hamsters|cricetine", lang))] <- "hamsters" + x_out[is.na(x_out) & (x %like% "monkey|simian|Simia" | x %like% translate_AMR("monkey|monkeys|simian", lang))] <- "monkeys" + x_out[is.na(x_out) & (x %like% "mouse|murine|Mus musculus" | x %like% translate_AMR("mouse|mice|murine", lang))] <- "mice" + x_out[is.na(x_out) & (x %like% "pig|porcine|Sus scrofa" | x %like% translate_AMR("pig|pigs|porcine", lang))] <- "pigs" + x_out[is.na(x_out) & (x %like% "rabbit|leporine|Oryctolagus cuniculus" | x %like% translate_AMR("rabbit|rabbits|leporine", lang))] <- "rabbits" + x_out[is.na(x_out) & (x %like% "rat|ratine|Rattus" | x %like% translate_AMR("rat|rats|ratine", lang))] <- "rats" + x_out[is.na(x_out) & (x %like% "sheep|ovine|Ovis aries" | x %like% translate_AMR("sheep|sheeps|ovine", lang))] <- "sheep" + x_out[is.na(x_out) & (x %like% "snake|serpentine|Serpentes" | x %like% translate_AMR("snake|snakes|serpentine", lang))] <- "snakes" + x_out[is.na(x_out) & (x %like% "turkey|meleagrine|Meleagris gallopavo" | x %like% translate_AMR("turkey|turkeys|meleagrine", lang))] <- "turkey" + + + x_out[x_out == "ecoff"] <- "ECOFF" + x_out +} + +as_sir_method <- function(method_short, + method_long, + x, + mo, + ab, + guideline, + uti, + conserve_capped_values, + add_intrinsic_resistance, + reference_data, + include_screening, + include_PKPD, + breakpoint_type, + host, + verbose, + ...) { + meet_criteria(x, allow_NA = TRUE, .call_depth = -2) + meet_criteria(mo, allow_class = c("mo", "character"), has_length = c(1, length(x)), allow_NULL = TRUE, .call_depth = -2) + meet_criteria(ab, allow_class = c("ab", "character"), has_length = c(1, length(x)), .call_depth = -2) + meet_criteria(guideline, allow_class = "character", has_length = 1, .call_depth = -2) + meet_criteria(uti, allow_class = c("logical", "character"), has_length = c(1, length(x)), allow_NULL = TRUE, allow_NA = TRUE, .call_depth = -2) + meet_criteria(conserve_capped_values, allow_class = "logical", has_length = 1, .call_depth = -2) + meet_criteria(add_intrinsic_resistance, allow_class = "logical", has_length = 1, .call_depth = -2) + meet_criteria(reference_data, allow_class = "data.frame", .call_depth = -2) + meet_criteria(include_screening, allow_class = "logical", has_length = 1, .call_depth = -2) + meet_criteria(include_PKPD, allow_class = "logical", has_length = 1, .call_depth = -2) + check_reference_data(reference_data, .call_depth = -2) + meet_criteria(breakpoint_type, allow_class = "character", is_in = reference_data$type, has_length = 1, .call_depth = -2) + meet_criteria(host, allow_class = c("character", "factor"), allow_NULL = TRUE, allow_NA = TRUE, .call_depth = -2) + meet_criteria(verbose, allow_class = "logical", has_length = 1, .call_depth = -2) + + # backward compatibilty + dots <- list(...) + dots <- dots[which(!names(dots) %in% c("warn", "mo.bak", "is_data.frame"))] + if (length(dots) != 0) { + warning_("These arguments in `as.sir()` are no longer used: ", vector_and(names(dots), quotes = "`"), ".", call = FALSE) + } + + guideline_coerced <- get_guideline(guideline, reference_data) + + if (message_not_thrown_before("as.sir", "sir_interpretation_history")) { + message() + message_("Run `sir_interpretation_history()` afterwards to retrieve a logbook with all the details of the breakpoint interpretations.\n\n", add_fn = font_green) + } + + current_df <- tryCatch(get_current_data(NA, 0), error = function(e) NULL) + + # get host + if (breakpoint_type == "animal") { + if (is.null(host)) { + host <- "dogs" + if (message_not_thrown_before("as.sir", "host_missing")) { + message_("Animal hosts not set in `host`, assuming `host = \"dogs\"`, since these have the highest breakpoint availability.\n\n") + } + } + } else { + if (!is.null(host) && !all(toupper(as.character(host)) %in% c("HUMAN", "ECOFF"))) { + if (message_not_thrown_before("as.sir", "assumed_breakpoint_animal")) { + message_("Assuming `breakpoint_type = \"animal\"`, since `host` is set.", ifelse(guideline_coerced %like% "EUCAST", " Do you also need to set `guideline = \"CLSI\"`?", ""), "\n\n") + } + breakpoint_type <- "animal" + } else { + host <- breakpoint_type + } + } + + if (!is.null(host) && !all(toupper(as.character(host)) %in% c("HUMAN", "ECOFF"))) { + if (!is.null(current_df) && length(host) == 1 && host %in% colnames(current_df) && any(current_df[[host]] %like% "[A-Z]", na.rm = TRUE)) { + host <- current_df[[host]] + } else if (length(host) != length(x)) { + # for dplyr's across() + cur_column_dplyr <- import_fn("cur_column", "dplyr", error_on_fail = FALSE) + if (!is.null(cur_column_dplyr) && is.data.frame(current_df)) { + # try to get current column, which will only be available when in across() + host <- tryCatch(cur_column_dplyr(), + error = function(e) host + ) + } + } + } + host.bak <- host + host <- convert_host(host) + if (any(is.na(host) & !is.na(host.bak)) && message_not_thrown_before("as.sir", "missing_hosts")) { + warning_("The following animal host(s) could not be coerced: ", vector_and(host.bak[is.na(host) & !is.na(host.bak)]), immediate = TRUE) + message() # new line + } + # TODO add a switch to turn this off? In interactive sessions perhaps ask the user. Default should be On. + if (breakpoint_type == "animal" && message_not_thrown_before("as.sir", "host_missing_breakpoints")) { + if (guideline_coerced %like% "CLSI") { + message_("Please note that in the absence of specific veterinary breakpoints for certain animal hosts, the CLSI guideline VET09 will be applied where possible.\n\n") + } + } + + # get ab + if (!is.null(current_df) && length(ab) == 1 && ab %in% colnames(current_df) && any(current_df[[ab]] %like% "[A-Z]", na.rm = TRUE)) { + ab <- current_df[[ab]] + } else if (length(ab) != length(x)) { + # for dplyr's across() + cur_column_dplyr <- import_fn("cur_column", "dplyr", error_on_fail = FALSE) + if (!is.null(cur_column_dplyr) && is.data.frame(current_df)) { + # try to get current column, which will only be available when in across() + ab <- tryCatch(cur_column_dplyr(), + error = function(e) ab + ) + } + } + + # get mo + if (!is.null(current_df) && length(mo) == 1 && mo %in% colnames(current_df)) { + mo_var_found <- paste0(" based on column '", font_bold(mo), "'") + mo <- current_df[[mo]] + } else if (length(mo) != length(x)) { + mo_var_found <- "" + if (is.null(mo)) { + tryCatch( + { + df <- get_current_data(arg_name = "mo", call = -3) # will return an error if not found + mo <- NULL + try( + { + mo <- suppressMessages(search_type_in_df(df, "mo", add_col_prefix = FALSE)) + }, + silent = TRUE + ) + if (!is.null(df) && !is.null(mo) && is.data.frame(df)) { + mo_var_found <- paste0(" based on column '", font_bold(mo), "'") + mo <- df[, mo, drop = TRUE] + } + }, + error = function(e) { + mo <- NULL + } + ) + } + } else { + mo_var_found <- "" + } + if (is.null(mo)) { + stop_("No information was supplied about the microorganisms (missing argument `mo` and no column of class 'mo' found). See ?as.sir.\n\n", + "To transform certain columns with e.g. mutate(), use `data %>% mutate(across(..., as.sir, mo = x))`, where x is your column with microorganisms.\n", + "To transform all ", method_long, " in a data set, use `data %>% as.sir()` or `data %>% mutate_if(is.", method_short, ", as.sir)`.", + call = FALSE + ) + } + + # get uti + if (!is.null(current_df) && length(uti) == 1 && uti %in% colnames(current_df)) { + uti <- current_df[[uti]] + } else if (length(uti) != length(x)) { + if (is.null(uti)) { + tryCatch( + { + df <- get_current_data(arg_name = "uti", call = -3) # will return an error if not found + uti <- NULL + try( + { + uti <- suppressMessages(search_type_in_df(df, "uti", add_col_prefix = FALSE)) + }, + silent = TRUE + ) + if (!is.null(df) && !is.null(uti) && is.data.frame(df)) { + uti <- df[, uti, drop = TRUE] + } + }, + error = function(e) { + uti <- NULL + } + ) + } + } + # TODO set uti to specimen column here + + + if (length(ab) == 1 && ab %like% paste0("as.", method_short)) { + stop_("No unambiguous name was supplied about the antibiotic (argument `ab`). See ?as.sir.", call = FALSE) + } + + ab.bak <- trimws2(ab) + ab <- suppressWarnings(as.ab(ab)) + if (!is.null(list(...)$mo.bak)) { + mo.bak <- list(...)$mo.bak + } else { + mo.bak <- mo + } + mo.bak <- trimws2(mo.bak) + # be sure to take current taxonomy, as the 'clinical_breakpoints' data set only contains current taxonomy + mo <- suppressWarnings(suppressMessages(as.mo(mo, keep_synonyms = FALSE, info = FALSE))) + if (all(is.na(ab))) { + message_("Returning NAs for unknown antibiotic: ", vector_and(ab.bak, sort = FALSE, quotes = TRUE), + ". Rename this column to a valid name or code, and check the output with `as.ab()`.", + add_fn = font_red, + as_note = FALSE + ) + return(as.sir(rep(NA, length(x)))) + } + if (length(mo) == 1) { + mo <- rep(mo, length(x)) + } + if (length(ab) == 1) { + ab <- rep(ab, length(x)) + ab.bak <- rep(ab.bak, length(ab)) + } + if (length(host) == 1) { + host <- rep(host, length(x)) + } + if (is.null(uti)) { + uti <- NA + } + if (length(uti) == 1) { + uti <- rep(uti, length(x)) + } + if (isTRUE(add_intrinsic_resistance) && guideline_coerced %unlike% "EUCAST") { + if (message_not_thrown_before("as.sir", "intrinsic")) { + warning_("in `as.sir()`: using 'add_intrinsic_resistance' is only useful when using EUCAST guidelines, since the rules for intrinsic resistance are based on EUCAST.") + } + } + + # format agents ---- + agent_formatted <- paste0("'", font_bold(ab.bak, collapse = NULL), "'") + agent_name <- ab_name(ab, tolower = TRUE, language = NULL) + same_ab <- generalise_antibiotic_name(ab) == generalise_antibiotic_name(agent_name) + same_ab.bak <- generalise_antibiotic_name(ab.bak) == generalise_antibiotic_name(agent_name) + agent_formatted[same_ab.bak] <- paste0(agent_formatted[same_ab.bak], " (", ab[same_ab.bak], ")") + agent_formatted[!same_ab.bak & !same_ab] <- paste0(agent_formatted[!same_ab.bak & !same_ab], + " (", ifelse(ab.bak[!same_ab.bak & !same_ab] == ab[!same_ab.bak & !same_ab], + "", + paste0(ab[!same_ab.bak & !same_ab], ", ")), + agent_name[!same_ab.bak & !same_ab], + ")") + # this intro text will also be printed in the progress bar if the `progress` package is installed + intro_txt <- paste0("Interpreting ", method_long, ": ", ifelse(isTRUE(list(...)$is_data.frame), "column ", ""), + ifelse(length(unique(agent_formatted)) == 1, unique(agent_formatted), paste0(vector_and(agent_formatted, quotes = FALSE, sort = FALSE))), + mo_var_found, + ifelse(identical(reference_data, AMR::clinical_breakpoints), + paste0(", ", font_bold(guideline_coerced)), + ""), + "... ") + + # prepare used arguments ---- + method <- method_short + + metadata_mo <- get_mo_uncertainties() + + rise_warning <- FALSE + rise_notes <- FALSE + method_coerced <- toupper(method) + ab_coerced <- as.ab(ab) + + if (identical(reference_data, AMR::clinical_breakpoints)) { + breakpoints <- reference_data %pm>% + subset(guideline == guideline_coerced & method == method_coerced & ab %in% ab_coerced) + if (any(ab_coerced == "AMX") && nrow(breakpoints[breakpoints$ab == "AMX", , drop = FALSE]) == 0) { + ab_coerced[ab_coerced == "AMX"] <- "AMP" + breakpoints <- reference_data %pm>% + subset(guideline == guideline_coerced & method == method_coerced & ab %in% ab_coerced) + } + } else { + breakpoints <- reference_data %pm>% + subset(method == method_coerced & ab %in% ab_coerced) + } + + # create the unique data frame to be filled to save time + df <- data.frame( + values = x, + mo = mo, + ab = ab, + result = NA_sir_, + uti = uti, + host = host, + stringsAsFactors = FALSE + ) + if (method == "mic") { + # when as.sir.mic is called directly + df$values <- as.mic(df$values) + } else if (method == "disk") { + # when as.sir.disk is called directly + df$values <- as.disk(df$values) + } + + df_unique <- unique(df[ , c("mo", "ab", "uti", "host"), drop = FALSE]) + + # get all breakpoints, use humans as backup for animals + breakpoint_type_lookup <- breakpoint_type + if (breakpoint_type == "animal") { + breakpoint_type_lookup <- c(breakpoint_type, "human") + } + breakpoints <- breakpoints %pm>% + subset(type %in% breakpoint_type_lookup) + + if (isFALSE(include_screening)) { + # remove screening rules from the breakpoints table + breakpoints <- breakpoints %pm>% + subset(site %unlike% "screen" & ref_tbl %unlike% "screen") + } + if (isFALSE(include_PKPD)) { + # remove PKPD rules from the breakpoints table + breakpoints <- breakpoints %pm>% + subset(mo != "UNKNOWN" & ref_tbl %unlike% "PK.*PD") + } + + notes <- character(0) + + if (guideline_coerced %like% "EUCAST") { + any_is_intrinsic_resistant <- FALSE + add_intrinsic_resistance_to_AMR_env() + } + + if (nrow(df_unique) < 10 || nrow(breakpoints) == 0) { + # only print intro under 10 items, otherwise progressbar will print this and then it will be printed double + message_(intro_txt, appendLF = FALSE, as_note = FALSE) + } + p <- progress_ticker(n = nrow(df_unique), n_min = 10, title = font_blue(intro_txt), only_bar_percent = TRUE) + has_progress_bar <- !is.null(import_fn("progress_bar", "progress", error_on_fail = FALSE)) && nrow(df_unique) >= 10 + on.exit(close(p)) + + if (nrow(breakpoints) == 0) { + # apparently no breakpoints found + message( + paste0(font_rose_bg(" WARNING "), "\n"), + font_black(paste0(" ", AMR_env$bullet_icon, " No ", guideline_coerced, " ", method_coerced, " breakpoints available for ", + suppressMessages(suppressWarnings(ab_name(unique(ab_coerced), language = NULL, tolower = TRUE))), + " (", unique(ab_coerced), ")."), collapse = "\n")) + + load_mo_uncertainties(metadata_mo) + return(rep(NA_sir_, nrow(df))) + } + + vectorise_log_entry <- function(x, len) { + if (length(x) == 1 && len > 1) { + rep(x, len) + } else { + x + } + } + + # run the rules (df_unique is a row combination per mo/ab/uti/host) ---- + for (i in seq_len(nrow(df_unique))) { + p$tick() + mo_current <- df_unique[i, "mo", drop = TRUE] + ab_current <- df_unique[i, "ab", drop = TRUE] + host_current <- df_unique[i, "host", drop = TRUE] + uti_current <- df_unique[i, "uti", drop = TRUE] + notes_current <- character(0) + if (is.na(uti_current)) { + # no preference, so no filter on UTIs + rows <- which(df$mo == mo_current & df$ab == ab_current & df$host == host_current) + } else { + rows <- which(df$mo == mo_current & df$ab == ab_current & df$host == host_current & df$uti == uti_current) + } + if (length(rows) == 0) { + # this can happen if a host is unavailable, just continue with the next one, since a note about hosts having NA are already given at this point + next + } + values <- df[rows, "values", drop = TRUE] + new_sir <- rep(NA_sir_, length(rows)) + + # find different mo properties, as fast as possible + # TODO in case of VET09, we need to keep E. coli, also when users have Proteus in their data set + # TODO look up which species, at least E. coli - also Staph or Strep? + mo_current_genus <- AMR_env$MO_lookup$mo[match(AMR_env$MO_lookup$genus[match(mo_current, AMR_env$MO_lookup$mo)], AMR_env$MO_lookup$fullname)] + mo_current_family <- AMR_env$MO_lookup$mo[match(AMR_env$MO_lookup$family[match(mo_current, AMR_env$MO_lookup$mo)], AMR_env$MO_lookup$fullname)] + mo_current_order <- AMR_env$MO_lookup$mo[match(AMR_env$MO_lookup$order[match(mo_current, AMR_env$MO_lookup$mo)], AMR_env$MO_lookup$fullname)] + mo_current_class <- AMR_env$MO_lookup$mo[match(AMR_env$MO_lookup$class[match(mo_current, AMR_env$MO_lookup$mo)], AMR_env$MO_lookup$fullname)] + mo_current_rank <- AMR_env$MO_lookup$rank[match(mo_current, AMR_env$MO_lookup$mo)] + mo_current_name <- AMR_env$MO_lookup$fullname[match(mo_current, AMR_env$MO_lookup$mo)] + if (mo_current %in% AMR::microorganisms.groups$mo) { + # get the species group (might be more than 1 entry) + mo_current_species_group <- AMR::microorganisms.groups$mo_group[which(AMR::microorganisms.groups$mo == mo_current)] + } else { + mo_current_species_group <- NULL + } + mo_current_other <- structure("UNKNOWN", class = c("mo", "character")) + # formatted for notes + mo_formatted <- mo_current_name + if (!mo_current_rank %in% c("kingdom", "phylum", "class", "order")) { + mo_formatted <- font_italic(mo_formatted, collapse = NULL) + } + ab_formatted <- paste0( + suppressMessages(suppressWarnings(ab_name(ab_current, language = NULL, tolower = TRUE))), + " (", ab_current, ")" + ) + + # gather all available breakpoints for current MO + # TODO for VET09 do not filter out E. coli and such + breakpoints_current <- breakpoints %pm>% + subset(ab == ab_current) %pm>% + subset(mo %in% c( + mo_current, mo_current_genus, mo_current_family, + mo_current_order, mo_current_class, + mo_current_species_group, + mo_current_other + )) + + # TODO are operators considered?? + # This seems to not work well: as.sir(as.mic(c(4, ">4", ">=4", 8, ">8", ">=8")), ab = "AMC", mo = "E. coli", breakpoint_type = "animal", host = "dogs", guideline = "CLSI 2024") + + ## fall-back methods for veterinary guidelines ---- + if (breakpoint_type == "animal" && !host_current %in% breakpoints_current$host) { + if (guideline_coerced %like% "CLSI") { + # VET09 says that staph/strep/enterococcus BP can be extrapolated to all Gr+ cocci except for intrinsic resistance, so take all Gr+ cocci: + gram_plus_cocci_vet09 <- microorganisms$mo[microorganisms$genus %in% c("Staphylococcus", "Streptococcus", "Peptostreptococcus", "Aerococcus", "Micrococcus") & microorganisms$rank == "genus"] # TODO should probably include genera that were either of these before + + # HUMAN SUBSTITUTES + if (ab_current == "AZM" && mo_current_genus %in% gram_plus_cocci_vet09 && host_current %in% c("dogs", "cats", "horse")) { + # azithro can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09.")) + } else if (ab_current == "CTX" && mo_current_order == "B_[ORD]_ENTRBCTR" && host_current %in% c("dogs", "cats", "horse")) { + # cefotax can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales based on CLSI VET09.")) + } else if (ab_current == "CAZ" && (mo_current_order == "B_[ORD]_ENTRBCTR" | mo_current == "B_PSDMN_AERG") && host_current %in% c("dogs", "cats", "horse")) { + # cefta can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales and ", font_italic("P. aeruginosa"), " based on CLSI VET09.")) + } else if (ab_current == "ERY" && mo_current_genus %in% gram_plus_cocci_vet09 && host_current %in% c("dogs", "cats", "horse")) { + # erythro can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09.")) + } else if (ab_current == "IPM" && (mo_current_order == "B_[ORD]_ENTRBCTR" | mo_current == "B_PSDMN_AERG") && host_current %in% c("dogs", "cats", "horse")) { + # imipenem can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales and ", font_italic("P. aeruginosa"), " based on CLSI VET09.")) + } else if (ab_current == "LNZ" && mo_current_genus %in% gram_plus_cocci_vet09 && host_current %in% c("dogs", "cats")) { + # linezolid can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in staphylococci/enterococci based on CLSI VET09.")) + } else if (ab_current == "NIT" && host_current %in% c("dogs", "cats")) { + # nitro can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09.")) + } else if (ab_current == "PEN" && mo_current_genus %in% gram_plus_cocci_vet09 && host_current %in% c("dogs", "cats")) { + # penicillin can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09.")) + } else if (ab_current == "RIF" && mo_current_genus %in% gram_plus_cocci_vet09 && host_current %in% c("dogs", "cats")) { + # rifampicin can take human breakpoints for staphylococci + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in staphylococci based on CLSI VET09.")) + } else if (ab_current == "SXT" && host_current %in% c("dogs", "cats", "horse")) { + # trimethoprim-sulfamethoxazole (TMS) can take human breakpoints for these agents + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09.")) + } else if (ab_current == "VAN" && host_current %in% c("dogs", "cats", "horse")) { + # vancomycin can take human breakpoints in these hosts + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09.")) + + } else if (host_current %in% c("dogs", "cats") && (mo_current_genus %in% c("B_AMYCS", "B_NOCRD", "B_CMPYL", "B_CRYNB", "B_ENTRC", "B_MYCBC", "B_PSDMN", "B_AERMN") | mo_current_class == "B_[CLS]_BTPRTBCT" | mo_current == "B_LISTR_MNCY")) { + # dog breakpoints if no canine/feline + # TODO do we still have dogs breakpoints at this point??? + breakpoints_current <- breakpoints_current %pm>% subset(host == "human") # WRONG + notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", mo_formatted, " based on CLSI VET09.")) + + } else { + # no specific CLSI solution for this, so only filter on current host (if no breakpoints available -> too bad) + breakpoints_current <- breakpoints_current %pm>% + subset(host == host_current) + } + } + + } + + if (NROW(breakpoints_current) == 0) { + AMR_env$sir_interpretation_history <- rbind_AMR( + AMR_env$sir_interpretation_history, + # recycling 1 to 2 rows does not always seem to work, which is why vectorise_log_entry() was added + data.frame( + datetime = vectorise_log_entry(Sys.time(), length(rows)), + index = rows, + ab_given = vectorise_log_entry(ab.bak[match(ab_current, df$ab)][1], length(rows)), + mo_given = vectorise_log_entry(mo.bak[match(mo_current, df$mo)][1], length(rows)), + host_given = vectorise_log_entry(host.bak[match(host_current, df$host)][1], length(rows)), + ab = vectorise_log_entry(ab_current, length(rows)), + mo = vectorise_log_entry(mo_current, length(rows)), + host = vectorise_log_entry(host_current, length(rows)), + method = vectorise_log_entry(method_coerced, length(rows)), + input = vectorise_log_entry(as.double(values), length(rows)), + outcome = vectorise_log_entry(NA_sir_, length(rows)), + notes = vectorise_log_entry("NO BREAKPOINT AVAILABLE", length(rows)), + guideline = vectorise_log_entry(guideline_coerced, length(rows)), + ref_table = vectorise_log_entry(NA_character_, length(rows)), + uti = vectorise_log_entry(uti_current, length(rows)), + breakpoint_S_R = vectorise_log_entry(NA_character_, length(rows)), + stringsAsFactors = FALSE + ) + ) + notes <- c(notes, notes_current) + next + } + + # sort on host and taxonomic rank + # (this will e.g. prefer 'species' breakpoints over 'order' breakpoints) + if (is.na(uti_current)) { + breakpoints_current <- breakpoints_current %pm>% + # `uti` is a column in the data set + # this will put UTI = FALSE first, then UTI = NA, then UTI = TRUE + pm_mutate(uti_index = ifelse(!is.na(uti) & uti == FALSE, 1, + ifelse(is.na(uti), 2, + 3))) %pm>% + # be as specific as possible (i.e. prefer species over genus): + pm_arrange(rank_index, uti_index) + } else if (uti_current == TRUE) { + breakpoints_current <- breakpoints_current %pm>% + subset(uti == TRUE) %pm>% + # be as specific as possible (i.e. prefer species over genus): + pm_arrange(rank_index) + } + + # throw messages for different body sites + site <- breakpoints_current[1L, "site", drop = FALSE] # this is the one we'll take + if (is.na(site)) { + site <- paste0("an unspecified body site") + } else { + site <- paste0("body site '", site, "'") + } + if (nrow(breakpoints_current) == 1 && all(breakpoints_current$uti == TRUE) && is.na(uti_current) && message_not_thrown_before("as.sir", "uti", ab_current)) { + # only UTI breakpoints available + notes_current <- c(notes_current, paste0("Breakpoints for ", font_bold(ab_formatted), " in ", mo_formatted, " are only available for (uncomplicated) urinary tract infections (UTI); assuming `uti = TRUE`.")) + } else if (nrow(breakpoints_current) > 1 && length(unique(breakpoints_current$site)) > 1 && any(is.na(uti_current)) && all(c(TRUE, FALSE) %in% breakpoints_current$uti, na.rm = TRUE) && message_not_thrown_before("as.sir", "siteUTI", mo_current, ab_current)) { + # both UTI and Non-UTI breakpoints available + notes_current <- c(notes_current, paste0("Breakpoints for UTI ", font_bold("and"), " non-UTI available for ", ab_formatted, " in ", mo_formatted, " - assuming ", site, ". Use argument `uti` to set which isolates are from urine. See `?as.sir`.")) + breakpoints_current <- breakpoints_current %pm>% + pm_filter(uti == FALSE) + } else if (nrow(breakpoints_current) > 1 && length(unique(breakpoints_current$site)) > 1 && all(breakpoints_current$uti == FALSE, na.rm = TRUE) && message_not_thrown_before("as.sir", "siteOther", mo_current, ab_current)) { + # breakpoints for multiple body sites available + notes_current <- c(notes_current, paste0("Multiple breakpoints available for ", font_bold(ab_formatted), " in ", mo_formatted, " - assuming ", site, ".")) + } + + # first check if mo is intrinsic resistant + if (isTRUE(add_intrinsic_resistance) && guideline_coerced %like% "EUCAST" && paste(mo_current, ab_current) %in% AMR_env$intrinsic_resistant) { + notes_current <- c(notes_current, paste0("Intrinsic resistance applied for ", ab_formatted, " in ", mo_formatted, "")) + new_sir <- rep(as.sir("R"), length(rows)) + } else if (nrow(breakpoints_current) == 0) { + # no rules available + new_sir <- rep(NA_sir_, length(rows)) + } else { + # then run the rules + breakpoints_current <- breakpoints_current[1L, , drop = FALSE] + + if (any(breakpoints_current$mo == "UNKNOWN", na.rm = TRUE) | any(breakpoints_current$ref_tbl %like% "PK.*PD", na.rm = TRUE)) { + notes_current <- c(notes_current, "Some PK/PD breakpoints were applied - use `include_PKPD = FALSE` to prevent this") + } + if (any(breakpoints_current$site %like% "screen", na.rm = TRUE) | any(breakpoints_current$ref_tbl %like% "screen", na.rm = TRUE)) { + notes_current <- c(notes_current, "Some screening breakpoints were applied - use `include_screening = FALSE` to prevent this") + } + + if (method == "mic") { + new_sir <- case_when_AMR( + is.na(values) ~ NA_sir_, + values <= breakpoints_current$breakpoint_S ~ as.sir("S"), + guideline_coerced %like% "EUCAST" & values > breakpoints_current$breakpoint_R ~ as.sir("R"), + guideline_coerced %like% "CLSI" & values >= breakpoints_current$breakpoint_R ~ as.sir("R"), + # return "I" or "SDD" when breakpoints are in the middle + !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"), + !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == FALSE ~ as.sir("I"), + # and NA otherwise + TRUE ~ NA_sir_ + ) + } else if (method == "disk") { + new_sir <- case_when_AMR( + is.na(values) ~ NA_sir_, + as.double(values) >= as.double(breakpoints_current$breakpoint_S) ~ as.sir("S"), + guideline_coerced %like% "EUCAST" & as.double(values) < as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"), + guideline_coerced %like% "CLSI" & as.double(values) <= as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"), + # return "I" or "SDD" when breakpoints are in the middle + !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"), + !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == FALSE ~ as.sir("I"), + # and NA otherwise + TRUE ~ NA_sir_ + ) + } + + # write to verbose output + AMR_env$sir_interpretation_history <- rbind_AMR( + AMR_env$sir_interpretation_history, + # recycling 1 to 2 rows does not always seem to work, which is why vectorise_log_entry() was added + data.frame( + datetime = vectorise_log_entry(Sys.time(), length(rows)), + index = rows, + ab_given = vectorise_log_entry(ab.bak[match(ab_current, df$ab)][1], length(rows)), + mo_given = vectorise_log_entry(mo.bak[match(mo_current, df$mo)][1], length(rows)), + host_given = vectorise_log_entry(host.bak[match(host_current, df$host)][1], length(rows)), + ab = vectorise_log_entry(breakpoints_current[, "ab", drop = TRUE], length(rows)), + mo = vectorise_log_entry(breakpoints_current[, "mo", drop = TRUE], length(rows)), + host = vectorise_log_entry(breakpoints_current[, "host", drop = TRUE], length(rows)), + method = vectorise_log_entry(method_coerced, length(rows)), + input = vectorise_log_entry(as.double(values), length(rows)), + outcome = vectorise_log_entry(as.sir(new_sir), length(rows)), + notes = vectorise_log_entry(paste0(font_stripstyle(notes_current), collapse = "\n"), length(rows)), + guideline = vectorise_log_entry(guideline_coerced, length(rows)), + ref_table = vectorise_log_entry(breakpoints_current[, "ref_tbl", drop = TRUE], length(rows)), + uti = vectorise_log_entry(breakpoints_current[, "uti", drop = TRUE], length(rows)), + breakpoint_S_R = vectorise_log_entry(paste0(breakpoints_current[, "breakpoint_S", drop = TRUE], "-", breakpoints_current[, "breakpoint_R", drop = TRUE]), length(rows)), + stringsAsFactors = FALSE + ) + ) + } + + notes <- c(notes, notes_current) + df[rows, "result"] <- new_sir + } + + close(p) + # printing messages + if (has_progress_bar == TRUE) { + # the progress bar has overwritten the intro text, so: + message_(intro_txt, appendLF = FALSE, as_note = FALSE) + } + if (length(notes) > 0) { + if (isTRUE(rise_warning)) { + message(font_rose_bg(" WARNING ")) + } else { + message(font_yellow_bg(" NOTE ")) + } + notes <- unique(notes) + if (isTRUE(verbose) || length(notes) == 1 || NROW(AMR_env$sir_interpretation_history) == 0) { + for (i in seq_len(length(notes))) { + message(word_wrap(" ", AMR_env$bullet_icon, " ", notes[i], add_fn = font_black)) + } + } else { + message(word_wrap(" ", AMR_env$bullet_icon, " There were multiple notes. Print or View `sir_interpretation_history()` to examine them, or use `as.sir(..., verbose = TRUE)` next time to directly print them here.", add_fn = font_black)) + } + } else { + message(font_green_bg(" OK ")) + } + + load_mo_uncertainties(metadata_mo) + + df$result +} + +#' @rdname as.sir +#' @param clean a [logical] to indicate whether previously stored results should be forgotten after returning the 'logbook' with results +#' @export +sir_interpretation_history <- function(clean = FALSE) { + meet_criteria(clean, allow_class = "logical", has_length = 1) + + out <- AMR_env$sir_interpretation_history + out$outcome <- as.sir(out$outcome) + if (NROW(out) > 0) { + # sort descending on time + out <- out[order(format(out$datetime, "%Y%m%d%H%M"), out$index, decreasing = TRUE), , drop = FALSE] + } + + if (isTRUE(clean)) { + AMR_env$sir_interpretation_history <- AMR_env$sir_interpretation_history[0, , drop = FALSE] + } + + if (pkg_is_available("tibble")) { + out <- import_fn("as_tibble", "tibble")(out) + } + structure(out, class = c("sir_log", class(out))) +} + +#' @method print sir_log +#' @export +#' @noRd +print.sir_log <- function(x, ...) { + if (NROW(x) == 0) { + message_("No results to print. Run `as.sir()` on MIC values or disk diffusion zones first to print a 'logbook' data set here.") + return(invisible(NULL)) + } + class(x) <- class(x)[class(x) != "sir_log"] + print(x, ...) +} + +# will be exported using s3_register() in R/zzz.R +pillar_shaft.sir <- function(x, ...) { + out <- trimws(format(x)) + if (has_colour()) { + # colours will anyway not work when has_colour() == FALSE, + # but then the indentation should also not be applied + out[is.na(x)] <- font_grey(" NA") + out[x == "NI"] <- font_grey_bg(" NI ") + out[x == "S"] <- font_green_bg(" S ") + out[x == "I"] <- font_orange_bg(" I ") + out[x == "SDD"] <- font_orange_bg(" SDD ") + if (is_dark()) { + out[x == "R"] <- font_red_bg(" R ") + } else { + out[x == "R"] <- font_rose_bg(" R ") + } + } + create_pillar_column(out, align = "left", width = 5) +} + +# will be exported using s3_register() in R/zzz.R +type_sum.sir <- function(x, ...) { + "sir" +} + +# will be exported using s3_register() in R/zzz.R +freq.sir <- function(x, ...) { + x_name <- deparse(substitute(x)) + x_name <- gsub(".*[$]", "", x_name) + if (x_name %in% c("x", ".")) { + # try again going through system calls + x_name <- stats::na.omit(vapply( + FUN.VALUE = character(1), + sys.calls(), + function(call) { + call_txt <- as.character(call) + ifelse(call_txt[1] %like% "freq$", call_txt[length(call_txt)], character(0)) + } + ))[1L] + } + ab <- suppressMessages(suppressWarnings(as.ab(x_name))) + digits <- list(...)$digits + if (is.null(digits)) { + digits <- 2 + } + if (!is.na(ab)) { + cleaner::freq.default( + x = x, ..., + .add_header = list( + Drug = paste0(ab_name(ab, language = NULL), " (", ab, ", ", paste(ab_atc(ab), collapse = "/"), ")"), + `Drug group` = ab_group(ab, language = NULL), + `%SI` = trimws(percentage(susceptibility(x, minimum = 0, as_percent = FALSE), + digits = digits + )) + ) + ) + } else { + cleaner::freq.default( + x = x, ..., + .add_header = list( + `%SI` = trimws(percentage(susceptibility(x, minimum = 0, as_percent = FALSE), + digits = digits + )) + ) + ) + } +} + + +# will be exported using s3_register() in R/zzz.R +get_skimmers.sir <- function(column) { + # get the variable name 'skim_variable' + name_call <- function(.data) { + calls <- sys.calls() + frms <- sys.frames() + calls_txt <- vapply(calls, function(x) paste(deparse(x), collapse = ""), FUN.VALUE = character(1)) + if (any(calls_txt %like% "skim_variable", na.rm = TRUE)) { + ind <- which(calls_txt %like% "skim_variable")[1L] + vars <- tryCatch(eval(parse(text = ".data$skim_variable$sir"), envir = frms[[ind]]), + error = function(e) NULL + ) + tryCatch(ab_name(as.character(calls[[length(calls)]][[2]]), language = NULL), + error = function(e) NA_character_ + ) + } else { + NA_character_ + } + } + + skimr::sfl( + skim_type = "sir", + ab_name = name_call, + count_R = count_R, + count_S = count_susceptible, + count_I = count_I, + prop_R = ~ proportion_R(., minimum = 0), + prop_S = ~ susceptibility(., minimum = 0), + prop_I = ~ proportion_I(., minimum = 0) + ) +} + +#' @method print sir +#' @export +#' @noRd +print.sir <- function(x, ...) { + x_name <- deparse(substitute(x)) + cat("Class 'sir'\n") + if (!is.null(attributes(x)$guideline) && !all(is.na(attributes(x)$guideline))) { + cat(font_blue(word_wrap("These values were interpreted using ", + font_bold(vector_and(attributes(x)$guideline, quotes = FALSE)), + " based on ", + vector_and(attributes(x)$method, quotes = FALSE), + " values. ", + "Use `sir_interpretation_history(", x_name, ")` to return a full logbook."))) + cat("\n") + } + print(as.character(x), quote = FALSE) +} + + +#' @method as.double sir +#' @export +as.double.sir <- function(x, ...) { + dbls <- rep(NA_real_, length(x)) + dbls[x == "S"] <- 1 + dbls[x %in% c("SDD", "I")] <- 2 + dbls[x == "R"] <- 3 + dbls +} + +#' @method droplevels sir +#' @export +#' @noRd +droplevels.sir <- function(x, exclude = if (any(is.na(levels(x)))) NULL else NA, ...) { + x <- droplevels.factor(x, exclude = exclude, ...) + class(x) <- c("sir", "ordered", "factor") + x +} + +#' @method summary sir +#' @export +#' @noRd +summary.sir <- function(object, ...) { + x <- object + n <- sum(!is.na(x)) + S <- sum(x == "S", na.rm = TRUE) + SDD <- sum(x == "SDD", na.rm = TRUE) + I <- sum(x == "I", na.rm = TRUE) + R <- sum(x == "R", na.rm = TRUE) + NI <- sum(x == "NI", na.rm = TRUE) + pad <- function(x) { + if (is.na(x)) { + return("??") + } + if (x == "0%") { + x <- " 0.0%" + } + if (nchar(x) < 5) { + x <- paste0(rep(" ", 5 - nchar(x)), x) + } + x + } + value <- c( + "Class" = "sir", + "%S" = paste0(pad(percentage(S / n, digits = 1)), " (n=", S, ")"), + "%SDD" = paste0(pad(percentage(SDD / n, digits = 1)), " (n=", SDD, ")"), + "%I" = paste0(pad(percentage(I / n, digits = 1)), " (n=", I, ")"), + "%R" = paste0(pad(percentage(R / n, digits = 1)), " (n=", R, ")"), + "%NI" = paste0(pad(percentage(NI / n, digits = 1)), " (n=", NI, ")") + ) + class(value) <- c("summaryDefault", "table") + value +} + +#' @method [<- sir +#' @export +#' @noRd +"[<-.sir" <- function(i, j, ..., value) { + value <- as.sir(value) + y <- NextMethod() + attributes(y) <- attributes(i) + y +} +#' @method [[<- sir +#' @export +#' @noRd +"[[<-.sir" <- function(i, j, ..., value) { + value <- as.sir(value) + y <- NextMethod() + attributes(y) <- attributes(i) + y +} +#' @method c sir +#' @export +#' @noRd +c.sir <- function(...) { + lst <- list(...) + + guideline <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$guideline %or% NA_character_) + mo <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$mo %or% NA_character_) + ab <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$ab %or% NA_character_) + method <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$method %or% NA_character_) + ref_tbl <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$ref_tbl %or% NA_character_) + ref_breakpoints <- vapply(FUN.VALUE = character(1), lst, function(x) attributes(x)$ref_breakpoints %or% NA_character_) + + out <- as.sir(unlist(lapply(list(...), as.character))) + + if (!all(is.na(guideline))) { + attributes(out)$guideline <- guideline + attributes(out)$mo <- mo + attributes(out)$ab <- ab + attributes(out)$method <- method + attributes(out)$ref_tbl <- ref_tbl + attributes(out)$ref_breakpoints <- ref_breakpoints + } + + out +} + +#' @method unique sir +#' @export +#' @noRd +unique.sir <- function(x, incomparables = FALSE, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +#' @method rep sir +#' @export +#' @noRd +rep.sir <- function(x, ...) { + y <- NextMethod() + attributes(y) <- attributes(x) + y +} + +check_reference_data <- function(reference_data, .call_depth) { + if (!identical(reference_data, AMR::clinical_breakpoints)) { + class_sir <- vapply(FUN.VALUE = character(1), AMR::clinical_breakpoints, function(x) paste0("<", class(x), ">", collapse = " and ")) + class_ref <- vapply(FUN.VALUE = character(1), reference_data, function(x) paste0("<", class(x), ">", collapse = " and ")) + if (!all(names(class_sir) == names(class_ref))) { + stop_("`reference_data` must have the same column names as the 'clinical_breakpoints' data set.", call = .call_depth) + } + if (!all(class_sir == class_ref)) { + stop_("`reference_data` must be the same structure as the 'clinical_breakpoints' data set. Column '", names(class_ref[class_sir != class_ref][1]), "' is of class ", class_ref[class_sir != class_ref][1], ", but should be of class ", class_sir[class_sir != class_ref][1], ".", call = .call_depth) + } + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/sir_calc.R + + + + +dots2vars <- function(...) { + # this function is to give more informative output about + # variable names in count_* and proportion_* functions + dots <- substitute(list(...)) + dots <- as.character(dots)[2:length(dots)] + paste0(dots[dots != "."], collapse = "+") +} + +sir_calc <- function(..., + ab_result, + minimum = 0, + as_percent = FALSE, + only_all_tested = FALSE, + only_count = FALSE) { + meet_criteria(ab_result, allow_class = c("character", "numeric", "integer"), has_length = c(1:5)) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(as_percent, allow_class = "logical", has_length = 1) + meet_criteria(only_all_tested, allow_class = "logical", has_length = 1) + meet_criteria(only_count, allow_class = "logical", has_length = 1) + + data_vars <- dots2vars(...) + + dots_df <- switch(1, + ... + ) + if (is.data.frame(dots_df)) { + # make sure to remove all other classes like tibbles, data.tables, etc + dots_df <- as.data.frame(dots_df, stringsAsFactors = FALSE) + } + + dots <- eval(substitute(alist(...))) + stop_if(length(dots) == 0, "no variables selected", call = -2) + + stop_if("also_single_tested" %in% names(dots), + "`also_single_tested` was replaced by `only_all_tested`.\n", + "Please read Details in the help page (`?proportion`) as this may have a considerable impact on your analysis.", + call = -2 + ) + ndots <- length(dots) + + if (is.data.frame(dots_df)) { + # data.frame passed with other columns, like: example_isolates %pm>% proportion_S(AMC, GEN) + + dots <- as.character(dots) + # remove first element, it's the data.frame + if (length(dots) == 1) { + dots <- character(0) + } else { + dots <- dots[2:length(dots)] + } + if (length(dots) == 0 || all(dots == "df")) { + # for complete data.frames, like example_isolates %pm>% select(AMC, GEN) %pm>% proportion_S() + # and the old sir function, which has "df" as name of the first argument + x <- dots_df + } else { + # get dots that are in column names already, and the ones that will be once evaluated using dots_df or global env + # this is to support susceptibility(example_isolates, AMC, any_of(some_vector_with_AB_names)) + dots <- c( + dots[dots %in% colnames(dots_df)], + eval(parse(text = dots[!dots %in% colnames(dots_df)]), envir = dots_df, enclos = globalenv()) + ) + dots_not_exist <- dots[!dots %in% colnames(dots_df)] + stop_if(length(dots_not_exist) > 0, "column(s) not found: ", vector_and(dots_not_exist, quotes = TRUE), call = -2) + x <- dots_df[, dots, drop = FALSE] + } + } else if (ndots == 1) { + # only 1 variable passed (can also be data.frame), like: proportion_S(example_isolates$AMC) and example_isolates$AMC %pm>% proportion_S() + x <- dots_df + } else { + # multiple variables passed without pipe, like: proportion_S(example_isolates$AMC, example_isolates$GEN) + x <- NULL + try(x <- as.data.frame(dots, stringsAsFactors = FALSE), silent = TRUE) + if (is.null(x)) { + # support for example_isolates %pm>% group_by(ward) %pm>% summarise(amox = susceptibility(GEN, AMX)) + x <- as.data.frame(list(...), stringsAsFactors = FALSE) + } + } + + if (is.null(x)) { + warning_("argument is NULL (check if columns exist): returning NA") + if (as_percent == TRUE) { + return(NA_character_) + } else { + return(NA_real_) + } + } + + print_warning <- FALSE + + ab_result <- as.sir(ab_result) + + if (is.data.frame(x)) { + sir_integrity_check <- character(0) + for (i in seq_len(ncol(x))) { + # check integrity of columns: force 'sir' class + if (!is.sir(x[, i, drop = TRUE])) { + sir_integrity_check <- c(sir_integrity_check, as.character(x[, i, drop = TRUE])) + x[, i] <- suppressWarnings(as.sir(x[, i, drop = TRUE])) # warning will be given later + print_warning <- TRUE + } + } + if (length(sir_integrity_check) > 0) { + # this will give a warning for invalid results, of all input columns (so only 1 warning) + sir_integrity_check <- as.sir(sir_integrity_check) + } + + x_transposed <- as.list(as.data.frame(t(x), stringsAsFactors = FALSE)) + if (isTRUE(only_all_tested)) { + # no NAs in any column + y <- apply( + X = as.data.frame(lapply(x, as.double), stringsAsFactors = FALSE), + MARGIN = 1, + FUN = min + ) + numerator <- sum(!is.na(y) & y %in% as.double(ab_result), na.rm = TRUE) + denominator <- sum(vapply(FUN.VALUE = logical(1), x_transposed, function(y) !(anyNA(y)))) + } else { + # may contain NAs in any column + other_values <- setdiff(c(NA, levels(ab_result)), ab_result) + numerator <- sum(vapply(FUN.VALUE = logical(1), x_transposed, function(y) any(y %in% ab_result, na.rm = TRUE))) + denominator <- sum(vapply(FUN.VALUE = logical(1), x_transposed, function(y) !(all(y %in% other_values) & anyNA(y)))) + } + } else { + # x is not a data.frame + if (!is.sir(x)) { + x <- as.sir(x) + print_warning <- TRUE + } + numerator <- sum(x %in% ab_result, na.rm = TRUE) + denominator <- sum(x %in% levels(ab_result), na.rm = TRUE) + } + + if (print_warning == TRUE) { + if (message_not_thrown_before("sir_calc")) { + warning_("Increase speed by transforming to class 'sir' on beforehand:\n", + " your_data %>% mutate_if(is_sir_eligible, as.sir)", + call = FALSE + ) + } + } + + if (only_count == TRUE) { + return(numerator) + } + + if (denominator < minimum) { + if (data_vars != "") { + data_vars <- paste(" for", data_vars) + # also add group name if used in dplyr::group_by() + cur_group <- import_fn("cur_group", "dplyr", error_on_fail = FALSE) + if (!is.null(cur_group)) { + group_df <- tryCatch(cur_group(), error = function(e) data.frame()) + if (NCOL(group_df) > 0) { + # transform factors to characters + group <- vapply(FUN.VALUE = character(1), group_df, function(x) { + if (is.numeric(x)) { + format(x) + } else if (is.logical(x)) { + as.character(x) + } else { + paste0('"', x, '"') + } + }) + data_vars <- paste0(data_vars, " in group: ", paste0(names(group), " = ", group, collapse = ", ")) + } + } + } + warning_("Introducing NA: ", + ifelse(denominator == 0, "no", paste("only", denominator)), + " results available", + data_vars, + " (`minimum` = ", minimum, ").", + call = FALSE + ) + fraction <- NA_real_ + } else { + fraction <- numerator / denominator + fraction[is.nan(fraction)] <- NA_real_ + } + + if (as_percent == TRUE) { + percentage(fraction, digits = 1) + } else { + fraction + } +} + +sir_calc_df <- function(type, # "proportion", "count" or "both" + data, + translate_ab = "name", + language = get_AMR_locale(), + minimum = 30, + as_percent = FALSE, + combine_SI = TRUE, + confidence_level = 0.95) { + meet_criteria(type, is_in = c("proportion", "count", "both"), has_length = 1) + meet_criteria(data, allow_class = "data.frame") + data <- ascertain_sir_classes(data, "data") + meet_criteria(translate_ab, allow_class = c("character", "logical"), has_length = 1, allow_NA = TRUE) + language <- validate_language(language) + meet_criteria(minimum, allow_class = c("numeric", "integer"), has_length = 1, is_positive_or_zero = TRUE, is_finite = TRUE) + meet_criteria(as_percent, allow_class = "logical", has_length = 1) + meet_criteria(combine_SI, allow_class = "logical", has_length = 1) + meet_criteria(confidence_level, allow_class = "numeric", has_length = 1) + + translate_ab <- get_translate_ab(translate_ab) + + data.bak <- data + # select only groups and antibiotics + if (is_null_or_grouped_tbl(data)) { + data_has_groups <- TRUE + groups <- get_group_names(data) + data <- data[, c(groups, colnames(data)[vapply(FUN.VALUE = logical(1), data, is.sir)]), drop = FALSE] + } else { + data_has_groups <- FALSE + data <- data[, colnames(data)[vapply(FUN.VALUE = logical(1), data, is.sir)], drop = FALSE] + } + + data <- as.data.frame(data, stringsAsFactors = FALSE) + if (isTRUE(combine_SI)) { + for (i in seq_len(ncol(data))) { + if (is.sir(data[, i, drop = TRUE])) { + data[, i] <- as.character(data[, i, drop = TRUE]) + if ("SDD" %in% data[, i, drop = TRUE]) { + if (message_not_thrown_before("sir_calc_df", combine_SI, entire_session = TRUE)) { + message_("Note that `sir_calc_df()` will also count dose-dependent susceptibility, 'SDD', as 'SI' when `combine_SI = TRUE`. This note will be shown once for this session.", as_note = FALSE) + } + + } + data[, i] <- gsub("(I|S|SDD)", "SI", data[, i, drop = TRUE]) + } + } + } + + sum_it <- function(.data) { + out <- data.frame( + antibiotic = character(0), + interpretation = character(0), + value = double(0), + ci_min = double(0), + ci_max = double(0), + isolates = integer(0), + stringsAsFactors = FALSE + ) + if (data_has_groups) { + group_values <- unique(.data[, which(colnames(.data) %in% groups), drop = FALSE]) + rownames(group_values) <- NULL + .data <- .data[, which(!colnames(.data) %in% groups), drop = FALSE] + } + for (i in seq_len(ncol(.data))) { + values <- .data[, i, drop = TRUE] + if (isTRUE(combine_SI)) { + values <- factor(values, levels = c("SI", "R", "NI"), ordered = TRUE) + } else { + values <- factor(values, levels = c("S", "SDD", "I", "R", "NI"), ordered = TRUE) + } + col_results <- as.data.frame(as.matrix(table(values)), stringsAsFactors = FALSE) + col_results$interpretation <- rownames(col_results) + col_results$isolates <- col_results[, 1, drop = TRUE] + if (NROW(col_results) > 0 && sum(col_results$isolates, na.rm = TRUE) > 0) { + if (sum(col_results$isolates, na.rm = TRUE) >= minimum) { + col_results$value <- col_results$isolates / sum(col_results$isolates, na.rm = TRUE) + ci <- lapply( + col_results$isolates, + function(x) { + stats::binom.test( + x = x, + n = sum(col_results$isolates, na.rm = TRUE), + conf.level = confidence_level + )$conf.int + } + ) + col_results$ci_min <- vapply(FUN.VALUE = double(1), ci, `[`, 1) + col_results$ci_max <- vapply(FUN.VALUE = double(1), ci, `[`, 2) + } else { + col_results$value <- rep(NA_real_, NROW(col_results)) + # confidence intervals also to NA + col_results$ci_min <- col_results$value + col_results$ci_max <- col_results$value + } + out_new <- data.frame( + antibiotic = ifelse(isFALSE(translate_ab), + colnames(.data)[i], + ab_property(colnames(.data)[i], property = translate_ab, language = language) + ), + interpretation = col_results$interpretation, + value = col_results$value, + ci_min = col_results$ci_min, + ci_max = col_results$ci_max, + isolates = col_results$isolates, + stringsAsFactors = FALSE + ) + if (data_has_groups) { + if (nrow(group_values) < nrow(out_new)) { + # repeat group_values for the number of rows in out_new + repeated <- rep(seq_len(nrow(group_values)), + each = nrow(out_new) / nrow(group_values) + ) + group_values <- group_values[repeated, , drop = FALSE] + } + out_new <- cbind(group_values, out_new) + } + out <- rbind_AMR(out, out_new) + } + } + out + } + + # based on pm_apply_grouped_function + apply_group <- function(.data, fn, groups, drop = FALSE, ...) { + grouped <- pm_split_into_groups(.data, groups, drop) + res <- do.call(rbind_AMR, unname(lapply(grouped, fn, ...))) + if (any(groups %in% colnames(res))) { + class(res) <- c("grouped_data", class(res)) + res <- pm_set_groups(res, groups[groups %in% colnames(res)]) + } + res + } + + if (data_has_groups) { + out <- apply_group(data, "sum_it", groups) + } else { + out <- sum_it(data) + } + + # apply factors for right sorting in interpretation + if (isTRUE(combine_SI)) { + out$interpretation <- factor(out$interpretation, levels = c("SI", "R"), ordered = TRUE) + } else { + # don't use as.sir() here, as it would add the class 'sir' and we would like + # the same data structure as output, regardless of input + if (out$value[out$interpretation == "SDD"] > 0) { + out$interpretation <- factor(out$interpretation, levels = c("S", "SDD", "I", "R"), ordered = TRUE) + } else { + out$interpretation <- factor(out$interpretation, levels = c("S", "I", "R"), ordered = TRUE) + } + } + + out <- out[!is.na(out$interpretation), , drop = FALSE] + + if (data_has_groups) { + # ordering by the groups and two more: "antibiotic" and "interpretation" + out <- pm_ungroup(out[do.call("order", out[, seq_len(length(groups) + 2), drop = FALSE]), , drop = FALSE]) + } else { + out <- out[order(out$antibiotic, out$interpretation), , drop = FALSE] + } + + if (type == "proportion") { + # remove number of isolates + out <- subset(out, select = -c(isolates)) + } else if (type == "count") { + # set value to be number of isolates + out$value <- out$isolates + # remove redundant columns + out <- subset(out, select = -c(ci_min, ci_max, isolates)) + } + + rownames(out) <- NULL + out <- as_original_data_class(out, class(data.bak)) # will remove tibble groups + structure(out, class = c("sir_df", class(out))) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/sir_df.R + + + + +#' @rdname proportion +#' @export +sir_df <- function(data, + translate_ab = "name", + language = get_AMR_locale(), + minimum = 30, + as_percent = FALSE, + combine_SI = TRUE, + confidence_level = 0.95) { + tryCatch( + sir_calc_df( + type = "both", + data = data, + translate_ab = translate_ab, + language = language, + minimum = minimum, + as_percent = as_percent, + combine_SI = combine_SI, + confidence_level = confidence_level + ), + error = function(e) stop_(gsub("in sir_calc_df(): ", "", e$message, fixed = TRUE), call = -5) + ) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/skewness.R + + + + +#' Skewness of the Sample +#' +#' @description Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. +#' +#' When negative ('left-skewed'): the left tail is longer; the mass of the distribution is concentrated on the right of a histogram. When positive ('right-skewed'): the right tail is longer; the mass of the distribution is concentrated on the left of a histogram. A normal distribution has a skewness of 0. +#' @param x a vector of values, a [matrix] or a [data.frame] +#' @param na.rm a [logical] value indicating whether `NA` values should be stripped before the computation proceeds +#' @seealso [kurtosis()] +#' @rdname skewness +#' @export +#' @examples +#' skewness(runif(1000)) +skewness <- function(x, na.rm = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + UseMethod("skewness") +} + +#' @method skewness default +#' @rdname skewness +#' @export +skewness.default <- function(x, na.rm = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + x <- as.vector(x) + if (isTRUE(na.rm)) { + x <- x[!is.na(x)] + } + n <- length(x) + (sum((x - mean(x))^3) / n) / (sum((x - mean(x))^2) / n)^(3 / 2) +} + +#' @method skewness matrix +#' @rdname skewness +#' @export +skewness.matrix <- function(x, na.rm = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + apply(x, 2, skewness.default, na.rm = na.rm) +} + +#' @method skewness data.frame +#' @rdname skewness +#' @export +skewness.data.frame <- function(x, na.rm = FALSE) { + meet_criteria(na.rm, allow_class = "logical", has_length = 1) + vapply(FUN.VALUE = double(1), x, skewness.default, na.rm = na.rm) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/translate.R + + + + +#' Translate Strings from the AMR Package +#' +#' For language-dependent output of `AMR` functions, such as [mo_name()], [mo_gramstain()], [mo_type()] and [ab_name()]. +#' @param x text to translate +#' @param language language to choose. Use one of these supported language names or ISO-639-1 codes: `r vector_or(paste0(sapply(LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), " (" , LANGUAGES_SUPPORTED, ")"), quotes = FALSE, sort = FALSE)`. +#' @details The currently `r length(LANGUAGES_SUPPORTED)` supported languages are `r vector_and(paste0(sapply(LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), " (" , LANGUAGES_SUPPORTED, ")"), quotes = FALSE, sort = FALSE)`. All these languages have translations available for all antimicrobial drugs and colloquial microorganism names. +#' +#' To permanently silence the once-per-session language note on a non-English operating system, you can set the package option [`AMR_locale`][AMR-options] in your `.Rprofile` file like this: +#' +#' ```r +#' # Open .Rprofile file +#' utils::file.edit("~/.Rprofile") +#' +#' # Then add e.g. Italian support to that file using: +#' options(AMR_locale = "Italian") +#' ``` +#' +#' And then save the file. +#' +#' Please read about adding or updating a language in [our Wiki](https://github.com/msberends/AMR/wiki/). +#' +#' ### Changing the Default Language +#' The system language will be used at default (as returned by `Sys.getenv("LANG")` or, if `LANG` is not set, [`Sys.getlocale("LC_COLLATE")`][Sys.getlocale()]), if that language is supported. But the language to be used can be overwritten in two ways and will be checked in this order: +#' +#' 1. Setting the package option [`AMR_locale`][AMR-options], either by using e.g. `set_AMR_locale("German")` or by running e.g. `options(AMR_locale = "German")`. +#' +#' Note that setting an \R option only works in the same session. Save the command `options(AMR_locale = "(your language)")` to your `.Rprofile` file to apply it for every session. Run `utils::file.edit("~/.Rprofile")` to edit your `.Rprofile` file. +#' 2. Setting the system variable `LANGUAGE` or `LANG`, e.g. by adding `LANGUAGE="de_DE.utf8"` to your `.Renviron` file in your home directory. +#' +#' Thus, if the package option [`AMR_locale`][AMR-options] is set, the system variables `LANGUAGE` and `LANG` will be ignored. +#' @rdname translate +#' @name translate +#' @export +#' @examples +#' # Current settings (based on system language) +#' ab_name("Ciprofloxacin") +#' mo_name("Coagulase-negative Staphylococcus (CoNS)") +#' +#' # setting another language +#' set_AMR_locale("Dutch") +#' ab_name("Ciprofloxacin") +#' mo_name("Coagulase-negative Staphylococcus (CoNS)") +#' +#' # setting yet another language +#' set_AMR_locale("German") +#' ab_name("Ciprofloxacin") +#' mo_name("Coagulase-negative Staphylococcus (CoNS)") +#' +#' # set_AMR_locale() understands endonyms, English exonyms, and ISO-639-1: +#' set_AMR_locale("Deutsch") +#' set_AMR_locale("German") +#' set_AMR_locale("de") +#' ab_name("amox/clav") +#' +#' # reset to system default +#' reset_AMR_locale() +#' ab_name("amox/clav") +get_AMR_locale <- function() { + # a message for this will be thrown in translate_into_language() if outcome is non-English + if (!is.null(getOption("AMR_locale"))) { + return(validate_language(getOption("AMR_locale"), extra_txt = "set with `options(AMR_locale = ...)`")) + } + lang <- "" + # now check the LANGUAGE system variable - return it if set + if (!identical("", Sys.getenv("LANGUAGE"))) { + lang <- Sys.getenv("LANGUAGE") + } + if (!identical("", Sys.getenv("LANG"))) { + lang <- Sys.getenv("LANG") + } + if (lang == "") { + lang <- Sys.getlocale("LC_COLLATE") + } + find_language(lang) +} + +#' @rdname translate +#' @export +set_AMR_locale <- function(language) { + language <- validate_language(language) + options(AMR_locale = language) + if (interactive() || identical(Sys.getenv("IN_PKGDOWN"), "true")) { + # show which language to use now + message_( + "Using ", LANGUAGES_SUPPORTED_NAMES[[language]]$exonym, + ifelse(language != "en", + paste0(" (", LANGUAGES_SUPPORTED_NAMES[[language]]$endonym, ")"), + "" + ), + " for the AMR package for this session." + ) + } +} + +#' @rdname translate +#' @export +reset_AMR_locale <- function() { + options(AMR_locale = NULL) + if (interactive() || identical(Sys.getenv("IN_PKGDOWN"), "true")) { + # show which language to use now + language <- suppressMessages(get_AMR_locale()) + message_("Using the ", LANGUAGES_SUPPORTED_NAMES[[language]]$exonym, " language (", LANGUAGES_SUPPORTED_NAMES[[language]]$endonym, ") for the AMR package for this session.") + } +} + +#' @rdname translate +#' @export +translate_AMR <- function(x, language = get_AMR_locale()) { + translate_into_language(x, + language = language, + only_unknown = FALSE, + only_affect_ab_names = FALSE, + only_affect_mo_names = FALSE + ) +} + + +validate_language <- function(language, extra_txt = character(0)) { + if (length(language) == 0 || isTRUE(trimws2(tolower(language[1])) %in% c("en", "english", "", "false", NA))) { + return("en") + } else if (language[1] %in% LANGUAGES_SUPPORTED) { + return(language[1]) + } + lang <- find_language(language[1], fallback = FALSE) + stop_ifnot(length(lang) > 0 && lang %in% LANGUAGES_SUPPORTED, + "unsupported language for AMR package", extra_txt, ": \"", language, "\". Use one of these language names or ISO-639-1 codes: ", + paste0('"', vapply(FUN.VALUE = character(1), LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), + '" ("', LANGUAGES_SUPPORTED, '")', + collapse = ", " + ), + call = FALSE + ) + lang +} + +find_language <- function(language, fallback = TRUE) { + language <- Map(LANGUAGES_SUPPORTED_NAMES, + LANGUAGES_SUPPORTED, + f = function(l, n, check = language) { + grepl( + paste0( + "^(", l[1], "|", l[2], "|", + n, "(_|$)|", toupper(n), "(_|$))" + ), + check, + ignore.case = TRUE, + perl = TRUE, + useBytes = FALSE + ) + }, + USE.NAMES = TRUE + ) + language <- names(which(language == TRUE)) + if (isTRUE(fallback) && length(language) == 0) { + # other language -> set to English + language <- "en" + } + language +} + +# translate strings based on inst/translations.tsv +translate_into_language <- function(from, + language = get_AMR_locale(), + only_unknown = FALSE, + only_affect_ab_names = FALSE, + only_affect_mo_names = FALSE) { + # get ISO-639-1 of language + lang <- validate_language(language) + if (lang == "en") { + # don' translate + return(from) + } + + df_trans <- TRANSLATIONS # internal data file + from.bak <- from + from_unique <- unique(from) + from_unique_translated <- from_unique + + # only keep lines where translation is available for this language + df_trans <- df_trans[which(!is.na(df_trans[, lang, drop = TRUE])), , drop = FALSE] + # and where the original string is not equal to the string in the target language + df_trans <- df_trans[which(df_trans[, "pattern", drop = TRUE] != df_trans[, lang, drop = TRUE]), , drop = FALSE] + if (only_unknown == TRUE) { + df_trans <- subset(df_trans, pattern %like% "unknown") + } + if (only_affect_ab_names == TRUE) { + df_trans <- subset(df_trans, affect_ab_name == TRUE) + } + if (only_affect_mo_names == TRUE) { + df_trans <- subset(df_trans, affect_mo_name == TRUE) + } + if (NROW(df_trans) == 0) { + return(from) + } + + # default: case sensitive if value if 'case_sensitive' is missing: + df_trans$case_sensitive[is.na(df_trans$case_sensitive)] <- TRUE + # default: not using regular expressions if 'regular_expr' is missing: + df_trans$regular_expr[is.na(df_trans$regular_expr)] <- FALSE + + # check if text to look for is in one of the patterns + any_form_in_patterns <- tryCatch( + any(from_unique %like% paste0("(", paste(gsub(" +\\(.*", "", df_trans$pattern), collapse = "|"), ")")), + error = function(e) { + warning_("Translation not possible. Please create an issue at ", font_url("https://github.com/msberends/AMR/issues"), ". Many thanks!") + return(FALSE) + } + ) + + if (NROW(df_trans) == 0 | !any_form_in_patterns) { + return(from) + } + + lapply( + # starting with longest pattern, since more general translations are shorter, such as 'Group' + order(nchar(df_trans$pattern), decreasing = TRUE), + function(i) { + from_unique_translated <<- gsub( + pattern = df_trans$pattern[i], + replacement = df_trans[i, lang, drop = TRUE], + x = from_unique_translated, + ignore.case = !df_trans$case_sensitive[i] & df_trans$regular_expr[i], + fixed = !df_trans$regular_expr[i], + perl = df_trans$regular_expr[i] + ) + } + ) + + # force UTF-8 for diacritics + from_unique_translated <- enc2utf8(from_unique_translated) + + # a kind of left join to get all results back + out <- from_unique_translated[match(from.bak, from_unique)] + + if (!identical(from.bak, out) && get_AMR_locale() == lang && is.null(getOption("AMR_locale", default = NULL)) && message_not_thrown_before("translation", entire_session = TRUE) && interactive()) { + message(word_wrap( + "Assuming the ", LANGUAGES_SUPPORTED_NAMES[[lang]]$exonym, " language (", + LANGUAGES_SUPPORTED_NAMES[[lang]]$endonym, ") for the AMR package. See `set_AMR_locale()` to change this or to silence this once-per-session note.", + add_fn = list(font_blue), as_note = TRUE + )) + } + + out +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/vctrs.R + + + + +# These are all S3 implementations for the vctrs package, +# that is used internally by tidyverse packages such as dplyr. +# They are to convert AMR-specific classes to bare characters and integers. +# All of them will be exported using s3_register() in R/zzz.R when loading the package. + +# see https://github.com/tidyverse/dplyr/issues/5955 why this is required + +# S3: ab_selector ---- +# this does not need a .default method since it's used internally only +vec_ptype2.character.ab_selector <- function(x, y, ...) { + x +} +vec_ptype2.ab_selector.character <- function(x, y, ...) { + y +} +vec_cast.character.ab_selector <- function(x, to, ...) { + unclass(x) +} + +# S3: ab_selector_any_all ---- +# this does not need a .default method since it's used internally only +vec_ptype2.logical.ab_selector_any_all <- function(x, y, ...) { + x +} +vec_ptype2.ab_selector_any_all.logical <- function(x, y, ...) { + y +} +vec_cast.logical.ab_selector_any_all <- function(x, to, ...) { + unclass(x) +} + +# S3: ab ---- +vec_ptype2.ab.default <- function (x, y, ..., x_arg = "", y_arg = "") { + x +} +vec_ptype2.ab.ab <- function(x, y, ...) { + x +} +vec_cast.character.ab <- function(x, to, ...) { + as.character(x) +} +vec_cast.ab.character <- function(x, to, ...) { + return_after_integrity_check(x, "antimicrobial drug code", as.character(AMR_env$AB_lookup$ab)) +} + +# S3: av ---- +vec_ptype2.av.default <- function (x, y, ..., x_arg = "", y_arg = "") { + x +} +vec_ptype2.av.av <- function(x, y, ...) { + x +} +vec_cast.character.av <- function(x, to, ...) { + as.character(x) +} +vec_cast.av.character <- function(x, to, ...) { + return_after_integrity_check(x, "antiviral drug code", as.character(AMR_env$AV_lookup$av)) +} + +# S3: mo ---- +vec_ptype2.mo.default <- function (x, y, ..., x_arg = "", y_arg = "") { + x +} +vec_ptype2.mo.mo <- function(x, y, ...) { + x +} +vec_cast.character.mo <- function(x, to, ...) { + as.character(x) +} +vec_cast.mo.character <- function(x, to, ...) { + add_MO_lookup_to_AMR_env() + return_after_integrity_check(x, "microorganism code", as.character(AMR_env$MO_lookup$mo)) +} + +# S3: disk ---- +vec_ptype_full.disk <- function(x, ...) { + "disk" +} +vec_ptype_abbr.disk <- function(x, ...) { + "dsk" +} +vec_ptype2.disk.default <- function (x, y, ..., x_arg = "", y_arg = "") { + NA_disk_[0] +} +vec_ptype2.disk.disk <- function(x, y, ...) { + NA_disk_[0] +} +vec_cast.disk.disk <- function(x, to, ...) { + as.disk(x) +} +vec_cast.integer.disk <- function(x, to, ...) { + unclass(x) +} +vec_cast.disk.integer <- function(x, to, ...) { + as.disk(x) +} +vec_cast.double.disk <- function(x, to, ...) { + unclass(x) +} +vec_cast.disk.double <- function(x, to, ...) { + as.disk(x) +} +vec_cast.character.disk <- function(x, to, ...) { + unclass(x) +} +vec_cast.disk.character <- function(x, to, ...) { + as.disk(x) +} + +# S3: mic ---- +vec_ptype2.mic.default <- function (x, y, ..., x_arg = "", y_arg = "") { + # this will make sure that currently implemented MIC levels are returned + NA_mic_[0] +} +vec_ptype2.mic.mic <- function(x, y, ...) { + # this will make sure that currently implemented MIC levels are returned + NA_mic_[0] +} +vec_cast.mic.mic <- function(x, to, ...) { + # this will make sure that currently implemented MIC levels are returned + as.mic(x) +} +vec_cast.character.mic <- function(x, to, ...) { + as.character(x) +} +vec_cast.double.mic <- function(x, to, ...) { + as.double(x) +} +vec_cast.integer.mic <- function(x, to, ...) { + as.integer(x) +} +vec_cast.factor.mic <- function(x, to, ...) { + factor(as.character(x)) +} +vec_cast.mic.double <- function(x, to, ...) { + as.mic(x) +} +vec_cast.mic.character <- function(x, to, ...) { + as.mic(x) +} +vec_cast.mic.integer <- function(x, to, ...) { + as.mic(x) +} +vec_cast.mic.factor <- function(x, to, ...) { + as.mic(x) +} +vec_math.mic <- function(.fn, x, ...) { + .fn(as.double(x), ...) +} +vec_arith.mic <- function(op, x, y, ...) { + vctrs::vec_arith(op, as.double(x), as.double(y)) +} + +# S3: sir ---- +vec_ptype2.sir.default <- function (x, y, ..., x_arg = "", y_arg = "") { + NA_sir_[0] +} +vec_ptype2.sir.sir <- function(x, y, ...) { + NA_sir_[0] +} +vec_ptype2.character.sir <- function(x, y, ...) { + NA_sir_[0] +} +vec_cast.sir.sir <- function(x, to, ...) { + # this makes sure that old SIR objects (with S/I/R) are converted to the current structure (S/SDD/I/R/NI) + as.sir(x) +} +vec_cast.character.sir <- function(x, to, ...) { + as.character(x) +} +vec_cast.sir.character <- function(x, to, ...) { + as.sir(x) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/whocc.R + + + + +#' WHOCC: WHO Collaborating Centre for Drug Statistics Methodology +#' +#' All antimicrobial drugs and their official names, ATC codes, ATC groups and defined daily dose (DDD) are included in this package, using the WHO Collaborating Centre for Drug Statistics Methodology. +#' @section WHOCC: +#' This package contains **all ~550 antibiotic, antimycotic and antiviral drugs** and their Anatomical Therapeutic Chemical (ATC) codes, ATC groups and Defined Daily Dose (DDD) from the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC, ) and the Pharmaceuticals Community Register of the European Commission (). +#' +#' These have become the gold standard for international drug utilisation monitoring and research. +#' +#' The WHOCC is located in Oslo at the Norwegian Institute of Public Health and funded by the Norwegian government. The European Commission is the executive of the European Union and promotes its general interest. +#' +#' **NOTE: The WHOCC copyright does not allow use for commercial purposes, unlike any other info from this package.** See + +#' @name WHOCC +#' @rdname WHOCC +#' @examples +#' as.ab("meropenem") +#' ab_name("J01DH02") +#' +#' ab_tradenames("flucloxacillin") +NULL + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/zz_deprecated.R + + + + +# #' Deprecated Functions +# #' +# #' These functions are so-called '[Deprecated]'. **They will be removed in a future release.** Using the functions will give a warning with the name of the function it has been replaced by (if there is one). +# #' @keywords internal +# #' @name AMR-deprecated +# #' @rdname AMR-deprecated +# #' @export +# NULL + +deprecation_warning <- function(old = NULL, new = NULL, extra_msg = NULL, is_function = TRUE) { + if (is.null(old)) { + warning_(extra_msg) + } else { + env <- paste0("deprecated_", old) + if (!env %in% names(AMR_env)) { + AMR_env[[paste0("deprecated_", old)]] <- 1 + if (isTRUE(is_function)) { + old <- paste0(old, "()") + new <- paste0(new, "()") + type <- "function" + } else { + type <- "argument" + } + warning_( + ifelse(is.null(new), + paste0("The `", old, "` ", type, " is no longer in use"), + paste0("The `", old, "` ", type, " has been replaced with `", new, "`") + ), + ifelse(type == "argument", + ". While the old argument still works, it will be removed in a future version, so please update your code.", + ", see `?AMR-deprecated`." + ), + ifelse(!is.null(extra_msg), + paste0(" ", extra_msg), + "" + ), + "\nThis warning will be shown once per session." + ) + } + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE R/zzz.R + + + + +# set up package environment, used by numerous AMR functions +AMR_env <- new.env(hash = FALSE) +AMR_env$mo_uncertainties <- data.frame( + original_input = character(0), + input = character(0), + fullname = character(0), + mo = character(0), + candidates = character(0), + minimum_matching_score = integer(0), + keep_synonyms = logical(0), + stringsAsFactors = FALSE +) +AMR_env$mo_renamed <- list() +AMR_env$mo_previously_coerced <- data.frame( + x = character(0), + mo = character(0), + stringsAsFactors = FALSE +) +AMR_env$ab_previously_coerced <- data.frame( + x = character(0), + ab = character(0), + stringsAsFactors = FALSE +) +AMR_env$av_previously_coerced <- data.frame( + x = character(0), + av = character(0), + stringsAsFactors = FALSE +) +AMR_env$sir_interpretation_history <- data.frame( + datetime = Sys.time()[0], + index = integer(0), + ab_given = character(0), + mo_given = character(0), + host_given = character(0), + ab = set_clean_class(character(0), c("ab", "character")), + mo = set_clean_class(character(0), c("mo", "character")), + host = character(0), + method = character(0), + input = double(0), + outcome = NA_sir_[0], + notes = character(0), + guideline = character(0), + ref_table = character(0), + uti = logical(0), + breakpoint_S_R = character(0), + stringsAsFactors = FALSE +) + +AMR_env$custom_ab_codes <- character(0) +AMR_env$custom_mo_codes <- character(0) +AMR_env$is_dark_theme <- NULL +AMR_env$chmatch <- import_fn("chmatch", "data.table", error_on_fail = FALSE) +AMR_env$chin <- import_fn("%chin%", "data.table", error_on_fail = FALSE) + +# take cli symbols and error function if available +AMR_env$info_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$info %or% "i" +AMR_env$bullet_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$bullet %or% "*" + +AMR_env$cross_icon <- if (isTRUE(base::l10n_info()$`UTF-8`)) "\u00d7" else "x" + +AMR_env$dots <- import_fn("symbol", "cli", error_on_fail = FALSE)$ellipsis %or% "..." +AMR_env$sup_1_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$sup_1 %or% "*" +AMR_env$cli_abort <- import_fn("cli_abort", "cli", error_on_fail = FALSE) + +.onLoad <- function(lib, pkg) { + # Support for tibble headers (type_sum) and tibble columns content (pillar_shaft) + # without the need to depend on other packages. This was suggested by the + # developers of the vctrs package: + # https://github.com/r-lib/vctrs/blob/05968ce8e669f73213e3e894b5f4424af4f46316/R/register-s3.R + s3_register("pillar::pillar_shaft", "ab") + s3_register("pillar::pillar_shaft", "av") + s3_register("pillar::pillar_shaft", "mo") + s3_register("pillar::pillar_shaft", "sir") + s3_register("pillar::pillar_shaft", "mic") + s3_register("pillar::pillar_shaft", "disk") + # no type_sum of disk, that's now in vctrs::vec_ptype_full + s3_register("pillar::type_sum", "ab") + s3_register("pillar::type_sum", "av") + s3_register("pillar::type_sum", "mo") + s3_register("pillar::type_sum", "sir") + s3_register("pillar::type_sum", "mic") + s3_register("pillar::tbl_sum", "antibiogram") + s3_register("pillar::tbl_format_footer", "antibiogram") + # Support for frequency tables from the cleaner package + s3_register("cleaner::freq", "mo") + s3_register("cleaner::freq", "sir") + # Support for skim() from the skimr package + if (pkg_is_available("skimr", min_version = "2.0.0")) { + s3_register("skimr::get_skimmers", "mo") + s3_register("skimr::get_skimmers", "sir") + s3_register("skimr::get_skimmers", "mic") + s3_register("skimr::get_skimmers", "disk") + } + # Support for autoplot() from the ggplot2 package + s3_register("ggplot2::autoplot", "sir") + s3_register("ggplot2::autoplot", "mic") + s3_register("ggplot2::autoplot", "disk") + s3_register("ggplot2::autoplot", "resistance_predict") + s3_register("ggplot2::autoplot", "antibiogram") + # Support for fortify from the ggplot2 package + s3_register("ggplot2::fortify", "sir") + s3_register("ggplot2::fortify", "mic") + s3_register("ggplot2::fortify", "disk") + # Support for knitr (R Markdown/Quarto) + s3_register("knitr::knit_print", "antibiogram") + s3_register("knitr::knit_print", "formatted_bug_drug_combinations") + # Support vctrs package for use in e.g. dplyr verbs + # NOTE 2024-02-22 this is the right way - it should be 2 S3 classes in the second argument + # S3: ab_selector + s3_register("vctrs::vec_ptype2", "character.ab_selector") + s3_register("vctrs::vec_ptype2", "ab_selector.character") + s3_register("vctrs::vec_cast", "character.ab_selector") + # S3: ab_selector_any_all + s3_register("vctrs::vec_ptype2", "logical.ab_selector_any_all") + s3_register("vctrs::vec_ptype2", "ab_selector_any_all.logical") + s3_register("vctrs::vec_cast", "logical.ab_selector_any_all") + # S3: ab + s3_register("vctrs::vec_ptype2", "ab.default") + s3_register("vctrs::vec_ptype2", "ab.ab") + s3_register("vctrs::vec_cast", "character.ab") + s3_register("vctrs::vec_cast", "ab.character") + # S3: av + s3_register("vctrs::vec_ptype2", "av.default") + s3_register("vctrs::vec_ptype2", "av.av") + s3_register("vctrs::vec_cast", "character.av") + s3_register("vctrs::vec_cast", "av.character") + # S3: mo + s3_register("vctrs::vec_ptype2", "mo.default") + s3_register("vctrs::vec_ptype2", "mo.mo") + s3_register("vctrs::vec_cast", "character.mo") + s3_register("vctrs::vec_cast", "mo.character") + # S3: disk + s3_register("vctrs::vec_ptype_full", "disk") + s3_register("vctrs::vec_ptype_abbr", "disk") + s3_register("vctrs::vec_ptype2", "disk.default") + s3_register("vctrs::vec_ptype2", "disk.disk") + s3_register("vctrs::vec_cast", "disk.disk") + s3_register("vctrs::vec_cast", "integer.disk") + s3_register("vctrs::vec_cast", "disk.integer") + s3_register("vctrs::vec_cast", "double.disk") + s3_register("vctrs::vec_cast", "disk.double") + s3_register("vctrs::vec_cast", "character.disk") + s3_register("vctrs::vec_cast", "disk.character") + # S3: mic + s3_register("vctrs::vec_ptype2", "mic.default") + s3_register("vctrs::vec_ptype2", "mic.mic") + s3_register("vctrs::vec_cast", "character.mic") + s3_register("vctrs::vec_cast", "double.mic") + s3_register("vctrs::vec_cast", "integer.mic") + s3_register("vctrs::vec_cast", "factor.mic") + s3_register("vctrs::vec_cast", "mic.character") + s3_register("vctrs::vec_cast", "mic.double") + s3_register("vctrs::vec_cast", "mic.integer") + s3_register("vctrs::vec_cast", "mic.factor") + s3_register("vctrs::vec_cast", "mic.mic") + s3_register("vctrs::vec_math", "mic") + s3_register("vctrs::vec_arith", "mic") + # S3: sir + s3_register("vctrs::vec_ptype2", "sir.default") + s3_register("vctrs::vec_ptype2", "sir.sir") + s3_register("vctrs::vec_ptype2", "character.sir") + s3_register("vctrs::vec_cast", "character.sir") + s3_register("vctrs::vec_cast", "sir.character") + s3_register("vctrs::vec_cast", "sir.sir") + + # if mo source exists, fire it up (see mo_source()) + if (tryCatch(file.exists(getOption("AMR_mo_source", "~/mo_source.rds")), error = function(e) FALSE)) { + try(invisible(get_mo_source()), silent = TRUE) + } + # be sure to print tibbles as tibbles + if (pkg_is_available("tibble")) { + try(loadNamespace("tibble"), silent = TRUE) + } + + # reference data - they have additional data to improve algorithm speed + # they cannot be part of R/sysdata.rda since CRAN thinks it would make the package too large (+3 MB) + AMR_env$AB_lookup <- cbind(AMR::antibiotics, AB_LOOKUP) + AMR_env$AV_lookup <- cbind(AMR::antivirals, AV_LOOKUP) +} + +.onAttach <- function(lib, pkg) { + # if custom ab option is available, load it + if (!is.null(getOption("AMR_custom_ab")) && file.exists(getOption("AMR_custom_ab", default = ""))) { + if (getOption("AMR_custom_ab") %unlike% "[.]rds$") { + packageStartupMessage("The file with custom antimicrobials must be an RDS file. Set the option `AMR_custom_ab` to another path.") + } else { + packageStartupMessage("Adding custom antimicrobials from '", getOption("AMR_custom_ab"), "'...", appendLF = FALSE) + x <- readRDS_AMR(getOption("AMR_custom_ab")) + tryCatch( + { + suppressWarnings(suppressMessages(add_custom_antimicrobials(x))) + packageStartupMessage("OK.") + }, + error = function(e) packageStartupMessage("Failed: ", e$message) + ) + } + } + # if custom mo option is available, load it + if (!is.null(getOption("AMR_custom_mo")) && file.exists(getOption("AMR_custom_mo", default = ""))) { + if (getOption("AMR_custom_mo") %unlike% "[.]rds$") { + packageStartupMessage("The file with custom microorganisms must be an RDS file. Set the option `AMR_custom_mo` to another path.") + } else { + packageStartupMessage("Adding custom microorganisms from '", getOption("AMR_custom_mo"), "'...", appendLF = FALSE) + x <- readRDS_AMR(getOption("AMR_custom_mo")) + tryCatch( + { + suppressWarnings(suppressMessages(add_custom_microorganisms(x))) + packageStartupMessage("OK.") + }, + error = function(e) packageStartupMessage("Failed: ", e$message) + ) + } + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/AMR.Rmd + + + +--- +title: "How to conduct AMR data analysis" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{How to conduct AMR data analysis} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 5 +) +``` + +**Note:** values on this page will change with every website update since they are based on randomly created values and the page was written in [R Markdown](https://rmarkdown.rstudio.com/). However, the methodology remains unchanged. This page was generated on `r format(Sys.Date(), "%d %B %Y")`. + +# Introduction + +Conducting AMR data analysis unfortunately requires in-depth knowledge from different scientific fields, which makes it hard to do right. At least, it requires: + +* Good questions (always start with those!) and reliable data +* A thorough understanding of (clinical) epidemiology, to understand the clinical and epidemiological relevance and possible bias of results +* A thorough understanding of (clinical) microbiology/infectious diseases, to understand which microorganisms are causal to which infections and the implications of pharmaceutical treatment, as well as understanding intrinsic and acquired microbial resistance +* Experience with data analysis with microbiological tests and their results, to understand the determination and limitations of MIC values and their interpretations to SIR values +* Availability of the biological taxonomy of microorganisms and probably normalisation factors for pharmaceuticals, such as defined daily doses (DDD) +* Available (inter-)national guidelines, and profound methods to apply them + +Of course, we cannot instantly provide you with knowledge and experience. But with this `AMR` package, we aimed at providing (1) tools to simplify antimicrobial resistance data cleaning, transformation and analysis, (2) methods to easily incorporate international guidelines and (3) scientifically reliable reference data, including the requirements mentioned above. + +The `AMR` package enables standardised and reproducible AMR data analysis, with the application of evidence-based rules, determination of first isolates, translation of various codes for microorganisms and antimicrobial agents, determination of (multi-drug) resistant microorganisms, and calculation of antimicrobial resistance, prevalence and future trends. + +# Preparation + +For this tutorial, we will create fake demonstration data to work with. + +You can skip to [Cleaning the data](#cleaning-the-data) if you already have your own data ready. If you start your analysis, try to make the structure of your data generally look like this: + +```{r example table, echo = FALSE, results = 'asis'} +knitr::kable( + data.frame( + date = Sys.Date(), + patient_id = c("abcd", "abcd", "efgh"), + mo = "Escherichia coli", + AMX = c("S", "S", "R"), + CIP = c("S", "R", "S"), + stringsAsFactors = FALSE + ), + align = "c" +) +``` + +## Needed R packages + +As with many uses in R, we need some additional packages for AMR data analysis. Our package works closely together with the [tidyverse packages](https://www.tidyverse.org) [`dplyr`](https://dplyr.tidyverse.org/) and [`ggplot2`](https://ggplot2.tidyverse.org) by RStudio. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R. + +We will also use the `cleaner` package, that can be used for cleaning data and creating frequency tables. + +```{r lib packages, message = FALSE, warning = FALSE, results = 'asis'} +library(dplyr) +library(ggplot2) +library(AMR) + +# (if not yet installed, install with:) +# install.packages(c("dplyr", "ggplot2", "AMR")) +``` + +The `AMR` package contains a data set `example_isolates_unclean`, which might look data that users have extracted from their laboratory systems: + +```{r} +example_isolates_unclean + +# we will use 'our_data' as the data set name for this tutorial +our_data <- example_isolates_unclean +``` + +For AMR data analysis, we would like the microorganism column to contain valid, up-to-date taxonomy, and the antibiotic columns to be cleaned as SIR values as well. + +## Taxonomy of microorganisms + +With `as.mo()`, users can transform arbitrary microorganism names or codes to current taxonomy. The `AMR` package contains up-to-date taxonomic data. To be specific, currently included data were retrieved on `r format(AMR:::TAXONOMY_VERSION$LPSN$accessed_date, "%d %b %Y")`. + +The codes of the AMR packages that come from `as.mo()` are short, but still human readable. More importantly, `as.mo()` supports all kinds of input: + +```{r, message = FALSE} +as.mo("Klebsiella pneumoniae") +as.mo("K. pneumoniae") +as.mo("KLEPNE") +as.mo("KLPN") +``` + +The first character in above codes denote their taxonomic kingdom, such as Bacteria (B), Fungi (F), and Protozoa (P). + +The `AMR` package also contain functions to directly retrieve taxonomic properties, such as the name, genus, species, family, order, and even Gram-stain. They all start with `mo_` and they use `as.mo()` internally, so that still any arbitrary user input can be used: + +```{r, message = FALSE} +mo_family("K. pneumoniae") +mo_genus("K. pneumoniae") +mo_species("K. pneumoniae") + +mo_gramstain("Klebsiella pneumoniae") + +mo_ref("K. pneumoniae") + +mo_snomed("K. pneumoniae") +``` + +Now we can thus clean our data: + +```{r, echo = FALSE, message = FALSE} +mo_reset_session() +``` + +```{r, message = TRUE} +our_data$bacteria <- as.mo(our_data$bacteria, info = TRUE) +``` + +Apparently, there was some uncertainty about the translation to taxonomic codes. Let's check this: + +```{r} +mo_uncertainties() +``` + +That's all good. + +## Antibiotic results + +The column with antibiotic test results must also be cleaned. The `AMR` package comes with three new data types to work with such test results: `mic` for minimal inhibitory concentrations (MIC), `disk` for disk diffusion diameters, and `sir` for SIR data that have been interpreted already. This package can also determine SIR values based on MIC or disk diffusion values, read more about that on the `as.sir()` page. + +For now, we will just clean the SIR columns in our data using dplyr: + +```{r} +# method 1, be explicit about the columns: +our_data <- our_data %>% + mutate_at(vars(AMX:GEN), as.sir) + +# method 2, let the AMR package determine the eligible columns +our_data <- our_data %>% + mutate_if(is_sir_eligible, as.sir) + +# result: +our_data +``` + +This is basically it for the cleaning, time to start the data inclusion. + +## First isolates + +We need to know which isolates we can *actually* use for analysis without repetition bias. + +To conduct an analysis of antimicrobial resistance, you must [only include the first isolate of every patient per episode](https:/pubmed.ncbi.nlm.nih.gov/17304462/) (Hindler *et al.*, Clin Infect Dis. 2007). If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following weeks (yes, some countries like the Netherlands have these blood drawing policies). The resistance percentage of oxacillin of all \emph{S. aureus} isolates would be overestimated, because you included this MRSA more than once. It would clearly be [selection bias](https://en.wikipedia.org/wiki/Selection_bias). + +The Clinical and Laboratory Standards Institute (CLSI) appoints this as follows: + +> *(...) When preparing a cumulative antibiogram to guide clinical decisions about empirical antimicrobial therapy of initial infections, **only the first isolate of a given species per patient, per analysis period (eg, one year) should be included, irrespective of body site, antimicrobial susceptibility profile, or other phenotypical characteristics (eg, biotype)**. The first isolate is easily identified, and cumulative antimicrobial susceptibility test data prepared using the first isolate are generally comparable to cumulative antimicrobial susceptibility test data calculated by other methods, providing duplicate isolates are excluded.* +
[M39-A4 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 4th Edition. CLSI, 2014. Chapter 6.4](https://clsi.org/standards/products/microbiology/documents/m39/) + +This `AMR` package includes this methodology with the `first_isolate()` function and is able to apply the four different methods as defined by [Hindler *et al.* in 2007](https://academic.oup.com/cid/article/44/6/867/364325): phenotype-based, episode-based, patient-based, isolate-based. The right method depends on your goals and analysis, but the default phenotype-based method is in any case the method to properly correct for most duplicate isolates. Read more about the methods on the `first_isolate()` page. + +The outcome of the function can easily be added to our data: + +```{r 1st isolate} +our_data <- our_data %>% + mutate(first = first_isolate(info = TRUE)) +``` + +So only `r round((sum(our_data$first) / nrow(our_data) * 100))`% is suitable for resistance analysis! We can now filter on it with the `filter()` function, also from the `dplyr` package: + +```{r 1st isolate filter} +our_data_1st <- our_data %>% + filter(first == TRUE) +``` + +For future use, the above two syntaxes can be shortened: + +```{r 1st isolate filter 2} +our_data_1st <- our_data %>% + filter_first_isolate() +``` + +So we end up with `r format(nrow(our_data_1st), big.mark = " ")` isolates for analysis. Now our data looks like: + +```{r preview data set 3} +our_data_1st +``` + +Time for the analysis. + +# Analysing the data + +The base R `summary()` function gives a good first impression, as it comes with support for the new `mo` and `sir` classes that we now have in our data set: + +```{r} +summary(our_data_1st) + +glimpse(our_data_1st) + +# number of unique values per column: +sapply(our_data_1st, n_distinct) +``` + +## Availability of species + +To just get an idea how the species are distributed, create a frequency table with `count()` based on the name of the microorganisms: + +```{r freq 1} +our_data %>% + count(mo_name(bacteria), sort = TRUE) + +our_data_1st %>% + count(mo_name(bacteria), sort = TRUE) +``` + +## Select and filter with antibiotic selectors + +Using so-called antibiotic class selectors, you can select or filter columns based on the antibiotic class that your antibiotic results are in: + +```{r bug_drg 2a} +our_data_1st %>% + select(date, aminoglycosides()) + +our_data_1st %>% + select(bacteria, betalactams()) + +our_data_1st %>% + select(bacteria, where(is.sir)) + +# filtering using AB selectors is also possible: +our_data_1st %>% + filter(any(aminoglycosides() == "R")) + +our_data_1st %>% + filter(all(betalactams() == "R")) + +# even works in base R (since R 3.0): +our_data_1st[all(betalactams() == "R"), ] +``` + +## Generate antibiograms + +Since AMR v2.0 (March 2023), it is very easy to create different types of antibiograms, with support for 20 different languages. + +There are four antibiogram types, as proposed by Klinker *et al.* (2021, [DOI 10.1177/20499361211011373](https://doi.org/10.1177/20499361211011373)), and they are all supported by the new `antibiogram()` function: + +1. **Traditional Antibiogram (TA)** e.g, for the susceptibility of *Pseudomonas aeruginosa* to piperacillin/tazobactam (TZP) +2. **Combination Antibiogram (CA)** e.g, for the sdditional susceptibility of *Pseudomonas aeruginosa* to TZP + tobramycin versus TZP alone +3. **Syndromic Antibiogram (SA)** e.g, for the susceptibility of *Pseudomonas aeruginosa* to TZP among respiratory specimens (obtained among ICU patients only) +4. **Weighted-Incidence Syndromic Combination Antibiogram (WISCA)** e.g, for the susceptibility of *Pseudomonas aeruginosa* to TZP among respiratory specimens (obtained among ICU patients only) for male patients age >=65 years with heart failure + +In this section, we show how to use the `antibiogram()` function to create any of the above antibiogram types. For starters, this is what the included `example_isolates` data set looks like: + +```{r} +example_isolates +``` + +### Traditional Antibiogram + +To create a traditional antibiogram, simply state which antibiotics should be used. The `antibiotics` argument in the `antibiogram()` function supports any (combination) of the previously mentioned antibiotic class selectors: + +```{r trad} +antibiogram(example_isolates, + antibiotics = c(aminoglycosides(), carbapenems())) +``` + +Notice that the `antibiogram()` function automatically prints in the right format when using Quarto or R Markdown (such as this page), and even applies italics for taxonomic names (by using `italicise_taxonomy()` internally). + +It also uses the language of your OS if this is either `r AMR:::vector_or(vapply(FUN.VALUE = character(1), AMR:::LANGUAGES_SUPPORTED_NAMES, function(x) x$exonym), quotes = FALSE, sort = FALSE)`. In this next example, we force the language to be Spanish using the `language` argument: + +```{r trad2} +antibiogram(example_isolates, + mo_transform = "gramstain", + antibiotics = aminoglycosides(), + ab_transform = "name", + language = "es") +``` + +### Combined Antibiogram + +To create a combined antibiogram, use antibiotic codes or names with a plus `+` character like this: + +```{r comb} +antibiogram(example_isolates, + antibiotics = c("TZP", "TZP+TOB", "TZP+GEN")) +``` + +### Syndromic Antibiogram + +To create a syndromic antibiogram, the `syndromic_group` argument must be used. This can be any column in the data, or e.g. an `ifelse()` with calculations based on certain columns: + +```{r synd} +antibiogram(example_isolates, + antibiotics = c(aminoglycosides(), carbapenems()), + syndromic_group = "ward") +``` + +### Weighted-Incidence Syndromic Combination Antibiogram (WISCA) + +To create a WISCA, you must state combination therapy in the `antibiotics` argument (similar to the Combination Antibiogram), define a syndromic group with the `syndromic_group` argument (similar to the Syndromic Antibiogram) in which cases are predefined based on clinical or demographic characteristics (e.g., endocarditis in 75+ females). This next example is a simplification without clinical characteristics, but just gives an idea of how a WISCA can be created: + +```{r wisca} +wisca <- antibiogram(example_isolates, + antibiotics = c("AMC", "AMC+CIP", "TZP", "TZP+TOB"), + mo_transform = "gramstain", + minimum = 10, # this should be >= 30, but now just as example + syndromic_group = ifelse(example_isolates$age >= 65 & + example_isolates$gender == "M", + "WISCA Group 1", "WISCA Group 2")) +wisca +``` + +### Plotting antibiograms + +Antibiograms can be plotted using `autoplot()` from the `ggplot2` packages, since this `AMR` package provides an extension to that function: + +```{r} +autoplot(wisca) +``` + +To calculate antimicrobial resistance in a more sensible way, also by correcting for too few results, we use the `resistance()` and `susceptibility()` functions. + +## Resistance percentages + +The functions `resistance()` and `susceptibility()` can be used to calculate antimicrobial resistance or susceptibility. For more specific analyses, the functions `proportion_S()`, `proportion_SI()`, `proportion_I()`, `proportion_IR()` and `proportion_R()` can be used to determine the proportion of a specific antimicrobial outcome. + +All these functions contain a `minimum` argument, denoting the minimum required number of test results for returning a value. These functions will otherwise return `NA`. The default is `minimum = 30`, following the [CLSI M39-A4 guideline](https://clsi.org/standards/products/microbiology/documents/m39/) for applying microbial epidemiology. + +As per the EUCAST guideline of 2019, we calculate resistance as the proportion of R (`proportion_R()`, equal to `resistance()`) and susceptibility as the proportion of S and I (`proportion_SI()`, equal to `susceptibility()`). These functions can be used on their own: + +```{r} +our_data_1st %>% resistance(AMX) +``` + +Or can be used in conjunction with `group_by()` and `summarise()`, both from the `dplyr` package: + +```{r} +our_data_1st %>% + group_by(hospital) %>% + summarise(amoxicillin = resistance(AMX)) +``` + +---- + +*Author: Dr. Matthijs Berends, 26th Feb 2023* + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/AMR_for_Python.Rmd + + + +--- +title: "AMR for Python" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{AMR for Python} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 5 +) +``` + +# Introduction + +The `AMR` package for R is a powerful tool for antimicrobial resistance (AMR) analysis. It provides extensive features for handling microbial and antimicrobial data. However, for those who work primarily in Python, we now have a more intuitive option available: the [`AMR` Python Package Index](https://pypi.org/project/AMR/). + +This Python package is a wrapper round the `AMR` R package. It uses the `rpy2` package internally. Despite the need to have R installed, Python users can now easily work with AMR data directly through Python code. + +# Install + +1. Since the Python package is available on the official [Python Package Index](https://pypi.org/project/AMR/), you can just run: + + ```bash + pip install AMR + ``` + +2. Make sure you have R installed. There is **no need to install the `AMR` R package**, as it will be installed automatically. + + For Linux: + + ```bash + # Ubuntu / Debian + sudo apt install r-base + # Fedora: + sudo dnf install R + # CentOS/RHEL + sudo yum install R + ``` + + For macOS (using [Homebrew](https://brew.sh)): + + ```bash + brew install r + ``` + + For Windows, visit the [CRAN download page](https://cran.r-project.org) to download and install R. + +# Examples of Usage + +## Cleaning Taxonomy + +Here’s an example that demonstrates how to clean microorganism and drug names using the `AMR` Python package: + +```python +import pandas as pd +import AMR + +# Sample data +data = { + "MOs": ['E. coli', 'ESCCOL', 'esco', 'Esche coli'], + "Drug": ['Cipro', 'CIP', 'J01MA02', 'Ciproxin'] +} +df = pd.DataFrame(data) + +# Use AMR functions to clean microorganism and drug names +df['MO_clean'] = AMR.mo_name(df['MOs']) +df['Drug_clean'] = AMR.ab_name(df['Drug']) + +# Display the results +print(df) +``` + +| MOs | Drug | MO_clean | Drug_clean | +|-------------|-----------|--------------------|---------------| +| E. coli | Cipro | Escherichia coli | Ciprofloxacin | +| ESCCOL | CIP | Escherichia coli | Ciprofloxacin | +| esco | J01MA02 | Escherichia coli | Ciprofloxacin | +| Esche coli | Ciproxin | Escherichia coli | Ciprofloxacin | + +### Explanation + +* **mo_name:** This function standardises microorganism names. Here, different variations of *Escherichia coli* (such as "E. coli", "ESCCOL", "esco", and "Esche coli") are all converted into the correct, standardised form, "Escherichia coli". + +* **ab_name**: Similarly, this function standardises antimicrobial names. The different representations of ciprofloxacin (e.g., "Cipro", "CIP", "J01MA02", and "Ciproxin") are all converted to the standard name, "Ciprofloxacin". + + +## Taxonomic Data Sets Now in Python! + +As a Python user, you might like that the most important data sets of the `AMR` R package, `microorganisms`, `antibiotics`, `clinical_breakpoints`, and `example_isolates`, are now available as regular Python data frames: + +```python +AMR.microorganisms +``` + +| mo | fullname | status | kingdom | gbif | gbif_parent | gbif_renamed_to | prevalence | +|--------------|------------------------------------|----------|----------|-----------|-------------|-----------------|------------| +| B_GRAMN | (unknown Gram-negatives) | unknown | Bacteria | None | None | None | 2.0 | +| B_GRAMP | (unknown Gram-positives) | unknown | Bacteria | None | None | None | 2.0 | +| B_ANAER-NEG | (unknown anaerobic Gram-negatives) | unknown | Bacteria | None | None | None | 2.0 | +| B_ANAER-POS | (unknown anaerobic Gram-positives) | unknown | Bacteria | None | None | None | 2.0 | +| B_ANAER | (unknown anaerobic bacteria) | unknown | Bacteria | None | None | None | 2.0 | +| ... | ... | ... | ... | ... | ... | ... | ... | +| B_ZYMMN_POMC | Zymomonas pomaceae | accepted | Bacteria | 10744418 | 3221412 | None | 2.0 | +| B_ZYMPH | Zymophilus | synonym | Bacteria | None | 9475166 | None | 2.0 | +| B_ZYMPH_PCVR | Zymophilus paucivorans | synonym | Bacteria | None | None | None | 2.0 | +| B_ZYMPH_RFFN | Zymophilus raffinosivorans | synonym | Bacteria | None | None | None | 2.0 | +| F_ZYZYG | Zyzygomyces | unknown | Fungi | None | 7581 | None | 2.0 | + +```python +AMR.antibiotics +``` + +| ab | cid | name | group | oral_ddd | oral_units | iv_ddd | iv_units | +|-----|-------------|----------------------|----------------------------|----------|------------|--------|----------| +| AMA | 4649.0 | 4-aminosalicylic acid| Antimycobacterials | 12.00 | g | NaN | None | +| ACM | 6450012.0 | Acetylmidecamycin | Macrolides/lincosamides | NaN | None | NaN | None | +| ASP | 49787020.0 | Acetylspiramycin | Macrolides/lincosamides | NaN | None | NaN | None | +| ALS | 8954.0 | Aldesulfone sodium | Other antibacterials | 0.33 | g | NaN | None | +| AMK | 37768.0 | Amikacin | Aminoglycosides | NaN | None | 1.0 | g | +| ... | ... | ... | ... | ... | ... | ... | ... | +| VIR | 11979535.0 | Virginiamycine | Other antibacterials | NaN | None | NaN | None | +| VOR | 71616.0 | Voriconazole | Antifungals/antimycotics | 0.40 | g | 0.4 | g | +| XBR | 72144.0 | Xibornol | Other antibacterials | NaN | None | NaN | None | +| ZID | 77846445.0 | Zidebactam | Other antibacterials | NaN | None | NaN | None | +| ZFD | NaN | Zoliflodacin | None | NaN | None | NaN | None | + + +## Calculating AMR + +```python +import AMR +import pandas as pd + +df = AMR.example_isolates +result = AMR.resistance(df["AMX"]) +print(result) +``` + +``` +[0.59555556] +``` + +## Generating Antibiograms + +One of the core functions of the `AMR` package is generating an antibiogram, a table that summarises the antimicrobial susceptibility of bacterial isolates. Here’s how you can generate an antibiogram from Python: + +```python +result2a = AMR.antibiogram(df[["mo", "AMX", "CIP", "TZP"]]) +print(result2a) +``` + +| Pathogen | Amoxicillin | Ciprofloxacin | Piperacillin/tazobactam | +|-----------------|-----------------|-----------------|--------------------------| +| CoNS | 7% (10/142) | 73% (183/252) | 30% (10/33) | +| E. coli | 50% (196/392) | 88% (399/456) | 94% (393/416) | +| K. pneumoniae | 0% (0/58) | 96% (53/55) | 89% (47/53) | +| P. aeruginosa | 0% (0/30) | 100% (30/30) | None | +| P. mirabilis | None | 94% (34/36) | None | +| S. aureus | 6% (8/131) | 90% (171/191) | None | +| S. epidermidis | 1% (1/91) | 64% (87/136) | None | +| S. hominis | None | 80% (56/70) | None | +| S. pneumoniae | 100% (112/112) | None | 100% (112/112) | + + +```python +result2b = AMR.antibiogram(df[["mo", "AMX", "CIP", "TZP"]], mo_transform = "gramstain") +print(result2b) +``` + +| Pathogen | Amoxicillin | Ciprofloxacin | Piperacillin/tazobactam | +|----------------|-----------------|------------------|--------------------------| +| Gram-negative | 36% (226/631) | 91% (621/684) | 88% (565/641) | +| Gram-positive | 43% (305/703) | 77% (560/724) | 86% (296/345) | + + +In this example, we generate an antibiogram by selecting various antibiotics. + +# Conclusion + +With the `AMR` Python package, Python users can now effortlessly call R functions from the `AMR` R package. This eliminates the need for complex `rpy2` configurations and provides a clean, easy-to-use interface for antimicrobial resistance analysis. The examples provided above demonstrate how this can be applied to typical workflows, such as standardising microorganism and antimicrobial names or calculating resistance. + +By just running `import AMR`, users can seamlessly integrate the robust features of the R `AMR` package into Python workflows. + +Whether you're cleaning data or analysing resistance patterns, the `AMR` Python package makes it easy to work with AMR data in Python. + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/EUCAST.Rmd + + + +--- +title: "How to apply EUCAST rules" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{How to apply EUCAST rules} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 4.5 +) +library(AMR) +``` + +## Introduction + +What are EUCAST rules? The European Committee on Antimicrobial Susceptibility Testing (EUCAST) states [on their website](https://www.eucast.org/expert_rules_and_intrinsic_resistance/): + +> *EUCAST expert rules are a tabulated collection of expert knowledge on intrinsic resistances, exceptional resistance phenotypes and interpretive rules that may be applied to antimicrobial susceptibility testing in order to reduce errors and make appropriate recommendations for reporting particular resistances.* + +In Europe, a lot of medical microbiological laboratories already apply these rules ([Brown *et al.*, 2015](https://www.eurosurveillance.org/content/10.2807/1560-7917.ES2015.20.2.21008)). Our package features their latest insights on intrinsic resistance and unusual phenotypes (`r AMR:::EUCAST_VERSION_EXPERT_RULES[[length(AMR:::EUCAST_VERSION_EXPERT_RULES)]]$version_txt`, `r AMR:::EUCAST_VERSION_EXPERT_RULES[[length(AMR:::EUCAST_VERSION_EXPERT_RULES)]]$year`). + +Moreover, the `eucast_rules()` function we use for this purpose can also apply additional rules, like forcing ampicillin = R in isolates when amoxicillin/clavulanic acid = R. + +## Examples + +These rules can be used to discard impossible bug-drug combinations in your data. For example, *Klebsiella* produces beta-lactamase that prevents ampicillin (or amoxicillin) from working against it. In other words, practically every strain of *Klebsiella* is resistant to ampicillin. + +Sometimes, laboratory data can still contain such strains with ampicillin being susceptible to ampicillin. This could be because an antibiogram is available before an identification is available, and the antibiogram is then not re-interpreted based on the identification (namely, *Klebsiella*). EUCAST expert rules solve this, that can be applied using `eucast_rules()`: + +```{r, warning = FALSE, message = FALSE} +oops <- data.frame( + mo = c( + "Klebsiella", + "Escherichia" + ), + ampicillin = "S" +) +oops + +eucast_rules(oops, info = FALSE) +``` + +A more convenient function is `mo_is_intrinsic_resistant()` that uses the same guideline, but allows to check for one or more specific microorganisms or antibiotics: + +```{r, warning = FALSE, message = FALSE} +mo_is_intrinsic_resistant( + c("Klebsiella", "Escherichia"), + "ampicillin" +) + +mo_is_intrinsic_resistant( + "Klebsiella", + c("ampicillin", "kanamycin") +) +``` + +EUCAST rules can not only be used for correction, they can also be used for filling in known resistance and susceptibility based on results of other antimicrobials drugs. This process is called *interpretive reading*, is basically a form of imputation, and is part of the `eucast_rules()` function as well: + +```{r, warning = FALSE, message = FALSE} +data <- data.frame( + mo = c( + "Staphylococcus aureus", + "Enterococcus faecalis", + "Escherichia coli", + "Klebsiella pneumoniae", + "Pseudomonas aeruginosa" + ), + VAN = "-", # Vancomycin + AMX = "-", # Amoxicillin + COL = "-", # Colistin + CAZ = "-", # Ceftazidime + CXM = "-", # Cefuroxime + PEN = "S", # Benzylenicillin + FOX = "S", # Cefoxitin + stringsAsFactors = FALSE +) +``` +```{r, eval = FALSE} +data +``` +```{r, echo = FALSE} +knitr::kable(data, align = "lccccccc") +``` +```{r, warning = FALSE, eval = FALSE} +eucast_rules(data) +``` +```{r, warning = FALSE, echo = FALSE, message = FALSE} +knitr::kable(eucast_rules(data), align = "lccccccc") +``` + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/MDR.Rmd + + + +--- +title: "How to determine multi-drug resistance (MDR)" +output: + rmarkdown::html_vignette: + toc: true +vignette: > + %\VignetteIndexEntry{How to determine multi-drug resistance (MDR)} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>" +) +library(AMR) +``` + +With the function `mdro()`, you can determine which micro-organisms are multi-drug resistant organisms (MDRO). + +### Type of input + +The `mdro()` function takes a data set as input, such as a regular `data.frame`. It tries to automatically determine the right columns for info about your isolates, such as the name of the species and all columns with results of antimicrobial agents. See the help page for more info about how to set the right settings for your data with the command `?mdro`. + +For WHONET data (and most other data), all settings are automatically set correctly. + +### Guidelines + +The `mdro()` function support multiple guidelines. You can select a guideline with the `guideline` parameter. Currently supported guidelines are (case-insensitive): + +* `guideline = "CMI2012"` (default) + + Magiorakos AP, Srinivasan A *et al.* "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012) ([link](https://www.clinicalmicrobiologyandinfection.com/article/S1198-743X(14)61632-3/fulltext)) + +* `guideline = "EUCAST3.2"` (or simply `guideline = "EUCAST"`) + + The European international guideline - EUCAST Expert Rules Version 3.2 "Intrinsic Resistance and Unusual Phenotypes" ([link](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf)) + +* `guideline = "EUCAST3.1"` + + The European international guideline - EUCAST Expert Rules Version 3.1 "Intrinsic Resistance and Exceptional Phenotypes Tables" ([link](https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf)) + +* `guideline = "TB"` + + The international guideline for multi-drug resistant tuberculosis - World Health Organization "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis" ([link](https://www.who.int/tb/publications/pmdt_companionhandbook/en/)) + +* `guideline = "MRGN"` + + The German national guideline - Mueller *et al.* (2015) Antimicrobial Resistance and Infection Control 4:7. DOI: 10.1186/s13756-015-0047-6 + +* `guideline = "BRMO"` + + The Dutch national guideline - Rijksinstituut voor Volksgezondheid en Milieu "WIP-richtlijn BRMO (Bijzonder Resistente Micro-Organismen) (ZKH)" ([link](https://www.rivm.nl/wip-richtlijn-brmo-bijzonder-resistente-micro-organismen-zkh)) + +Please suggest your own (country-specific) guidelines by letting us know: . + +#### Custom Guidelines + +You can also use your own custom guideline. Custom guidelines can be set with the `custom_mdro_guideline()` function. This is of great importance if you have custom rules to determine MDROs in your hospital, e.g., rules that are dependent on ward, state of contact isolation or other variables in your data. + +If you are familiar with `case_when()` of the `dplyr` package, you will recognise the input method to set your own rules. Rules must be set using what R considers to be the 'formula notation': + +```{r} +custom <- custom_mdro_guideline( + CIP == "R" & age > 60 ~ "Elderly Type A", + ERY == "R" & age > 60 ~ "Elderly Type B" +) +``` + +If a row/an isolate matches the first rule, the value after the first `~` (in this case *'Elderly Type A'*) will be set as MDRO value. Otherwise, the second rule will be tried and so on. The maximum number of rules is unlimited. + +You can print the rules set in the console for an overview. Colours will help reading it if your console supports colours. + +```{r} +custom +``` + +The outcome of the function can be used for the `guideline` argument in the `mdro()` function: + +```{r} +x <- mdro(example_isolates, guideline = custom) +table(x) +``` + +The rules set (the `custom` object in this case) could be exported to a shared file location using `saveRDS()` if you collaborate with multiple users. The custom rules set could then be imported using `readRDS()`. + +### Examples + +The `mdro()` function always returns an ordered `factor` for predefined guidelines. For example, the output of the default guideline by Magiorakos *et al.* returns a `factor` with levels 'Negative', 'MDR', 'XDR' or 'PDR' in that order. + +The next example uses the `example_isolates` data set. This is a data set included with this package and contains full antibiograms of 2,000 microbial isolates. It reflects reality and can be used to practise AMR data analysis. If we test the MDR/XDR/PDR guideline on this data set, we get: + +```{r, message = FALSE} +library(dplyr) # to support pipes: %>% +library(cleaner) # to create frequency tables +``` +```{r, results = 'hide'} +example_isolates %>% + mdro() %>% + freq() # show frequency table of the result +``` +```{r, echo = FALSE, results = 'asis', message = FALSE, warning = FALSE} +example_isolates %>% + mdro(info = FALSE) %>% + freq() # show frequency table of the result +``` + +For another example, I will create a data set to determine multi-drug resistant TB: + +```{r} +# random_sir() is a helper function to generate +# a random vector with values S, I and R +my_TB_data <- data.frame( + rifampicin = random_sir(5000), + isoniazid = random_sir(5000), + gatifloxacin = random_sir(5000), + ethambutol = random_sir(5000), + pyrazinamide = random_sir(5000), + moxifloxacin = random_sir(5000), + kanamycin = random_sir(5000) +) +``` + +Because all column names are automatically verified for valid drug names or codes, this would have worked exactly the same way: + +```{r, eval = FALSE} +my_TB_data <- data.frame( + RIF = random_sir(5000), + INH = random_sir(5000), + GAT = random_sir(5000), + ETH = random_sir(5000), + PZA = random_sir(5000), + MFX = random_sir(5000), + KAN = random_sir(5000) +) +``` + +The data set now looks like this: + +```{r} +head(my_TB_data) +``` + +We can now add the interpretation of MDR-TB to our data set. You can use: + +```r +mdro(my_TB_data, guideline = "TB") +``` + +or its shortcut `mdr_tb()`: + +```{r} +my_TB_data$mdr <- mdr_tb(my_TB_data) +``` + +Create a frequency table of the results: + +```{r, results = 'asis'} +freq(my_TB_data$mdr) +``` + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/PCA.Rmd + + + +--- +title: "How to conduct principal component analysis (PCA) for AMR" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{How to conduct principal component analysis (PCA) for AMR} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 4.5, + dpi = 100 +) +``` + +**NOTE: This page will be updated soon, as the pca() function is currently being developed.** + +# Introduction + +# Transforming + +For PCA, we need to transform our AMR data first. This is what the `example_isolates` data set in this package looks like: + +```{r, message = FALSE} +library(AMR) +library(dplyr) +glimpse(example_isolates) +``` + +Now to transform this to a data set with only resistance percentages per taxonomic order and genus: + +```{r, warning = FALSE} +resistance_data <- example_isolates %>% + group_by( + order = mo_order(mo), # group on anything, like order + genus = mo_genus(mo) + ) %>% # and genus as we do here + summarise_if(is.sir, resistance) %>% # then get resistance of all drugs + select( + order, genus, AMC, CXM, CTX, + CAZ, GEN, TOB, TMP, SXT + ) # and select only relevant columns + +head(resistance_data) +``` + +# Perform principal component analysis + +The new `pca()` function will automatically filter on rows that contain numeric values in all selected variables, so we now only need to do: + +```{r pca} +pca_result <- pca(resistance_data) +``` + +The result can be reviewed with the good old `summary()` function: + +```{r} +summary(pca_result) +``` + +```{r, echo = FALSE} +proportion_of_variance <- summary(pca_result)$importance[2, ] +``` + +Good news. The first two components explain a total of `r cleaner::percentage(sum(proportion_of_variance[1:2]))` of the variance (see the PC1 and PC2 values of the *Proportion of Variance*. We can create a so-called biplot with the base R `biplot()` function, to see which antimicrobial resistance per drug explain the difference per microorganism. + +# Plotting the results + +```{r} +biplot(pca_result) +``` + +But we can't see the explanation of the points. Perhaps this works better with our new `ggplot_pca()` function, that automatically adds the right labels and even groups: + +```{r} +ggplot_pca(pca_result) +``` + +You can also print an ellipse per group, and edit the appearance: + +```{r} +ggplot_pca(pca_result, ellipse = TRUE) + + ggplot2::labs(title = "An AMR/PCA biplot!") +``` + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/WHONET.Rmd + + + +--- +title: "How to work with WHONET data" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{How to work with WHONET data} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 4.5 +) +``` + +### Import of data + +This tutorial assumes you already imported the WHONET data with e.g. the [`readxl` package](https://readxl.tidyverse.org/). In RStudio, this can be done using the menu button 'Import Dataset' in the tab 'Environment'. Choose the option 'From Excel' and select your exported file. Make sure date fields are imported correctly. + +An example syntax could look like this: + +```{r, eval = FALSE} +library(readxl) +data <- read_excel(path = "path/to/your/file.xlsx") +``` + +This package comes with an [example data set `WHONET`](https://msberends.github.io/AMR/reference/WHONET.html). We will use it for this analysis. + +### Preparation + +First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don't know it yet, I suggest you read about it on their website: https://www.tidyverse.org/. + +```{r, message = FALSE} +library(dplyr) # part of tidyverse +library(ggplot2) # part of tidyverse +library(AMR) # this package +library(cleaner) # to create frequency tables +``` + +We will have to transform some variables to simplify and automate the analysis: + +* Microorganisms should be transformed to our own microorganism codes (called an `mo`) using [our Catalogue of Life reference data set](https://msberends.github.io/AMR/reference/catalogue_of_life), which contains all ~70,000 microorganisms from the taxonomic kingdoms Bacteria, Fungi and Protozoa. We do the tranformation with `as.mo()`. This function also recognises almost all WHONET abbreviations of microorganisms. +* Antimicrobial results or interpretations have to be clean and valid. In other words, they should only contain values `"S"`, `"I"` or `"R"`. That is exactly where the `as.sir()` function is for. + +```{r} +# transform variables +data <- WHONET %>% + # get microbial ID based on given organism + mutate(mo = as.mo(Organism)) %>% + # transform everything from "AMP_ND10" to "CIP_EE" to the new `sir` class + mutate_at(vars(AMP_ND10:CIP_EE), as.sir) +``` + +No errors or warnings, so all values are transformed succesfully. + +We also created a package dedicated to data cleaning and checking, called the `cleaner` package. Its `freq()` function can be used to create frequency tables. + +So let's check our data, with a couple of frequency tables: + +```{r, results = 'asis'} +# our newly created `mo` variable, put in the mo_name() function +data %>% freq(mo_name(mo), nmax = 10) +``` +```{r, results = 'asis'} +# our transformed antibiotic columns +# amoxicillin/clavulanic acid (J01CR02) as an example +data %>% freq(AMC_ND2) +``` + +### A first glimpse at results + +An easy `ggplot` will already give a lot of information, using the included `ggplot_sir()` function: + +```{r, eval = FALSE} +data %>% + group_by(Country) %>% + select(Country, AMP_ND2, AMC_ED20, CAZ_ED10, CIP_ED5) %>% + ggplot_sir(translate_ab = "ab", facet = "Country", datalabels = FALSE) +``` + +```{r, echo = FALSE} +# on very old and some new releases of R, this may lead to an error +tryCatch( + data %>% + group_by(Country) %>% + select(Country, AMP_ND2, AMC_ED20, CAZ_ED10, CIP_ED5) %>% + ggplot_sir(translate_ab = "ab", facet = "Country", datalabels = FALSE) %>% + print(), + error = function(e) base::invisible() +) +``` + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/datasets.Rmd + + + +--- +title: "Data sets for download / own use" +date: '`r format(Sys.Date(), "%d %B %Y")`' +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 1 +vignette: > + %\VignetteIndexEntry{Data sets for download / own use} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = "markup"} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#", + fig.width = 7.5, + fig.height = 5 +) + +library(AMR) +library(dplyr) + +options(knitr.kable.NA = "") + +structure_txt <- function(dataset) { + paste0( + "A data set with ", + format(nrow(dataset), big.mark = " "), " rows and ", + ncol(dataset), " columns, containing the following column names: \n", + AMR:::vector_or(colnames(dataset), quotes = "*", last_sep = " and ", sort = FALSE), "." + ) +} + +download_txt <- function(filename) { + msg <- paste0( + "It was last updated on ", + trimws(format(file.mtime(paste0("../data/", filename, ".rda")), "%e %B %Y %H:%M:%S %Z", tz = "UTC")), + ". Find more info about the structure of this data set [here](https://msberends.github.io/AMR/reference/", ifelse(filename == "antivirals", "antibiotics", filename), ".html).\n" + ) + github_base <- "https://github.com/msberends/AMR/raw/main/data-raw/" + filename <- paste0("../data-raw/", filename) + rds <- paste0(filename, ".rds") + txt <- paste0(filename, ".txt") + excel <- paste0(filename, ".xlsx") + feather <- paste0(filename, ".feather") + parquet <- paste0(filename, ".parquet") + xpt <- paste0(filename, ".xpt") + spss <- paste0(filename, ".sav") + stata <- paste0(filename, ".dta") + create_txt <- function(filename, type, software, exists) { + if (isTRUE(exists)) { + paste0( + "* Download as [", software, "](", github_base, filename, ") (", + AMR:::formatted_filesize(filename), ") \n" + ) + } else { + paste0("* *(unavailable as ", software, ")*\n") + } + } + + if (any( + file.exists(rds), + file.exists(txt), + file.exists(excel), + file.exists(feather), + file.exists(parquet), + file.exists(xpt), + file.exists(spss), + file.exists(stata) + )) { + msg <- c( + msg, "\n**Direct download links:**\n\n", + create_txt(rds, "rds", "original R Data Structure (RDS) file", file.exists(rds)), + create_txt(txt, "txt", "tab-separated text file", file.exists(txt)), + create_txt(excel, "xlsx", "Microsoft Excel workbook", file.exists(excel)), + create_txt(feather, "feather", "Apache Feather file", file.exists(feather)), + create_txt(parquet, "parquet", "Apache Parquet file", file.exists(parquet)), + # create_txt(xpt, "xpt", "SAS transport (XPT) file", file.exists(xpt)), + create_txt(spss, "sav", "IBM SPSS Statistics data file", file.exists(spss)), + create_txt(stata, "dta", "Stata DTA file", file.exists(stata)) + ) + } + paste0(msg, collapse = "") +} + +print_df <- function(x, rows = 6) { + x %>% + as.data.frame(stringsAsFactors = FALSE) %>% + head(n = rows) %>% + mutate_all(function(x) { + if (is.list(x)) { + sapply(x, function(y) { + if (length(y) > 3) { + paste0(paste(y[1:3], collapse = ", "), ", ...") + } else if (length(y) == 0 || all(is.na(y))) { + "" + } else { + paste(y, collapse = ", ") + } + }) + } else { + x + } + }) %>% + knitr::kable(align = "c") +} +``` + +All reference data (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) in this `AMR` package are reliable, up-to-date and freely available. We continually export our data sets to formats for use in R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. + +On this page, we explain how to download them and how the structure of the data sets look like. + +## `microorganisms`: Full Microbial Taxonomy + +`r structure_txt(microorganisms)` + +This data set is in R available as `microorganisms`, after you load the `AMR` package. + +`r download_txt("microorganisms")` + +**NOTE: The exported files for SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise exceed 100 MB; the file size limit of GitHub.** Their file structures and compression techniques are very inefficient. Advice? Use R instead. It's free and much better in many ways. + +The tab-separated text file and Microsoft Excel workbook both contain all SNOMED codes as comma separated values. + +### Source + +This data set contains the full microbial taxonomy of `r AMR:::nr2char(length(unique(AMR::microorganisms$kingdom[!AMR::microorganisms$kingdom %like% "unknown"])))` kingdoms from the `r AMR:::TAXONOMY_VERSION$LPSN$name`, `r AMR:::TAXONOMY_VERSION$MycoBank$name`, and the `r AMR:::TAXONOMY_VERSION$GBIF$name`: + +* `r AMR:::TAXONOMY_VERSION$LPSN$citation` Accessed from <`r AMR:::TAXONOMY_VERSION$LPSN$url`> on `r AMR:::documentation_date(AMR:::TAXONOMY_VERSION$LPSN$accessed_date)`. +* `r AMR:::TAXONOMY_VERSION$MycoBank$citation` Accessed from <`r AMR:::TAXONOMY_VERSION$MycoBank$url`> on `r AMR:::documentation_date(AMR:::TAXONOMY_VERSION$MycoBank$accessed_date)`. +* `r AMR:::TAXONOMY_VERSION$GBIF$citation` Accessed from <`r AMR:::TAXONOMY_VERSION$GBIF$url`> on `r AMR:::documentation_date(AMR:::TAXONOMY_VERSION$GBIF$accessed_date)`. +* `r AMR:::TAXONOMY_VERSION$BacDive$citation` Accessed from <`r AMR:::TAXONOMY_VERSION$BacDive$url`> on `r AMR:::documentation_date(AMR:::TAXONOMY_VERSION$BacDive$accessed_date)`. +* `r AMR:::TAXONOMY_VERSION$SNOMED$citation` URL: <`r AMR:::TAXONOMY_VERSION$SNOMED$url`> + +### Example content + +Included (sub)species per taxonomic kingdom: + +```{r, echo = FALSE} +microorganisms %>% + count(kingdom) %>% + mutate(n = format(n, big.mark = " ")) %>% + setNames(c("Kingdom", "Number of (sub)species")) %>% + print_df() +``` + +Example rows when filtering on genus *Escherichia*: + +```{r, echo = FALSE} +microorganisms %>% + filter(genus == "Escherichia") %>% + print_df() +``` + + +## `antibiotics`: Antibiotic (+Antifungal) Drugs + +`r structure_txt(antibiotics)` + +This data set is in R available as `antibiotics`, after you load the `AMR` package. + +`r download_txt("antibiotics")` + +The tab-separated text, Microsoft Excel, SPSS, and Stata files all contain the ATC codes, common abbreviations, trade names and LOINC codes as comma separated values. + +### Source + +This data set contains all EARS-Net and ATC codes gathered from WHO and WHONET, and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration. + +* [ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology](https://atcddd.fhi.no/atc_ddd_index/) (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use) +* [PubChem by the US National Library of Medicine](https://pubchem.ncbi.nlm.nih.gov) +* [WHONET software 2019](https://whonet.org) +* [LOINC (Logical Observation Identifiers Names and Codes)](https://loinc.org) + +### Example content + +```{r, echo = FALSE} +antibiotics %>% + filter(ab %in% colnames(example_isolates)) %>% + print_df() +``` + + +## `antivirals`: Antiviral Drugs + +`r structure_txt(antivirals)` + +This data set is in R available as `antivirals`, after you load the `AMR` package. + +`r download_txt("antivirals")` + +The tab-separated text, Microsoft Excel, SPSS, and Stata files all contain the trade names and LOINC codes as comma separated values. + +### Source + +This data set contains all ATC codes gathered from WHO and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration. + +* [ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology](https://atcddd.fhi.no/atc_ddd_index/) (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use) +* [PubChem by the US National Library of Medicine](https://pubchem.ncbi.nlm.nih.gov) +* [LOINC (Logical Observation Identifiers Names and Codes)](https://loinc.org) + +### Example content + +```{r, echo = FALSE} +antivirals %>% + print_df() +``` + +## `clinical_breakpoints`: Interpretation from MIC values & disk diameters to SIR + +`r structure_txt(clinical_breakpoints)` + +This data set is in R available as `clinical_breakpoints`, after you load the `AMR` package. + +`r download_txt("clinical_breakpoints")` + +### Source + +This data set contains interpretation rules for MIC values and disk diffusion diameters. Included guidelines are CLSI (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "CLSI")$guideline)))`) and EUCAST (`r min(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`). + +Clinical breakpoints in this package were validated through and imported from [WHONET](https://whonet.org), a free desktop Windows application developed and supported by the WHO Collaborating Centre for Surveillance of Antimicrobial Resistance. More can be read on [their website](https://whonet.org). The developers of WHONET and this `AMR` package have been in contact about sharing their work. We highly appreciate their development on the WHONET software. + +The CEO of CLSI and the chairman of EUCAST have endorsed the work and public use of this `AMR` package (and consequently the use of their breakpoints) in June 2023, when future development of distributing clinical breakpoints was discussed in a meeting between CLSI, EUCAST, the WHO, and developers of WHONET and the `AMR` package. + +**NOTE:** this `AMR` package (and the WHONET software as well) contains internal methods to apply the guidelines, which is rather complex. For example, some breakpoints must be applied on certain species groups (which are in case of this package available through the `microorganisms.groups` data set). It is important that this is considered when using the breakpoints for own use. + +### Example content + +```{r, echo = FALSE} +clinical_breakpoints %>% + mutate(mo_name = mo_name(mo, language = NULL), .after = mo) %>% + mutate(ab_name = ab_name(ab, language = NULL), .after = ab) %>% + print_df() +``` + + +## `intrinsic_resistant`: Intrinsic Bacterial Resistance + +`r structure_txt(intrinsic_resistant)` + +This data set is in R available as `intrinsic_resistant`, after you load the `AMR` package. + +`r download_txt("intrinsic_resistant")` + +### Source + +This data set contains all defined intrinsic resistance by EUCAST of all bug-drug combinations, and is based on `r AMR:::format_eucast_version_nr("3.3")`. + +### Example content + +Example rows when filtering on *Enterobacter cloacae*: + +```{r, echo = FALSE} +intrinsic_resistant %>% + transmute( + microorganism = mo_name(mo), + antibiotic = ab_name(ab) + ) %>% + filter(microorganism == "Enterobacter cloacae") %>% + arrange(antibiotic) %>% + print_df(rows = Inf) +``` + + +## `dosage`: Dosage Guidelines from EUCAST + +`r structure_txt(dosage)` + +This data set is in R available as `dosage`, after you load the `AMR` package. + +`r download_txt("dosage")` + +### Source + +EUCAST breakpoints used in this package are based on the dosages in this data set. + +Currently included dosages in the data set are meant for: `r AMR:::format_eucast_version_nr(unique(dosage$eucast_version))`. + +### Example content + +```{r, echo = FALSE} +dosage %>% + print_df() +``` + + +## `example_isolates`: Example Data for Practice + +`r structure_txt(example_isolates)` + +This data set is in R available as `example_isolates`, after you load the `AMR` package. + +`r download_txt("example_isolates")` + +### Source + +This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis. + +### Example content + +```{r, echo = FALSE} +example_isolates %>% + print_df() +``` + +## `example_isolates_unclean`: Example Data for Practice + +`r structure_txt(example_isolates_unclean)` + +This data set is in R available as `example_isolates_unclean`, after you load the `AMR` package. + +`r download_txt("example_isolates_unclean")` + +### Source + +This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis. + +### Example content + +```{r, echo = FALSE} +example_isolates_unclean %>% + print_df() +``` + + +## `microorganisms.groups`: Species Groups and Microbiological Complexes + +`r structure_txt(microorganisms.groups)` + +This data set is in R available as `microorganisms.groups`, after you load the `AMR` package. + +`r download_txt("microorganisms.groups")` + +### Source + +This data set contains species groups and microbiological complexes, which are used in the `clinical_breakpoints` data set. + +### Example content + +```{r, echo = FALSE} +microorganisms.groups %>% + print_df() +``` + + +## `microorganisms.codes`: Common Laboratory Codes + +`r structure_txt(microorganisms.codes)` + +This data set is in R available as `microorganisms.codes`, after you load the `AMR` package. + +`r download_txt("microorganisms.codes")` + +### Source + +This data set contains commonly used codes for microorganisms, from laboratory systems and [WHONET](https://whonet.org). + +### Example content + +```{r, echo = FALSE} +microorganisms.codes %>% + print_df() +``` + + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/resistance_predict.Rmd + + + +--- +title: "How to predict antimicrobial resistance" +output: + rmarkdown::html_vignette: + toc: true +vignette: > + %\VignetteIndexEntry{How to predict antimicrobial resistance} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 4.75 +) +``` + +## Needed R packages +As with many uses in R, we need some additional packages for AMR data analysis. Our package works closely together with the [tidyverse packages](https://www.tidyverse.org) [`dplyr`](https://dplyr.tidyverse.org/) and [`ggplot2`](https://ggplot2.tidyverse.org). The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R. + +Our `AMR` package depends on these packages and even extends their use and functions. + +```{r lib packages, message = FALSE} +library(dplyr) +library(ggplot2) +library(AMR) + +# (if not yet installed, install with:) +# install.packages(c("tidyverse", "AMR")) +``` + +## Prediction analysis +Our package contains a function `resistance_predict()`, which takes the same input as functions for [other AMR data analysis](./AMR.html). Based on a date column, it calculates cases per year and uses a regression model to predict antimicrobial resistance. + +It is basically as easy as: +```{r, eval = FALSE} +# resistance prediction of piperacillin/tazobactam (TZP): +resistance_predict(tbl = example_isolates, col_date = "date", col_ab = "TZP", model = "binomial") + +# or: +example_isolates %>% + resistance_predict( + col_ab = "TZP", + model = "binomial" + ) + +# to bind it to object 'predict_TZP' for example: +predict_TZP <- example_isolates %>% + resistance_predict( + col_ab = "TZP", + model = "binomial" + ) +``` + +The function will look for a date column itself if `col_date` is not set. + +When running any of these commands, a summary of the regression model will be printed unless using `resistance_predict(..., info = FALSE)`. + +```{r, echo = FALSE, message = FALSE} +predict_TZP <- example_isolates %>% + resistance_predict(col_ab = "TZP", model = "binomial") +``` + +This text is only a printed summary - the actual result (output) of the function is a `data.frame` containing for each year: the number of observations, the actual observed resistance, the estimated resistance and the standard error below and above the estimation: + +```{r} +predict_TZP +``` + +The function `plot` is available in base R, and can be extended by other packages to depend the output based on the type of input. We extended its function to cope with resistance predictions: + +```{r, fig.height = 5.5} +plot(predict_TZP) +``` + +This is the fastest way to plot the result. It automatically adds the right axes, error bars, titles, number of available observations and type of model. + +We also support the `ggplot2` package with our custom function `ggplot_sir_predict()` to create more appealing plots: + +```{r} +ggplot_sir_predict(predict_TZP) + +# choose for error bars instead of a ribbon +ggplot_sir_predict(predict_TZP, ribbon = FALSE) +``` + +### Choosing the right model + +Resistance is not easily predicted; if we look at vancomycin resistance in Gram-positive bacteria, the spread (i.e. standard error) is enormous: + +```{r} +example_isolates %>% + filter(mo_gramstain(mo, language = NULL) == "Gram-positive") %>% + resistance_predict(col_ab = "VAN", year_min = 2010, info = FALSE, model = "binomial") %>% + ggplot_sir_predict() +``` + +Vancomycin resistance could be 100% in ten years, but might remain very low. + +You can define the model with the `model` parameter. The model chosen above is a generalised linear regression model using a binomial distribution, assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance. + +Valid values are: + +| Input values | Function used by R | Type of model | +|----------------------------------------|-------------------------------|-----------------------------------------------------| +| `"binomial"` or `"binom"` or `"logit"` | `glm(..., family = binomial)` | Generalised linear model with binomial distribution | +| `"loglin"` or `"poisson"` | `glm(..., family = poisson)` | Generalised linear model with poisson distribution | +| `"lin"` or `"linear"` | `lm()` | Linear model | + +For the vancomycin resistance in Gram-positive bacteria, a linear model might be more appropriate: + +```{r} +example_isolates %>% + filter(mo_gramstain(mo, language = NULL) == "Gram-positive") %>% + resistance_predict(col_ab = "VAN", year_min = 2010, info = FALSE, model = "linear") %>% + ggplot_sir_predict() +``` + +The model itself is also available from the object, as an `attribute`: +```{r} +model <- attributes(predict_TZP)$model + +summary(model)$family + +summary(model)$coefficients +``` + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE vignettes/welcome_to_AMR.Rmd + + + +--- +title: "Welcome to the `AMR` package" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{Welcome to the `AMR` package} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 5 +) +``` + +Note: to keep the package size as small as possible, we only include this vignette on CRAN. You can read more vignettes on our website about how to conduct AMR data analysis, determine MDROs, find explanation of EUCAST and CLSI breakpoints, and much more: . + +---- + +The `AMR` package is a [free and open-source](https://msberends.github.io/AMR/#copyright) R package with [zero dependencies](https://en.wikipedia.org/wiki/Dependency_hell) to simplify the analysis and prediction of Antimicrobial Resistance (AMR) and to work with microbial and antimicrobial data and properties, by using evidence-based methods. **Our aim is to provide a standard** for clean and reproducible AMR data analysis, that can therefore empower epidemiological analyses to continuously enable surveillance and treatment evaluation in any setting. [Many different researchers](https://msberends.github.io/AMR/authors.html) from around the globe are continually helping us to make this a successful and durable project! + +This work was published in the Journal of Statistical Software (Volume 104(3); [DOI 10.18637/jss.v104.i03](https://doi.org/10.18637/jss.v104.i03)) and formed the basis of two PhD theses ([DOI 10.33612/diss.177417131](https://doi.org/10.33612/diss.177417131) and [DOI 10.33612/diss.192486375](https://doi.org/10.33612/diss.192486375)). + +After installing this package, R knows `r AMR:::format_included_data_number(AMR::microorganisms)` distinct microbial species and all `r AMR:::format_included_data_number(rbind(AMR::antibiotics[, "atc", drop = FALSE], AMR::antivirals[, "atc", drop = FALSE]))` antibiotic, antimycotic and antiviral drugs by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral breakpoint guidelines from CLSI and EUCAST are included from the last 10 years. It supports and can read any data format, including WHONET data. + +With the help of contributors from all corners of the world, the `AMR` package is available in English, Czech, Chinese, Danish, Dutch, Finnish, French, German, Greek, Italian, Japanese, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish, and Ukrainian. Antimicrobial drug (group) names and colloquial microorganism names are provided in these languages. + +This package is fully independent of any other R package and works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). **It was designed to work in any setting, including those with very limited resources**. Since its first public release in early 2018, this package has been downloaded from more than 175 countries. + +This package can be used for: + + * Reference for the taxonomy of microorganisms, since the package contains all microbial (sub)species from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF) + * Interpreting raw MIC and disk diffusion values, based on the latest CLSI or EUCAST guidelines + * Retrieving antimicrobial drug names, doses and forms of administration from clinical health care records + * Determining first isolates to be used for AMR data analysis + * Calculating antimicrobial resistance + * Determining multi-drug resistance (MDR) / multi-drug resistant organisms (MDRO) + * Calculating (empirical) susceptibility of both mono therapy and combination therapies + * Predicting future antimicrobial resistance using regression models + * Getting properties for any microorganism (like Gram stain, species, genus or family) + * Getting properties for any antibiotic (like name, code of EARS-Net/ATC/LOINC/PubChem, defined daily dose or trade name) + * Plotting antimicrobial resistance + * Applying EUCAST expert rules + * Getting SNOMED codes of a microorganism, or getting properties of a microorganism based on a SNOMED code + * Getting LOINC codes of an antibiotic, or getting properties of an antibiotic based on a LOINC code + * Machine reading the EUCAST and CLSI guidelines from 2011-2020 to translate MIC values and disk diffusion diameters to SIR + * Principal component analysis for AMR + +All reference data sets (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) in this `AMR` package are publicly and freely available. We continually export our data sets to formats for use in R, SPSS, Stata and Excel. We also supply flat files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please find [all download links on our website](https://msberends.github.io/AMR/articles/datasets.html), which is automatically updated with every code change. + +This R package was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the [University of Groningen](https://www.rug.nl), in collaboration with non-profit organisations [Certe Medical Diagnostics and Advice Foundation](https://www.certe.nl) and [University Medical Center Groningen](https://www.umcg.nl), and is being [actively and durably maintained](https://msberends.github.io/AMR/news/) by two public healthcare organisations in the Netherlands. + +---- + + +This AMR package for R is free, open-source software and licensed under the [GNU General Public License v2.0 (GPL-2)](https://msberends.github.io/AMR/LICENSE-text.html). These requirements are consequently legally binding: modifications must be released under the same license when distributing the package, changes made to the code must be documented, source code must be made available when the package is distributed, and a copy of the license and copyright notice must be included with the package. + + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE DESCRIPTION + + + +Package: AMR +Version: 2.1.1.9106 +Date: 2024-11-21 +Title: Antimicrobial Resistance Data Analysis +Description: Functions to simplify and standardise antimicrobial resistance (AMR) + data analysis and to work with microbial and antimicrobial properties by + using evidence-based methods, as described in . +Authors@R: c( + person(family = "Berends", c("Matthijs", "S."), role = c("aut", "cre"), comment = c(ORCID = "0000-0001-7620-1800"), email = "m.s.berends@umcg.nl"), + person(family = "Souverein", c("Dennis"), role = c("aut", "ctb"), comment = c(ORCID = "0000-0003-0455-0336")), + person(family = "Hassing", c("Erwin", "E.", "A."), role = c("aut", "ctb")), + person(family = "Albers", c("Casper", "J."), role = "ths", comment = c(ORCID = "0000-0002-9213-6743")), + person(family = "Bolton", c("Larisse"), role = "ctb", comment = c(ORCID = "0000-0001-7879-2173")), + person(family = "Dutey-Magni", c("Peter"), role = "ctb", comment = c(ORCID = "0000-0002-8942-9836")), + person(family = "Fonville", c("Judith", "M."), role = "ctb"), + person(family = "Friedrich", c("Alex", "W."), role = "ths", comment = c(ORCID = "0000-0003-4881-038X")), + person(family = "Glasner", c("Corinna"), role = "ths", comment = c(ORCID = "0000-0003-1241-1328")), + person(family = "Hazenberg", c("Eric", "H.", "L.", "C.", "M."), role = "ctb"), + person(family = "Knight", c("Gwen"), role = "ctb", comment = c(ORCID = "0000-0002-7263-9896")), + person(family = "Lenglet", c("Annick"), role = "ctb", comment = c(ORCID = "0000-0003-2013-8405")), + person(family = "Luz", c("Christian", "F."), role = c("ctb"), comment = c(ORCID = "0000-0001-5809-5995")), + person(family = "Meijer", c("Bart", "C."), role = "ctb"), + person(family = "Mykhailenko", c("Dmytro"), role = "ctb"), + person(family = "Mymrikov", c("Anton"), role = "ctb"), + person(family = "Norgan", c("Andrew", "P."), role = "ctb", comment = c(ORCID = "0000-0002-2955-2066")), + person(family = "Ny", c("Sofia"), role = "ctb", comment = c(ORCID = "0000-0002-2017-1363")), + person(family = "Saab", c("Matthew"), role = "ctb"), + person(family = "Salm", c("Jonas"), role = "ctb"), + person(family = "Sanchez", c("Javier"), role = "ctb", comment = c(ORCID = "0000-0003-2605-8094")), + person(family = "Schade", c("Rogier", "P."), role = "ctb"), + person(family = "Sinha", c("Bhanu", "N.", "M."), role = "ths", comment = c(ORCID = "0000-0003-1634-0010")), + person(family = "Stull", c("Jason"), role = "ctb", comment = c(ORCID = "0000-0002-9028-8153")), + person(family = "Underwood", c("Anthony"), role = "ctb", comment = c(ORCID = "0000-0002-8547-4277")), + person(family = "Williams", c("Anita"), role = "ctb", comment = c(ORCID = "0000-0002-5295-8451"))) +Depends: R (>= 3.0.0) +Suggests: + cleaner, + cli, + curl, + data.table, + dplyr, + ggplot2, + knitr, + progress, + readxl, + rmarkdown, + rvest, + skimr, + tibble, + tidyselect, + tinytest, + vctrs, + xml2 +VignetteBuilder: knitr,rmarkdown +URL: https://msberends.github.io/AMR/, https://github.com/msberends/AMR +BugReports: https://github.com/msberends/AMR/issues +License: GPL-2 | file LICENSE +Encoding: UTF-8 +LazyData: true +RoxygenNote: 7.3.2 +Roxygen: list(markdown = TRUE) + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE NAMESPACE + + + +# Generated by roxygen2: do not edit by hand + +S3method("!=",ab_selector) +S3method("&",ab_selector) +S3method("==",ab_selector) +S3method("[",ab) +S3method("[",av) +S3method("[",disk) +S3method("[",mic) +S3method("[",mo) +S3method("[<-",ab) +S3method("[<-",av) +S3method("[<-",disk) +S3method("[<-",mic) +S3method("[<-",mo) +S3method("[<-",sir) +S3method("[[",ab) +S3method("[[",av) +S3method("[[",disk) +S3method("[[",mic) +S3method("[[",mo) +S3method("[[<-",ab) +S3method("[[<-",av) +S3method("[[<-",disk) +S3method("[[<-",mic) +S3method("[[<-",mo) +S3method("[[<-",sir) +S3method("|",ab_selector) +S3method(Complex,mic) +S3method(Math,mic) +S3method(Ops,mic) +S3method(Summary,mic) +S3method(all,ab_selector) +S3method(all,ab_selector_any_all) +S3method(any,ab_selector) +S3method(any,ab_selector_any_all) +S3method(as.data.frame,ab) +S3method(as.data.frame,av) +S3method(as.data.frame,mic) +S3method(as.data.frame,mo) +S3method(as.double,mic) +S3method(as.double,sir) +S3method(as.list,custom_eucast_rules) +S3method(as.list,custom_mdro_guideline) +S3method(as.list,mic) +S3method(as.matrix,mic) +S3method(as.numeric,mic) +S3method(as.sir,data.frame) +S3method(as.sir,default) +S3method(as.sir,disk) +S3method(as.sir,mic) +S3method(as.vector,mic) +S3method(barplot,antibiogram) +S3method(barplot,disk) +S3method(barplot,mic) +S3method(barplot,sir) +S3method(c,ab) +S3method(c,ab_selector) +S3method(c,av) +S3method(c,custom_eucast_rules) +S3method(c,custom_mdro_guideline) +S3method(c,disk) +S3method(c,mic) +S3method(c,mo) +S3method(c,sir) +S3method(close,progress_bar) +S3method(droplevels,mic) +S3method(droplevels,sir) +S3method(format,bug_drug_combinations) +S3method(hist,mic) +S3method(kurtosis,data.frame) +S3method(kurtosis,default) +S3method(kurtosis,matrix) +S3method(mean,mic) +S3method(mean_amr_distance,data.frame) +S3method(mean_amr_distance,default) +S3method(mean_amr_distance,disk) +S3method(mean_amr_distance,mic) +S3method(mean_amr_distance,sir) +S3method(median,mic) +S3method(plot,antibiogram) +S3method(plot,disk) +S3method(plot,mic) +S3method(plot,resistance_predict) +S3method(plot,sir) +S3method(print,ab) +S3method(print,ab_selector) +S3method(print,av) +S3method(print,bug_drug_combinations) +S3method(print,custom_eucast_rules) +S3method(print,custom_mdro_guideline) +S3method(print,disk) +S3method(print,mic) +S3method(print,mo) +S3method(print,mo_renamed) +S3method(print,mo_uncertainties) +S3method(print,pca) +S3method(print,sir) +S3method(print,sir_log) +S3method(quantile,mic) +S3method(rep,ab) +S3method(rep,av) +S3method(rep,disk) +S3method(rep,mic) +S3method(rep,mo) +S3method(rep,sir) +S3method(skewness,data.frame) +S3method(skewness,default) +S3method(skewness,matrix) +S3method(sort,mic) +S3method(summary,mic) +S3method(summary,mo) +S3method(summary,pca) +S3method(summary,sir) +S3method(unique,ab) +S3method(unique,av) +S3method(unique,disk) +S3method(unique,mic) +S3method(unique,mo) +S3method(unique,sir) +export("%like%") +export("%like_case%") +export("%unlike%") +export("%unlike_case%") +export(NA_disk_) +export(NA_mic_) +export(NA_sir_) +export(ab_atc) +export(ab_atc_group1) +export(ab_atc_group2) +export(ab_cid) +export(ab_class) +export(ab_ddd) +export(ab_ddd_units) +export(ab_from_text) +export(ab_group) +export(ab_info) +export(ab_loinc) +export(ab_name) +export(ab_property) +export(ab_selector) +export(ab_synonyms) +export(ab_tradenames) +export(ab_url) +export(add_custom_antimicrobials) +export(add_custom_microorganisms) +export(administrable_iv) +export(administrable_per_os) +export(age) +export(age_groups) +export(all_antimicrobials) +export(aminoglycosides) +export(aminopenicillins) +export(amr_distance_from_row) +export(anti_join_microorganisms) +export(antibiogram) +export(antifungals) +export(antimicrobials_equal) +export(antimycobacterials) +export(as.ab) +export(as.av) +export(as.disk) +export(as.mic) +export(as.mo) +export(as.sir) +export(atc_online_ddd) +export(atc_online_ddd_units) +export(atc_online_groups) +export(atc_online_property) +export(av_atc) +export(av_cid) +export(av_ddd) +export(av_ddd_units) +export(av_from_text) +export(av_group) +export(av_info) +export(av_loinc) +export(av_name) +export(av_property) +export(av_synonyms) +export(av_tradenames) +export(av_url) +export(availability) +export(betalactams) +export(brmo) +export(bug_drug_combinations) +export(carbapenems) +export(cephalosporins) +export(cephalosporins_1st) +export(cephalosporins_2nd) +export(cephalosporins_3rd) +export(cephalosporins_4th) +export(cephalosporins_5th) +export(clear_custom_antimicrobials) +export(clear_custom_microorganisms) +export(count_I) +export(count_IR) +export(count_R) +export(count_S) +export(count_SI) +export(count_all) +export(count_df) +export(count_resistant) +export(count_susceptible) +export(custom_eucast_rules) +export(custom_mdro_guideline) +export(eucast_dosage) +export(eucast_exceptional_phenotypes) +export(eucast_rules) +export(facet_sir) +export(filter_first_isolate) +export(first_isolate) +export(fluoroquinolones) +export(full_join_microorganisms) +export(g.test) +export(geom_sir) +export(get_AMR_locale) +export(get_episode) +export(get_mo_source) +export(ggplot_pca) +export(ggplot_sir) +export(ggplot_sir_predict) +export(glycopeptides) +export(guess_ab_col) +export(inner_join_microorganisms) +export(is.ab) +export(is.av) +export(is.disk) +export(is.mic) +export(is.mo) +export(is.sir) +export(is_new_episode) +export(is_sir_eligible) +export(italicise_taxonomy) +export(italicize_taxonomy) +export(key_antimicrobials) +export(kurtosis) +export(labels_sir_count) +export(left_join_microorganisms) +export(like) +export(lincosamides) +export(lipoglycopeptides) +export(macrolides) +export(mdr_cmi2012) +export(mdr_tb) +export(mdro) +export(mean_amr_distance) +export(mo_authors) +export(mo_class) +export(mo_cleaning_regex) +export(mo_current) +export(mo_domain) +export(mo_failures) +export(mo_family) +export(mo_fullname) +export(mo_gbif) +export(mo_genus) +export(mo_gramstain) +export(mo_group_members) +export(mo_info) +export(mo_is_anaerobic) +export(mo_is_gram_negative) +export(mo_is_gram_positive) +export(mo_is_intrinsic_resistant) +export(mo_is_yeast) +export(mo_kingdom) +export(mo_lpsn) +export(mo_matching_score) +export(mo_mycobank) +export(mo_name) +export(mo_order) +export(mo_oxygen_tolerance) +export(mo_pathogenicity) +export(mo_phylum) +export(mo_property) +export(mo_rank) +export(mo_ref) +export(mo_renamed) +export(mo_reset_session) +export(mo_shortname) +export(mo_snomed) +export(mo_species) +export(mo_status) +export(mo_subspecies) +export(mo_synonyms) +export(mo_taxonomy) +export(mo_type) +export(mo_uncertainties) +export(mo_url) +export(mo_year) +export(mrgn) +export(n_sir) +export(nitrofurans) +export(not_intrinsic_resistant) +export(oxazolidinones) +export(pca) +export(penicillins) +export(polymyxins) +export(proportion_I) +export(proportion_IR) +export(proportion_R) +export(proportion_S) +export(proportion_SI) +export(proportion_df) +export(quinolones) +export(random_disk) +export(random_mic) +export(random_sir) +export(rescale_mic) +export(reset_AMR_locale) +export(resistance) +export(resistance_predict) +export(rifamycins) +export(right_join_microorganisms) +export(scale_colour_mic) +export(scale_fill_mic) +export(scale_sir_colours) +export(scale_x_mic) +export(scale_y_mic) +export(scale_y_percent) +export(semi_join_microorganisms) +export(set_AMR_locale) +export(set_ab_names) +export(set_mo_source) +export(sir_confidence_interval) +export(sir_df) +export(sir_interpretation_history) +export(sir_predict) +export(skewness) +export(streptogramins) +export(susceptibility) +export(tetracyclines) +export(theme_sir) +export(translate_AMR) +export(trimethoprims) +export(ureidopenicillins) +importFrom(graphics,arrows) +importFrom(graphics,axis) +importFrom(graphics,barplot) +importFrom(graphics,hist) +importFrom(graphics,legend) +importFrom(graphics,mtext) +importFrom(graphics,plot) +importFrom(graphics,points) +importFrom(graphics,text) +importFrom(stats,complete.cases) +importFrom(stats,glm) +importFrom(stats,lm) +importFrom(stats,median) +importFrom(stats,pchisq) +importFrom(stats,prcomp) +importFrom(stats,predict) +importFrom(stats,qchisq) +importFrom(stats,quantile) +importFrom(stats,var) + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE index.md + + + +# The `AMR` Package for R + +* Provides an **all-in-one solution** for antimicrobial resistance (AMR) data analysis in a One Health approach +* Used in over 175 countries, available in 20 languages +* Generates **antibiograms** - traditional, combined, syndromic, and even WISCA +* Provides the **full microbiological taxonomy** and extensive info on **all antimicrobial drugs** +* Applies all recent **CLSI** and **EUCAST** clinical and veterinary breakpoints for MICs, disk zones and ECOFFs +* Corrects for duplicate isolates, **calculates** and **predicts** AMR per antibiotic class +* Integrates with **WHONET**, ATC, **EARS-Net**, PubChem, **LOINC**, **SNOMED CT**, and **NCBI** +* 100% free of costs and dependencies, highly suitable for places with **limited resources** + +> Now available for Python too! [Click here](./articles/AMR_for_Python.html) to read more. + + + + + +---- + +### Introduction + +The `AMR` package is a [free and open-source](#copyright) R package with [zero dependencies](https://en.wikipedia.org/wiki/Dependency_hell) to simplify the analysis and prediction of Antimicrobial Resistance (AMR) and to work with microbial and antimicrobial data and properties, by using evidence-based methods. **Our aim is to provide a standard** for clean and reproducible AMR data analysis, that can therefore empower epidemiological analyses to continuously enable surveillance and treatment evaluation in any setting. [Many different researchers](./authors.html) from around the globe are continually helping us to make this a successful and durable project! + +This work was published in the Journal of Statistical Software (Volume 104(3); [DOI 10.18637/jss.v104.i03](https://doi.org/10.18637/jss.v104.i03)) and formed the basis of two PhD theses ([DOI 10.33612/diss.177417131](https://doi.org/10.33612/diss.177417131) and [DOI 10.33612/diss.192486375](https://doi.org/10.33612/diss.192486375)). + +After installing this package, R knows [**~52,000 distinct microbial species**](./reference/microorganisms.html) (updated December 2022) and all [**~600 antibiotic, antimycotic and antiviral drugs**](./reference/antibiotics.html) by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral clinical breakpoint guidelines from CLSI and EUCAST are included, even with epidemiological cut-off (ECOFF) values. It supports and can read any data format, including WHONET data. This package works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). **It was designed to work in any setting, including those with very limited resources**. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the [University of Groningen](https://www.rug.nl), in collaboration with non-profit organisations [Certe Medical Diagnostics and Advice Foundation](https://www.certe.nl) and [University Medical Center Groningen](https://www.umcg.nl). + +##### Used in over 175 countries, available in 20 languages + + + +Since its first public release in early 2018, this R package has been used in almost all countries in the world. Click the map to enlarge and to see the country names. + +With the help of contributors from all corners of the world, the `AMR` package is available in English, Czech, Chinese, Danish, Dutch, Finnish, French, German, Greek, Italian, Japanese, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish, and Ukrainian. Antimicrobial drug (group) names and colloquial microorganism names are provided in these languages. + +### Practical examples + +#### Filtering and selecting data + +One of the most powerful functions of this package, aside from calculating and plotting AMR, is selecting and filtering based on antibiotic columns. This can be done using the so-called [antibiotic class selectors](https://msberends.github.io/AMR/reference/antibiotic_class_selectors.html) that work in base R, `dplyr` and `data.table`: + +```r +# AMR works great with dplyr, but it's not required or neccesary +library(AMR) +library(dplyr) + +example_isolates %>% + mutate(bacteria = mo_fullname()) %>% + # filtering functions for microorganisms: + filter(mo_is_gram_negative(), + mo_is_intrinsic_resistant(ab = "cefotax")) %>% + # antibiotic selectors: + select(bacteria, + aminoglycosides(), + carbapenems()) +``` + +With only having defined a row filter on Gram-negative bacteria with intrinsic resistance to cefotaxime (`mo_is_gram_negative()` and `mo_is_intrinsic_resistant()`) and a column selection on two antibiotic groups (`aminoglycosides()` and `carbapenems()`), the reference data about [all microorganisms](./reference/microorganisms.html) and [all antibiotics](./reference/antibiotics.html) in the `AMR` package make sure you get what you meant: + +|bacteria | GEN | TOB | AMK | KAN | IPM | MEM | +|:------------------------------|:---:|:---:|:---:|:---:|:---:|:---:| +|*Pseudomonas aeruginosa* | I | S | | R | S | | +|*Pseudomonas aeruginosa* | I | S | | R | S | | +|*Pseudomonas aeruginosa* | I | S | | R | S | | +|*Pseudomonas aeruginosa* | S | S | S | R | | S | +|*Pseudomonas aeruginosa* | S | S | S | R | S | S | +|*Pseudomonas aeruginosa* | S | S | S | R | S | S | +|*Stenotrophomonas maltophilia* | R | R | R | R | R | R | +|*Pseudomonas aeruginosa* | S | S | S | R | | S | +|*Pseudomonas aeruginosa* | S | S | S | R | | S | +|*Pseudomonas aeruginosa* | S | S | S | R | S | S | + +A base R equivalent would be: + +```r +library(AMR) +example_isolates$bacteria <- mo_fullname(example_isolates$mo) +example_isolates[which(mo_is_gram_negative() & + mo_is_intrinsic_resistant(ab = "cefotax")), + c("bacteria", aminoglycosides(), carbapenems())] +``` + +This base R code will work in any version of R since April 2013 (R-3.0). Moreover, this code works identically with the `data.table` package, only by starting with: + +```r +example_isolates <- data.table::as.data.table(example_isolates) +``` + +#### Generating antibiograms + +The `AMR` package supports generating traditional, combined, syndromic, and even weighted-incidence syndromic combination antibiograms (WISCA). + +If used inside R Markdown or Quarto, the table will be printed in the right output format automatically (such as markdown, LaTeX, HTML, etc.). + +```r +antibiogram(example_isolates, + antibiotics = c(aminoglycosides(), carbapenems())) +``` + +| Pathogen | Amikacin | Gentamicin | Imipenem | Kanamycin | Meropenem | Tobramycin | +|:------------------------|:----------------:|:---------------:|:----------------:|:----------------:|:----------------:|:----------------:| +| CoNS | 0% (0/43) | 86% (267/309) | 52% (25/48) | 0% (0/43) | 52% (25/48) | 22% (12/55) | +| *E. coli* | 100% (171/171) | 98% (451/460) | 100% (422/422) | | 100% (418/418) | 97% (450/462) | +| *E. faecalis* | 0% (0/39) | 0% (0/39) | 100% (38/38) | 0% (0/39) | | 0% (0/39) | +| *K. pneumoniae* | | 90% (52/58) | 100% (51/51) | | 100% (53/53) | 90% (52/58) | +| *P. aeruginosa* | | 100% (30/30) | | 0% (0/30) | | 100% (30/30) | +| *P. mirabilis* | | 94% (32/34) | 94% (30/32) | | | 94% (32/34) | +| *S. aureus* | | 99% (231/233) | | | | 98% (84/86) | +| *S. epidermidis* | 0% (0/44) | 79% (128/163) | | 0% (0/44) | | 51% (45/89) | +| *S. hominis* | | 92% (74/80) | | | | 85% (53/62) | +| *S. pneumoniae* | 0% (0/117) | 0% (0/117) | | 0% (0/117) | | 0% (0/117) | + +In combination antibiograms, it is clear that combined antibiotics yield higher empiric coverage: + +```r +antibiogram(example_isolates, + antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"), + mo_transform = "gramstain") +``` + +| Pathogen | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin | +|:--------------|:-----------------------:|:-------------------------------------:|:------------------------------------:| +| Gram-negative | 88% (565/641) | 99% (681/691) | 98% (679/693) | +| Gram-positive | 86% (296/345) | 98% (1018/1044) | 95% (524/550) | + +Like many other functions in this package, `antibiogram()` comes with support for 20 languages that are often detected automatically based on system language: + +```r +antibiogram(example_isolates, + antibiotics = c("cipro", "tobra", "genta"), # any arbitrary name or code will work + mo_transform = "gramstain", + ab_transform = "name", + language = "uk") # Ukrainian +``` + +| Збудник | Гентаміцин | Тобраміцин | Ципрофлоксацин | +|:---------------|:--------------------:|:-------------------:|:------------------:| +| Грамнегативні | 96% (659/684) | 96% (658/686) | 91% (621/684) | +| Грампозитивні | 63% (740/1170) | 34% (228/665) | 77% (560/724) | + +#### Calculating resistance per group + +For a manual approach, you can use the `resistance` or `susceptibility()` function: + +```r +example_isolates %>% + # group by ward: + group_by(ward) %>% + # calculate AMR using resistance() for gentamicin and tobramycin + # and get their 95% confidence intervals using sir_confidence_interval(): + summarise(across(c(GEN, TOB), + list(total_R = resistance, + conf_int = function(x) sir_confidence_interval(x, collapse = "-")))) +``` + +|ward | GEN_total_R|GEN_conf_int | TOB_total_R|TOB_conf_int | +|:---------:|:----------:|:-----------:|:----------:|:-----------:| +|Clinical | 0.229 |0.205-0.254 | 0.315 |0.284-0.347 | +|ICU | 0.290 |0.253-0.330 | 0.400 |0.353-0.449 | +|Outpatient | 0.200 |0.131-0.285 | 0.368 |0.254-0.493 | + +Or use [antibiotic class selectors](https://msberends.github.io/AMR/reference/antibiotic_class_selectors.html) to select a series of antibiotic columns: + +```r +library(AMR) +library(dplyr) + +out <- example_isolates %>% + # group by ward: + group_by(ward) %>% + # calculate AMR using resistance(), over all aminoglycosides and polymyxins: + summarise(across(c(aminoglycosides(), polymyxins()), + resistance)) +out +``` + +| ward | GEN | TOB | AMK | KAN | COL | +|:-----------|------:|------:|------:|------:|------:| +| Clinical | 0.229 | 0.315 | 0.626 | 1 | 0.780 | +| ICU | 0.290 | 0.400 | 0.662 | 1 | 0.857 | +| Outpatient | 0.200 | 0.368 | 0.605 | | 0.889 | + +```r +# transform the antibiotic columns to names: +out %>% set_ab_names() +``` + +| ward | gentamicin | tobramycin | amikacin | kanamycin | colistin | +|:-----------|-----------:|-----------:|----------|----------:|----------:| +| Clinical | 0.229 | 0.315 | 0.626 | 1 | 0.780 | +| ICU | 0.290 | 0.400 | 0.662 | 1 | 0.857 | +| Outpatient | 0.200 | 0.368 | 0.605 | | 0.889 | + +```r +# transform the antibiotic column to ATC codes: +out %>% set_ab_names(property = "atc") +``` + +| ward | J01GB03 | J01GB01 | J01GB06 | J01GB04 | J01XB01 | +|:-----------|-----------:|-----------:|----------|----------:|----------:| +| Clinical | 0.229 | 0.315 | 0.626 | 1 | 0.780 | +| ICU | 0.290 | 0.400 | 0.662 | 1 | 0.857 | +| Outpatient | 0.200 | 0.368 | 0.605 | | 0.889 | + +### What else can you do with this package? + +This package was intended as a comprehensive toolbox for integrated AMR data analysis. This package can be used for: + + * Reference for the taxonomy of microorganisms, since the package contains all microbial (sub)species from the List of Prokaryotic names with Standing in Nomenclature ([LPSN]((https://lpsn.dsmz.de))) and the Global Biodiversity Information Facility ([GBIF](https://www.gbif.org)) ([manual](./reference/mo_property.html)) + * Interpreting raw MIC and disk diffusion values, based on any CLSI or EUCAST guideline ([manual](./reference/as.sir.html)) + * Retrieving antimicrobial drug names, doses and forms of administration from clinical health care records ([manual](./reference/ab_from_text.html)) + * Determining first isolates to be used for AMR data analysis ([manual](./reference/first_isolate.html)) + * Calculating antimicrobial resistance ([tutorial](./articles/AMR.html)) + * Determining multi-drug resistance (MDR) / multi-drug resistant organisms (MDRO) ([tutorial](./articles/MDR.html)) + * Calculating (empirical) susceptibility of both mono therapy and combination therapies ([tutorial](./articles/AMR.html)) + * Predicting future antimicrobial resistance using regression models ([tutorial](./articles/resistance_predict.html)) + * Getting properties for any microorganism (like Gram stain, species, genus or family) ([manual](./reference/mo_property.html)) + * Getting properties for any antibiotic (like name, code of EARS-Net/ATC/LOINC/PubChem, defined daily dose or trade name) ([manual](./reference/ab_property.html)) + * Plotting antimicrobial resistance ([tutorial](./articles/AMR.html)) + * Applying EUCAST expert rules ([manual](./reference/eucast_rules.html)) + * Getting SNOMED codes of a microorganism, or getting properties of a microorganism based on a SNOMED code ([manual](./reference/mo_property.html)) + * Getting LOINC codes of an antibiotic, or getting properties of an antibiotic based on a LOINC code ([manual](./reference/ab_property.html)) + * Machine reading the EUCAST and CLSI guidelines from 2011-2021 to translate MIC values and disk diffusion diameters to SIR ([link](./articles/datasets.html)) + * Principal component analysis for AMR ([tutorial](./articles/PCA.html)) + +### Get this package + +#### Latest official version + +[![CRAN](https://www.r-pkg.org/badges/version-ago/AMR)](https://cran.r-project.org/package=AMR) +[![CRANlogs](https://cranlogs.r-pkg.org/badges/grand-total/AMR)](https://cran.r-project.org/package=AMR) + +This package is available [here on the official R network (CRAN)](https://cran.r-project.org/package=AMR). Install this package in R from CRAN by using the command: + +```r +install.packages("AMR") +``` + +It will be downloaded and installed automatically. For RStudio, click on the menu *Tools* > *Install Packages...* and then type in "AMR" and press Install. + +**Note:** Not all functions on this website may be available in this latest release. To use all functions and data sets mentioned on this website, install the latest development version. + +#### Latest development version + +[![check-old](https://github.com/msberends/AMR/actions/workflows/check-old.yaml/badge.svg?branch=main)](https://github.com/msberends/AMR/actions/workflows/check-old.yaml?query=branch%3Amain) +[![check-recent](https://github.com/msberends/AMR/actions/workflows/check-recent.yaml/badge.svg?branch=main)](https://github.com/msberends/AMR/actions/workflows/check-recent.yaml?query=branch%3Amain) +[![CodeFactor](https://www.codefactor.io/repository/github/msberends/amr/badge)](https://www.codefactor.io/repository/github/msberends/amr) +[![Codecov](https://codecov.io/gh/msberends/AMR/branch/main/graph/badge.svg)](https://codecov.io/gh/msberends/AMR?branch=main) + +Please read our [Developer Guideline here](https://github.com/msberends/AMR/wiki/Developer-Guideline). + +The latest and unpublished development version can be installed from GitHub in two ways: + +1. Manually, using: + + ```r + install.packages("remotes") # if you haven't already + remotes::install_github("msberends/AMR") + ``` + +2. Automatically, using the [rOpenSci R-universe platform](https://ropensci.org/r-universe/), by adding [our R-universe address](https://msberends.r-universe.dev) to your list of repositories ('repos'): + + ```r + options(repos = c(getOption("repos"), + msberends = "https://msberends.r-universe.dev")) + ``` + + After this, you can install and update this `AMR` package like any official release (e.g., using `install.packages("AMR")` or in RStudio via *Tools* > *Check for Package Updates...*). + +### Get started + +To find out how to conduct AMR data analysis, please [continue reading here to get started](./articles/AMR.html) or click a link in the ['How to' menu](https://msberends.github.io/AMR/articles/). + +### Partners + +The development of this package is part of, related to, or made possible by the following non-profit organisations and initiatives: + +
+ + + + + +
+ +### Copyright + +This R package is free, open-source software and licensed under the [GNU General Public License v2.0 (GPL-2)](./LICENSE-text.html). In a nutshell, this means that this package: + +- May be used for commercial purposes + +- May be used for private purposes + +- May **not** be used for patent purposes + +- May be modified, although: + + - Modifications **must** be released under the same license when distributing the package + - Changes made to the code **must** be documented + +- May be distributed, although: + + - Source code **must** be made available when the package is distributed + - A copy of the license and copyright notice **must** be included with the package. + +- Comes with a LIMITATION of liability + +- Comes with NO warranty + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE tests/tinytest.R + + + + +# we use {tinytest} instead of {testthat} because it does not rely on recent R versions - we want to test on R >= 3.0. + +# Run them in RStudio using: +# rstudioapi::jobRunScript("tests/tinytest.R", name = "AMR Unit Tests", workingDir = getwd(), exportEnv = "tinytest_results") + +# test only on GitHub Actions and at using RStudio jobs - not on CRAN as tests are lengthy +if (tryCatch(isTRUE(AMR:::import_fn("isJob", "rstudioapi")()), error = function(e) FALSE) || + identical(Sys.getenv("R_RUN_TINYTEST"), "true")) { + # env var 'R_LIBS_USER' got overwritten during 'R CMD check' in GitHub Actions, so: + .libPaths(c(Sys.getenv("R_LIBS_USER_GH_ACTIONS"), .libPaths())) + if (AMR:::pkg_is_available("tinytest", also_load = TRUE)) { + library(AMR) + # set language + set_AMR_locale("English") + # set some functions if on old R + if (getRversion() < "3.2.0") { + anyNA <- AMR:::anyNA + dir.exists <- AMR:::dir.exists + file.size <- AMR:::file.size + file.mtime <- AMR:::file.mtime + isNamespaceLoaded <- AMR:::isNamespaceLoaded + lengths <- AMR:::lengths + } + if (getRversion() < "3.3.0") { + strrep <- AMR:::strrep + } + if (getRversion() < "3.5.0") { + isFALSE <- AMR:::isFALSE + } + if (getRversion() < "3.6.0") { + str2lang <- AMR:::str2lang + # trims() was introduced in 3.3.0, but its argument `whitespace` only in 3.6.0 + trimws <- AMR:::trimws + } + if (getRversion() < "4.0.0") { + deparse1 <- AMR:::deparse1 + } + + # start the unit tests + suppressMessages( + out <- test_package("AMR", + testdir = ifelse(dir.exists("inst/tinytest"), + "inst/tinytest", + "tinytest" + ), + verbose = FALSE, + color = FALSE + ) + ) + cat("\n\nSUMMARY:\n") + print(summary(out)) + } +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/AMR-options.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/aa_options.R +\name{AMR-options} +\alias{AMR-options} +\title{Options for the AMR package} +\description{ +This is an overview of all the package-specific \code{\link[=options]{options()}} you can set in the \code{AMR} package. +} +\section{Options}{ + +\itemize{ +\item \code{AMR_antibiogram_formatting_type} \cr A \link{numeric} (1-12) to use in \code{\link[=antibiogram]{antibiogram()}}, to indicate which formatting type to use. +\item \code{AMR_breakpoint_type} \cr A \link{character} to use in \code{\link[=as.sir]{as.sir()}}, to indicate which breakpoint type to use. This must be either "ECOFF", "animal", or "human". +\item \code{AMR_cleaning_regex} \cr A \link[base:regex]{regular expression} (case-insensitive) to use in \code{\link[=as.mo]{as.mo()}} and all \code{\link[=mo_property]{mo_*}} functions, to clean the user input. The default is the outcome of \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}, which removes texts between brackets and texts such as "species" and "serovar". +\item \code{AMR_custom_ab} \cr A file location to an RDS file, to use custom antimicrobial drugs with this package. This is explained in \code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}}. +\item \code{AMR_custom_mo} \cr A file location to an RDS file, to use custom microorganisms with this package. This is explained in \code{\link[=add_custom_microorganisms]{add_custom_microorganisms()}}. +\item \code{AMR_eucastrules} \cr A \link{character} to set the default types of rules for \code{\link[=eucast_rules]{eucast_rules()}} function, must be one or more of: \code{"breakpoints"}, \code{"expert"}, \code{"other"}, \code{"custom"}, \code{"all"}, and defaults to \code{c("breakpoints", "expert")}. +\item \code{AMR_guideline} \cr A \link{character} to set the default guideline for interpreting MIC values and disk diffusion diameters with \code{\link[=as.sir]{as.sir()}}. Can be only the guideline name (e.g., \code{"CLSI"}) or the name with a year (e.g. \code{"CLSI 2019"}). The default to the latest implemented EUCAST guideline, currently \code{"EUCAST 2024"}. Supported guideline are currently EUCAST (2011-2024) and CLSI (2011-2024). +\item \code{AMR_ignore_pattern} \cr A \link[base:regex]{regular expression} to ignore (i.e., make \code{NA}) any match given in \code{\link[=as.mo]{as.mo()}} and all \code{\link[=mo_property]{mo_*}} functions. +\item \code{AMR_include_PKPD} \cr A \link{logical} to use in \code{\link[=as.sir]{as.sir()}}, to indicate that PK/PD clinical breakpoints must be applied as a last resort - the default is \code{TRUE}. +\item \code{AMR_include_screening} \cr A \link{logical} to use in \code{\link[=as.sir]{as.sir()}}, to indicate that clinical breakpoints for screening are allowed - the default is \code{FALSE}. +\item \code{AMR_keep_synonyms} \cr A \link{logical} to use in \code{\link[=as.mo]{as.mo()}} and all \code{\link[=mo_property]{mo_*}} functions, to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is \code{FALSE}. +\item \code{AMR_locale} \cr A \link{character} to set the language for the \code{AMR} package, can be one of these supported language names or ISO-639-1 codes: English (en), Chinese (zh), Czech (cs), Danish (da), Dutch (nl), Finnish (fi), French (fr), German (de), Greek (el), Italian (it), Japanese (ja), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Spanish (es), Swedish (sv), Turkish (tr), or Ukrainian (uk). The default is the current system language (if supported, English otherwise). +\item \code{AMR_mo_source} \cr A file location for a manual code list to be used in \code{\link[=as.mo]{as.mo()}} and all \code{\link[=mo_property]{mo_*}} functions. This is explained in \code{\link[=set_mo_source]{set_mo_source()}}. +} +} + +\section{Saving Settings Between Sessions}{ + +Settings in \R are not saved globally and are thus lost when \R is exited. You can save your options to your own \code{.Rprofile} file, which is a user-specific file. You can edit it using: + +\if{html}{\out{
}}\preformatted{ utils::file.edit("~/.Rprofile") +}\if{html}{\out{
}} + +In this file, you can set options such as... + +\if{html}{\out{
}}\preformatted{ options(AMR_locale = "pt") + options(AMR_include_PKPD = TRUE) +}\if{html}{\out{
}} + +...to add Portuguese language support of antibiotics, and allow PK/PD rules when interpreting MIC values with \code{\link[=as.sir]{as.sir()}}. +\subsection{Share Options Within Team}{ + +For a more global approach, e.g. within a (data) team, save an options file to a remote file location, such as a shared network drive, and have each user read in this file automatically at start-up. This would work in this way: +\enumerate{ +\item Save a plain text file to e.g. "X:/team_folder/R_options.R" and fill it with preferred settings. +\item For each user, open the \code{.Rprofile} file using \code{utils::file.edit("~/.Rprofile")} and put in there: + +\if{html}{\out{
}}\preformatted{ source("X:/team_folder/R_options.R") +}\if{html}{\out{
}} +\item Reload R/RStudio and check the settings with \code{\link[=getOption]{getOption()}}, e.g. \code{getOption("AMR_locale")} if you have set that value. +} + +Now the team settings are configured in only one place, and can be maintained there. +} +} + +\keyword{internal} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/AMR.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/aa_amr-package.R +\docType{package} +\name{AMR} +\alias{AMR-package} +\alias{AMR} +\title{The \code{AMR} Package} +\source{ +To cite AMR in publications use: + +Berends MS, Luz CF, Friedrich AW, Sinha BNM, Albers CJ, Glasner C (2022). "AMR: An R Package for Working with Antimicrobial Resistance Data." \emph{Journal of Statistical Software}, \emph{104}(3), 1-31. \doi{10.18637/jss.v104.i03} + +A BibTeX entry for LaTeX users is: + +\preformatted{ +@Article{, + title = {{AMR}: An {R} Package for Working with Antimicrobial Resistance Data}, + author = {Matthijs S. Berends and Christian F. Luz and Alexander W. Friedrich and Bhanu N. M. Sinha and Casper J. Albers and Corinna Glasner}, + journal = {Journal of Statistical Software}, + year = {2022}, + volume = {104}, + number = {3}, + pages = {1--31}, + doi = {10.18637/jss.v104.i03}, +} +} +} +\description{ +Welcome to the \code{AMR} package. + +The \code{AMR} package is a \href{https://msberends.github.io/AMR/#copyright}{free and open-source} R package with \href{https://en.wikipedia.org/wiki/Dependency_hell}{zero dependencies} to simplify the analysis and prediction of Antimicrobial Resistance (AMR) and to work with microbial and antimicrobial data and properties, by using evidence-based methods. \strong{Our aim is to provide a standard} for clean and reproducible AMR data analysis, that can therefore empower epidemiological analyses to continuously enable surveillance and treatment evaluation in any setting. \href{https://msberends.github.io/AMR/authors.html}{Many different researchers} from around the globe are continually helping us to make this a successful and durable project! + +This work was published in the Journal of Statistical Software (Volume 104(3); \doi{10.18637/jss.v104.i03}) and formed the basis of two PhD theses (\doi{10.33612/diss.177417131} and \doi{10.33612/diss.192486375}). + +After installing this package, R knows \href{https://msberends.github.io/AMR/reference/microorganisms.html}{\strong{~79 000 microorganisms}} (updated June 2024) and all \href{https://msberends.github.io/AMR/reference/antibiotics.html}{\strong{~600 antibiotic, antimycotic and antiviral drugs}} by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral clinical breakpoint guidelines from CLSI and EUCAST are included, even with epidemiological cut-off (ECOFF) values. It supports and can read any data format, including WHONET data. This package works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). \strong{It was designed to work in any setting, including those with very limited resources}. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the public \href{https://www.rug.nl}{University of Groningen}, in collaboration with non-profit organisations \href{https://www.certe.nl}{Certe Medical Diagnostics and Advice Foundation} and \href{https://www.umcg.nl}{University Medical Center Groningen}. + +The \code{AMR} package is available in English, Chinese, Czech, Danish, Dutch, Finnish, French, German, Greek, Italian, Japanese, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Turkish, and Ukrainian. Antimicrobial drug (group) names and colloquial microorganism names are provided in these languages. +} +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\seealso{ +Useful links: +\itemize{ + \item \url{https://msberends.github.io/AMR/} + \item \url{https://github.com/msberends/AMR} + \item Report bugs at \url{https://github.com/msberends/AMR/issues} +} + +} +\author{ +\strong{Maintainer}: Matthijs S. Berends \email{m.s.berends@umcg.nl} (\href{https://orcid.org/0000-0001-7620-1800}{ORCID}) + +Authors: +\itemize{ + \item Dennis Souverein (\href{https://orcid.org/0000-0003-0455-0336}{ORCID}) [contributor] + \item Erwin E. A. Hassing [contributor] +} + +Other contributors: +\itemize{ + \item Casper J. Albers (\href{https://orcid.org/0000-0002-9213-6743}{ORCID}) [thesis advisor] + \item Larisse Bolton (\href{https://orcid.org/0000-0001-7879-2173}{ORCID}) [contributor] + \item Peter Dutey-Magni (\href{https://orcid.org/0000-0002-8942-9836}{ORCID}) [contributor] + \item Judith M. Fonville [contributor] + \item Alex W. Friedrich (\href{https://orcid.org/0000-0003-4881-038X}{ORCID}) [thesis advisor] + \item Corinna Glasner (\href{https://orcid.org/0000-0003-1241-1328}{ORCID}) [thesis advisor] + \item Eric H. L. C. M. Hazenberg [contributor] + \item Gwen Knight (\href{https://orcid.org/0000-0002-7263-9896}{ORCID}) [contributor] + \item Annick Lenglet (\href{https://orcid.org/0000-0003-2013-8405}{ORCID}) [contributor] + \item Christian F. Luz (\href{https://orcid.org/0000-0001-5809-5995}{ORCID}) [contributor] + \item Bart C. Meijer [contributor] + \item Dmytro Mykhailenko [contributor] + \item Anton Mymrikov [contributor] + \item Andrew P. Norgan (\href{https://orcid.org/0000-0002-2955-2066}{ORCID}) [contributor] + \item Sofia Ny (\href{https://orcid.org/0000-0002-2017-1363}{ORCID}) [contributor] + \item Matthew Saab [contributor] + \item Jonas Salm [contributor] + \item Javier Sanchez (\href{https://orcid.org/0000-0003-2605-8094}{ORCID}) [contributor] + \item Rogier P. Schade [contributor] + \item Bhanu N. M. Sinha (\href{https://orcid.org/0000-0003-1634-0010}{ORCID}) [thesis advisor] + \item Jason Stull (\href{https://orcid.org/0000-0002-9028-8153}{ORCID}) [contributor] + \item Anthony Underwood (\href{https://orcid.org/0000-0002-8547-4277}{ORCID}) [contributor] + \item Anita Williams (\href{https://orcid.org/0000-0002-5295-8451}{ORCID}) [contributor] +} + +} +\keyword{internal} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/WHOCC.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/whocc.R +\name{WHOCC} +\alias{WHOCC} +\title{WHOCC: WHO Collaborating Centre for Drug Statistics Methodology} +\description{ +All antimicrobial drugs and their official names, ATC codes, ATC groups and defined daily dose (DDD) are included in this package, using the WHO Collaborating Centre for Drug Statistics Methodology. +} +\section{WHOCC}{ + +This package contains \strong{all ~550 antibiotic, antimycotic and antiviral drugs} and their Anatomical Therapeutic Chemical (ATC) codes, ATC groups and Defined Daily Dose (DDD) from the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC, \url{https://atcddd.fhi.no}) and the Pharmaceuticals Community Register of the European Commission (\url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm}). + +These have become the gold standard for international drug utilisation monitoring and research. + +The WHOCC is located in Oslo at the Norwegian Institute of Public Health and funded by the Norwegian government. The European Commission is the executive of the European Union and promotes its general interest. + +\strong{NOTE: The WHOCC copyright does not allow use for commercial purposes, unlike any other info from this package.} See \url{https://atcddd.fhi.no/copyright_disclaimer/.} +} + +\examples{ +as.ab("meropenem") +ab_name("J01DH02") + +ab_tradenames("flucloxacillin") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/WHONET.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{WHONET} +\alias{WHONET} +\title{Data Set with 500 Isolates - WHONET Example} +\format{ +A \link[tibble:tibble]{tibble} with 500 observations and 53 variables: +\itemize{ +\item \verb{Identification number}\cr ID of the sample +\item \verb{Specimen number}\cr ID of the specimen +\item \code{Organism}\cr Name of the microorganism. Before analysis, you should transform this to a valid microbial class, using \code{\link[=as.mo]{as.mo()}}. +\item \code{Country}\cr Country of origin +\item \code{Laboratory}\cr Name of laboratory +\item \verb{Last name}\cr Fictitious last name of patient +\item \verb{First name}\cr Fictitious initial of patient +\item \code{Sex}\cr Fictitious gender of patient +\item \code{Age}\cr Fictitious age of patient +\item \verb{Age category}\cr Age group, can also be looked up using \code{\link[=age_groups]{age_groups()}} +\item \verb{Date of admission}\cr \link{Date} of hospital admission +\item \verb{Specimen date}\cr \link{Date} when specimen was received at laboratory +\item \verb{Specimen type}\cr Specimen type or group +\item \verb{Specimen type (Numeric)}\cr Translation of \code{"Specimen type"} +\item \code{Reason}\cr Reason of request with Differential Diagnosis +\item \verb{Isolate number}\cr ID of isolate +\item \verb{Organism type}\cr Type of microorganism, can also be looked up using \code{\link[=mo_type]{mo_type()}} +\item \code{Serotype}\cr Serotype of microorganism +\item \code{Beta-lactamase}\cr Microorganism produces beta-lactamase? +\item \code{ESBL}\cr Microorganism produces extended spectrum beta-lactamase? +\item \code{Carbapenemase}\cr Microorganism produces carbapenemase? +\item \verb{MRSA screening test}\cr Microorganism is possible MRSA? +\item \verb{Inducible clindamycin resistance}\cr Clindamycin can be induced? +\item \code{Comment}\cr Other comments +\item \verb{Date of data entry}\cr \link{Date} this data was entered in WHONET +\item \code{AMP_ND10:CIP_EE}\cr 28 different antibiotics. You can lookup the abbreviations in the \link{antibiotics} data set, or use e.g. \code{\link[=ab_name]{ab_name("AMP")}} to get the official name immediately. Before analysis, you should transform this to a valid antibiotic class, using \code{\link[=as.sir]{as.sir()}}. +} +} +\usage{ +WHONET +} +\description{ +This example data set has the exact same structure as an export file from WHONET. Such files can be used with this package, as this example data set shows. The antibiotic results are from our \link{example_isolates} data set. All patient names were created using online surname generators and are only in place for practice purposes. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +WHONET +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/ab_from_text.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ab_from_text.R +\name{ab_from_text} +\alias{ab_from_text} +\title{Retrieve Antimicrobial Drug Names and Doses from Clinical Text} +\usage{ +ab_from_text( + text, + type = c("drug", "dose", "administration"), + collapse = NULL, + translate_ab = FALSE, + thorough_search = NULL, + info = interactive(), + ... +) +} +\arguments{ +\item{text}{text to analyse} + +\item{type}{type of property to search for, either \code{"drug"}, \code{"dose"} or \code{"administration"}, see \emph{Examples}} + +\item{collapse}{a \link{character} to pass on to \code{paste(, collapse = ...)} to only return one \link{character} per element of \code{text}, see \emph{Examples}} + +\item{translate_ab}{if \code{type = "drug"}: a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}. The default is \code{FALSE}. Using \code{TRUE} is equal to using "name".} + +\item{thorough_search}{a \link{logical} to indicate whether the input must be extensively searched for misspelling and other faulty input values. Setting this to \code{TRUE} will take considerably more time than when using \code{FALSE}. At default, it will turn \code{TRUE} when all input elements contain a maximum of three words.} + +\item{info}{a \link{logical} to indicate whether a progress bar should be printed - the default is \code{TRUE} only in interactive mode} + +\item{...}{arguments passed on to \code{\link[=as.ab]{as.ab()}}} +} +\value{ +A \link{list}, or a \link{character} if \code{collapse} is not \code{NULL} +} +\description{ +Use this function on e.g. clinical texts from health care records. It returns a \link{list} with all antimicrobial drugs, doses and forms of administration found in the texts. +} +\details{ +This function is also internally used by \code{\link[=as.ab]{as.ab()}}, although it then only searches for the first drug name and will throw a note if more drug names could have been returned. Note: the \code{\link[=as.ab]{as.ab()}} function may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. +\subsection{Argument \code{type}}{ + +At default, the function will search for antimicrobial drug names. All text elements will be searched for official names, ATC codes and brand names. As it uses \code{\link[=as.ab]{as.ab()}} internally, it will correct for misspelling. + +With \code{type = "dose"} (or similar, like "dosing", "doses"), all text elements will be searched for \link{numeric} values that are higher than 100 and do not resemble years. The output will be \link{numeric}. It supports any unit (g, mg, IE, etc.) and multiple values in one clinical text, see \emph{Examples}. + +With \code{type = "administration"} (or abbreviations, like "admin", "adm"), all text elements will be searched for a form of drug administration. It supports the following forms (including common abbreviations): buccal, implant, inhalation, instillation, intravenous, nasal, oral, parenteral, rectal, sublingual, transdermal and vaginal. Abbreviations for oral (such as 'po', 'per os') will become "oral", all values for intravenous (such as 'iv', 'intraven') will become "iv". It supports multiple values in one clinical text, see \emph{Examples}. +} + +\subsection{Argument \code{collapse}}{ + +Without using \code{collapse}, this function will return a \link{list}. This can be convenient to use e.g. inside a \code{mutate()}):\cr +\code{df \%>\% mutate(abx = ab_from_text(clinical_text))} + +The returned AB codes can be transformed to official names, groups, etc. with all \code{\link[=ab_property]{ab_*}} functions such as \code{\link[=ab_name]{ab_name()}} and \code{\link[=ab_group]{ab_group()}}, or by using the \code{translate_ab} argument. + +With using \code{collapse}, this function will return a \link{character}:\cr +\code{df \%>\% mutate(abx = ab_from_text(clinical_text, collapse = "|"))} +} +} +\examples{ +# mind the bad spelling of amoxicillin in this line, +# straight from a true health care record: +ab_from_text("28/03/2020 regular amoxicilliin 500mg po tid") + +ab_from_text("500 mg amoxi po and 400mg cipro iv") +ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "dose") +ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "admin") + +ab_from_text("500 mg amoxi po and 400mg cipro iv", collapse = ", ") +\donttest{ +# if you want to know which antibiotic groups were administered, do e.g.: +abx <- ab_from_text("500 mg amoxi po and 400mg cipro iv") +ab_group(abx[[1]]) + +if (require("dplyr")) { + tibble(clinical_text = c( + "given 400mg cipro and 500 mg amox", + "started on doxy iv today" + )) \%>\% + mutate( + abx_codes = ab_from_text(clinical_text), + abx_doses = ab_from_text(clinical_text, type = "doses"), + abx_admin = ab_from_text(clinical_text, type = "admin"), + abx_coll = ab_from_text(clinical_text, collapse = "|"), + abx_coll_names = ab_from_text(clinical_text, + collapse = "|", + translate_ab = "name" + ), + abx_coll_doses = ab_from_text(clinical_text, + type = "doses", + collapse = "|" + ), + abx_coll_admin = ab_from_text(clinical_text, + type = "admin", + collapse = "|" + ) + ) +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/ab_property.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ab_property.R +\name{ab_property} +\alias{ab_property} +\alias{ab_name} +\alias{ab_cid} +\alias{ab_synonyms} +\alias{ab_tradenames} +\alias{ab_group} +\alias{ab_atc} +\alias{ATC} +\alias{ab_atc_group1} +\alias{ab_atc_group2} +\alias{ab_loinc} +\alias{ab_ddd} +\alias{ab_ddd_units} +\alias{ab_info} +\alias{ab_url} +\alias{set_ab_names} +\title{Get Properties of an Antibiotic} +\usage{ +ab_name(x, language = get_AMR_locale(), tolower = FALSE, ...) + +ab_cid(x, ...) + +ab_synonyms(x, ...) + +ab_tradenames(x, ...) + +ab_group(x, language = get_AMR_locale(), ...) + +ab_atc(x, only_first = FALSE, ...) + +ab_atc_group1(x, language = get_AMR_locale(), ...) + +ab_atc_group2(x, language = get_AMR_locale(), ...) + +ab_loinc(x, ...) + +ab_ddd(x, administration = "oral", ...) + +ab_ddd_units(x, administration = "oral", ...) + +ab_info(x, language = get_AMR_locale(), ...) + +ab_url(x, open = FALSE, ...) + +ab_property(x, property = "name", language = get_AMR_locale(), ...) + +set_ab_names( + data, + ..., + property = "name", + language = get_AMR_locale(), + snake_case = NULL +) +} +\arguments{ +\item{x}{any (vector of) text that can be coerced to a valid antibiotic drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{language}{language of the returned text - the default is the current system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{tolower}{a \link{logical} to indicate whether the first \link{character} of every output should be transformed to a lower case \link{character}. This will lead to e.g. "polymyxin B" and not "polymyxin b".} + +\item{...}{in case of \code{\link[=set_ab_names]{set_ab_names()}} and \code{data} is a \link{data.frame}: columns to select (supports tidy selection such as \code{column1:column4}), otherwise other arguments passed on to \code{\link[=as.ab]{as.ab()}}} + +\item{only_first}{a \link{logical} to indicate whether only the first ATC code must be returned, with giving preference to J0-codes (i.e., the antimicrobial drug group)} + +\item{administration}{way of administration, either \code{"oral"} or \code{"iv"}} + +\item{open}{browse the URL using \code{\link[utils:browseURL]{utils::browseURL()}}} + +\item{property}{one of the column names of one of the \link{antibiotics} data set: \code{vector_or(colnames(antibiotics), sort = FALSE)}.} + +\item{data}{a \link{data.frame} of which the columns need to be renamed, or a \link{character} vector of column names} + +\item{snake_case}{a \link{logical} to indicate whether the names should be in so-called \href{https://en.wikipedia.org/wiki/Snake_case}{snake case}: in lower case and all spaces/slashes replaced with an underscore (\verb{_})} +} +\value{ +\itemize{ +\item An \link{integer} in case of \code{\link[=ab_cid]{ab_cid()}} +\item A named \link{list} in case of \code{\link[=ab_info]{ab_info()}} and multiple \code{\link[=ab_atc]{ab_atc()}}/\code{\link[=ab_synonyms]{ab_synonyms()}}/\code{\link[=ab_tradenames]{ab_tradenames()}} +\item A \link{double} in case of \code{\link[=ab_ddd]{ab_ddd()}} +\item A \link{data.frame} in case of \code{\link[=set_ab_names]{set_ab_names()}} +\item A \link{character} in all other cases +} +} +\description{ +Use these functions to return a specific property of an antibiotic from the \link{antibiotics} data set. All input values will be evaluated internally with \code{\link[=as.ab]{as.ab()}}. +} +\details{ +All output \link[=translate]{will be translated} where possible. + +The function \code{\link[=ab_url]{ab_url()}} will return the direct URL to the official WHO website. A warning will be returned if the required ATC code is not available. + +The function \code{\link[=set_ab_names]{set_ab_names()}} is a special column renaming function for \link{data.frame}s. It renames columns names that resemble antimicrobial drugs. It always makes sure that the new column names are unique. If \code{property = "atc"} is set, preference is given to ATC codes from the J-group. +} +\section{Source}{ + +World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} + +European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# all properties: +ab_name("AMX") +ab_atc("AMX") +ab_cid("AMX") +ab_synonyms("AMX") +ab_tradenames("AMX") +ab_group("AMX") +ab_atc_group1("AMX") +ab_atc_group2("AMX") +ab_url("AMX") + +# smart lowercase transformation +ab_name(x = c("AMC", "PLB")) +ab_name(x = c("AMC", "PLB"), tolower = TRUE) + +# defined daily doses (DDD) +ab_ddd("AMX", "oral") +ab_ddd_units("AMX", "oral") +ab_ddd("AMX", "iv") +ab_ddd_units("AMX", "iv") + +ab_info("AMX") # all properties as a list + +# all ab_* functions use as.ab() internally, so you can go from 'any' to 'any': +ab_atc("AMP") +ab_group("J01CA01") +ab_loinc("ampicillin") +ab_name("21066-6") +ab_name(6249) +ab_name("J01CA01") + +# spelling from different languages and dyslexia are no problem +ab_atc("ceftriaxon") +ab_atc("cephtriaxone") +ab_atc("cephthriaxone") +ab_atc("seephthriaaksone") + +# use set_ab_names() for renaming columns +colnames(example_isolates) +colnames(set_ab_names(example_isolates)) +colnames(set_ab_names(example_isolates, NIT:VAN)) +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + set_ab_names() + + # this does the same: + example_isolates \%>\% + rename_with(set_ab_names) + + # set_ab_names() works with any AB property: + example_isolates \%>\% + set_ab_names(property = "atc") + + example_isolates \%>\% + set_ab_names(where(is.sir)) \%>\% + colnames() + + example_isolates \%>\% + set_ab_names(NIT:VAN) \%>\% + colnames() +} +} +} +\seealso{ +\link{antibiotics} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/add_custom_antimicrobials.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/custom_antimicrobials.R +\name{add_custom_antimicrobials} +\alias{add_custom_antimicrobials} +\alias{clear_custom_antimicrobials} +\title{Add Custom Antimicrobials} +\usage{ +add_custom_antimicrobials(x) + +clear_custom_antimicrobials() +} +\arguments{ +\item{x}{a \link{data.frame} resembling the \link{antibiotics} data set, at least containing columns "ab" and "name"} +} +\description{ +With \code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}} you can add your own custom antimicrobial drug names and codes. +} +\details{ +\strong{Important:} Due to how \R works, the \code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}} function has to be run in every \R session - added antimicrobials are not stored between sessions and are thus lost when \R is exited. + +There are two ways to circumvent this and automate the process of adding antimicrobials: + +\strong{Method 1:} Using the package option \code{\link[=AMR-options]{AMR_custom_ab}}, which is the preferred method. To use this method: +\enumerate{ +\item Create a data set in the structure of the \link{antibiotics} data set (containing at the very least columns "ab" and "name") and save it with \code{\link[=saveRDS]{saveRDS()}} to a location of choice, e.g. \code{"~/my_custom_ab.rds"}, or any remote location. +\item Set the file location to the package option \code{\link[=AMR-options]{AMR_custom_ab}}: \code{options(AMR_custom_ab = "~/my_custom_ab.rds")}. This can even be a remote file location, such as an https URL. Since options are not saved between \R sessions, it is best to save this option to the \code{.Rprofile} file so that it will be loaded on start-up of \R. To do this, open the \code{.Rprofile} file using e.g. \code{utils::file.edit("~/.Rprofile")}, add this text and save the file: + +\if{html}{\out{
}}\preformatted{# Add custom antimicrobial codes: +options(AMR_custom_ab = "~/my_custom_ab.rds") +}\if{html}{\out{
}} + +Upon package load, this file will be loaded and run through the \code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}} function. +} + +\strong{Method 2:} Loading the antimicrobial additions directly from your \code{.Rprofile} file. Note that the definitions will be stored in a user-specific \R file, which is a suboptimal workflow. To use this method: +\enumerate{ +\item Edit the \code{.Rprofile} file using e.g. \code{utils::file.edit("~/.Rprofile")}. +\item Add a text like below and save the file: + +\if{html}{\out{
}}\preformatted{ # Add custom antibiotic drug codes: + AMR::add_custom_antimicrobials( + data.frame(ab = "TESTAB", + name = "Test Antibiotic", + group = "Test Group") + ) +}\if{html}{\out{
}} +} + +Use \code{\link[=clear_custom_antimicrobials]{clear_custom_antimicrobials()}} to clear the previously added antimicrobials. +} +\examples{ +\donttest{ + +# returns NA and throws a warning (which is suppressed here): +suppressWarnings( + as.ab("testab") +) + +# now add a custom entry - it will be considered by as.ab() and +# all ab_*() functions +add_custom_antimicrobials( + data.frame( + ab = "TESTAB", + name = "Test Antibiotic", + # you can add any property present in the + # 'antibiotics' data set, such as 'group': + group = "Test Group" + ) +) + +# "testab" is now a new antibiotic: +as.ab("testab") +ab_name("testab") +ab_group("testab") + +ab_info("testab") + + +# Add Co-fluampicil, which is one of the many J01CR50 codes, see +# https://atcddd.fhi.no/ddd/list_of_ddds_combined_products/ +add_custom_antimicrobials( + data.frame( + ab = "COFLU", + name = "Co-fluampicil", + atc = "J01CR50", + group = "Beta-lactams/penicillins" + ) +) +ab_atc("Co-fluampicil") +ab_name("J01CR50") + +# even antibiotic selectors work +x <- data.frame( + random_column = "some value", + coflu = as.sir("S"), + ampicillin = as.sir("R") +) +x +x[, betalactams()] +} +} +\seealso{ +\code{\link[=add_custom_microorganisms]{add_custom_microorganisms()}} to add custom microorganisms. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/add_custom_microorganisms.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/custom_microorganisms.R +\name{add_custom_microorganisms} +\alias{add_custom_microorganisms} +\alias{clear_custom_microorganisms} +\title{Add Custom Microorganisms} +\usage{ +add_custom_microorganisms(x) + +clear_custom_microorganisms() +} +\arguments{ +\item{x}{a \link{data.frame} resembling the \link{microorganisms} data set, at least containing column "genus" (case-insensitive)} +} +\description{ +With \code{\link[=add_custom_microorganisms]{add_custom_microorganisms()}} you can add your own custom microorganisms, such the non-taxonomic outcome of laboratory analysis. +} +\details{ +This function will fill in missing taxonomy for you, if specific taxonomic columns are missing, see \emph{Examples}. + +\strong{Important:} Due to how \R works, the \code{\link[=add_custom_microorganisms]{add_custom_microorganisms()}} function has to be run in every \R session - added microorganisms are not stored between sessions and are thus lost when \R is exited. + +There are two ways to circumvent this and automate the process of adding microorganisms: + +\strong{Method 1:} Using the package option \code{\link[=AMR-options]{AMR_custom_mo}}, which is the preferred method. To use this method: +\enumerate{ +\item Create a data set in the structure of the \link{microorganisms} data set (containing at the very least column "genus") and save it with \code{\link[=saveRDS]{saveRDS()}} to a location of choice, e.g. \code{"~/my_custom_mo.rds"}, or any remote location. +\item Set the file location to the package option \code{\link[=AMR-options]{AMR_custom_mo}}: \code{options(AMR_custom_mo = "~/my_custom_mo.rds")}. This can even be a remote file location, such as an https URL. Since options are not saved between \R sessions, it is best to save this option to the \code{.Rprofile} file so that it will be loaded on start-up of \R. To do this, open the \code{.Rprofile} file using e.g. \code{utils::file.edit("~/.Rprofile")}, add this text and save the file: + +\if{html}{\out{
}}\preformatted{# Add custom microorganism codes: +options(AMR_custom_mo = "~/my_custom_mo.rds") +}\if{html}{\out{
}} + +Upon package load, this file will be loaded and run through the \code{\link[=add_custom_microorganisms]{add_custom_microorganisms()}} function. +} + +\strong{Method 2:} Loading the microorganism directly from your \code{.Rprofile} file. Note that the definitions will be stored in a user-specific \R file, which is a suboptimal workflow. To use this method: +\enumerate{ +\item Edit the \code{.Rprofile} file using e.g. \code{utils::file.edit("~/.Rprofile")}. +\item Add a text like below and save the file: + +\if{html}{\out{
}}\preformatted{ # Add custom antibiotic drug codes: + AMR::add_custom_microorganisms( + data.frame(genus = "Enterobacter", + species = "asburiae/cloacae") + ) +}\if{html}{\out{
}} +} + +Use \code{\link[=clear_custom_microorganisms]{clear_custom_microorganisms()}} to clear the previously added microorganisms. +} +\examples{ +\donttest{ +# a combination of species is not formal taxonomy, so +# this will result in "Enterobacter cloacae cloacae", +# since it resembles the input best: +mo_name("Enterobacter asburiae/cloacae") + +# now add a custom entry - it will be considered by as.mo() and +# all mo_*() functions +add_custom_microorganisms( + data.frame( + genus = "Enterobacter", + species = "asburiae/cloacae" + ) +) + +# E. asburiae/cloacae is now a new microorganism: +mo_name("Enterobacter asburiae/cloacae") + +# its code: +as.mo("Enterobacter asburiae/cloacae") + +# all internal algorithms will work as well: +mo_name("Ent asburia cloacae") + +# and even the taxonomy was added based on the genus! +mo_family("E. asburiae/cloacae") +mo_gramstain("Enterobacter asburiae/cloacae") + +mo_info("Enterobacter asburiae/cloacae") + + +# the function tries to be forgiving: +add_custom_microorganisms( + data.frame( + GENUS = "BACTEROIDES / PARABACTEROIDES SLASHLINE", + SPECIES = "SPECIES" + ) +) +mo_name("BACTEROIDES / PARABACTEROIDES") +mo_rank("BACTEROIDES / PARABACTEROIDES") + +# taxonomy still works, even though a slashline genus was given as input: +mo_family("Bacteroides/Parabacteroides") + + +# for groups and complexes, set them as species or subspecies: +add_custom_microorganisms( + data.frame( + genus = "Citrobacter", + species = c("freundii", "braakii complex"), + subspecies = c("complex", "") + ) +) +mo_name(c("C. freundii complex", "C. braakii complex")) +mo_species(c("C. freundii complex", "C. braakii complex")) +mo_gramstain(c("C. freundii complex", "C. braakii complex")) +} +} +\seealso{ +\code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}} to add custom antimicrobials. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/age.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/age.R +\name{age} +\alias{age} +\title{Age in Years of Individuals} +\usage{ +age(x, reference = Sys.Date(), exact = FALSE, na.rm = FALSE, ...) +} +\arguments{ +\item{x}{date(s), \link{character} (vectors) will be coerced with \code{\link[=as.POSIXlt]{as.POSIXlt()}}} + +\item{reference}{reference date(s) (default is today), \link{character} (vectors) will be coerced with \code{\link[=as.POSIXlt]{as.POSIXlt()}}} + +\item{exact}{a \link{logical} to indicate whether age calculation should be exact, i.e. with decimals. It divides the number of days of \href{https://en.wikipedia.org/wiki/Year-to-date}{year-to-date} (YTD) of \code{x} by the number of days in the year of \code{reference} (either 365 or 366).} + +\item{na.rm}{a \link{logical} to indicate whether missing values should be removed} + +\item{...}{arguments passed on to \code{\link[=as.POSIXlt]{as.POSIXlt()}}, such as \code{origin}} +} +\value{ +An \link{integer} (no decimals) if \code{exact = FALSE}, a \link{double} (with decimals) otherwise +} +\description{ +Calculates age in years based on a reference date, which is the system date at default. +} +\details{ +Ages below 0 will be returned as \code{NA} with a warning. Ages above 120 will only give a warning. + +This function vectorises over both \code{x} and \code{reference}, meaning that either can have a length of 1 while the other argument has a larger length. +} +\examples{ +# 10 random pre-Y2K birth dates +df <- data.frame(birth_date = as.Date("2000-01-01") - runif(10) * 25000) + +# add ages +df$age <- age(df$birth_date) + +# add exact ages +df$age_exact <- age(df$birth_date, exact = TRUE) + +# add age at millenium switch +df$age_at_y2k <- age(df$birth_date, "2000-01-01") + +df +} +\seealso{ +To split ages into groups, use the \code{\link[=age_groups]{age_groups()}} function. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/age_groups.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/age.R +\name{age_groups} +\alias{age_groups} +\title{Split Ages into Age Groups} +\usage{ +age_groups(x, split_at = c(12, 25, 55, 75), na.rm = FALSE) +} +\arguments{ +\item{x}{age, e.g. calculated with \code{\link[=age]{age()}}} + +\item{split_at}{values to split \code{x} at - the default is age groups 0-11, 12-24, 25-54, 55-74 and 75+. See \emph{Details}.} + +\item{na.rm}{a \link{logical} to indicate whether missing values should be removed} +} +\value{ +Ordered \link{factor} +} +\description{ +Split ages into age groups defined by the \code{split} argument. This allows for easier demographic (antimicrobial resistance) analysis. +} +\details{ +To split ages, the input for the \code{split_at} argument can be: +\itemize{ +\item A \link{numeric} vector. A value of e.g. \code{c(10, 20)} will split \code{x} on 0-9, 10-19 and 20+. A value of only \code{50} will split \code{x} on 0-49 and 50+. +The default is to split on young children (0-11), youth (12-24), young adults (25-54), middle-aged adults (55-74) and elderly (75+). +\item A character: +\itemize{ +\item \code{"children"} or \code{"kids"}, equivalent of: \code{c(0, 1, 2, 4, 6, 13, 18)}. This will split on 0, 1, 2-3, 4-5, 6-12, 13-17 and 18+. +\item \code{"elderly"} or \code{"seniors"}, equivalent of: \code{c(65, 75, 85)}. This will split on 0-64, 65-74, 75-84, 85+. +\item \code{"fives"}, equivalent of: \code{1:20 * 5}. This will split on 0-4, 5-9, ..., 95-99, 100+. +\item \code{"tens"}, equivalent of: \code{1:10 * 10}. This will split on 0-9, 10-19, ..., 90-99, 100+. +} +} +} +\examples{ +ages <- c(3, 8, 16, 54, 31, 76, 101, 43, 21) + +# split into 0-49 and 50+ +age_groups(ages, 50) + +# split into 0-19, 20-49 and 50+ +age_groups(ages, c(20, 50)) + +# split into groups of ten years +age_groups(ages, 1:10 * 10) +age_groups(ages, split_at = "tens") + +# split into groups of five years +age_groups(ages, 1:20 * 5) +age_groups(ages, split_at = "fives") + +# split specifically for children +age_groups(ages, c(1, 2, 4, 6, 13, 18)) +age_groups(ages, "children") + +\donttest{ +# resistance of ciprofloxacin per age group +if (require("dplyr") && require("ggplot2")) { + example_isolates \%>\% + filter_first_isolate() \%>\% + filter(mo == as.mo("Escherichia coli")) \%>\% + group_by(age_group = age_groups(age)) \%>\% + select(age_group, CIP) \%>\% + ggplot_sir( + x = "age_group", + minimum = 0, + x.title = "Age Group", + title = "Ciprofloxacin resistance per age group" + ) +} +} +} +\seealso{ +To determine ages, based on one or more reference dates, use the \code{\link[=age]{age()}} function. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/antibiogram.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/antibiogram.R +\name{antibiogram} +\alias{antibiogram} +\alias{plot.antibiogram} +\alias{autoplot.antibiogram} +\alias{knit_print.antibiogram} +\title{Generate Traditional, Combination, Syndromic, or WISCA Antibiograms} +\source{ +\itemize{ +\item Bielicki JA \emph{et al.} (2016). \strong{Selecting appropriate empirical antibiotic regimens for paediatric bloodstream infections: application of a Bayesian decision model to local and pooled antimicrobial resistance surveillance data} \emph{Journal of Antimicrobial Chemotherapy} 71(3); \doi{10.1093/jac/dkv397} +\item Klinker KP \emph{et al.} (2021). \strong{Antimicrobial stewardship and antibiograms: importance of moving beyond traditional antibiograms}. \emph{Therapeutic Advances in Infectious Disease}, May 5;8:20499361211011373; \doi{10.1177/20499361211011373} +\item Barbieri E \emph{et al.} (2021). \strong{Development of a Weighted-Incidence Syndromic Combination Antibiogram (WISCA) to guide the choice of the empiric antibiotic treatment for urinary tract infection in paediatric patients: a Bayesian approach} \emph{Antimicrobial Resistance & Infection Control} May 1;10(1):74; \doi{10.1186/s13756-021-00939-2} +\item \strong{M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition}, 2022, \emph{Clinical and Laboratory Standards Institute (CLSI)}. \url{https://clsi.org/standards/products/microbiology/documents/m39/}. +} +} +\usage{ +antibiogram( + x, + antibiotics = where(is.sir), + mo_transform = "shortname", + ab_transform = "name", + syndromic_group = NULL, + add_total_n = FALSE, + only_all_tested = FALSE, + digits = 0, + formatting_type = getOption("AMR_antibiogram_formatting_type", 10), + col_mo = NULL, + language = get_AMR_locale(), + minimum = 30, + combine_SI = TRUE, + sep = " + ", + info = interactive() +) + +\method{plot}{antibiogram}(x, ...) + +\method{autoplot}{antibiogram}(object, ...) + +\method{knit_print}{antibiogram}( + x, + italicise = TRUE, + na = getOption("knitr.kable.NA", default = ""), + ... +) +} +\arguments{ +\item{x}{a \link{data.frame} containing at least a column with microorganisms and columns with antibiotic results (class 'sir', see \code{\link[=as.sir]{as.sir()}})} + +\item{antibiotics}{vector of any antibiotic name or code (will be evaluated with \code{\link[=as.ab]{as.ab()}}, column name of \code{x}, or (any combinations of) \link[=antibiotic_class_selectors]{antibiotic selectors} such as \code{\link[=aminoglycosides]{aminoglycosides()}} or \code{\link[=carbapenems]{carbapenems()}}. For combination antibiograms, this can also be set to values separated with \code{"+"}, such as "TZP+TOB" or "cipro + genta", given that columns resembling such antibiotics exist in \code{x}. See \emph{Examples}.} + +\item{mo_transform}{a character to transform microorganism input - must be \code{"name"}, \code{"shortname"} (default), \code{"gramstain"}, or one of the column names of the \link{microorganisms} data set: "mo", "fullname", "status", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "oxygen_tolerance", "source", "lpsn", "lpsn_parent", "lpsn_renamed_to", "mycobank", "mycobank_parent", "mycobank_renamed_to", "gbif", "gbif_parent", "gbif_renamed_to", "prevalence", or "snomed". Can also be \code{NULL} to not transform the input.} + +\item{ab_transform}{a character to transform antibiotic input - must be one of the column names of the \link{antibiotics} data set (defaults to \code{"name"}): "ab", "cid", "name", "group", "atc", "atc_group1", "atc_group2", "abbreviations", "synonyms", "oral_ddd", "oral_units", "iv_ddd", "iv_units", or "loinc". Can also be \code{NULL} to not transform the input.} + +\item{syndromic_group}{a column name of \code{x}, or values calculated to split rows of \code{x}, e.g. by using \code{\link[=ifelse]{ifelse()}} or \code{\link[dplyr:case_when]{case_when()}}. See \emph{Examples}.} + +\item{add_total_n}{a \link{logical} to indicate whether total available numbers per pathogen should be added to the table (default is \code{TRUE}). This will add the lowest and highest number of available isolate per antibiotic (e.g, if for \emph{E. coli} 200 isolates are available for ciprofloxacin and 150 for amoxicillin, the returned number will be "150-200").} + +\item{only_all_tested}{(for combination antibiograms): a \link{logical} to indicate that isolates must be tested for all antibiotics, see \emph{Details}} + +\item{digits}{number of digits to use for rounding the susceptibility percentage} + +\item{formatting_type}{numeric value (1–12) indicating how the 'cells' of the antibiogram table should be formatted. See \emph{Details} > \emph{Formatting Type} for a list of options.} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{language}{language to translate text, which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})} + +\item{minimum}{the minimum allowed number of available (tested) isolates. Any isolate count lower than \code{minimum} will return \code{NA} with a warning. The default number of \code{30} isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see \emph{Source}.} + +\item{combine_SI}{a \link{logical} to indicate whether all susceptibility should be determined by results of either S, SDD, or I, instead of only S (default is \code{TRUE})} + +\item{sep}{a separating character for antibiotic columns in combination antibiograms} + +\item{info}{a \link{logical} to indicate info should be printed - the default is \code{TRUE} only in interactive mode} + +\item{...}{when used in \link[knitr:kable]{R Markdown or Quarto}: arguments passed on to \code{\link[knitr:kable]{knitr::kable()}} (otherwise, has no use)} + +\item{object}{an \code{\link[=antibiogram]{antibiogram()}} object} + +\item{italicise}{a \link{logical} to indicate whether the microorganism names in the \link[knitr:kable]{knitr} table should be made italic, using \code{\link[=italicise_taxonomy]{italicise_taxonomy()}}.} + +\item{na}{character to use for showing \code{NA} values} +} +\description{ +Create detailed antibiograms with options for traditional, combination, syndromic, and Bayesian WISCA methods. Based on the approaches of Klinker \emph{et al.}, Barbieri \emph{et al.}, and the Bayesian WISCA model (Weighted-Incidence Syndromic Combination Antibiogram) by Bielicki \emph{et al.}, this function provides flexible output formats including plots and tables, ideal for integration with R Markdown and Quarto reports. +} +\details{ +This function returns a table with values between 0 and 100 for \emph{susceptibility}, not resistance. + +\strong{Remember that you should filter your data to let it contain only first isolates!} This is needed to exclude duplicates and to reduce selection bias. Use \code{\link[=first_isolate]{first_isolate()}} to determine them in your data set with one of the four available algorithms. +\subsection{Formatting Type}{ + +The formatting of the 'cells' of the table can be set with the argument \code{formatting_type}. In these examples, \code{5} is the susceptibility percentage, \code{15} the numerator, and \code{300} the denominator: +\enumerate{ +\item 5 +\item 15 +\item 300 +\item 15/300 +\item 5 (300) +\item 5\% (300) +\item 5 (N=300) +\item 5\% (N=300) +\item 5 (15/300) +\item 5\% (15/300) +\item 5 (N=15/300) +\item 5\% (N=15/300) +} + +The default is \code{10}, which can be set globally with the package option \code{\link[=AMR-options]{AMR_antibiogram_formatting_type}}, e.g. \code{options(AMR_antibiogram_formatting_type = 5)}. + +Set \code{digits} (defaults to \code{0}) to alter the rounding of the susceptibility percentage. +} + +\subsection{Antibiogram Types}{ + +There are four antibiogram types, as summarised by Klinker \emph{et al.} (2021, \doi{10.1177/20499361211011373}), and they are all supported by \code{\link[=antibiogram]{antibiogram()}}. Use WISCA whenever possible, since it provides precise coverage estimates by accounting for pathogen incidence and antimicrobial susceptibility. See the section \emph{Why Use WISCA?} on this page. + +The four antibiogram types: +\enumerate{ +\item \strong{Traditional Antibiogram} + +Case example: Susceptibility of \emph{Pseudomonas aeruginosa} to piperacillin/tazobactam (TZP) + +Code example: + +\if{html}{\out{
}}\preformatted{antibiogram(your_data, + antibiotics = "TZP") +}\if{html}{\out{
}} +\item \strong{Combination Antibiogram} + +Case example: Additional susceptibility of \emph{Pseudomonas aeruginosa} to TZP + tobramycin versus TZP alone + +Code example: + +\if{html}{\out{
}}\preformatted{antibiogram(your_data, + antibiotics = c("TZP", "TZP+TOB", "TZP+GEN")) +}\if{html}{\out{
}} +\item \strong{Syndromic Antibiogram} + +Case example: Susceptibility of \emph{Pseudomonas aeruginosa} to TZP among respiratory specimens (obtained among ICU patients only) + +Code example: + +\if{html}{\out{
}}\preformatted{antibiogram(your_data, + antibiotics = penicillins(), + syndromic_group = "ward") +}\if{html}{\out{
}} +\item \strong{Weighted-Incidence Syndromic Combination Antibiogram (WISCA)} + +WISCA enhances empirical antibiotic selection by weighting the incidence of pathogens in specific clinical syndromes and combining them with their susceptibility data. It provides an estimation of regimen coverage by aggregating pathogen incidences and susceptibilities across potential causative organisms. See also the section \emph{Why Use WISCA?} on this page. + +Case example: Susceptibility of \emph{Pseudomonas aeruginosa} to TZP among respiratory specimens (obtained among ICU patients only) for male patients age >=65 years with heart failure + +Code example: + +\if{html}{\out{
}}\preformatted{library(dplyr) +your_data \%>\% + filter(ward == "ICU" & specimen_type == "Respiratory") \%>\% + antibiogram(antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"), + syndromic_group = ifelse(.$age >= 65 & + .$gender == "Male" & + .$condition == "Heart Disease", + "Study Group", "Control Group")) +}\if{html}{\out{
}} + +WISCA uses a sophisticated Bayesian decision model to combine both local and pooled antimicrobial resistance data. This approach not only evaluates local patterns but can also draw on multi-centre datasets to improve regimen accuracy, even in low-incidence infections like paediatric bloodstream infections (BSIs). +} +} + +\subsection{Inclusion in Combination Antibiogram and Syndromic Antibiogram}{ + +Note that for types 2 and 3 (Combination Antibiogram and Syndromic Antibiogram), it is important to realise that susceptibility can be calculated in two ways, which can be set with the \code{only_all_tested} argument (default is \code{FALSE}). See this example for two antibiotics, Drug A and Drug B, about how \code{\link[=antibiogram]{antibiogram()}} works to calculate the \%SI: + +\if{html}{\out{
}}\preformatted{-------------------------------------------------------------------- + only_all_tested = FALSE only_all_tested = TRUE + ----------------------- ----------------------- + Drug A Drug B include as include as include as include as + numerator denominator numerator denominator +-------- -------- ---------- ----------- ---------- ----------- + S or I S or I X X X X + R S or I X X X X + S or I X X - - + S or I R X X X X + R R - X - X + R - - - - + S or I X X - - + R - - - - + - - - - +-------------------------------------------------------------------- +}\if{html}{\out{
}} +} + +\subsection{Plotting}{ + +All types of antibiograms as listed above can be plotted (using \code{\link[ggplot2:autoplot]{ggplot2::autoplot()}} or base \R's \code{\link[=plot]{plot()}} and \code{\link[=barplot]{barplot()}}). + +THe outcome of \code{\link[=antibiogram]{antibiogram()}} can also be used directly in R Markdown / Quarto (i.e., \code{knitr}) for reports. In this case, \code{\link[knitr:kable]{knitr::kable()}} will be applied automatically and microorganism names will even be printed in italics at default (see argument \code{italicise}). + +You can also use functions from specific 'table reporting' packages to transform the output of \code{\link[=antibiogram]{antibiogram()}} to your needs, e.g. with \code{flextable::as_flextable()} or \code{gt::gt()}. +} +} +\section{Why Use WISCA?}{ + +WISCA is a powerful tool for guiding empirical antibiotic therapy because it provides precise coverage estimates by accounting for pathogen incidence and antimicrobial susceptibility. This is particularly important in empirical treatment, where the causative pathogen is often unknown at the outset. Traditional antibiograms do not reflect the weighted likelihood of specific pathogens based on clinical syndromes, which can lead to suboptimal treatment choices. + +The Bayesian WISCA, as described by Bielicki \emph{et al.} (2016), improves on earlier methods by handling uncertainties common in smaller datasets, such as low-incidence infections. This method offers a significant advantage by: +\enumerate{ +\item Pooling Data from Multiple Sources:\cr WISCA uses pooled data from multiple hospitals or surveillance sources to overcome limitations of small sample sizes at individual institutions, allowing for more confident selection of narrow-spectrum antibiotics or combinations. +\item Bayesian Framework:\cr The Bayesian decision tree model accounts for both local data and prior knowledge (such as inherent resistance patterns) to estimate regimen coverage. It allows for a more precise estimation of coverage, even in cases where susceptibility data is missing or incomplete. +\item Incorporating Pathogen and Regimen Uncertainty:\cr WISCA allows clinicians to see the likelihood that an empirical regimen will be effective against all relevant pathogens, taking into account uncertainties related to both pathogen prevalence and antimicrobial resistance. This leads to better-informed, data-driven clinical decisions. +\item Scenarios for Optimising Treatment:\cr For hospitals or settings with low-incidence infections, WISCA helps determine whether local data is sufficient or if pooling with external data is necessary. It also identifies statistically significant differences or similarities between antibiotic regimens, enabling clinicians to choose optimal therapies with greater confidence. +} + +WISCA is essential in optimising empirical treatment by shifting away from broad-spectrum antibiotics, which are often overused in empirical settings. By offering precise estimates based on syndromic patterns and pooled data, WISCA supports antimicrobial stewardship by guiding more targeted therapy, reducing unnecessary broad-spectrum use, and combating the rise of antimicrobial resistance. +} + +\examples{ +# example_isolates is a data set available in the AMR package. +# run ?example_isolates for more info. +example_isolates + +\donttest{ +# Traditional antibiogram ---------------------------------------------- + +antibiogram(example_isolates, + antibiotics = c(aminoglycosides(), carbapenems()) +) + +antibiogram(example_isolates, + antibiotics = aminoglycosides(), + ab_transform = "atc", + mo_transform = "gramstain" +) + +antibiogram(example_isolates, + antibiotics = carbapenems(), + ab_transform = "name", + mo_transform = "name" +) + + +# Combined antibiogram ------------------------------------------------- + +# combined antibiotics yield higher empiric coverage +antibiogram(example_isolates, + antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"), + mo_transform = "gramstain" +) + +# names of antibiotics do not need to resemble columns exactly: +antibiogram(example_isolates, + antibiotics = c("Cipro", "cipro + genta"), + mo_transform = "gramstain", + ab_transform = "name", + sep = " & " +) + + +# Syndromic antibiogram ------------------------------------------------ + +# the data set could contain a filter for e.g. respiratory specimens +antibiogram(example_isolates, + antibiotics = c(aminoglycosides(), carbapenems()), + syndromic_group = "ward" +) + +# now define a data set with only E. coli +ex1 <- example_isolates[which(mo_genus() == "Escherichia"), ] + +# with a custom language, though this will be determined automatically +# (i.e., this table will be in Spanish on Spanish systems) +antibiogram(ex1, + antibiotics = aminoglycosides(), + ab_transform = "name", + syndromic_group = ifelse(ex1$ward == "ICU", + "UCI", "No UCI" + ), + language = "es" +) + + +# Weighted-incidence syndromic combination antibiogram (WISCA) --------- + +# the data set could contain a filter for e.g. respiratory specimens/ICU +antibiogram(example_isolates, + antibiotics = c("AMC", "AMC+CIP", "TZP", "TZP+TOB"), + mo_transform = "gramstain", + minimum = 10, # this should be >=30, but now just as example + syndromic_group = ifelse(example_isolates$age >= 65 & + example_isolates$gender == "M", + "WISCA Group 1", "WISCA Group 2" + ) +) + + +# Print the output for R Markdown / Quarto ----------------------------- + +ureido <- antibiogram(example_isolates, + antibiotics = ureidopenicillins(), + ab_transform = "name" +) + +# in an Rmd file, you would just need to return `ureido` in a chunk, +# but to be explicit here: +if (requireNamespace("knitr")) { + cat(knitr::knit_print(ureido)) +} + + +# Generate plots with ggplot2 or base R -------------------------------- + +ab1 <- antibiogram(example_isolates, + antibiotics = c("AMC", "CIP", "TZP", "TZP+TOB"), + mo_transform = "gramstain" +) +ab2 <- antibiogram(example_isolates, + antibiotics = c("AMC", "CIP", "TZP", "TZP+TOB"), + mo_transform = "gramstain", + syndromic_group = "ward" +) + +if (requireNamespace("ggplot2")) { + ggplot2::autoplot(ab1) +} +if (requireNamespace("ggplot2")) { + ggplot2::autoplot(ab2) +} + +plot(ab1) +plot(ab2) +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/antibiotic_class_selectors.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ab_selectors.R +\name{antibiotic_class_selectors} +\alias{antibiotic_class_selectors} +\alias{ab_class} +\alias{ab_selector} +\alias{aminoglycosides} +\alias{aminopenicillins} +\alias{antifungals} +\alias{antimycobacterials} +\alias{betalactams} +\alias{carbapenems} +\alias{cephalosporins} +\alias{cephalosporins_1st} +\alias{cephalosporins_2nd} +\alias{cephalosporins_3rd} +\alias{cephalosporins_4th} +\alias{cephalosporins_5th} +\alias{fluoroquinolones} +\alias{glycopeptides} +\alias{lincosamides} +\alias{lipoglycopeptides} +\alias{macrolides} +\alias{nitrofurans} +\alias{oxazolidinones} +\alias{penicillins} +\alias{polymyxins} +\alias{quinolones} +\alias{rifamycins} +\alias{streptogramins} +\alias{tetracyclines} +\alias{trimethoprims} +\alias{ureidopenicillins} +\alias{administrable_per_os} +\alias{administrable_iv} +\alias{not_intrinsic_resistant} +\title{Antibiotic Selectors} +\usage{ +ab_class(ab_class, only_sir_columns = FALSE, only_treatable = TRUE, ...) + +ab_selector(filter, only_sir_columns = FALSE, only_treatable = TRUE, ...) + +aminoglycosides(only_sir_columns = FALSE, only_treatable = TRUE, ...) + +aminopenicillins(only_sir_columns = FALSE, ...) + +antifungals(only_sir_columns = FALSE, ...) + +antimycobacterials(only_sir_columns = FALSE, ...) + +betalactams(only_sir_columns = FALSE, only_treatable = TRUE, ...) + +carbapenems(only_sir_columns = FALSE, only_treatable = TRUE, ...) + +cephalosporins(only_sir_columns = FALSE, ...) + +cephalosporins_1st(only_sir_columns = FALSE, ...) + +cephalosporins_2nd(only_sir_columns = FALSE, ...) + +cephalosporins_3rd(only_sir_columns = FALSE, ...) + +cephalosporins_4th(only_sir_columns = FALSE, ...) + +cephalosporins_5th(only_sir_columns = FALSE, ...) + +fluoroquinolones(only_sir_columns = FALSE, ...) + +glycopeptides(only_sir_columns = FALSE, ...) + +lincosamides(only_sir_columns = FALSE, only_treatable = TRUE, ...) + +lipoglycopeptides(only_sir_columns = FALSE, ...) + +macrolides(only_sir_columns = FALSE, ...) + +nitrofurans(only_sir_columns = FALSE, ...) + +oxazolidinones(only_sir_columns = FALSE, ...) + +penicillins(only_sir_columns = FALSE, ...) + +polymyxins(only_sir_columns = FALSE, only_treatable = TRUE, ...) + +quinolones(only_sir_columns = FALSE, ...) + +rifamycins(only_sir_columns = FALSE, ...) + +streptogramins(only_sir_columns = FALSE, ...) + +tetracyclines(only_sir_columns = FALSE, ...) + +trimethoprims(only_sir_columns = FALSE, ...) + +ureidopenicillins(only_sir_columns = FALSE, ...) + +administrable_per_os(only_sir_columns = FALSE, ...) + +administrable_iv(only_sir_columns = FALSE, ...) + +not_intrinsic_resistant( + only_sir_columns = FALSE, + col_mo = NULL, + version_expertrules = 3.3, + ... +) +} +\arguments{ +\item{ab_class}{an antimicrobial class or a part of it, such as \code{"carba"} and \code{"carbapenems"}. The columns \code{group}, \code{atc_group1} and \code{atc_group2} of the \link{antibiotics} data set will be searched (case-insensitive) for this value.} + +\item{only_sir_columns}{a \link{logical} to indicate whether only columns of class \code{sir} must be selected (default is \code{FALSE}), see \code{\link[=as.sir]{as.sir()}}} + +\item{only_treatable}{a \link{logical} to indicate whether antimicrobial drugs should be excluded that are only for laboratory tests (default is \code{TRUE}), such as gentamicin-high (\code{GEH}) and imipenem/EDTA (\code{IPE})} + +\item{...}{ignored, only in place to allow future extensions} + +\item{filter}{an \link{expression} to be evaluated in the \link{antibiotics} data set, such as \code{name \%like\% "trim"}} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{version_expertrules}{the version number to use for the EUCAST Expert Rules and Intrinsic Resistance guideline. Can be "3.3", "3.2", or "3.1".} +} +\value{ +(internally) a \link{character} vector of column names, with additional class \code{"ab_selector"} +} +\description{ +These functions allow for filtering rows and selecting columns based on antibiotic test results that are of a specific antibiotic class or group (according to the \link{antibiotics} data set), without the need to define the columns or antibiotic abbreviations. + +In short, if you have a column name that resembles an antimicrobial drug, it will be picked up by any of these functions that matches its pharmaceutical class: "cefazolin", "kefzol", "CZO" and "J01DB04" will all be picked up by \code{\link[=cephalosporins]{cephalosporins()}}. +} +\details{ +These functions can be used in data set calls for selecting columns and filtering rows. They work with base \R, the Tidyverse, and \code{data.table}. They are heavily inspired by the \link[tidyselect:language]{Tidyverse selection helpers} such as \code{\link[tidyselect:everything]{everything()}}, but are not limited to \code{dplyr} verbs. Nonetheless, they are very convenient to use with \code{dplyr} functions such as \code{\link[dplyr:select]{select()}}, \code{\link[dplyr:filter]{filter()}} and \code{\link[dplyr:summarise]{summarise()}}, see \emph{Examples}. + +All columns in the data in which these functions are called will be searched for known antibiotic names, abbreviations, brand names, and codes (ATC, EARS-Net, WHO, etc.) according to the \link{antibiotics} data set. This means that a selector such as \code{\link[=aminoglycosides]{aminoglycosides()}} will pick up column names like 'gen', 'genta', 'J01GB03', 'tobra', 'Tobracin', etc. + +The \code{\link[=ab_class]{ab_class()}} function can be used to filter/select on a manually defined antibiotic class. It searches for results in the \link{antibiotics} data set within the columns \code{group}, \code{atc_group1} and \code{atc_group2}. + +The \code{\link[=ab_selector]{ab_selector()}} function can be used to internally filter the \link{antibiotics} data set on any results, see \emph{Examples}. It allows for filtering on a (part of) a certain name, and/or a group name or even a minimum of DDDs for oral treatment. This function yields the highest flexibility, but is also the least user-friendly, since it requires a hard-coded filter to set. + +The \code{\link[=administrable_per_os]{administrable_per_os()}} and \code{\link[=administrable_iv]{administrable_iv()}} functions also rely on the \link{antibiotics} data set - antibiotic columns will be matched where a DDD (defined daily dose) for resp. oral and IV treatment is available in the \link{antibiotics} data set. + +The \code{\link[=not_intrinsic_resistant]{not_intrinsic_resistant()}} function can be used to only select antibiotic columns that pose no intrinsic resistance for the microorganisms in the data set. For example, if a data set contains only microorganism codes or names of \emph{E. coli} and \emph{K. pneumoniae} and contains a column "vancomycin", this column will be removed (or rather, unselected) using this function. It currently applies \href{https://www.eucast.org/expert_rules_and_expected_phenotypes}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021) to determine intrinsic resistance, using the \code{\link[=eucast_rules]{eucast_rules()}} function internally. Because of this determination, this function is quite slow in terms of performance. +} +\section{Full list of supported (antibiotic) classes}{ + +\itemize{ +\item \code{\link[=aminoglycosides]{aminoglycosides()}} can select: \cr amikacin (AMK), amikacin/fosfomycin (AKF), apramycin (APR), arbekacin (ARB), astromicin (AST), bekanamycin (BEK), dibekacin (DKB), framycetin (FRM), gentamicin (GEN), gentamicin-high (GEH), habekacin (HAB), hygromycin (HYG), isepamicin (ISE), kanamycin (KAN), kanamycin-high (KAH), kanamycin/cephalexin (KAC), micronomicin (MCR), neomycin (NEO), netilmicin (NET), pentisomicin (PIM), plazomicin (PLZ), propikacin (PKA), ribostamycin (RST), sisomicin (SIS), streptoduocin (STR), streptomycin (STR1), streptomycin-high (STH), tobramycin (TOB), and tobramycin-high (TOH) +\item \code{\link[=aminopenicillins]{aminopenicillins()}} can select: \cr amoxicillin (AMX) and ampicillin (AMP) +\item \code{\link[=antifungals]{antifungals()}} can select: \cr amorolfine (AMO), amphotericin B (AMB), amphotericin B-high (AMH), anidulafungin (ANI), butoconazole (BUT), caspofungin (CAS), ciclopirox (CIX), clotrimazole (CTR), econazole (ECO), fluconazole (FLU), flucytosine (FCT), fosfluconazole (FFL), griseofulvin (GRI), hachimycin (HCH), ibrexafungerp (IBX), isavuconazole (ISV), isoconazole (ISO), itraconazole (ITR), ketoconazole (KET), manogepix (MGX), micafungin (MIF), miconazole (MCZ), nystatin (NYS), oteseconazole (OTE), pimaricin (PMR), posaconazole (POS), rezafungin (RZF), ribociclib (RBC), sulconazole (SUC), terbinafine (TRB), terconazole (TRC), and voriconazole (VOR) +\item \code{\link[=antimycobacterials]{antimycobacterials()}} can select: \cr 4-aminosalicylic acid (AMA), calcium aminosalicylate (CLA), capreomycin (CAP), clofazimine (CLF), delamanid (DLM), enviomycin (ENV), ethambutol (ETH), ethambutol/isoniazid (ETI), ethionamide (ETI1), isoniazid (INH), isoniazid/sulfamethoxazole/trimethoprim/pyridoxine (IST), morinamide (MRN), p-aminosalicylic acid (PAS), pretomanid (PMD), protionamide (PTH), pyrazinamide (PZA), rifabutin (RIB), rifampicin (RIF), rifampicin/ethambutol/isoniazid (REI), rifampicin/isoniazid (RFI), rifampicin/pyrazinamide/ethambutol/isoniazid (RPEI), rifampicin/pyrazinamide/isoniazid (RPI), rifamycin (RFM), rifapentine (RFP), simvastatin/fenofibrate (SMF), sodium aminosalicylate (SDA), streptomycin/isoniazid (STI), terizidone (TRZ), thioacetazone (TAT), thioacetazone/isoniazid (THI1), tiocarlide (TCR), and viomycin (VIO) +\item \code{\link[=betalactams]{betalactams()}} can select: \cr amoxicillin (AMX), amoxicillin/clavulanic acid (AMC), amoxicillin/sulbactam (AXS), ampicillin (AMP), ampicillin/sulbactam (SAM), apalcillin (APL), aspoxicillin (APX), avibactam (AVB), azidocillin (AZD), azlocillin (AZL), aztreonam (ATM), aztreonam/avibactam (AZA), aztreonam/nacubactam (ANC), bacampicillin (BAM), benzathine benzylpenicillin (BNB), benzathine phenoxymethylpenicillin (BNP), benzylpenicillin (PEN), biapenem (BIA), carbenicillin (CRB), carindacillin (CRN), cefacetrile (CAC), cefaclor (CEC), cefadroxil (CFR), cefalexin (LEX), cefaloridine (RID), cefalotin (CEP), cefamandole (MAN), cefapirin (HAP), cefatrizine (CTZ), cefazedone (CZD), cefazolin (CZO), cefcapene (CCP), cefcapene pivoxil (CCX), cefdinir (CDR), cefditoren (DIT), cefditoren pivoxil (DIX), cefepime (FEP), cefepime/clavulanic acid (CPC), cefepime/nacubactam (FNC), cefepime/tazobactam (FPT), cefetamet (CAT), cefetamet pivoxil (CPI), cefetecol (CCL), cefetrizole (CZL), cefiderocol (FDC), cefixime (CFM), cefmenoxime (CMX), cefmetazole (CMZ), cefodizime (DIZ), cefonicid (CID), cefoperazone (CFP), cefoperazone/sulbactam (CSL), ceforanide (CND), cefoselis (CSE), cefotaxime (CTX), cefotaxime/clavulanic acid (CTC), cefotaxime/sulbactam (CTS), cefotetan (CTT), cefotiam (CTF), cefotiam hexetil (CHE), cefovecin (FOV), cefoxitin (FOX), cefoxitin screening (FOX1), cefozopran (ZOP), cefpimizole (CFZ), cefpiramide (CPM), cefpirome (CPO), cefpodoxime (CPD), cefpodoxime proxetil (CPX), cefpodoxime/clavulanic acid (CDC), cefprozil (CPR), cefquinome (CEQ), cefroxadine (CRD), cefsulodin (CFS), cefsumide (CSU), ceftaroline (CPT), ceftaroline/avibactam (CPA), ceftazidime (CAZ), ceftazidime/avibactam (CZA), ceftazidime/clavulanic acid (CCV), cefteram (CEM), cefteram pivoxil (CPL), ceftezole (CTL), ceftibuten (CTB), ceftiofur (TIO), ceftizoxime (CZX), ceftizoxime alapivoxil (CZP), ceftobiprole (BPR), ceftobiprole medocaril (CFM1), ceftolozane/tazobactam (CZT), ceftriaxone (CRO), ceftriaxone/beta-lactamase inhibitor (CEB), cefuroxime (CXM), cefuroxime axetil (CXA), cephradine (CED), ciclacillin (CIC), clometocillin (CLM), cloxacillin (CLO), dicloxacillin (DIC), doripenem (DOR), epicillin (EPC), ertapenem (ETP), flucloxacillin (FLC), hetacillin (HET), imipenem (IPM), imipenem/EDTA (IPE), imipenem/relebactam (IMR), latamoxef (LTM), lenampicillin (LEN), loracarbef (LOR), mecillinam (MEC), meropenem (MEM), meropenem/nacubactam (MNC), meropenem/vaborbactam (MEV), metampicillin (MTM), meticillin (MET), mezlocillin (MEZ), mezlocillin/sulbactam (MSU), nacubactam (NAC), nafcillin (NAF), oxacillin (OXA), panipenem (PAN), penamecillin (PNM), penicillin/novobiocin (PNO), penicillin/sulbactam (PSU), pheneticillin (PHE), phenoxymethylpenicillin (PHN), piperacillin (PIP), piperacillin/sulbactam (PIS), piperacillin/tazobactam (TZP), piridicillin (PRC), pivampicillin (PVM), pivmecillinam (PME), procaine benzylpenicillin (PRB), propicillin (PRP), razupenem (RZM), ritipenem (RIT), ritipenem acoxil (RIA), sarmoxicillin (SRX), sulbactam (SUL), sulbenicillin (SBC), sultamicillin (SLT6), talampicillin (TAL), tazobactam (TAZ), tebipenem (TBP), temocillin (TEM), ticarcillin (TIC), and ticarcillin/clavulanic acid (TCC) +\item \code{\link[=carbapenems]{carbapenems()}} can select: \cr biapenem (BIA), doripenem (DOR), ertapenem (ETP), imipenem (IPM), imipenem/EDTA (IPE), imipenem/relebactam (IMR), meropenem (MEM), meropenem/nacubactam (MNC), meropenem/vaborbactam (MEV), panipenem (PAN), razupenem (RZM), ritipenem (RIT), ritipenem acoxil (RIA), and tebipenem (TBP) +\item \code{\link[=cephalosporins]{cephalosporins()}} can select: \cr cefacetrile (CAC), cefaclor (CEC), cefadroxil (CFR), cefalexin (LEX), cefaloridine (RID), cefalotin (CEP), cefamandole (MAN), cefapirin (HAP), cefatrizine (CTZ), cefazedone (CZD), cefazolin (CZO), cefcapene (CCP), cefcapene pivoxil (CCX), cefdinir (CDR), cefditoren (DIT), cefditoren pivoxil (DIX), cefepime (FEP), cefepime/clavulanic acid (CPC), cefepime/tazobactam (FPT), cefetamet (CAT), cefetamet pivoxil (CPI), cefetecol (CCL), cefetrizole (CZL), cefiderocol (FDC), cefixime (CFM), cefmenoxime (CMX), cefmetazole (CMZ), cefodizime (DIZ), cefonicid (CID), cefoperazone (CFP), cefoperazone/sulbactam (CSL), ceforanide (CND), cefoselis (CSE), cefotaxime (CTX), cefotaxime/clavulanic acid (CTC), cefotaxime/sulbactam (CTS), cefotetan (CTT), cefotiam (CTF), cefotiam hexetil (CHE), cefovecin (FOV), cefoxitin (FOX), cefoxitin screening (FOX1), cefozopran (ZOP), cefpimizole (CFZ), cefpiramide (CPM), cefpirome (CPO), cefpodoxime (CPD), cefpodoxime proxetil (CPX), cefpodoxime/clavulanic acid (CDC), cefprozil (CPR), cefquinome (CEQ), cefroxadine (CRD), cefsulodin (CFS), cefsumide (CSU), ceftaroline (CPT), ceftaroline/avibactam (CPA), ceftazidime (CAZ), ceftazidime/avibactam (CZA), ceftazidime/clavulanic acid (CCV), cefteram (CEM), cefteram pivoxil (CPL), ceftezole (CTL), ceftibuten (CTB), ceftiofur (TIO), ceftizoxime (CZX), ceftizoxime alapivoxil (CZP), ceftobiprole (BPR), ceftobiprole medocaril (CFM1), ceftolozane/tazobactam (CZT), ceftriaxone (CRO), ceftriaxone/beta-lactamase inhibitor (CEB), cefuroxime (CXM), cefuroxime axetil (CXA), cephradine (CED), latamoxef (LTM), and loracarbef (LOR) +\item \code{\link[=cephalosporins_1st]{cephalosporins_1st()}} can select: \cr cefacetrile (CAC), cefadroxil (CFR), cefalexin (LEX), cefaloridine (RID), cefalotin (CEP), cefapirin (HAP), cefatrizine (CTZ), cefazedone (CZD), cefazolin (CZO), cefroxadine (CRD), ceftezole (CTL), and cephradine (CED) +\item \code{\link[=cephalosporins_2nd]{cephalosporins_2nd()}} can select: \cr cefaclor (CEC), cefamandole (MAN), cefmetazole (CMZ), cefonicid (CID), ceforanide (CND), cefotetan (CTT), cefotiam (CTF), cefoxitin (FOX), cefoxitin screening (FOX1), cefprozil (CPR), cefuroxime (CXM), cefuroxime axetil (CXA), and loracarbef (LOR) +\item \code{\link[=cephalosporins_3rd]{cephalosporins_3rd()}} can select: \cr cefcapene (CCP), cefcapene pivoxil (CCX), cefdinir (CDR), cefditoren (DIT), cefditoren pivoxil (DIX), cefetamet (CAT), cefetamet pivoxil (CPI), cefixime (CFM), cefmenoxime (CMX), cefodizime (DIZ), cefoperazone (CFP), cefoperazone/sulbactam (CSL), cefotaxime (CTX), cefotaxime/clavulanic acid (CTC), cefotaxime/sulbactam (CTS), cefotiam hexetil (CHE), cefovecin (FOV), cefpimizole (CFZ), cefpiramide (CPM), cefpodoxime (CPD), cefpodoxime proxetil (CPX), cefpodoxime/clavulanic acid (CDC), cefsulodin (CFS), ceftazidime (CAZ), ceftazidime/avibactam (CZA), ceftazidime/clavulanic acid (CCV), cefteram (CEM), cefteram pivoxil (CPL), ceftibuten (CTB), ceftiofur (TIO), ceftizoxime (CZX), ceftizoxime alapivoxil (CZP), ceftriaxone (CRO), ceftriaxone/beta-lactamase inhibitor (CEB), and latamoxef (LTM) +\item \code{\link[=cephalosporins_4th]{cephalosporins_4th()}} can select: \cr cefepime (FEP), cefepime/clavulanic acid (CPC), cefepime/tazobactam (FPT), cefetecol (CCL), cefoselis (CSE), cefozopran (ZOP), cefpirome (CPO), and cefquinome (CEQ) +\item \code{\link[=cephalosporins_5th]{cephalosporins_5th()}} can select: \cr ceftaroline (CPT), ceftaroline/avibactam (CPA), ceftobiprole (BPR), ceftobiprole medocaril (CFM1), and ceftolozane/tazobactam (CZT) +\item \code{\link[=fluoroquinolones]{fluoroquinolones()}} can select: \cr besifloxacin (BES), ciprofloxacin (CIP), clinafloxacin (CLX), danofloxacin (DAN), delafloxacin (DFX), difloxacin (DIF), enoxacin (ENX), enrofloxacin (ENR), finafloxacin (FIN), fleroxacin (FLE), garenoxacin (GRN), gatifloxacin (GAT), gemifloxacin (GEM), grepafloxacin (GRX), lascufloxacin (LSC), levofloxacin (LVX), levonadifloxacin (LND), lomefloxacin (LOM), marbofloxacin (MAR), metioxate (MXT), miloxacin (MIL), moxifloxacin (MFX), nadifloxacin (NAD), nifuroquine (NIF), norfloxacin (NOR), ofloxacin (OFX), orbifloxacin (ORB), pazufloxacin (PAZ), pefloxacin (PEF), pradofloxacin (PRA), premafloxacin (PRX), prulifloxacin (PRU), rufloxacin (RFL), sarafloxacin (SAR), sitafloxacin (SIT), sparfloxacin (SPX), temafloxacin (TMX), tilbroquinol (TBQ), tioxacin (TXC), tosufloxacin (TFX), and trovafloxacin (TVA) +\item \code{\link[=glycopeptides]{glycopeptides()}} can select: \cr avoparcin (AVO), dalbavancin (DAL), norvancomycin (NVA), oritavancin (ORI), ramoplanin (RAM), teicoplanin (TEC), teicoplanin-macromethod (TCM), telavancin (TLV), vancomycin (VAN), and vancomycin-macromethod (VAM) +\item \code{\link[=lincosamides]{lincosamides()}} can select: \cr acetylmidecamycin (ACM), acetylspiramycin (ASP), clindamycin (CLI), clindamycin inducible screening (CLI1), gamithromycin (GAM), kitasamycin (KIT), lincomycin (LIN), meleumycin (MEL), nafithromycin (ZWK), pirlimycin (PRL), primycin (PRM), solithromycin (SOL), tildipirosin (TIP), tilmicosin (TIL), tulathromycin (TUL), tylosin (TYL), and tylvalosin (TYL1) +\item \code{\link[=lipoglycopeptides]{lipoglycopeptides()}} can select: \cr dalbavancin (DAL), oritavancin (ORI), and telavancin (TLV) +\item \code{\link[=macrolides]{macrolides()}} can select: \cr acetylmidecamycin (ACM), acetylspiramycin (ASP), azithromycin (AZM), clarithromycin (CLR), dirithromycin (DIR), erythromycin (ERY), flurithromycin (FLR1), gamithromycin (GAM), josamycin (JOS), kitasamycin (KIT), meleumycin (MEL), midecamycin (MID), miocamycin (MCM), nafithromycin (ZWK), oleandomycin (OLE), pirlimycin (PRL), primycin (PRM), rokitamycin (ROK), roxithromycin (RXT), solithromycin (SOL), spiramycin (SPI), telithromycin (TLT), tildipirosin (TIP), tilmicosin (TIL), troleandomycin (TRL), tulathromycin (TUL), tylosin (TYL), and tylvalosin (TYL1) +\item \code{\link[=nitrofurans]{nitrofurans()}} can select: \cr furazidin (FUR), furazolidone (FRZ), nifurtoinol (NFR), nitrofurantoin (NIT), and nitrofurazone (NIZ) +\item \code{\link[=oxazolidinones]{oxazolidinones()}} can select: \cr cadazolid (CDZ), cycloserine (CYC), linezolid (LNZ), tedizolid (TZD), and thiacetazone (THA) +\item \code{\link[=penicillins]{penicillins()}} can select: \cr amoxicillin (AMX), amoxicillin/clavulanic acid (AMC), amoxicillin/sulbactam (AXS), ampicillin (AMP), ampicillin/sulbactam (SAM), apalcillin (APL), aspoxicillin (APX), avibactam (AVB), azidocillin (AZD), azlocillin (AZL), aztreonam (ATM), aztreonam/avibactam (AZA), aztreonam/nacubactam (ANC), bacampicillin (BAM), benzathine benzylpenicillin (BNB), benzathine phenoxymethylpenicillin (BNP), benzylpenicillin (PEN), carbenicillin (CRB), carindacillin (CRN), cefepime/nacubactam (FNC), ciclacillin (CIC), clometocillin (CLM), cloxacillin (CLO), dicloxacillin (DIC), epicillin (EPC), flucloxacillin (FLC), hetacillin (HET), lenampicillin (LEN), mecillinam (MEC), metampicillin (MTM), meticillin (MET), mezlocillin (MEZ), mezlocillin/sulbactam (MSU), nacubactam (NAC), nafcillin (NAF), oxacillin (OXA), penamecillin (PNM), penicillin/novobiocin (PNO), penicillin/sulbactam (PSU), pheneticillin (PHE), phenoxymethylpenicillin (PHN), piperacillin (PIP), piperacillin/sulbactam (PIS), piperacillin/tazobactam (TZP), piridicillin (PRC), pivampicillin (PVM), pivmecillinam (PME), procaine benzylpenicillin (PRB), propicillin (PRP), sarmoxicillin (SRX), sulbactam (SUL), sulbenicillin (SBC), sultamicillin (SLT6), talampicillin (TAL), tazobactam (TAZ), temocillin (TEM), ticarcillin (TIC), and ticarcillin/clavulanic acid (TCC) +\item \code{\link[=polymyxins]{polymyxins()}} can select: \cr colistin (COL), polymyxin B (PLB), and polymyxin B/polysorbate 80 (POP) +\item \code{\link[=quinolones]{quinolones()}} can select: \cr besifloxacin (BES), cinoxacin (CIN), ciprofloxacin (CIP), ciprofloxacin/metronidazole (CIM), ciprofloxacin/ornidazole (CIO), ciprofloxacin/tinidazole (CIT), clinafloxacin (CLX), danofloxacin (DAN), delafloxacin (DFX), difloxacin (DIF), enoxacin (ENX), enrofloxacin (ENR), finafloxacin (FIN), fleroxacin (FLE), flumequine (FLM), garenoxacin (GRN), gatifloxacin (GAT), gemifloxacin (GEM), grepafloxacin (GRX), lascufloxacin (LSC), levofloxacin (LVX), levonadifloxacin (LND), lomefloxacin (LOM), marbofloxacin (MAR), metioxate (MXT), miloxacin (MIL), moxifloxacin (MFX), nadifloxacin (NAD), nalidixic acid (NAL), nemonoxacin (NEM), nifuroquine (NIF), nitroxoline (NTR), norfloxacin (NOR), ofloxacin (OFX), orbifloxacin (ORB), oxolinic acid (OXO), pazufloxacin (PAZ), pefloxacin (PEF), pipemidic acid (PPA), piromidic acid (PIR), pradofloxacin (PRA), premafloxacin (PRX), prulifloxacin (PRU), rosoxacin (ROS), rufloxacin (RFL), sarafloxacin (SAR), sitafloxacin (SIT), sparfloxacin (SPX), temafloxacin (TMX), tilbroquinol (TBQ), tioxacin (TXC), tosufloxacin (TFX), and trovafloxacin (TVA) +\item \code{\link[=rifamycins]{rifamycins()}} can select: \cr rifabutin (RIB), rifampicin (RIF), rifampicin/ethambutol/isoniazid (REI), rifampicin/isoniazid (RFI), rifampicin/pyrazinamide/ethambutol/isoniazid (RPEI), rifampicin/pyrazinamide/isoniazid (RPI), rifamycin (RFM), and rifapentine (RFP) +\item \code{\link[=streptogramins]{streptogramins()}} can select: \cr pristinamycin (PRI) and quinupristin/dalfopristin (QDA) +\item \code{\link[=tetracyclines]{tetracyclines()}} can select: \cr cetocycline (CTO), chlortetracycline (CTE), clomocycline (CLM1), demeclocycline (DEM), doxycycline (DOX), eravacycline (ERV), lymecycline (LYM), metacycline (MTC), minocycline (MNO), omadacycline (OMC), oxytetracycline (OXY), penimepicycline (PNM1), rolitetracycline (RLT), sarecycline (SRC), tetracycline (TCY), and tigecycline (TGC) +\item \code{\link[=trimethoprims]{trimethoprims()}} can select: \cr brodimoprim (BDP), sulfadiazine (SDI), sulfadiazine/tetroxoprim (SLT), sulfadiazine/trimethoprim (SLT1), sulfadimethoxine (SUD), sulfadimidine (SDM), sulfadimidine/trimethoprim (SLT2), sulfafurazole (SLF), sulfaisodimidine (SLF1), sulfalene (SLF2), sulfamazone (SZO), sulfamerazine (SLF3), sulfamerazine/trimethoprim (SLT3), sulfamethizole (SLF4), sulfamethoxazole (SMX), sulfamethoxypyridazine (SLF5), sulfametomidine (SLF6), sulfametoxydiazine (SLF7), sulfametrole/trimethoprim (SLT4), sulfamoxole (SLF8), sulfamoxole/trimethoprim (SLT5), sulfanilamide (SLF9), sulfaperin (SLF10), sulfaphenazole (SLF11), sulfapyridine (SLF12), sulfathiazole (SUT), sulfathiourea (SLF13), trimethoprim (TMP), and trimethoprim/sulfamethoxazole (SXT) +\item \code{\link[=ureidopenicillins]{ureidopenicillins()}} can select: \cr azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), and piperacillin/tazobactam (TZP) +} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates. +example_isolates + + +# Examples sections below are split into 'dplyr', 'base R', and 'data.table': + +\donttest{ +# dplyr ------------------------------------------------------------------- + +if (require("dplyr")) { + example_isolates \%>\% select(carbapenems()) +} + +if (require("dplyr")) { + # select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB' + example_isolates \%>\% select(mo, aminoglycosides()) +} + +if (require("dplyr")) { + # select only antibiotic columns with DDDs for oral treatment + example_isolates \%>\% select(administrable_per_os()) +} + +if (require("dplyr")) { + # get AMR for all aminoglycosides e.g., per ward: + example_isolates \%>\% + group_by(ward) \%>\% + summarise(across(aminoglycosides(), + resistance)) +} +if (require("dplyr")) { + # You can combine selectors with '&' to be more specific: + example_isolates \%>\% + select(penicillins() & administrable_per_os()) +} +if (require("dplyr")) { + # get AMR for only drugs that matter - no intrinsic resistance: + example_isolates \%>\% + filter(mo_genus() \%in\% c("Escherichia", "Klebsiella")) \%>\% + group_by(ward) \%>\% + summarise_at(not_intrinsic_resistant(), + resistance) +} +if (require("dplyr")) { + # get susceptibility for antibiotics whose name contains "trim": + example_isolates \%>\% + filter(first_isolate()) \%>\% + group_by(ward) \%>\% + summarise(across(ab_selector(name \%like\% "trim"), susceptibility)) +} +if (require("dplyr")) { + # this will select columns 'IPM' (imipenem) and 'MEM' (meropenem): + example_isolates \%>\% + select(carbapenems()) +} +if (require("dplyr")) { + # this will select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB': + example_isolates \%>\% + select(mo, aminoglycosides()) +} +if (require("dplyr")) { + # any() and all() work in dplyr's filter() too: + example_isolates \%>\% + filter( + any(aminoglycosides() == "R"), + all(cephalosporins_2nd() == "R") + ) +} +if (require("dplyr")) { + # also works with c(): + example_isolates \%>\% + filter(any(c(carbapenems(), aminoglycosides()) == "R")) +} +if (require("dplyr")) { + # not setting any/all will automatically apply all(): + example_isolates \%>\% + filter(aminoglycosides() == "R") +} +if (require("dplyr")) { + # this will select columns 'mo' and all antimycobacterial drugs ('RIF'): + example_isolates \%>\% + select(mo, ab_class("mycobact")) +} +if (require("dplyr")) { + # get bug/drug combinations for only glycopeptides in Gram-positives: + example_isolates \%>\% + filter(mo_is_gram_positive()) \%>\% + select(mo, glycopeptides()) \%>\% + bug_drug_combinations() \%>\% + format() +} +if (require("dplyr")) { + data.frame( + some_column = "some_value", + J01CA01 = "S" + ) \%>\% # ATC code of ampicillin + select(penicillins()) # only the 'J01CA01' column will be selected +} +if (require("dplyr")) { + # with recent versions of dplyr, this is all equal: + x <- example_isolates[carbapenems() == "R", ] + y <- example_isolates \%>\% filter(carbapenems() == "R") + z <- example_isolates \%>\% filter(if_all(carbapenems(), ~ .x == "R")) + identical(x, y) && identical(y, z) +} + + +# base R ------------------------------------------------------------------ + +# select columns 'IPM' (imipenem) and 'MEM' (meropenem) +example_isolates[, carbapenems()] + +# select columns 'mo', 'AMK', 'GEN', 'KAN' and 'TOB' +example_isolates[, c("mo", aminoglycosides())] + +# select only antibiotic columns with DDDs for oral treatment +example_isolates[, administrable_per_os()] + +# filter using any() or all() +example_isolates[any(carbapenems() == "R"), ] +subset(example_isolates, any(carbapenems() == "R")) + +# filter on any or all results in the carbapenem columns (i.e., IPM, MEM): +example_isolates[any(carbapenems()), ] +example_isolates[all(carbapenems()), ] + +# filter with multiple antibiotic selectors using c() +example_isolates[all(c(carbapenems(), aminoglycosides()) == "R"), ] + +# filter + select in one go: get penicillins in carbapenem-resistant strains +example_isolates[any(carbapenems() == "R"), penicillins()] + +# You can combine selectors with '&' to be more specific. For example, +# penicillins() would select benzylpenicillin ('peni G') and +# administrable_per_os() would select erythromycin. Yet, when combined these +# drugs are both omitted since benzylpenicillin is not administrable per os +# and erythromycin is not a penicillin: +example_isolates[, penicillins() & administrable_per_os()] + +# ab_selector() applies a filter in the `antibiotics` data set and is thus +# very flexible. For instance, to select antibiotic columns with an oral DDD +# of at least 1 gram: +example_isolates[, ab_selector(oral_ddd > 1 & oral_units == "g")] + + +# data.table -------------------------------------------------------------- + +# data.table is supported as well, just use it in the same way as with +# base R, but add `with = FALSE` if using a single AB selector. + +if (require("data.table")) { + dt <- as.data.table(example_isolates) + + # this does not work, it returns column *names* + dt[, carbapenems()] +} +if (require("data.table")) { + # so `with = FALSE` is required + dt[, carbapenems(), with = FALSE] +} + +# for multiple selections or AB selectors, `with = FALSE` is not needed: +if (require("data.table")) { + dt[, c("mo", aminoglycosides())] +} +if (require("data.table")) { + dt[, c(carbapenems(), aminoglycosides())] +} + +# row filters are also supported: +if (require("data.table")) { + dt[any(carbapenems() == "S"), ] +} +if (require("data.table")) { + dt[any(carbapenems() == "S"), penicillins(), with = FALSE] +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/antibiotics.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{antibiotics} +\alias{antibiotics} +\alias{antivirals} +\title{Data Sets with 605 Antimicrobial Drugs} +\format{ +\subsection{For the \link{antibiotics} data set: a \link[tibble:tibble]{tibble} with 485 observations and 14 variables:}{ +\itemize{ +\item \code{ab}\cr Antibiotic ID as used in this package (such as \code{AMC}), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available. \emph{\strong{This is a unique identifier.}} +\item \code{cid}\cr Compound ID as found in PubChem. \emph{\strong{This is a unique identifier.}} +\item \code{name}\cr Official name as used by WHONET/EARS-Net or the WHO. \emph{\strong{This is a unique identifier.}} +\item \code{group}\cr A short and concise group name, based on WHONET and WHOCC definitions +\item \code{atc}\cr ATC codes (Anatomical Therapeutic Chemical) as defined by the WHOCC, like \code{J01CR02} +\item \code{atc_group1}\cr Official pharmacological subgroup (3rd level ATC code) as defined by the WHOCC, like \code{"Macrolides, lincosamides and streptogramins"} +\item \code{atc_group2}\cr Official chemical subgroup (4th level ATC code) as defined by the WHOCC, like \code{"Macrolides"} +\item \code{abbr}\cr List of abbreviations as used in many countries, also for antibiotic susceptibility testing (AST) +\item \code{synonyms}\cr Synonyms (often trade names) of a drug, as found in PubChem based on their compound ID +\item \code{oral_ddd}\cr Defined Daily Dose (DDD), oral treatment, currently available for 179 drugs +\item \code{oral_units}\cr Units of \code{oral_ddd} +\item \code{iv_ddd}\cr Defined Daily Dose (DDD), parenteral (intravenous) treatment, currently available for 153 drugs +\item \code{iv_units}\cr Units of \code{iv_ddd} +\item \code{loinc}\cr All codes associated with the name of the antimicrobial drug from Logical Observation Identifiers Names and Codes (LOINC), Version 2.76 (18 September, 2023). Use \code{\link[=ab_loinc]{ab_loinc()}} to retrieve them quickly, see \code{\link[=ab_property]{ab_property()}}. +} +} + +\subsection{For the \link{antivirals} data set: a \link[tibble:tibble]{tibble} with 120 observations and 11 variables:}{ +\itemize{ +\item \code{av}\cr Antiviral ID as used in this package (such as \code{ACI}), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available. \emph{\strong{This is a unique identifier.}} Combinations are codes that contain a \code{+} to indicate this, such as \code{ATA+COBI} for atazanavir/cobicistat. +\item \code{name}\cr Official name as used by WHONET/EARS-Net or the WHO. \emph{\strong{This is a unique identifier.}} +\item \code{atc}\cr ATC codes (Anatomical Therapeutic Chemical) as defined by the WHOCC +\item \code{cid}\cr Compound ID as found in PubChem. \emph{\strong{This is a unique identifier.}} +\item \code{atc_group}\cr Official pharmacological subgroup (3rd level ATC code) as defined by the WHOCC +\item \code{synonyms}\cr Synonyms (often trade names) of a drug, as found in PubChem based on their compound ID +\item \code{oral_ddd}\cr Defined Daily Dose (DDD), oral treatment +\item \code{oral_units}\cr Units of \code{oral_ddd} +\item \code{iv_ddd}\cr Defined Daily Dose (DDD), parenteral treatment +\item \code{iv_units}\cr Units of \code{iv_ddd} +\item \code{loinc}\cr All codes associated with the name of the antiviral drug from Logical Observation Identifiers Names and Codes (LOINC), Version 2.76 (18 September, 2023). Use \code{\link[=av_loinc]{av_loinc()}} to retrieve them quickly, see \code{\link[=av_property]{av_property()}}. +} +} + +An object of class \code{tbl_df} (inherits from \code{tbl}, \code{data.frame}) with 120 rows and 11 columns. +} +\source{ +\itemize{ +\item World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology (WHOCC): \url{https://atcddd.fhi.no/atc_ddd_index/} +\item Logical Observation Identifiers Names and Codes (LOINC), Version 2.76 (18 September, 2023). Accessed from \url{https://loinc.org} on October 19th, 2023. +\item European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +} +} +\usage{ +antibiotics + +antivirals +} +\description{ +Two data sets containing all antibiotics/antimycotics and antivirals. Use \code{\link[=as.ab]{as.ab()}} or one of the \code{\link[=ab_property]{ab_*}} functions to retrieve values from the \link{antibiotics} data set. Three identifiers are included in this data set: an antibiotic ID (\code{ab}, primarily used in this package) as defined by WHONET/EARS-Net, an ATC code (\code{atc}) as defined by the WHO, and a Compound ID (\code{cid}) as found in PubChem. Other properties in this data set are derived from one or more of these codes. Note that some drugs have multiple ATC codes. +} +\details{ +Properties that are based on an ATC code are only available when an ATC is available. These properties are: \code{atc_group1}, \code{atc_group2}, \code{oral_ddd}, \code{oral_units}, \code{iv_ddd} and \code{iv_units}. + +Synonyms (i.e. trade names) were derived from the PubChem Compound ID (column \code{cid}) and consequently only available where a CID is available. +\subsection{Direct download}{ + +Like all data sets in this package, these data sets are publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +} +\section{WHOCC}{ + +This package contains \strong{all ~550 antibiotic, antimycotic and antiviral drugs} and their Anatomical Therapeutic Chemical (ATC) codes, ATC groups and Defined Daily Dose (DDD) from the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC, \url{https://atcddd.fhi.no}) and the Pharmaceuticals Community Register of the European Commission (\url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm}). + +These have become the gold standard for international drug utilisation monitoring and research. + +The WHOCC is located in Oslo at the Norwegian Institute of Public Health and funded by the Norwegian government. The European Commission is the executive of the European Union and promotes its general interest. + +\strong{NOTE: The WHOCC copyright does not allow use for commercial purposes, unlike any other info from this package.} See \url{https://atcddd.fhi.no/copyright_disclaimer/.} +} + +\examples{ +antibiotics +antivirals +} +\seealso{ +\link{microorganisms}, \link{intrinsic_resistant} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.ab.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ab.R +\name{as.ab} +\alias{as.ab} +\alias{ab} +\alias{is.ab} +\title{Transform Input to an Antibiotic ID} +\usage{ +as.ab(x, flag_multiple_results = TRUE, info = interactive(), ...) + +is.ab(x) +} +\arguments{ +\item{x}{a \link{character} vector to determine to antibiotic ID} + +\item{flag_multiple_results}{a \link{logical} to indicate whether a note should be printed to the console that probably more than one antibiotic drug code or name can be retrieved from a single input value.} + +\item{info}{a \link{logical} to indicate whether a progress bar should be printed - the default is \code{TRUE} only in interactive mode} + +\item{...}{arguments passed on to internal functions} +} +\value{ +A \link{character} \link{vector} with additional class \code{\link{ab}} +} +\description{ +Use this function to determine the antibiotic drug code of one or more antibiotics. The data set \link{antibiotics} will be searched for abbreviations, official names and synonyms (brand names). +} +\details{ +All entries in the \link{antibiotics} data set have three different identifiers: a human readable EARS-Net code (column \code{ab}, used by ECDC and WHONET), an ATC code (column \code{atc}, used by WHO), and a CID code (column \code{cid}, Compound ID, used by PubChem). The data set contains more than 5,000 official brand names from many different countries, as found in PubChem. Not that some drugs contain multiple ATC codes. + +All these properties will be searched for the user input. The \code{\link[=as.ab]{as.ab()}} can correct for different forms of misspelling: +\itemize{ +\item Wrong spelling of drug names (such as "tobramicin" or "gentamycin"), which corrects for most audible similarities such as f/ph, x/ks, c/z/s, t/th, etc. +\item Too few or too many vowels or consonants +\item Switching two characters (such as "mreopenem", often the case in clinical data, when doctors typed too fast) +\item Digitalised paper records, leaving artefacts like 0/o/O (zero and O's), B/8, n/r, etc. +} + +Use the \code{\link[=ab_property]{ab_*}} functions to get properties based on the returned antibiotic ID, see \emph{Examples}. + +Note: the \code{\link[=as.ab]{as.ab()}} and \code{\link[=ab_property]{ab_*}} functions may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. + +You can add your own manual codes to be considered by \code{\link[=as.ab]{as.ab()}} and all \code{\link[=ab_property]{ab_*}} functions, see \code{\link[=add_custom_antimicrobials]{add_custom_antimicrobials()}}. +} +\section{Source}{ + +World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} + +European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +} + +\section{WHOCC}{ + +This package contains \strong{all ~550 antibiotic, antimycotic and antiviral drugs} and their Anatomical Therapeutic Chemical (ATC) codes, ATC groups and Defined Daily Dose (DDD) from the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC, \url{https://atcddd.fhi.no}) and the Pharmaceuticals Community Register of the European Commission (\url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm}). + +These have become the gold standard for international drug utilisation monitoring and research. + +The WHOCC is located in Oslo at the Norwegian Institute of Public Health and funded by the Norwegian government. The European Commission is the executive of the European Union and promotes its general interest. + +\strong{NOTE: The WHOCC copyright does not allow use for commercial purposes, unlike any other info from this package.} See \url{https://atcddd.fhi.no/copyright_disclaimer/.} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# these examples all return "ERY", the ID of erythromycin: +as.ab("J01FA01") +as.ab("J 01 FA 01") +as.ab("Erythromycin") +as.ab("eryt") +as.ab(" eryt 123") +as.ab("ERYT") +as.ab("ERY") +as.ab("eritromicine") # spelled wrong, yet works +as.ab("Erythrocin") # trade name +as.ab("Romycin") # trade name + +# spelling from different languages and dyslexia are no problem +ab_atc("ceftriaxon") +ab_atc("cephtriaxone") # small spelling error +ab_atc("cephthriaxone") # or a bit more severe +ab_atc("seephthriaaksone") # and even this works + +# use ab_* functions to get a specific properties (see ?ab_property); +# they use as.ab() internally: +ab_name("J01FA01") +ab_name("eryt") + +\donttest{ +if (require("dplyr")) { + # you can quickly rename 'sir' columns using set_ab_names() with dplyr: + example_isolates \%>\% + set_ab_names(where(is.sir), property = "atc") +} +} +} +\seealso{ +\itemize{ +\item \link{antibiotics} for the \link{data.frame} that is being used to determine ATCs +\item \code{\link[=ab_from_text]{ab_from_text()}} for a function to retrieve antimicrobial drugs from clinical text (from health care records) +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.av.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/av.R +\name{as.av} +\alias{as.av} +\alias{av} +\alias{is.av} +\title{Transform Input to an Antiviral Drug ID} +\usage{ +as.av(x, flag_multiple_results = TRUE, info = interactive(), ...) + +is.av(x) +} +\arguments{ +\item{x}{a \link{character} vector to determine to antiviral drug ID} + +\item{flag_multiple_results}{a \link{logical} to indicate whether a note should be printed to the console that probably more than one antiviral drug code or name can be retrieved from a single input value.} + +\item{info}{a \link{logical} to indicate whether a progress bar should be printed - the default is \code{TRUE} only in interactive mode} + +\item{...}{arguments passed on to internal functions} +} +\value{ +A \link{character} \link{vector} with additional class \code{\link{ab}} +} +\description{ +Use this function to determine the antiviral drug code of one or more antiviral drugs. The data set \link{antivirals} will be searched for abbreviations, official names and synonyms (brand names). +} +\details{ +All entries in the \link{antivirals} data set have three different identifiers: a human readable EARS-Net code (column \code{ab}, used by ECDC and WHONET), an ATC code (column \code{atc}, used by WHO), and a CID code (column \code{cid}, Compound ID, used by PubChem). The data set contains more than 5,000 official brand names from many different countries, as found in PubChem. Not that some drugs contain multiple ATC codes. + +All these properties will be searched for the user input. The \code{\link[=as.av]{as.av()}} can correct for different forms of misspelling: +\itemize{ +\item Wrong spelling of drug names (such as "acyclovir"), which corrects for most audible similarities such as f/ph, x/ks, c/z/s, t/th, etc. +\item Too few or too many vowels or consonants +\item Switching two characters (such as "aycclovir", often the case in clinical data, when doctors typed too fast) +\item Digitalised paper records, leaving artefacts like 0/o/O (zero and O's), B/8, n/r, etc. +} + +Use the \code{\link[=av_property]{av_*}} functions to get properties based on the returned antiviral drug ID, see \emph{Examples}. + +Note: the \code{\link[=as.av]{as.av()}} and \code{\link[=av_property]{av_*}} functions may use very long regular expression to match brand names of antimicrobial drugs. This may fail on some systems. +} +\section{Source}{ + +World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} + +European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +} + +\section{WHOCC}{ + +This package contains \strong{all ~550 antibiotic, antimycotic and antiviral drugs} and their Anatomical Therapeutic Chemical (ATC) codes, ATC groups and Defined Daily Dose (DDD) from the World Health Organization Collaborating Centre for Drug Statistics Methodology (WHOCC, \url{https://atcddd.fhi.no}) and the Pharmaceuticals Community Register of the European Commission (\url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm}). + +These have become the gold standard for international drug utilisation monitoring and research. + +The WHOCC is located in Oslo at the Norwegian Institute of Public Health and funded by the Norwegian government. The European Commission is the executive of the European Union and promotes its general interest. + +\strong{NOTE: The WHOCC copyright does not allow use for commercial purposes, unlike any other info from this package.} See \url{https://atcddd.fhi.no/copyright_disclaimer/.} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# these examples all return "ACI", the ID of aciclovir: +as.av("J05AB01") +as.av("J 05 AB 01") +as.av("Aciclovir") +as.av("aciclo") +as.av(" aciclo 123") +as.av("ACICL") +as.av("ACI") +as.av("Virorax") # trade name +as.av("Zovirax") # trade name + +as.av("acyklofir") # severe spelling error, yet works + +# use av_* functions to get a specific properties (see ?av_property); +# they use as.av() internally: +av_name("J05AB01") +av_name("acicl") +} +\seealso{ +\itemize{ +\item \link{antivirals} for the \link{data.frame} that is being used to determine ATCs +\item \code{\link[=av_from_text]{av_from_text()}} for a function to retrieve antimicrobial drugs from clinical text (from health care records) +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.disk.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/disk.R +\docType{data} +\name{as.disk} +\alias{as.disk} +\alias{disk} +\alias{NA_disk_} +\alias{is.disk} +\title{Transform Input to Disk Diffusion Diameters} +\format{ +An object of class \code{disk} (inherits from \code{integer}) of length 1. +} +\usage{ +as.disk(x, na.rm = FALSE) + +NA_disk_ + +is.disk(x) +} +\arguments{ +\item{x}{vector} + +\item{na.rm}{a \link{logical} indicating whether missing values should be removed} +} +\value{ +An \link{integer} with additional class \code{\link{disk}} +} +\description{ +This transforms a vector to a new class \code{\link{disk}}, which is a disk diffusion growth zone size (around an antibiotic disk) in millimetres between 0 and 50. +} +\details{ +Interpret disk values as SIR values with \code{\link[=as.sir]{as.sir()}}. It supports guidelines from EUCAST and CLSI. + +Disk diffusion growth zone sizes must be between 0 and 50 millimetres. Values higher than 50 but lower than 100 will be maximised to 50. All others input values outside the 0-50 range will return \code{NA}. + +\code{NA_disk_} is a missing value of the new \code{disk} class. +} +\examples{ +# transform existing disk zones to the `disk` class (using base R) +df <- data.frame( + microorganism = "Escherichia coli", + AMP = 20, + CIP = 14, + GEN = 18, + TOB = 16 +) +df[, 2:5] <- lapply(df[, 2:5], as.disk) +str(df) + +\donttest{ +# transforming is easier with dplyr: +if (require("dplyr")) { + df \%>\% mutate(across(AMP:TOB, as.disk)) +} +} + +# interpret disk values, see ?as.sir +as.sir( + x = as.disk(18), + mo = "Strep pneu", # `mo` will be coerced with as.mo() + ab = "ampicillin", # and `ab` with as.ab() + guideline = "EUCAST" +) + +# interpret whole data set, pretend to be all from urinary tract infections: +as.sir(df, uti = TRUE) +} +\seealso{ +\code{\link[=as.sir]{as.sir()}} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.mic.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mic.R +\docType{data} +\name{as.mic} +\alias{as.mic} +\alias{mic} +\alias{is.mic} +\alias{NA_mic_} +\alias{rescale_mic} +\alias{droplevels.mic} +\title{Transform Input to Minimum Inhibitory Concentrations (MIC)} +\usage{ +as.mic(x, na.rm = FALSE, keep_operators = "all") + +is.mic(x) + +NA_mic_ + +rescale_mic(x, mic_range, keep_operators = "edges", as.mic = TRUE) + +\method{droplevels}{mic}(x, as.mic = FALSE, ...) +} +\arguments{ +\item{x}{a \link{character} or \link{numeric} vector} + +\item{na.rm}{a \link{logical} indicating whether missing values should be removed} + +\item{keep_operators}{a \link{character} specifying how to handle operators (such as \code{>} and \code{<=}) in the input. Accepts one of three values: \code{"all"} (or \code{TRUE}) to keep all operators, \code{"none"} (or \code{FALSE}) to remove all operators, or \code{"edges"} to keep operators only at both ends of the range.} + +\item{mic_range}{a manual range to limit the MIC values, e.g., \code{mic_range = c(0.001, 32)}. Use \code{NA} to set no limit on one side, e.g., \code{mic_range = c(NA, 32)}.} + +\item{as.mic}{a \link{logical} to indicate whether the \code{mic} class should be kept - the default is \code{FALSE}} + +\item{...}{arguments passed on to methods} +} +\value{ +Ordered \link{factor} with additional class \code{\link{mic}}, that in mathematical operations acts as a \link{numeric} vector. Bear in mind that the outcome of any mathematical operation on MICs will return a \link{numeric} value. +} +\description{ +This transforms vectors to a new class \code{\link{mic}}, which treats the input as decimal numbers, while maintaining operators (such as ">=") and only allowing valid MIC values known to the field of (medical) microbiology. +} +\details{ +To interpret MIC values as SIR values, use \code{\link[=as.sir]{as.sir()}} on MIC values. It supports guidelines from EUCAST (2011-2024) and CLSI (2011-2024). + +This class for MIC values is a quite a special data type: formally it is an ordered \link{factor} with valid MIC values as \link{factor} levels (to make sure only valid MIC values are retained), but for any mathematical operation it acts as decimal numbers: + +\if{html}{\out{
}}\preformatted{x <- random_mic(10) +x +#> Class 'mic' +#> [1] 16 1 8 8 64 >=128 0.0625 32 32 16 + +is.factor(x) +#> [1] TRUE + +x[1] * 2 +#> [1] 32 + +median(x) +#> [1] 26 +}\if{html}{\out{
}} + +This makes it possible to maintain operators that often come with MIC values, such ">=" and "<=", even when filtering using \link{numeric} values in data analysis, e.g.: + +\if{html}{\out{
}}\preformatted{x[x > 4] +#> Class 'mic' +#> [1] 16 8 8 64 >=128 32 32 16 + +df <- data.frame(x, hospital = "A") +subset(df, x > 4) # or with dplyr: df \%>\% filter(x > 4) +#> x hospital +#> 1 16 A +#> 5 64 A +#> 6 >=128 A +#> 8 32 A +#> 9 32 A +#> 10 16 A +}\if{html}{\out{
}} + +All so-called \link[=groupGeneric]{group generic functions} are implemented for the MIC class (such as \code{!}, \code{!=}, \code{<}, \code{>=}, \code{\link[=exp]{exp()}}, \code{\link[=log2]{log2()}}). Some functions of the \code{stats} package are also implemented (such as \code{\link[=quantile]{quantile()}}, \code{\link[=median]{median()}}, \code{\link[=fivenum]{fivenum()}}). Since \code{\link[=sd]{sd()}} and \code{\link[=var]{var()}} are non-generic functions, these could not be extended. Use \code{\link[=mad]{mad()}} as an alternative, or use e.g. \code{sd(as.numeric(x))} where \code{x} is your vector of MIC values. + +Using \code{\link[=as.double]{as.double()}} or \code{\link[=as.numeric]{as.numeric()}} on MIC values will remove the operators and return a numeric vector. Do \strong{not} use \code{\link[=as.integer]{as.integer()}} on MIC values as by the \R convention on \link{factor}s, it will return the index of the factor levels (which is often useless for regular users). + +Use \code{\link[=droplevels]{droplevels()}} to drop unused levels. At default, it will return a plain factor. Use \code{droplevels(..., as.mic = TRUE)} to maintain the \code{mic} class. + +With \code{\link[=rescale_mic]{rescale_mic()}}, existing MIC ranges can be limited to a defined range of MIC values. This can be useful to better compare MIC distributions. + +For \code{ggplot2}, use one of the \code{\link[=scale_x_mic]{scale_*_mic()}} functions to plot MIC values. They allows custom MIC ranges and to plot intermediate log2 levels for missing MIC values. + +\code{NA_mic_} is a missing value of the new \code{mic} class, analogous to e.g. base \R's \code{\link[base:NA]{NA_character_}}. +} +\examples{ +mic_data <- as.mic(c(">=32", "1.0", "1", "1.00", 8, "<=0.128", "8", "16", "16")) +mic_data +is.mic(mic_data) + +# this can also coerce combined MIC/SIR values: +as.mic("<=0.002; S") + +# mathematical processing treats MICs as numeric values +fivenum(mic_data) +quantile(mic_data) +all(mic_data < 512) + +# rescale MICs using rescale_mic() +rescale_mic(mic_data, mic_range = c(4, 16)) + +# interpret MIC values +as.sir( + x = as.mic(2), + mo = as.mo("Streptococcus pneumoniae"), + ab = "AMX", + guideline = "EUCAST" +) +as.sir( + x = as.mic(c(0.01, 2, 4, 8)), + mo = as.mo("Streptococcus pneumoniae"), + ab = "AMX", + guideline = "EUCAST" +) + +# plot MIC values, see ?plot +plot(mic_data) +plot(mic_data, mo = "E. coli", ab = "cipro") + +if (require("ggplot2")) { + autoplot(mic_data, mo = "E. coli", ab = "cipro") +} +if (require("ggplot2")) { + autoplot(mic_data, mo = "E. coli", ab = "cipro", language = "nl") # Dutch +} +} +\seealso{ +\code{\link[=as.sir]{as.sir()}} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.mo.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mo.R +\name{as.mo} +\alias{as.mo} +\alias{mo} +\alias{is.mo} +\alias{mo_uncertainties} +\alias{mo_renamed} +\alias{mo_failures} +\alias{mo_reset_session} +\alias{mo_cleaning_regex} +\title{Transform Arbitrary Input to Valid Microbial Taxonomy} +\usage{ +as.mo( + x, + Becker = FALSE, + Lancefield = FALSE, + minimum_matching_score = NULL, + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + reference_df = get_mo_source(), + ignore_pattern = getOption("AMR_ignore_pattern", NULL), + cleaning_regex = getOption("AMR_cleaning_regex", mo_cleaning_regex()), + only_fungi = getOption("AMR_only_fungi", FALSE), + language = get_AMR_locale(), + info = interactive(), + ... +) + +is.mo(x) + +mo_uncertainties() + +mo_renamed() + +mo_failures() + +mo_reset_session() + +mo_cleaning_regex() +} +\arguments{ +\item{x}{a \link{character} vector or a \link{data.frame} with one or two columns} + +\item{Becker}{a \link{logical} to indicate whether staphylococci should be categorised into coagulase-negative staphylococci ("CoNS") and coagulase-positive staphylococci ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} (see \emph{Source}). Please see \emph{Details} for a full list of staphylococcal species that will be converted. + +This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".} + +\item{Lancefield}{a \link{logical} to indicate whether a beta-haemolytic \emph{Streptococcus} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield (see \emph{Source}). These streptococci will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. . Please see \emph{Details} for a full list of streptococcal species that will be converted. + +This excludes enterococci at default (who are in group D), use \code{Lancefield = "all"} to also categorise all enterococci as group D.} + +\item{minimum_matching_score}{a numeric value to set as the lower limit for the \link[=mo_matching_score]{MO matching score}. When left blank, this will be determined automatically based on the character length of \code{x}, its \link[=microorganisms]{taxonomic kingdom} and \link[=mo_matching_score]{human pathogenicity}.} + +\item{keep_synonyms}{a \link{logical} to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is \code{FALSE}, which will return a note if old taxonomic names were processed. The default can be set with the package option \code{\link[=AMR-options]{AMR_keep_synonyms}}, i.e. \code{options(AMR_keep_synonyms = TRUE)} or \code{options(AMR_keep_synonyms = FALSE)}.} + +\item{reference_df}{a \link{data.frame} to be used for extra reference when translating \code{x} to a valid \code{\link{mo}}. See \code{\link[=set_mo_source]{set_mo_source()}} and \code{\link[=get_mo_source]{get_mo_source()}} to automate the usage of your own codes (e.g. used in your analysis or organisation).} + +\item{ignore_pattern}{a Perl-compatible \link[base:regex]{regular expression} (case-insensitive) of which all matches in \code{x} must return \code{NA}. This can be convenient to exclude known non-relevant input and can also be set with the package option \code{\link[=AMR-options]{AMR_ignore_pattern}}, e.g. \code{options(AMR_ignore_pattern = "(not reported|contaminated flora)")}.} + +\item{cleaning_regex}{a Perl-compatible \link[base:regex]{regular expression} (case-insensitive) to clean the input of \code{x}. Every matched part in \code{x} will be removed. At default, this is the outcome of \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}, which removes texts between brackets and texts such as "species" and "serovar". The default can be set with the package option \code{\link[=AMR-options]{AMR_cleaning_regex}}.} + +\item{only_fungi}{a \link{logical} to indicate if only fungi must be found, making sure that e.g. misspellings always return records from the kingdom of Fungi. This can be set globally for \link[=mo_property]{all microorganism functions} with the package option \code{\link[=AMR-options]{AMR_only_fungi}}, i.e. \code{options(AMR_only_fungi = TRUE)}.} + +\item{language}{language to translate text like "no growth", which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})} + +\item{info}{a \link{logical} to indicate that info must be printed, e.g. a progress bar when more than 25 items are to be coerced, or a list with old taxonomic names. The default is \code{TRUE} only in interactive mode.} + +\item{...}{other arguments passed on to functions} +} +\value{ +A \link{character} \link{vector} with additional class \code{\link{mo}} +} +\description{ +Use this function to get a valid microorganism code (\code{\link{mo}}) based on arbitrary user input. Determination is done using intelligent rules and the complete taxonomic tree of the kingdoms Animalia, Archaea, Bacteria, Chromista, and Protozoa, and most microbial species from the kingdom Fungi (see \emph{Source}). The input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (such as \code{"S. aureus"}), an abbreviation known in the field (such as \code{"MRSA"}), or just a genus. See \emph{Examples}. +} +\details{ +A microorganism (MO) code from this package (class: \code{\link{mo}}) is human-readable and typically looks like these examples: + +\if{html}{\out{
}}\preformatted{ Code Full name + --------------- -------------------------------------- + B_KLBSL Klebsiella + B_KLBSL_PNMN Klebsiella pneumoniae + B_KLBSL_PNMN_RHNS Klebsiella pneumoniae rhinoscleromatis + | | | | + | | | | + | | | \\---> subspecies, a 3-5 letter acronym + | | \\----> species, a 3-6 letter acronym + | \\----> genus, a 4-8 letter acronym + \\----> kingdom: A (Archaea), AN (Animalia), B (Bacteria), + C (Chromista), F (Fungi), PL (Plantae), + P (Protozoa) +}\if{html}{\out{
}} + +Values that cannot be coerced will be considered 'unknown' and will return the MO code \code{UNKNOWN} with a warning. + +Use the \code{\link[=mo_property]{mo_*}} functions to get properties based on the returned code, see \emph{Examples}. + +The \code{\link[=as.mo]{as.mo()}} function uses a novel and scientifically validated (\doi{10.18637/jss.v104.i03}) matching score algorithm (see \emph{Matching Score for Microorganisms} below) to match input against the \link[=microorganisms]{available microbial taxonomy} in this package. This implicates that e.g. \code{"E. coli"} (a microorganism highly prevalent in humans) will return the microbial ID of \emph{Escherichia coli} and not \emph{Entamoeba coli} (a microorganism less prevalent in humans), although the latter would alphabetically come first. +\subsection{Coping with Uncertain Results}{ + +Results of non-exact taxonomic input are based on their \link[=mo_matching_score]{matching score}. The lowest allowed score can be set with the \code{minimum_matching_score} argument. At default this will be determined based on the character length of the input, the \link[=microorganisms]{taxonomic kingdom}, and the \link[=mo_matching_score]{human pathogenicity} of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to inspect the results with \code{\link[=mo_uncertainties]{mo_uncertainties()}}, which returns a \link{data.frame} with all specifications. + +To increase the quality of matching, the \code{cleaning_regex} argument is used to clean the input. This must be a \link[base:regex]{regular expression} that matches parts of the input that should be removed before the input is matched against the \link[=microorganisms]{available microbial taxonomy}. It will be matched Perl-compatible and case-insensitive. The default value of \code{cleaning_regex} is the outcome of the helper function \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}. + +There are three helper functions that can be run after using the \code{\link[=as.mo]{as.mo()}} function: +\itemize{ +\item Use \code{\link[=mo_uncertainties]{mo_uncertainties()}} to get a \link{data.frame} that prints in a pretty format with all taxonomic names that were guessed. The output contains the matching score for all matches (see \emph{Matching Score for Microorganisms} below). +\item Use \code{\link[=mo_failures]{mo_failures()}} to get a \link{character} \link{vector} with all values that could not be coerced to a valid value. +\item Use \code{\link[=mo_renamed]{mo_renamed()}} to get a \link{data.frame} with all values that could be coerced based on old, previously accepted taxonomic names. +} +} + +\subsection{For Mycologists}{ + +The \link[=mo_matching_score]{matching score algorithm} gives precedence to bacteria over fungi. If you are only analysing fungi, be sure to use \code{only_fungi = TRUE}, or better yet, add this to your code and run it once every session: + +\if{html}{\out{
}}\preformatted{options(AMR_only_fungi = TRUE) +}\if{html}{\out{
}} + +This will make sure that no bacteria or other 'non-fungi' will be returned by \code{\link[=as.mo]{as.mo()}}, or any of the \code{\link[=mo_property]{mo_*}} functions. +} + +\subsection{Coagulase-negative and Coagulase-positive Staphylococci}{ + +With \code{Becker = TRUE}, the following staphylococci will be converted to their corresponding coagulase group: +\itemize{ +\item Coagulase-negative: \emph{S. americanisciuri}, \emph{S. argensis}, \emph{S. arlettae}, \emph{S. auricularis}, \emph{S. borealis}, \emph{S. brunensis}, \emph{S. caeli}, \emph{S. caledonicus}, \emph{S. canis}, \emph{S. capitis}, \emph{S. capitis capitis}, \emph{S. capitis urealyticus}, \emph{S. capitis ureolyticus}, \emph{S. caprae}, \emph{S. carnosus}, \emph{S. carnosus carnosus}, \emph{S. carnosus utilis}, \emph{S. casei}, \emph{S. caseolyticus}, \emph{S. chromogenes}, \emph{S. cohnii}, \emph{S. cohnii cohnii}, \emph{S. cohnii urealyticum}, \emph{S. cohnii urealyticus}, \emph{S. condimenti}, \emph{S. croceilyticus}, \emph{S. debuckii}, \emph{S. devriesei}, \emph{S. durrellii}, \emph{S. edaphicus}, \emph{S. epidermidis}, \emph{S. equorum}, \emph{S. equorum equorum}, \emph{S. equorum linens}, \emph{S. felis}, \emph{S. fleurettii}, \emph{S. gallinarum}, \emph{S. haemolyticus}, \emph{S. hominis}, \emph{S. hominis hominis}, \emph{S. hominis novobiosepticus}, \emph{S. jettensis}, \emph{S. kloosii}, \emph{S. lentus}, \emph{S. lloydii}, \emph{S. lugdunensis}, \emph{S. marylandisciuri}, \emph{S. massiliensis}, \emph{S. microti}, \emph{S. muscae}, \emph{S. nepalensis}, \emph{S. pasteuri}, \emph{S. petrasii}, \emph{S. petrasii croceilyticus}, \emph{S. petrasii jettensis}, \emph{S. petrasii petrasii}, \emph{S. petrasii pragensis}, \emph{S. pettenkoferi}, \emph{S. piscifermentans}, \emph{S. pragensis}, \emph{S. pseudoxylosus}, \emph{S. pulvereri}, \emph{S. ratti}, \emph{S. rostri}, \emph{S. saccharolyticus}, \emph{S. saprophyticus}, \emph{S. saprophyticus bovis}, \emph{S. saprophyticus saprophyticus}, \emph{S. schleiferi}, \emph{S. schleiferi schleiferi}, \emph{S. sciuri}, \emph{S. sciuri carnaticus}, \emph{S. sciuri lentus}, \emph{S. sciuri rodentium}, \emph{S. sciuri sciuri}, \emph{S. shinii}, \emph{S. simulans}, \emph{S. stepanovicii}, \emph{S. succinus}, \emph{S. succinus casei}, \emph{S. succinus succinus}, \emph{S. taiwanensis}, \emph{S. urealyticus}, \emph{S. ureilyticus}, \emph{S. veratri}, \emph{S. vitulinus}, \emph{S. vitulus}, \emph{S. warneri}, and \emph{S. xylosus} +\item Coagulase-positive: \emph{S. agnetis}, \emph{S. argenteus}, \emph{S. coagulans}, \emph{S. cornubiensis}, \emph{S. delphini}, \emph{S. hyicus}, \emph{S. hyicus chromogenes}, \emph{S. hyicus hyicus}, \emph{S. intermedius}, \emph{S. lutrae}, \emph{S. pseudintermedius}, \emph{S. roterodami}, \emph{S. schleiferi coagulans}, \emph{S. schweitzeri}, \emph{S. simiae}, and \emph{S. singaporensis} +} + +This is based on: +\itemize{ +\item Becker K \emph{et al.} (2014). \strong{Coagulase-Negative Staphylococci.} \emph{Clin Microbiol Rev.} 27(4): 870-926; \doi{10.1128/CMR.00109-13} +\item Becker K \emph{et al.} (2019). \strong{Implications of identifying the recently defined members of the \emph{S. aureus} complex, \emph{S. argenteus} and \emph{S. schweitzeri}: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).} \emph{Clin Microbiol Infect}; \doi{10.1016/j.cmi.2019.02.028} +\item Becker K \emph{et al.} (2020). \strong{Emergence of coagulase-negative staphylococci.} \emph{Expert Rev Anti Infect Ther.} 18(4):349-366; \doi{10.1080/14787210.2020.1730813} +} + +For newly named staphylococcal species, such as \emph{S. brunensis} (2024) and \emph{S. shinii} (2023), we looked up the scientific reference to make sure the species are considered for the correct coagulase group. +} + +\subsection{Lancefield Groups in Streptococci}{ + +With \code{Lancefield = TRUE}, the following streptococci will be converted to their corresponding Lancefield group: +\itemize{ +\item Streptococcus Group A: \emph{S. pyogenes} +\item Streptococcus Group B: \emph{S. agalactiae} +\item Streptococcus Group C: \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, \emph{S. dysgalactiae equisimilis}, \emph{S. equi}, \emph{S. equi equi}, \emph{S. equi ruminatorum}, and \emph{S. equi zooepidemicus} +\item Streptococcus Group F: \emph{S. anginosus}, \emph{S. anginosus anginosus}, \emph{S. anginosus whileyi}, \emph{S. constellatus}, \emph{S. constellatus constellatus}, \emph{S. constellatus pharyngis}, \emph{S. constellatus viborgensis}, and \emph{S. intermedius} +\item Streptococcus Group G: \emph{S. canis}, \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, and \emph{S. dysgalactiae equisimilis} +\item Streptococcus Group H: \emph{S. sanguinis} +\item Streptococcus Group K: \emph{S. salivarius}, \emph{S. salivarius salivarius}, and \emph{S. salivarius thermophilus} +\item Streptococcus Group L: \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, and \emph{S. dysgalactiae equisimilis} +} + +This is based on: +\itemize{ +\item Lancefield RC (1933). \strong{A serological differentiation of human and other groups of hemolytic streptococci.} \emph{J Exp Med.} 57(4): 571-95; \doi{10.1084/jem.57.4.571} +} +} +} +\section{Source}{ + +\itemize{ +\item Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03} +\item Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on June 24th, 2024. +\item Vincent, R \emph{et al} (2013). \strong{MycoBank gearing up for new horizons.} IMA Fungus, 4(2), 371-9; \doi{10.5598/imafungus.2013.04.02.16}. Accessed from \url{https://www.mycobank.org} on June 24th, 2024. +\item GBIF Secretariat (2023). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on June 24th, 2024. +\item Reimer, LC \emph{et al.} (2022). \strong{\emph{BacDive} in 2022: the knowledge base for standardized bacterial and archaeal data.} Nucleic Acids Res., 50(D1):D741-D74; \doi{10.1093/nar/gkab961}. Accessed from \url{https://bacdive.dsmz.de} on July 16th, 2024. +\item Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microorganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: \url{https://phinvads.cdc.gov} +\item Bartlett A \emph{et al.} (2022). \strong{A comprehensive list of bacterial pathogens infecting humans} \emph{Microbiology} 168:001269; \doi{10.1099/mic.0.001269} +} +} + +\section{Matching Score for Microorganisms}{ + +With ambiguous user input in \code{\link[=as.mo]{as.mo()}} and all the \code{\link[=mo_property]{mo_*}} functions, the returned results are chosen based on their matching score using \code{\link[=mo_matching_score]{mo_matching_score()}}. This matching score \eqn{m}, is calculated as: + +\ifelse{latex}{\deqn{m_{(x, n)} = \frac{l_{n} - 0.5 \cdot \min \begin{cases}l_{n} \\ \textrm{lev}(x, n)\end{cases}}{l_{n} \cdot p_{n} \cdot k_{n}}}}{ + +\ifelse{html}{\figure{mo_matching_score.png}{options: width="300" alt="mo matching score"}}{m(x, n) = ( l_n * min(l_n, lev(x, n) ) ) / ( l_n * p_n * k_n )}} + +where: +\itemize{ +\item \eqn{x} is the user input; +\item \eqn{n} is a taxonomic name (genus, species, and subspecies); +\item \eqn{l_n} is the length of \eqn{n}; +\item \eqn{lev} is the \href{https://en.wikipedia.org/wiki/Levenshtein_distance}{Levenshtein distance function} (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n}; +\item \eqn{p_n} is the human pathogenic prevalence group of \eqn{n}, as described below; +\item \eqn{k_n} is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 1.25, Protozoa = 1.5, Chromista = 1.75, Archaea = 2, others = 3. +} + +The grouping into human pathogenic prevalence \eqn{p} is based on recent work from Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}) who extensively studied medical-scientific literature to categorise all bacterial species into these groups: +\itemize{ +\item \strong{Established}, if a taxonomic species has infected at least three persons in three or more references. These records have \code{prevalence = 1.15} in the \link{microorganisms} data set; +\item \strong{Putative}, if a taxonomic species has fewer than three known cases. These records have \code{prevalence = 1.25} in the \link{microorganisms} data set. +} + +Furthermore, +\itemize{ +\item Genera from the World Health Organization's (WHO) Priority Pathogen List have \code{prevalence = 1.0} in the \link{microorganisms} data set; +\item Any genus present in the \strong{established} list also has \code{prevalence = 1.15} in the \link{microorganisms} data set; +\item Any other genus present in the \strong{putative} list has \code{prevalence = 1.25} in the \link{microorganisms} data set; +\item Any other species or subspecies of which the genus is present in the two aforementioned groups, has \code{prevalence = 1.5} in the \link{microorganisms} data set; +\item Any \emph{non-bacterial} genus, species or subspecies of which the genus is present in the following list, has \code{prevalence = 1.25} in the \link{microorganisms} data set: \emph{Absidia}, \emph{Acanthamoeba}, \emph{Acremonium}, \emph{Actinomucor}, \emph{Aedes}, \emph{Alternaria}, \emph{Amoeba}, \emph{Ancylostoma}, \emph{Angiostrongylus}, \emph{Anisakis}, \emph{Anopheles}, \emph{Apophysomyces}, \emph{Arthroderma}, \emph{Aspergillus}, \emph{Aureobasidium}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Bipolaris}, \emph{Blastobotrys}, \emph{Blastocystis}, \emph{Blastomyces}, \emph{Candida}, \emph{Capillaria}, \emph{Chaetomium}, \emph{Chilomastix}, \emph{Chrysonilia}, \emph{Chrysosporium}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Clavispora}, \emph{Coccidioides}, \emph{Cokeromyces}, \emph{Conidiobolus}, \emph{Coniochaeta}, \emph{Contracaecum}, \emph{Cordylobia}, \emph{Cryptococcus}, \emph{Cryptosporidium}, \emph{Cunninghamella}, \emph{Curvularia}, \emph{Cyberlindnera}, \emph{Debaryozyma}, \emph{Demodex}, \emph{Dermatobia}, \emph{Dientamoeba}, \emph{Diphyllobothrium}, \emph{Dirofilaria}, \emph{Echinostoma}, \emph{Entamoeba}, \emph{Enterobius}, \emph{Epidermophyton}, \emph{Exidia}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Fasciola}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Geotrichum}, \emph{Giardia}, \emph{Graphium}, \emph{Haloarcula}, \emph{Halobacterium}, \emph{Halococcus}, \emph{Hansenula}, \emph{Hendersonula}, \emph{Heterophyes}, \emph{Histomonas}, \emph{Histoplasma}, \emph{Hortaea}, \emph{Hymenolepis}, \emph{Hypomyces}, \emph{Hysterothylacium}, \emph{Kloeckera}, \emph{Kluyveromyces}, \emph{Kodamaea}, \emph{Lacazia}, \emph{Leishmania}, \emph{Lichtheimia}, \emph{Lodderomyces}, \emph{Lomentospora}, \emph{Madurella}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Metagonimus}, \emph{Meyerozyma}, \emph{Microsporidium}, \emph{Microsporum}, \emph{Millerozyma}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Nannizzia}, \emph{Necator}, \emph{Nectria}, \emph{Ochroconis}, \emph{Oesophagostomum}, \emph{Oidiodendron}, \emph{Opisthorchis}, \emph{Paecilomyces}, \emph{Paracoccidioides}, \emph{Pediculus}, \emph{Penicillium}, \emph{Phaeoacremonium}, \emph{Phaeomoniella}, \emph{Phialophora}, \emph{Phlebotomus}, \emph{Phoma}, \emph{Pichia}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Pneumocystis}, \emph{Pseudallescheria}, \emph{Pseudoscopulariopsis}, \emph{Pseudoterranova}, \emph{Pulex}, \emph{Purpureocillium}, \emph{Quambalaria}, \emph{Rhinocladiella}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Saccharomyces}, \emph{Saksenaea}, \emph{Saprochaete}, \emph{Sarcoptes}, \emph{Scedosporium}, \emph{Schistosoma}, \emph{Schizosaccharomyces}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Spirometra}, \emph{Sporobolomyces}, \emph{Sporopachydermia}, \emph{Sporothrix}, \emph{Sporotrichum}, \emph{Stachybotrys}, \emph{Strongyloides}, \emph{Syncephalastrum}, \emph{Syngamus}, \emph{Taenia}, \emph{Talaromyces}, \emph{Teleomorph}, \emph{Toxocara}, \emph{Trichinella}, \emph{Trichobilharzia}, \emph{Trichoderma}, \emph{Trichomonas}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Trichostrongylus}, \emph{Trichuris}, \emph{Tritirachium}, \emph{Trombicula}, \emph{Trypanosoma}, \emph{Tunga}, \emph{Ulocladium}, \emph{Ustilago}, \emph{Verticillium}, \emph{Wallemia}, \emph{Wangiella}, \emph{Wickerhamomyces}, \emph{Wuchereria}, \emph{Yarrowia}, or \emph{Zygosaccharomyces}; +\item All other records have \code{prevalence = 2.0} in the \link{microorganisms} data set. +} + +When calculating the matching score, all characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses. + +All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.381}, a less prevalent microorganism in humans), although the latter would alphabetically come first. +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +\donttest{ +# These examples all return "B_STPHY_AURS", the ID of S. aureus: +as.mo(c( + "sau", # WHONET code + "stau", + "STAU", + "staaur", + "S. aureus", + "S aureus", + "Sthafilokkockus aureus", # handles incorrect spelling + "Staphylococcus aureus (MRSA)", + "MRSA", # Methicillin Resistant S. aureus + "VISA", # Vancomycin Intermediate S. aureus + "VRSA", # Vancomycin Resistant S. aureus + 115329001 # SNOMED CT code +)) + +# Dyslexia is no problem - these all work: +as.mo(c( + "Ureaplasma urealyticum", + "Ureaplasma urealyticus", + "Ureaplasmium urealytica", + "Ureaplazma urealitycium" +)) + +# input will get cleaned up with the input given in the `cleaning_regex` argument, +# which defaults to `mo_cleaning_regex()`: +cat(mo_cleaning_regex(), "\n") + +as.mo("Streptococcus group A") + +as.mo("S. epidermidis") # will remain species: B_STPHY_EPDR +as.mo("S. epidermidis", Becker = TRUE) # will not remain species: B_STPHY_CONS + +as.mo("S. pyogenes") # will remain species: B_STRPT_PYGN +as.mo("S. pyogenes", Lancefield = TRUE) # will not remain species: B_STRPT_GRPA + +# All mo_* functions use as.mo() internally too (see ?mo_property): +mo_genus("E. coli") +mo_gramstain("ESCO") +mo_is_intrinsic_resistant("ESCCOL", ab = "vanco") +} +} +\seealso{ +\link{microorganisms} for the \link{data.frame} that is being used to determine ID's. + +The \code{\link[=mo_property]{mo_*}} functions (such as \code{\link[=mo_genus]{mo_genus()}}, \code{\link[=mo_gramstain]{mo_gramstain()}}) to get properties based on the returned code. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/as.sir.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/sir.R +\docType{data} +\name{as.sir} +\alias{as.sir} +\alias{sir} +\alias{NA_sir_} +\alias{is.sir} +\alias{is_sir_eligible} +\alias{as.sir.default} +\alias{as.sir.mic} +\alias{as.sir.disk} +\alias{as.sir.data.frame} +\alias{sir_interpretation_history} +\title{Translate MIC and Disk Diffusion to SIR, or Clean Existing SIR Data} +\source{ +For interpretations of minimum inhibitory concentration (MIC) values and disk diffusion diameters: +\itemize{ +\item \strong{CLSI M39: Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data}, 2011-2024, \emph{Clinical and Laboratory Standards Institute} (CLSI). \url{https://clsi.org/standards/products/microbiology/documents/m39/}. +\item \strong{CLSI M100: Performance Standard for Antimicrobial Susceptibility Testing}, 2011-2024, \emph{Clinical and Laboratory Standards Institute} (CLSI). \url{https://clsi.org/standards/products/microbiology/documents/m100/}. +\item \strong{CLSI VET01: Performance Standards for Antimicrobial Disk and Dilution Susceptibility Tests for Bacteria Isolated From Animals}, 2019-2024, \emph{Clinical and Laboratory Standards Institute} (CLSI). \url{https://clsi.org/standards/products/veterinary-medicine/documents/vet01/}. +\item \strong{CLSI VET09: Understanding Susceptibility Test Data as a Component of Antimicrobial Stewardship in Veterinary Settings}, 2019-2024, \emph{Clinical and Laboratory Standards Institute} (CLSI). \url{https://clsi.org/standards/products/veterinary-medicine/documents/vet09/}. +\item \strong{EUCAST Breakpoint tables for interpretation of MICs and zone diameters}, 2011-2024, \emph{European Committee on Antimicrobial Susceptibility Testing} (EUCAST). \url{https://www.eucast.org/clinical_breakpoints}. +\item \strong{WHONET} as a source for machine-reading the clinical breakpoints (\href{https://msberends.github.io/AMR/reference/clinical_breakpoints.html#imported-from-whonet}{read more here}), 1989-2024, \emph{WHO Collaborating Centre for Surveillance of Antimicrobial Resistance}. \url{https://whonet.org/}. +} +} +\usage{ +as.sir(x, ...) + +NA_sir_ + +is.sir(x) + +is_sir_eligible(x, threshold = 0.05) + +\method{as.sir}{default}( + x, + S = "^(S|U)+$", + I = "^(I)+$", + R = "^(R)+$", + NI = "^(N|NI|V)+$", + SDD = "^(SDD|D|H)+$", + ... +) + +\method{as.sir}{mic}( + x, + mo = NULL, + ab = deparse(substitute(x)), + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + conserve_capped_values = FALSE, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE, + ... +) + +\method{as.sir}{disk}( + x, + mo = NULL, + ab = deparse(substitute(x)), + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE, + ... +) + +\method{as.sir}{data.frame}( + x, + ..., + col_mo = NULL, + guideline = getOption("AMR_guideline", "EUCAST"), + uti = NULL, + conserve_capped_values = FALSE, + add_intrinsic_resistance = FALSE, + reference_data = AMR::clinical_breakpoints, + include_screening = getOption("AMR_include_screening", FALSE), + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + host = NULL, + verbose = FALSE +) + +sir_interpretation_history(clean = FALSE) +} +\arguments{ +\item{x}{vector of values (for class \code{\link{mic}}: MIC values in mg/L, for class \code{\link{disk}}: a disk diffusion radius in millimetres)} + +\item{...}{for using on a \link{data.frame}: names of columns to apply \code{\link[=as.sir]{as.sir()}} on (supports tidy selection such as \code{column1:column4}). Otherwise: arguments passed on to methods.} + +\item{threshold}{maximum fraction of invalid antimicrobial interpretations of \code{x}, see \emph{Examples}} + +\item{S, I, R, NI, SDD}{a case-independent \link[base:regex]{regular expression} to translate input to this result. This regular expression will be run \emph{after} all non-letters and whitespaces are removed from the input.} + +\item{mo}{a vector (or column name) with \link{character}s that can be coerced to valid microorganism codes with \code{\link[=as.mo]{as.mo()}}, can be left empty to determine it automatically} + +\item{ab}{a vector (or column name) with \link{character}s that can be coerced to a valid antimicrobial drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{guideline}{defaults to EUCAST 2024 (the latest implemented EUCAST guideline in the \link{clinical_breakpoints} data set), but can be set with the package option \code{\link[=AMR-options]{AMR_guideline}}. Currently supports EUCAST (2011-2024) and CLSI (2011-2024), see \emph{Details}.} + +\item{uti}{(Urinary Tract Infection) a vector (or column name) with \link{logical}s (\code{TRUE} or \code{FALSE}) to specify whether a UTI specific interpretation from the guideline should be chosen. For using \code{\link[=as.sir]{as.sir()}} on a \link{data.frame}, this can also be a column containing \link{logical}s or when left blank, the data set will be searched for a column 'specimen', and rows within this column containing 'urin' (such as 'urine', 'urina') will be regarded isolates from a UTI. See \emph{Examples}.} + +\item{conserve_capped_values}{a \link{logical} to indicate that MIC values starting with \code{">"} (but not \code{">="}) must always return "R" , and that MIC values starting with \code{"<"} (but not \code{"<="}) must always return "S"} + +\item{add_intrinsic_resistance}{\emph{(only useful when using a EUCAST guideline)} a \link{logical} to indicate whether intrinsic antibiotic resistance must also be considered for applicable bug-drug combinations, meaning that e.g. ampicillin will always return "R" in \emph{Klebsiella} species. Determination is based on the \link{intrinsic_resistant} data set, that itself is based on \href{https://www.eucast.org/expert_rules_and_expected_phenotypes}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021).} + +\item{reference_data}{a \link{data.frame} to be used for interpretation, which defaults to the \link{clinical_breakpoints} data set. Changing this argument allows for using own interpretation guidelines. This argument must contain a data set that is equal in structure to the \link{clinical_breakpoints} data set (same column names and column types). Please note that the \code{guideline} argument will be ignored when \code{reference_data} is manually set.} + +\item{include_screening}{a \link{logical} to indicate that clinical breakpoints for screening are allowed - the default is \code{FALSE}. Can also be set with the package option \code{\link[=AMR-options]{AMR_include_screening}}.} + +\item{include_PKPD}{a \link{logical} to indicate that PK/PD clinical breakpoints must be applied as a last resort - the default is \code{TRUE}. Can also be set with the package option \code{\link[=AMR-options]{AMR_include_PKPD}}.} + +\item{breakpoint_type}{the type of breakpoints to use, either "ECOFF", "animal", or "human". ECOFF stands for Epidemiological Cut-Off values. The default is \code{"human"}, which can also be set with the package option \code{\link[=AMR-options]{AMR_breakpoint_type}}. If \code{host} is set to values of veterinary species, this will automatically be set to \code{"animal"}.} + +\item{host}{a vector (or column name) with \link{character}s to indicate the host. Only useful for veterinary breakpoints, as it requires \code{breakpoint_type = "animal"}. The values can be any text resembling the animal species, even in any of the 20 supported languages of this package. For foreign languages, be sure to set the language with \code{\link[=set_AMR_locale]{set_AMR_locale()}} (though it will be automatically guessed based on the system language).} + +\item{verbose}{a \link{logical} to indicate that all notes should be printed during interpretation of MIC values or disk diffusion values.} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{clean}{a \link{logical} to indicate whether previously stored results should be forgotten after returning the 'logbook' with results} +} +\value{ +Ordered \link{factor} with new class \code{sir} +} +\description{ +Clean up existing SIR values, or interpret minimum inhibitory concentration (MIC) values and disk diffusion diameters according to EUCAST or CLSI. \code{\link[=as.sir]{as.sir()}} transforms the input to a new class \code{\link{sir}}, which is an ordered \link{factor} containing the levels \code{S}, \code{SDD}, \code{I}, \code{R}, \code{NI}. + +These breakpoints are currently implemented: +\itemize{ +\item For \strong{clinical microbiology}: EUCAST 2011-2024 and CLSI 2011-2024; +\item For \strong{veterinary microbiology}: EUCAST 2021-2024 and CLSI 2019-2024; +\item For \strong{ECOFFs} (Epidemiological Cut-off Values): EUCAST 2020-2024 and CLSI 2022-2024. +} + +All breakpoints used for interpretation are available in our \link{clinical_breakpoints} data set. +} +\details{ +\emph{Note: The clinical breakpoints in this package were validated through, and imported from, \href{https://whonet.org}{WHONET}. The public use of this \code{AMR} package has been endorsed by both CLSI and EUCAST. See \link{clinical_breakpoints} for more information.} +\subsection{How it Works}{ + +The \code{\link[=as.sir]{as.sir()}} function can work in four ways: +\enumerate{ +\item For \strong{cleaning raw / untransformed data}. The data will be cleaned to only contain valid values, namely: \strong{S} for susceptible, \strong{I} for intermediate or 'susceptible, increased exposure', \strong{R} for resistant, \strong{NI} for non-interpretable, and \strong{SDD} for susceptible dose-dependent. Each of these can be set using a \link[base:regex]{regular expression}. Furthermore, \code{\link[=as.sir]{as.sir()}} will try its best to clean with some intelligence. For example, mixed values with SIR interpretations and MIC values such as \code{"<0.25; S"} will be coerced to \code{"S"}. Combined interpretations for multiple test methods (as seen in laboratory records) such as \code{"S; S"} will be coerced to \code{"S"}, but a value like \code{"S; I"} will return \code{NA} with a warning that the input is invalid. +\item For \strong{interpreting minimum inhibitory concentration (MIC) values} according to EUCAST or CLSI. You must clean your MIC values first using \code{\link[=as.mic]{as.mic()}}, that also gives your columns the new data class \code{\link{mic}}. Also, be sure to have a column with microorganism names or codes. It will be found automatically, but can be set manually using the \code{mo} argument. +\itemize{ +\item Using \code{dplyr}, SIR interpretation can be done very easily with either: + +\if{html}{\out{
}}\preformatted{your_data \%>\% mutate_if(is.mic, as.sir) +your_data \%>\% mutate(across(where(is.mic), as.sir)) +your_data \%>\% mutate_if(is.mic, as.sir, ab = "column_with_antibiotics", mo = "column_with_microorganisms") +your_data \%>\% mutate_if(is.mic, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...)) + +# for veterinary breakpoints, also set `host`: +your_data \%>\% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI") +}\if{html}{\out{
}} +\item Operators like "<=" will be stripped before interpretation. When using \code{conserve_capped_values = TRUE}, an MIC value of e.g. ">2" will always return "R", even if the breakpoint according to the chosen guideline is ">=4". This is to prevent that capped values from raw laboratory data would not be treated conservatively. The default behaviour (\code{conserve_capped_values = FALSE}) considers ">2" to be lower than ">=4" and might in this case return "S" or "I". +} +\item For \strong{interpreting disk diffusion diameters} according to EUCAST or CLSI. You must clean your disk zones first using \code{\link[=as.disk]{as.disk()}}, that also gives your columns the new data class \code{\link{disk}}. Also, be sure to have a column with microorganism names or codes. It will be found automatically, but can be set manually using the \code{mo} argument. +\itemize{ +\item Using \code{dplyr}, SIR interpretation can be done very easily with either: + +\if{html}{\out{
}}\preformatted{your_data \%>\% mutate_if(is.disk, as.sir) +your_data \%>\% mutate(across(where(is.disk), as.sir)) +your_data \%>\% mutate_if(is.disk, as.sir, ab = "column_with_antibiotics", mo = "column_with_microorganisms") +your_data \%>\% mutate_if(is.disk, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...)) + +# for veterinary breakpoints, also set `host`: +your_data \%>\% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI") +}\if{html}{\out{
}} +} +\item For \strong{interpreting a complete data set}, with automatic determination of MIC values, disk diffusion diameters, microorganism names or codes, and antimicrobial test results. This is done very simply by running \code{as.sir(your_data)}. +} + +\strong{For points 2, 3 and 4: Use \code{\link[=sir_interpretation_history]{sir_interpretation_history()}}} to retrieve a \link{data.frame} (or \link[tibble:tibble]{tibble} if the \code{tibble} package is installed) with all results of the last \code{\link[=as.sir]{as.sir()}} call. +} + +\subsection{Supported Guidelines}{ + +For interpreting MIC values as well as disk diffusion diameters, currently implemented guidelines are for \strong{clinical microbiology}: EUCAST 2011-2024 and CLSI 2011-2024, and for \strong{veterinary microbiology}: EUCAST 2021-2024 and CLSI 2019-2024. + +Thus, the \code{guideline} argument must be set to e.g., \code{"EUCAST 2024"} or \code{"CLSI 2024"}. By simply using \code{"EUCAST"} (the default) or \code{"CLSI"} as input, the latest included version of that guideline will automatically be selected. You can set your own data set using the \code{reference_data} argument. The \code{guideline} argument will then be ignored. + +You can set the default guideline with the package option \code{\link[=AMR-options]{AMR_guideline}} (e.g. in your \code{.Rprofile} file), such as: + +\if{html}{\out{
}}\preformatted{ options(AMR_guideline = "CLSI") + options(AMR_guideline = "CLSI 2018") + options(AMR_guideline = "EUCAST 2020") + # or to reset: + options(AMR_guideline = NULL) +}\if{html}{\out{
}} + +For veterinary guidelines, these might be the best options: + +\if{html}{\out{
}}\preformatted{ options(AMR_guideline = "CLSI") + options(AMR_breakpoint_type = "animal") +}\if{html}{\out{
}} + +When applying veterinary breakpoints (by setting \code{host} or by setting \code{breakpoint_type = "animal"}), the \href{https://clsi.org/standards/products/veterinary-medicine/documents/vet09/}{CLSI VET09 guideline} will be applied to cope with missing animal species-specific breakpoints. +} + +\subsection{After Interpretation}{ + +After using \code{\link[=as.sir]{as.sir()}}, you can use the \code{\link[=eucast_rules]{eucast_rules()}} defined by EUCAST to (1) apply inferred susceptibility and resistance based on results of other antimicrobials and (2) apply intrinsic resistance based on taxonomic properties of a microorganism. + +To determine which isolates are multi-drug resistant, be sure to run \code{\link[=mdro]{mdro()}} (which applies the MDR/PDR/XDR guideline from 2012 at default) on a data set that contains S/I/R values. Read more about \link[=mdro]{interpreting multidrug-resistant organisms here}. +} + +\subsection{Machine-Readable Clinical Breakpoints}{ + +The repository of this package \href{https://github.com/msberends/AMR/blob/main/data-raw/clinical_breakpoints.txt}{contains a machine-readable version} of all guidelines. This is a CSV file consisting of 34 063 rows and 14 columns. This file is machine-readable, since it contains one row for every unique combination of the test method (MIC or disk diffusion), the antimicrobial drug and the microorganism. \strong{This allows for easy implementation of these rules in laboratory information systems (LIS)}. Note that it only contains interpretation guidelines for humans - interpretation guidelines from CLSI for animals were removed. +} + +\subsection{Other}{ + +The function \code{\link[=is.sir]{is.sir()}} detects if the input contains class \code{sir}. If the input is a \link{data.frame}, it iterates over all columns and returns a \link{logical} vector. + +The base R function \code{\link[=as.double]{as.double()}} can be used to retrieve quantitative values from a \code{sir} object: \code{"S"} = 1, \code{"I"}/\code{"SDD"} = 2, \code{"R"} = 3. All other values are rendered \code{NA} . \strong{Note:} Do not use \code{as.integer()}, since that (because of how R works internally) will return the factor level indices, and not these aforementioned quantitative values. + +The function \code{\link[=is_sir_eligible]{is_sir_eligible()}} returns \code{TRUE} when a column contains at most 5\% invalid antimicrobial interpretations (not S and/or I and/or R and/or NI and/or SDD), and \code{FALSE} otherwise. The threshold of 5\% can be set with the \code{threshold} argument. If the input is a \link{data.frame}, it iterates over all columns and returns a \link{logical} vector. +} + +\code{NA_sir_} is a missing value of the new \code{sir} class, analogous to e.g. base \R's \code{\link[base:NA]{NA_character_}}. +} +\section{Interpretation of SIR}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (\url{https://www.eucast.org/newsiandr}): +\itemize{ +\item \strong{S - Susceptible, standard dosing regimen}\cr +A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +\item \strong{I - Susceptible, increased exposure} \emph{\cr +A microorganism is categorised as "Susceptible, Increased exposure}" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +\item \strong{R = Resistant}\cr +A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +\itemize{ +\item \emph{Exposure} is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +} +} + +This AMR package honours this insight. Use \code{\link[=susceptibility]{susceptibility()}} (equal to \code{\link[=proportion_SI]{proportion_SI()}}) to determine antimicrobial susceptibility and \code{\link[=count_susceptible]{count_susceptible()}} (equal to \code{\link[=count_SI]{count_SI()}}) to count susceptible isolates. +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +example_isolates +summary(example_isolates) # see all SIR results at a glance + +# For INTERPRETING disk diffusion and MIC values ----------------------- + +# example data sets, with combined MIC values and disk zones +df_wide <- data.frame( + microorganism = "Escherichia coli", + amoxicillin = as.mic(8), + cipro = as.mic(0.256), + tobra = as.disk(16), + genta = as.disk(18), + ERY = "R" +) +df_long <- data.frame( + bacteria = rep("Escherichia coli", 4), + antibiotic = c("amoxicillin", "cipro", "tobra", "genta"), + mics = as.mic(c(0.01, 1, 4, 8)), + disks = as.disk(c(6, 10, 14, 18)) +) + +\donttest{ +## Using dplyr ------------------------------------------------- +if (require("dplyr")) { + # approaches that all work without additional arguments: + df_wide \%>\% mutate_if(is.mic, as.sir) + df_wide \%>\% mutate_if(function(x) is.mic(x) | is.disk(x), as.sir) + df_wide \%>\% mutate(across(where(is.mic), as.sir)) + df_wide \%>\% mutate_at(vars(amoxicillin:tobra), as.sir) + df_wide \%>\% mutate(across(amoxicillin:tobra, as.sir)) + + # approaches that all work with additional arguments: + df_long \%>\% + # given a certain data type, e.g. MIC values + mutate_if(is.mic, as.sir, + mo = "bacteria", + ab = "antibiotic", + guideline = "CLSI") + df_long \%>\% + mutate(across(where(is.mic), + function(x) as.sir(x, + mo = "bacteria", + ab = "antibiotic", + guideline = "CLSI"))) + df_wide \%>\% + # given certain columns, e.g. from 'cipro' to 'genta' + mutate_at(vars(cipro:genta), as.sir, + mo = "bacteria", + guideline = "CLSI") + df_wide \%>\% + mutate(across(cipro:genta, + function(x) as.sir(x, + mo = "bacteria", + guideline = "CLSI"))) + + # for veterinary breakpoints, add 'host': + df_long$animal_species <- c("cats", "dogs", "horses", "cattle") + df_long \%>\% + # given a certain data type, e.g. MIC values + mutate_if(is.mic, as.sir, + mo = "bacteria", + ab = "antibiotic", + host = "animal_species", + guideline = "CLSI") + df_long \%>\% + mutate(across(where(is.mic), + function(x) as.sir(x, + mo = "bacteria", + ab = "antibiotic", + host = "animal_species", + guideline = "CLSI"))) + df_wide \%>\% + mutate_at(vars(cipro:genta), as.sir, + mo = "bacteria", + ab = "antibiotic", + host = "animal_species", + guideline = "CLSI") + df_wide \%>\% + mutate(across(cipro:genta, + function(x) as.sir(x, + mo = "bacteria", + host = "animal_species", + guideline = "CLSI"))) + + # to include information about urinary tract infections (UTI) + data.frame(mo = "E. coli", + nitrofuratoin = c("<= 2", 32), + from_the_bladder = c(TRUE, FALSE)) \%>\% + as.sir(uti = "from_the_bladder") + + data.frame(mo = "E. coli", + nitrofuratoin = c("<= 2", 32), + specimen = c("urine", "blood")) \%>\% + as.sir() # automatically determines urine isolates + + df_wide \%>\% + mutate_at(vars(cipro:genta), as.sir, mo = "E. coli", uti = TRUE) +} + + +## Using base R ------------------------------------------------ + +as.sir(df_wide) + +# return a 'logbook' about the results: +sir_interpretation_history() + +# for single values +as.sir( + x = as.mic(2), + mo = as.mo("S. pneumoniae"), + ab = "AMP", + guideline = "EUCAST" +) + +as.sir( + x = as.disk(18), + mo = "Strep pneu", # `mo` will be coerced with as.mo() + ab = "ampicillin", # and `ab` with as.ab() + guideline = "EUCAST" +) + + +# For CLEANING existing SIR values ------------------------------------ + +as.sir(c("S", "SDD", "I", "R", "NI", "A", "B", "C")) +as.sir("<= 0.002; S") # will return "S" +sir_data <- as.sir(c(rep("S", 474), rep("I", 36), rep("R", 370))) +is.sir(sir_data) +plot(sir_data) # for percentages +barplot(sir_data) # for frequencies + +# as common in R, you can use as.integer() to return factor indices: +as.integer(as.sir(c("S", "SDD", "I", "R", "NI", NA))) +# but for computational use, as.double() will return 1 for S, 2 for I/SDD, and 3 for R: +as.double(as.sir(c("S", "SDD", "I", "R", "NI", NA))) + +# the dplyr way +if (require("dplyr")) { + example_isolates \%>\% + mutate_at(vars(PEN:RIF), as.sir) + # same: + example_isolates \%>\% + as.sir(PEN:RIF) + + # fastest way to transform all columns with already valid AMR results to class `sir`: + example_isolates \%>\% + mutate_if(is_sir_eligible, as.sir) + + # since dplyr 1.0.0, this can also be: + # example_isolates \%>\% + # mutate(across(where(is_sir_eligible), as.sir)) +} +} +} +\seealso{ +\code{\link[=as.mic]{as.mic()}}, \code{\link[=as.disk]{as.disk()}}, \code{\link[=as.mo]{as.mo()}} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/atc_online.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/atc_online.R +\name{atc_online_property} +\alias{atc_online_property} +\alias{atc_online_groups} +\alias{atc_online_ddd} +\alias{atc_online_ddd_units} +\title{Get ATC Properties from WHOCC Website} +\source{ +\url{https://atcddd.fhi.no/atc_ddd_alterations__cumulative/ddd_alterations/abbrevations/} +} +\usage{ +atc_online_property( + atc_code, + property, + administration = "O", + url = "https://atcddd.fhi.no/atc_ddd_index/?code=\%s&showdescription=no", + url_vet = "https://atcddd.fhi.no/atcvet/atcvet_index/?code=\%s&showdescription=no" +) + +atc_online_groups(atc_code, ...) + +atc_online_ddd(atc_code, ...) + +atc_online_ddd_units(atc_code, ...) +} +\arguments{ +\item{atc_code}{a \link{character} (vector) with ATC code(s) of antibiotics, will be coerced with \code{\link[=as.ab]{as.ab()}} and \code{\link[=ab_atc]{ab_atc()}} internally if not a valid ATC code} + +\item{property}{property of an ATC code. Valid values are \code{"ATC"}, \code{"Name"}, \code{"DDD"}, \code{"U"} (\code{"unit"}), \code{"Adm.R"}, \code{"Note"} and \code{groups}. For this last option, all hierarchical groups of an ATC code will be returned, see \emph{Examples}.} + +\item{administration}{type of administration when using \code{property = "Adm.R"}, see \emph{Details}} + +\item{url}{url of website of the WHOCC. The sign \verb{\%s} can be used as a placeholder for ATC codes.} + +\item{url_vet}{url of website of the WHOCC for veterinary medicine. The sign \verb{\%s} can be used as a placeholder for ATC_vet codes (that all start with "Q").} + +\item{...}{arguments to pass on to \code{atc_property}} +} +\description{ +Gets data from the WHOCC website to determine properties of an Anatomical Therapeutic Chemical (ATC) (e.g. an antibiotic), such as the name, defined daily dose (DDD) or standard unit. +} +\details{ +Options for argument \code{administration}: +\itemize{ +\item \code{"Implant"} = Implant +\item \code{"Inhal"} = Inhalation +\item \code{"Instill"} = Instillation +\item \code{"N"} = nasal +\item \code{"O"} = oral +\item \code{"P"} = parenteral +\item \code{"R"} = rectal +\item \code{"SL"} = sublingual/buccal +\item \code{"TD"} = transdermal +\item \code{"V"} = vaginal +} + +Abbreviations of return values when using \code{property = "U"} (unit): +\itemize{ +\item \code{"g"} = gram +\item \code{"mg"} = milligram +\item \code{"mcg"} = microgram +\item \code{"U"} = unit +\item \code{"TU"} = thousand units +\item \code{"MU"} = million units +\item \code{"mmol"} = millimole +\item \code{"ml"} = millilitre (e.g. eyedrops) +} + +\strong{N.B. This function requires an internet connection and only works if the following packages are installed: \code{curl}, \code{rvest}, \code{xml2}.} +} +\examples{ +\donttest{ +if (requireNamespace("curl") && requireNamespace("rvest") && requireNamespace("xml2")) { + # oral DDD (Defined Daily Dose) of amoxicillin + atc_online_property("J01CA04", "DDD", "O") + atc_online_ddd(ab_atc("amox")) + + # parenteral DDD (Defined Daily Dose) of amoxicillin + atc_online_property("J01CA04", "DDD", "P") + + atc_online_property("J01CA04", property = "groups") # search hierarchical groups of amoxicillin +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/av_from_text.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/av_from_text.R +\name{av_from_text} +\alias{av_from_text} +\title{Retrieve Antiviral Drug Names and Doses from Clinical Text} +\usage{ +av_from_text( + text, + type = c("drug", "dose", "administration"), + collapse = NULL, + translate_av = FALSE, + thorough_search = NULL, + info = interactive(), + ... +) +} +\arguments{ +\item{text}{text to analyse} + +\item{type}{type of property to search for, either \code{"drug"}, \code{"dose"} or \code{"administration"}, see \emph{Examples}} + +\item{collapse}{a \link{character} to pass on to \code{paste(, collapse = ...)} to only return one \link{character} per element of \code{text}, see \emph{Examples}} + +\item{translate_av}{if \code{type = "drug"}: a column name of the \link{antivirals} data set to translate the antibiotic abbreviations to, using \code{\link[=av_property]{av_property()}}. The default is \code{FALSE}. Using \code{TRUE} is equal to using "name".} + +\item{thorough_search}{a \link{logical} to indicate whether the input must be extensively searched for misspelling and other faulty input values. Setting this to \code{TRUE} will take considerably more time than when using \code{FALSE}. At default, it will turn \code{TRUE} when all input elements contain a maximum of three words.} + +\item{info}{a \link{logical} to indicate whether a progress bar should be printed - the default is \code{TRUE} only in interactive mode} + +\item{...}{arguments passed on to \code{\link[=as.av]{as.av()}}} +} +\value{ +A \link{list}, or a \link{character} if \code{collapse} is not \code{NULL} +} +\description{ +Use this function on e.g. clinical texts from health care records. It returns a \link{list} with all antiviral drugs, doses and forms of administration found in the texts. +} +\details{ +This function is also internally used by \code{\link[=as.av]{as.av()}}, although it then only searches for the first drug name and will throw a note if more drug names could have been returned. Note: the \code{\link[=as.av]{as.av()}} function may use very long regular expression to match brand names of antiviral drugs. This may fail on some systems. +\subsection{Argument \code{type}}{ + +At default, the function will search for antiviral drug names. All text elements will be searched for official names, ATC codes and brand names. As it uses \code{\link[=as.av]{as.av()}} internally, it will correct for misspelling. + +With \code{type = "dose"} (or similar, like "dosing", "doses"), all text elements will be searched for \link{numeric} values that are higher than 100 and do not resemble years. The output will be \link{numeric}. It supports any unit (g, mg, IE, etc.) and multiple values in one clinical text, see \emph{Examples}. + +With \code{type = "administration"} (or abbreviations, like "admin", "adm"), all text elements will be searched for a form of drug administration. It supports the following forms (including common abbreviations): buccal, implant, inhalation, instillation, intravenous, nasal, oral, parenteral, rectal, sublingual, transdermal and vaginal. Abbreviations for oral (such as 'po', 'per os') will become "oral", all values for intravenous (such as 'iv', 'intraven') will become "iv". It supports multiple values in one clinical text, see \emph{Examples}. +} + +\subsection{Argument \code{collapse}}{ + +Without using \code{collapse}, this function will return a \link{list}. This can be convenient to use e.g. inside a \code{mutate()}):\cr +\code{df \%>\% mutate(avx = av_from_text(clinical_text))} + +The returned AV codes can be transformed to official names, groups, etc. with all \code{\link[=av_property]{av_*}} functions such as \code{\link[=av_name]{av_name()}} and \code{\link[=av_group]{av_group()}}, or by using the \code{translate_av} argument. + +With using \code{collapse}, this function will return a \link{character}:\cr +\code{df \%>\% mutate(avx = av_from_text(clinical_text, collapse = "|"))} +} +} +\examples{ +av_from_text("28/03/2020 valaciclovir po tid") +av_from_text("28/03/2020 valaciclovir po tid", type = "admin") +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/av_property.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/av_property.R +\name{av_property} +\alias{av_property} +\alias{av_name} +\alias{av_cid} +\alias{av_synonyms} +\alias{av_tradenames} +\alias{av_group} +\alias{av_atc} +\alias{av_loinc} +\alias{av_ddd} +\alias{av_ddd_units} +\alias{av_info} +\alias{av_url} +\title{Get Properties of an Antiviral Drug} +\usage{ +av_name(x, language = get_AMR_locale(), tolower = FALSE, ...) + +av_cid(x, ...) + +av_synonyms(x, ...) + +av_tradenames(x, ...) + +av_group(x, language = get_AMR_locale(), ...) + +av_atc(x, ...) + +av_loinc(x, ...) + +av_ddd(x, administration = "oral", ...) + +av_ddd_units(x, administration = "oral", ...) + +av_info(x, language = get_AMR_locale(), ...) + +av_url(x, open = FALSE, ...) + +av_property(x, property = "name", language = get_AMR_locale(), ...) +} +\arguments{ +\item{x}{any (vector of) text that can be coerced to a valid antiviral drug code with \code{\link[=as.av]{as.av()}}} + +\item{language}{language of the returned text - the default is system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{tolower}{a \link{logical} to indicate whether the first \link{character} of every output should be transformed to a lower case \link{character}.} + +\item{...}{other arguments passed on to \code{\link[=as.av]{as.av()}}} + +\item{administration}{way of administration, either \code{"oral"} or \code{"iv"}} + +\item{open}{browse the URL using \code{\link[utils:browseURL]{utils::browseURL()}}} + +\item{property}{one of the column names of one of the \link{antivirals} data set: \code{vector_or(colnames(antivirals), sort = FALSE)}.} +} +\value{ +\itemize{ +\item An \link{integer} in case of \code{\link[=av_cid]{av_cid()}} +\item A named \link{list} in case of \code{\link[=av_info]{av_info()}} and multiple \code{\link[=av_atc]{av_atc()}}/\code{\link[=av_synonyms]{av_synonyms()}}/\code{\link[=av_tradenames]{av_tradenames()}} +\item A \link{double} in case of \code{\link[=av_ddd]{av_ddd()}} +\item A \link{character} in all other cases +} +} +\description{ +Use these functions to return a specific property of an antiviral drug from the \link{antivirals} data set. All input values will be evaluated internally with \code{\link[=as.av]{as.av()}}. +} +\details{ +All output \link[=translate]{will be translated} where possible. + +The function \code{\link[=av_url]{av_url()}} will return the direct URL to the official WHO website. A warning will be returned if the required ATC code is not available. +} +\section{Source}{ + +World Health Organization (WHO) Collaborating Centre for Drug Statistics Methodology: \url{https://atcddd.fhi.no/atc_ddd_index/} + +European Commission Public Health PHARMACEUTICALS - COMMUNITY REGISTER: \url{https://ec.europa.eu/health/documents/community-register/html/reg_hum_atc.htm} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# all properties: +av_name("ACI") +av_atc("ACI") +av_cid("ACI") +av_synonyms("ACI") +av_tradenames("ACI") +av_group("ACI") +av_url("ACI") + +# lowercase transformation +av_name(x = c("ACI", "VALA")) +av_name(x = c("ACI", "VALA"), tolower = TRUE) + +# defined daily doses (DDD) +av_ddd("ACI", "oral") +av_ddd_units("ACI", "oral") +av_ddd("ACI", "iv") +av_ddd_units("ACI", "iv") + +av_info("ACI") # all properties as a list + +# all av_* functions use as.av() internally, so you can go from 'any' to 'any': +av_atc("ACI") +av_group("J05AB01") +av_loinc("abacavir") +av_name("29113-8") +av_name(135398513) +av_name("J05AB01") +} +\seealso{ +\link{antivirals} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/availability.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/availability.R +\name{availability} +\alias{availability} +\title{Check Availability of Columns} +\usage{ +availability(tbl, width = NULL) +} +\arguments{ +\item{tbl}{a \link{data.frame} or \link{list}} + +\item{width}{number of characters to present the visual availability - the default is filling the width of the console} +} +\value{ +\link{data.frame} with column names of \code{tbl} as row names +} +\description{ +Easy check for data availability of all columns in a data set. This makes it easy to get an idea of which antimicrobial combinations can be used for calculation with e.g. \code{\link[=susceptibility]{susceptibility()}} and \code{\link[=resistance]{resistance()}}. +} +\details{ +The function returns a \link{data.frame} with columns \code{"resistant"} and \code{"visual_resistance"}. The values in that columns are calculated with \code{\link[=resistance]{resistance()}}. +} +\examples{ +availability(example_isolates) +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + filter(mo == as.mo("Escherichia coli")) \%>\% + select_if(is.sir) \%>\% + availability() +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/bug_drug_combinations.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/bug_drug_combinations.R +\name{bug_drug_combinations} +\alias{bug_drug_combinations} +\alias{format.bug_drug_combinations} +\title{Determine Bug-Drug Combinations} +\usage{ +bug_drug_combinations(x, col_mo = NULL, FUN = mo_shortname, ...) + +\method{format}{bug_drug_combinations}( + x, + translate_ab = "name (ab, atc)", + language = get_AMR_locale(), + minimum = 30, + combine_SI = TRUE, + add_ab_group = TRUE, + remove_intrinsic_resistant = FALSE, + decimal.mark = getOption("OutDec"), + big.mark = ifelse(decimal.mark == ",", ".", ","), + ... +) +} +\arguments{ +\item{x}{a data set with antibiotic columns, such as \code{amox}, \code{AMX} and \code{AMC}} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{FUN}{the function to call on the \code{mo} column to transform the microorganism codes - the default is \code{\link[=mo_shortname]{mo_shortname()}}} + +\item{...}{arguments passed on to \code{FUN}} + +\item{translate_ab}{a \link{character} of length 1 containing column names of the \link{antibiotics} data set} + +\item{language}{language of the returned text - the default is the current system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{minimum}{the minimum allowed number of available (tested) isolates. Any isolate count lower than \code{minimum} will return \code{NA} with a warning. The default number of \code{30} isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see \emph{Source}.} + +\item{combine_SI}{a \link{logical} to indicate whether values S, SDD, and I should be summed, so resistance will be based on only R - the default is \code{TRUE}} + +\item{add_ab_group}{a \link{logical} to indicate where the group of the antimicrobials must be included as a first column} + +\item{remove_intrinsic_resistant}{\link{logical} to indicate that rows and columns with 100\% resistance for all tested antimicrobials must be removed from the table} + +\item{decimal.mark}{the character to be used to indicate the numeric + decimal point.} + +\item{big.mark}{character; if not empty used as mark between every + \code{big.interval} decimals \emph{before} (hence \code{big}) the + decimal point.} +} +\value{ +The function \code{\link[=bug_drug_combinations]{bug_drug_combinations()}} returns a \link{data.frame} with columns "mo", "ab", "S", "SDD", "I", "R", and "total". +} +\description{ +Determine antimicrobial resistance (AMR) of all bug-drug combinations in your data set where at least 30 (default) isolates are available per species. Use \code{\link[=format]{format()}} on the result to prettify it to a publishable/printable format, see \emph{Examples}. +} +\details{ +The function \code{\link[=format]{format()}} calculates the resistance per bug-drug combination and returns a table ready for reporting/publishing. Use \code{combine_SI = TRUE} (default) to test R vs. S+I and \code{combine_SI = FALSE} to test R+I vs. S. This table can also directly be used in R Markdown / Quarto without the need for e.g. \code{\link[knitr:kable]{knitr::kable()}}. +} +\examples{ +# example_isolates is a data set available in the AMR package. +# run ?example_isolates for more info. +example_isolates + +\donttest{ +x <- bug_drug_combinations(example_isolates) +head(x) +format(x, translate_ab = "name (atc)") + +# Use FUN to change to transformation of microorganism codes +bug_drug_combinations(example_isolates, + FUN = mo_gramstain +) + +bug_drug_combinations(example_isolates, + FUN = function(x) { + ifelse(x == as.mo("Escherichia coli"), + "E. coli", + "Others" + ) + } +) +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/clinical_breakpoints.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{clinical_breakpoints} +\alias{clinical_breakpoints} +\title{Data Set with Clinical Breakpoints for SIR Interpretation} +\format{ +A \link[tibble:tibble]{tibble} with 34 063 observations and 14 variables: +\itemize{ +\item \code{guideline}\cr Name of the guideline +\item \code{type}\cr Breakpoint type, either "ECOFF", "animal", or "human" +\item \code{host}\cr Host of infectious agent. This is mostly useful for veterinary breakpoints and is either "ECOFF", "aquatic", "cats", "cattle", "dogs", "horse", "human", "poultry", or "swine" +\item \code{method}\cr Testing method, either "DISK" or "MIC" +\item \code{site}\cr Body site for which the breakpoint must be applied, e.g. "Oral" or "Respiratory" +\item \code{mo}\cr Microbial ID, see \code{\link[=as.mo]{as.mo()}} +\item \code{rank_index}\cr Taxonomic rank index of \code{mo} from 1 (subspecies/infraspecies) to 5 (unknown microorganism) +\item \code{ab}\cr Antibiotic code as used by this package, EARS-Net and WHONET, see \code{\link[=as.ab]{as.ab()}} +\item \code{ref_tbl}\cr Info about where the guideline rule can be found +\item \code{disk_dose}\cr Dose of the used disk diffusion method +\item \code{breakpoint_S}\cr Lowest MIC value or highest number of millimetres that leads to "S" +\item \code{breakpoint_R}\cr Highest MIC value or lowest number of millimetres that leads to "R" +\item \code{uti}\cr A \link{logical} value (\code{TRUE}/\code{FALSE}) to indicate whether the rule applies to a urinary tract infection (UTI) +\item \code{is_SDD}\cr A \link{logical} value (\code{TRUE}/\code{FALSE}) to indicate whether the intermediate range between "S" and "R" should be interpreted as "SDD", instead of "I". This currently applies to 24 breakpoints. +} +} +\usage{ +clinical_breakpoints +} +\description{ +Data set containing clinical breakpoints to interpret MIC and disk diffusion to SIR values, according to international guidelines. This dataset contain breakpoints for humans, 7 different animal groups, and ECOFFs. + +These breakpoints are currently implemented: +\itemize{ +\item For \strong{clinical microbiology}: EUCAST 2011-2024 and CLSI 2011-2024; +\item For \strong{veterinary microbiology}: EUCAST 2021-2024 and CLSI 2019-2024; +\item For \strong{ECOFFs} (Epidemiological Cut-off Values): EUCAST 2020-2024 and CLSI 2022-2024. +} + +Use \code{\link[=as.sir]{as.sir()}} to transform MICs or disks measurements to SIR values. +} +\details{ +\subsection{Different types of breakpoints}{ + +Supported types of breakpoints are ECOFF, animal, and human. ECOFF (Epidemiological cut-off) values are used in antimicrobial susceptibility testing to differentiate between wild-type and non-wild-type strains of bacteria or fungi. + +The default is \code{"human"}, which can also be set with the package option \code{\link[=AMR-options]{AMR_breakpoint_type}}. Use \code{\link[=as.sir]{as.sir(..., breakpoint_type = ...)}} to interpret raw data using a specific breakpoint type, e.g. \code{as.sir(..., breakpoint_type = "ECOFF")} to use ECOFFs. +} + +\subsection{Imported from WHONET}{ + +Clinical breakpoints in this package were validated through and imported from \href{https://whonet.org}{WHONET}, a free desktop Windows application developed and supported by the WHO Collaborating Centre for Surveillance of Antimicrobial Resistance. More can be read on \href{https://whonet.org}{their website}. The developers of WHONET and this \code{AMR} package have been in contact about sharing their work. We highly appreciate their great development on the WHONET software. +} + +\subsection{Response from CLSI and EUCAST}{ + +The CEO of CLSI and the chairman of EUCAST have endorsed the work and public use of this \code{AMR} package (and consequently the use of their breakpoints) in June 2023, when future development of distributing clinical breakpoints was discussed in a meeting between CLSI, EUCAST, WHO, developers of WHONET software, and developers of this \code{AMR} package. +} + +\subsection{Download}{ + +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. They allow for machine reading EUCAST and CLSI guidelines, which is almost impossible with the MS Excel and PDF files distributed by EUCAST and CLSI, though initiatives have started to overcome these burdens. + +\strong{NOTE:} this \code{AMR} package (and the WHONET software as well) contains rather complex internal methods to apply the guidelines. For example, some breakpoints must be applied on certain species groups (which are in case of this package available through the \link{microorganisms.groups} data set). It is important that this is considered when using the breakpoints for own use. +} +} +\examples{ +clinical_breakpoints +} +\seealso{ +\link{intrinsic_resistant} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/count.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/count.R +\name{count} +\alias{count} +\alias{count_resistant} +\alias{count_susceptible} +\alias{count_S} +\alias{count_SI} +\alias{count_I} +\alias{count_IR} +\alias{count_R} +\alias{count_all} +\alias{n_sir} +\alias{count_df} +\title{Count Available Isolates} +\usage{ +count_resistant(..., only_all_tested = FALSE) + +count_susceptible(..., only_all_tested = FALSE) + +count_S(..., only_all_tested = FALSE) + +count_SI(..., only_all_tested = FALSE) + +count_I(..., only_all_tested = FALSE) + +count_IR(..., only_all_tested = FALSE) + +count_R(..., only_all_tested = FALSE) + +count_all(..., only_all_tested = FALSE) + +n_sir(..., only_all_tested = FALSE) + +count_df( + data, + translate_ab = "name", + language = get_AMR_locale(), + combine_SI = TRUE +) +} +\arguments{ +\item{...}{one or more vectors (or columns) with antibiotic interpretations. They will be transformed internally with \code{\link[=as.sir]{as.sir()}} if needed.} + +\item{only_all_tested}{(for combination therapies, i.e. using more than one variable for \code{...}): a \link{logical} to indicate that isolates must be tested for all antibiotics, see section \emph{Combination Therapy} below} + +\item{data}{a \link{data.frame} containing columns with class \code{\link{sir}} (see \code{\link[=as.sir]{as.sir()}})} + +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} + +\item{language}{language of the returned text - the default is the current system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{combine_SI}{a \link{logical} to indicate whether all values of S, SDD, and I must be merged into one, so the output only consists of S+SDD+I vs. R (susceptible vs. resistant) - the default is \code{TRUE}} +} +\value{ +An \link{integer} +} +\description{ +These functions can be used to count resistant/susceptible microbial isolates. All functions support quasiquotation with pipes, can be used in \code{summarise()} from the \code{dplyr} package and also support grouped variables, see \emph{Examples}. + +\code{\link[=count_resistant]{count_resistant()}} should be used to count resistant isolates, \code{\link[=count_susceptible]{count_susceptible()}} should be used to count susceptible isolates. +} +\details{ +These functions are meant to count isolates. Use the \code{\link[=resistance]{resistance()}}/\code{\link[=susceptibility]{susceptibility()}} functions to calculate microbial resistance/susceptibility. + +The function \code{\link[=count_resistant]{count_resistant()}} is equal to the function \code{\link[=count_R]{count_R()}}. The function \code{\link[=count_susceptible]{count_susceptible()}} is equal to the function \code{\link[=count_SI]{count_SI()}}. + +The function \code{\link[=n_sir]{n_sir()}} is an alias of \code{\link[=count_all]{count_all()}}. They can be used to count all available isolates, i.e. where all input antibiotics have an available result (S, I or R). Their use is equal to \code{n_distinct()}. Their function is equal to \code{count_susceptible(...) + count_resistant(...)}. + +The function \code{\link[=count_df]{count_df()}} takes any variable from \code{data} that has an \code{\link{sir}} class (created with \code{\link[=as.sir]{as.sir()}}) and counts the number of S's, I's and R's. It also supports grouped variables. The function \code{\link[=sir_df]{sir_df()}} works exactly like \code{\link[=count_df]{count_df()}}, but adds the percentage of S, I and R. +} +\section{Interpretation of SIR}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (\url{https://www.eucast.org/newsiandr}): +\itemize{ +\item \strong{S - Susceptible, standard dosing regimen}\cr +A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +\item \strong{I - Susceptible, increased exposure} \emph{\cr +A microorganism is categorised as "Susceptible, Increased exposure}" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +\item \strong{R = Resistant}\cr +A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +\itemize{ +\item \emph{Exposure} is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +} +} + +This AMR package honours this insight. Use \code{\link[=susceptibility]{susceptibility()}} (equal to \code{\link[=proportion_SI]{proportion_SI()}}) to determine antimicrobial susceptibility and \code{\link[=count_susceptible]{count_susceptible()}} (equal to \code{\link[=count_SI]{count_SI()}}) to count susceptible isolates. +} + +\section{Combination Therapy}{ + +When using more than one variable for \code{...} (= combination therapy), use \code{only_all_tested} to only count isolates that are tested for all antibiotics/variables that you test them for. See this example for two antibiotics, Drug A and Drug B, about how \code{\link[=susceptibility]{susceptibility()}} works to calculate the \%SI: + +\if{html}{\out{
}}\preformatted{-------------------------------------------------------------------- + only_all_tested = FALSE only_all_tested = TRUE + ----------------------- ----------------------- + Drug A Drug B include as include as include as include as + numerator denominator numerator denominator +-------- -------- ---------- ----------- ---------- ----------- + S or I S or I X X X X + R S or I X X X X + S or I X X - - + S or I R X X X X + R R - X - X + R - - - - + S or I X X - - + R - - - - + - - - - +-------------------------------------------------------------------- +}\if{html}{\out{
}} + +Please note that, in combination therapies, for \code{only_all_tested = TRUE} applies that: + +\if{html}{\out{
}}\preformatted{ count_S() + count_I() + count_R() = count_all() + proportion_S() + proportion_I() + proportion_R() = 1 +}\if{html}{\out{
}} + +and that, in combination therapies, for \code{only_all_tested = FALSE} applies that: + +\if{html}{\out{
}}\preformatted{ count_S() + count_I() + count_R() >= count_all() + proportion_S() + proportion_I() + proportion_R() >= 1 +}\if{html}{\out{
}} + +Using \code{only_all_tested} has no impact when only using one antibiotic as input. +} + +\examples{ +# example_isolates is a data set available in the AMR package. +# run ?example_isolates for more info. + +# base R ------------------------------------------------------------ +count_resistant(example_isolates$AMX) # counts "R" +count_susceptible(example_isolates$AMX) # counts "S" and "I" +count_all(example_isolates$AMX) # counts "S", "I" and "R" + +# be more specific +count_S(example_isolates$AMX) +count_SI(example_isolates$AMX) +count_I(example_isolates$AMX) +count_IR(example_isolates$AMX) +count_R(example_isolates$AMX) + +# Count all available isolates +count_all(example_isolates$AMX) +n_sir(example_isolates$AMX) + +# n_sir() is an alias of count_all(). +# Since it counts all available isolates, you can +# calculate back to count e.g. susceptible isolates. +# These results are the same: +count_susceptible(example_isolates$AMX) +susceptibility(example_isolates$AMX) * n_sir(example_isolates$AMX) + +# dplyr ------------------------------------------------------------- +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + group_by(ward) \%>\% + summarise( + R = count_R(CIP), + I = count_I(CIP), + S = count_S(CIP), + n1 = count_all(CIP), # the actual total; sum of all three + n2 = n_sir(CIP), # same - analogous to n_distinct + total = n() + ) # NOT the number of tested isolates! + + # Number of available isolates for a whole antibiotic class + # (i.e., in this data set columns GEN, TOB, AMK, KAN) + example_isolates \%>\% + group_by(ward) \%>\% + summarise(across(aminoglycosides(), n_sir)) + + # Count co-resistance between amoxicillin/clav acid and gentamicin, + # so we can see that combination therapy does a lot more than mono therapy. + # Please mind that `susceptibility()` calculates percentages right away instead. + example_isolates \%>\% count_susceptible(AMC) # 1433 + example_isolates \%>\% count_all(AMC) # 1879 + + example_isolates \%>\% count_susceptible(GEN) # 1399 + example_isolates \%>\% count_all(GEN) # 1855 + + example_isolates \%>\% count_susceptible(AMC, GEN) # 1764 + example_isolates \%>\% count_all(AMC, GEN) # 1936 + + # Get number of S+I vs. R immediately of selected columns + example_isolates \%>\% + select(AMX, CIP) \%>\% + count_df(translate = FALSE) + + # It also supports grouping variables + example_isolates \%>\% + select(ward, AMX, CIP) \%>\% + group_by(ward) \%>\% + count_df(translate = FALSE) +} +} +} +\seealso{ +\code{\link[=proportion]{proportion_*}} to calculate microbial resistance and susceptibility. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/custom_eucast_rules.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/custom_eucast_rules.R +\name{custom_eucast_rules} +\alias{custom_eucast_rules} +\title{Define Custom EUCAST Rules} +\usage{ +custom_eucast_rules(...) +} +\arguments{ +\item{...}{rules in \link[base:tilde]{formula} notation, see below for instructions, and in \emph{Examples}} +} +\value{ +A \link{list} containing the custom rules +} +\description{ +Define custom EUCAST rules for your organisation or specific analysis and use the output of this function in \code{\link[=eucast_rules]{eucast_rules()}}. +} +\details{ +Some organisations have their own adoption of EUCAST rules. This function can be used to define custom EUCAST rules to be used in the \code{\link[=eucast_rules]{eucast_rules()}} function. +} +\section{How it works}{ + +\subsection{Basics}{ + +If you are familiar with the \code{\link[dplyr:case_when]{case_when()}} function of the \code{dplyr} package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written \emph{before} the tilde (\code{~}) and the consequence of the rule is written \emph{after} the tilde: + +\if{html}{\out{
}}\preformatted{x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S", + TZP == "R" ~ aminopenicillins == "R") +}\if{html}{\out{
}} + +These are two custom EUCAST rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work: + +\if{html}{\out{
}}\preformatted{x +#> A set of custom EUCAST rules: +#> +#> 1. If TZP is "S" then set to S : +#> amoxicillin (AMX), ampicillin (AMP) +#> +#> 2. If TZP is "R" then set to R : +#> amoxicillin (AMX), ampicillin (AMP) +}\if{html}{\out{
}} + +The rules (the part \emph{before} the tilde, in above example \code{TZP == "S"} and \code{TZP == "R"}) must be evaluable in your data set: it should be able to run as a filter in your data set without errors. This means for the above example that the column \code{TZP} must exist. We will create a sample data set and test the rules set: + +\if{html}{\out{
}}\preformatted{df <- data.frame(mo = c("Escherichia coli", "Klebsiella pneumoniae"), + TZP = as.sir("R"), + ampi = as.sir("S"), + cipro = as.sir("S")) +df +#> mo TZP ampi cipro +#> 1 Escherichia coli R S S +#> 2 Klebsiella pneumoniae R S S + +eucast_rules(df, rules = "custom", custom_rules = x, info = FALSE) +#> mo TZP ampi cipro +#> 1 Escherichia coli R R S +#> 2 Klebsiella pneumoniae R R S +}\if{html}{\out{
}} +} + +\subsection{Using taxonomic properties in rules}{ + +There is one exception in columns used for the rules: all column names of the \link{microorganisms} data set can also be used, but do not have to exist in the data set. These column names are: "mo", "fullname", "status", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "oxygen_tolerance", "source", "lpsn", "lpsn_parent", "lpsn_renamed_to", "mycobank", "mycobank_parent", "mycobank_renamed_to", "gbif", "gbif_parent", "gbif_renamed_to", "prevalence", and "snomed". Thus, this next example will work as well, despite the fact that the \code{df} data set does not contain a column \code{genus}: + +\if{html}{\out{
}}\preformatted{y <- custom_eucast_rules(TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S", + TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R") + +eucast_rules(df, rules = "custom", custom_rules = y, info = FALSE) +#> mo TZP ampi cipro +#> 1 Escherichia coli R S S +#> 2 Klebsiella pneumoniae R R S +}\if{html}{\out{
}} +} + +\subsection{Usage of multiple antibiotics and antibiotic group names}{ + +You can define antibiotic groups instead of single antibiotics for the rule consequence, which is the part \emph{after} the tilde (~). In the examples above, the antibiotic group \code{aminopenicillins} includes both ampicillin and amoxicillin. + +Rules can also be applied to multiple antibiotics and antibiotic groups simultaneously. Use the \code{c()} function to combine multiple antibiotics. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R": + +\if{html}{\out{
}}\preformatted{x <- custom_eucast_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R") +x +#> A set of custom EUCAST rules: +#> +#> 1. If TZP is "R" then set to "R": +#> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP) +}\if{html}{\out{
}} + +These 30 antibiotic groups are allowed in the rules (case-insensitive) and can be used in any combination: +\itemize{ +\item aminoglycosides\cr(amikacin, amikacin/fosfomycin, apramycin, arbekacin, astromicin, bekanamycin, dibekacin, framycetin, gentamicin, gentamicin-high, habekacin, hygromycin, isepamicin, kanamycin, kanamycin-high, kanamycin/cephalexin, micronomicin, neomycin, netilmicin, pentisomicin, plazomicin, propikacin, ribostamycin, sisomicin, streptoduocin, streptomycin, streptomycin-high, tobramycin, and tobramycin-high) +\item aminopenicillins\cr(amoxicillin and ampicillin) +\item antifungals\cr(amorolfine, amphotericin B, amphotericin B-high, anidulafungin, butoconazole, caspofungin, ciclopirox, clotrimazole, econazole, fluconazole, flucytosine, fosfluconazole, griseofulvin, hachimycin, ibrexafungerp, isavuconazole, isoconazole, itraconazole, ketoconazole, manogepix, micafungin, miconazole, nystatin, oteseconazole, pimaricin, posaconazole, rezafungin, ribociclib, sulconazole, terbinafine, terconazole, and voriconazole) +\item antimycobacterials\cr(4-aminosalicylic acid, calcium aminosalicylate, capreomycin, clofazimine, delamanid, enviomycin, ethambutol, ethambutol/isoniazid, ethionamide, isoniazid, isoniazid/sulfamethoxazole/trimethoprim/pyridoxine, morinamide, p-aminosalicylic acid, pretomanid, protionamide, pyrazinamide, rifabutin, rifampicin, rifampicin/ethambutol/isoniazid, rifampicin/isoniazid, rifampicin/pyrazinamide/ethambutol/isoniazid, rifampicin/pyrazinamide/isoniazid, rifamycin, rifapentine, simvastatin/fenofibrate, sodium aminosalicylate, streptomycin/isoniazid, terizidone, thioacetazone, thioacetazone/isoniazid, tiocarlide, and viomycin) +\item betalactams\cr(amoxicillin, amoxicillin/clavulanic acid, amoxicillin/sulbactam, ampicillin, ampicillin/sulbactam, apalcillin, aspoxicillin, avibactam, azidocillin, azlocillin, aztreonam, aztreonam/avibactam, aztreonam/nacubactam, bacampicillin, benzathine benzylpenicillin, benzathine phenoxymethylpenicillin, benzylpenicillin, biapenem, carbenicillin, carindacillin, cefacetrile, cefaclor, cefadroxil, cefalexin, cefaloridine, cefalotin, cefamandole, cefapirin, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/nacubactam, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefiderocol, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/tazobactam, ceftriaxone, ceftriaxone/beta-lactamase inhibitor, cefuroxime, cefuroxime axetil, cephradine, ciclacillin, clometocillin, cloxacillin, dicloxacillin, doripenem, epicillin, ertapenem, flucloxacillin, hetacillin, imipenem, imipenem/EDTA, imipenem/relebactam, latamoxef, lenampicillin, loracarbef, mecillinam, meropenem, meropenem/nacubactam, meropenem/vaborbactam, metampicillin, meticillin, mezlocillin, mezlocillin/sulbactam, nacubactam, nafcillin, oxacillin, panipenem, penamecillin, penicillin/novobiocin, penicillin/sulbactam, pheneticillin, phenoxymethylpenicillin, piperacillin, piperacillin/sulbactam, piperacillin/tazobactam, piridicillin, pivampicillin, pivmecillinam, procaine benzylpenicillin, propicillin, razupenem, ritipenem, ritipenem acoxil, sarmoxicillin, sulbactam, sulbenicillin, sultamicillin, talampicillin, tazobactam, tebipenem, temocillin, ticarcillin, and ticarcillin/clavulanic acid) +\item carbapenems\cr(biapenem, doripenem, ertapenem, imipenem, imipenem/EDTA, imipenem/relebactam, meropenem, meropenem/nacubactam, meropenem/vaborbactam, panipenem, razupenem, ritipenem, ritipenem acoxil, and tebipenem) +\item cephalosporins\cr(cefacetrile, cefaclor, cefadroxil, cefalexin, cefaloridine, cefalotin, cefamandole, cefapirin, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefiderocol, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/tazobactam, ceftriaxone, ceftriaxone/beta-lactamase inhibitor, cefuroxime, cefuroxime axetil, cephradine, latamoxef, and loracarbef) +\item cephalosporins_1st\cr(cefacetrile, cefadroxil, cefalexin, cefaloridine, cefalotin, cefapirin, cefatrizine, cefazedone, cefazolin, cefroxadine, ceftezole, and cephradine) +\item cephalosporins_2nd\cr(cefaclor, cefamandole, cefmetazole, cefonicid, ceforanide, cefotetan, cefotiam, cefoxitin, cefoxitin screening, cefprozil, cefuroxime, cefuroxime axetil, and loracarbef) +\item cephalosporins_3rd\cr(cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefetamet, cefetamet pivoxil, cefixime, cefmenoxime, cefodizime, cefoperazone, cefoperazone/sulbactam, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotiam hexetil, cefovecin, cefpimizole, cefpiramide, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefsulodin, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftriaxone, ceftriaxone/beta-lactamase inhibitor, and latamoxef) +\item cephalosporins_4th\cr(cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetecol, cefoselis, cefozopran, cefpirome, and cefquinome) +\item cephalosporins_5th\cr(ceftaroline, ceftaroline/avibactam, ceftobiprole, ceftobiprole medocaril, and ceftolozane/tazobactam) +\item cephalosporins_except_caz\cr(cefacetrile, cefaclor, cefadroxil, cefalexin, cefaloridine, cefalotin, cefamandole, cefapirin, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefiderocol, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/tazobactam, ceftriaxone, ceftriaxone/beta-lactamase inhibitor, cefuroxime, cefuroxime axetil, cephradine, latamoxef, and loracarbef) +\item fluoroquinolones\cr(besifloxacin, ciprofloxacin, clinafloxacin, danofloxacin, delafloxacin, difloxacin, enoxacin, enrofloxacin, finafloxacin, fleroxacin, garenoxacin, gatifloxacin, gemifloxacin, grepafloxacin, lascufloxacin, levofloxacin, levonadifloxacin, lomefloxacin, marbofloxacin, metioxate, miloxacin, moxifloxacin, nadifloxacin, nifuroquine, norfloxacin, ofloxacin, orbifloxacin, pazufloxacin, pefloxacin, pradofloxacin, premafloxacin, prulifloxacin, rufloxacin, sarafloxacin, sitafloxacin, sparfloxacin, temafloxacin, tilbroquinol, tioxacin, tosufloxacin, and trovafloxacin) +\item glycopeptides\cr(avoparcin, dalbavancin, norvancomycin, oritavancin, ramoplanin, teicoplanin, teicoplanin-macromethod, telavancin, vancomycin, and vancomycin-macromethod) +\item glycopeptides_except_lipo\cr(avoparcin, norvancomycin, ramoplanin, teicoplanin, teicoplanin-macromethod, vancomycin, and vancomycin-macromethod) +\item lincosamides\cr(acetylmidecamycin, acetylspiramycin, clindamycin, clindamycin inducible screening, gamithromycin, kitasamycin, lincomycin, meleumycin, nafithromycin, pirlimycin, primycin, solithromycin, tildipirosin, tilmicosin, tulathromycin, tylosin, and tylvalosin) +\item lipoglycopeptides\cr(dalbavancin, oritavancin, and telavancin) +\item macrolides\cr(acetylmidecamycin, acetylspiramycin, azithromycin, clarithromycin, dirithromycin, erythromycin, flurithromycin, gamithromycin, josamycin, kitasamycin, meleumycin, midecamycin, miocamycin, nafithromycin, oleandomycin, pirlimycin, primycin, rokitamycin, roxithromycin, solithromycin, spiramycin, telithromycin, tildipirosin, tilmicosin, troleandomycin, tulathromycin, tylosin, and tylvalosin) +\item nitrofurans\cr(furazidin, furazolidone, nifurtoinol, nitrofurantoin, and nitrofurazone) +\item oxazolidinones\cr(cadazolid, cycloserine, linezolid, tedizolid, and thiacetazone) +\item penicillins\cr(amoxicillin, amoxicillin/clavulanic acid, amoxicillin/sulbactam, ampicillin, ampicillin/sulbactam, apalcillin, aspoxicillin, avibactam, azidocillin, azlocillin, aztreonam, aztreonam/avibactam, aztreonam/nacubactam, bacampicillin, benzathine benzylpenicillin, benzathine phenoxymethylpenicillin, benzylpenicillin, carbenicillin, carindacillin, cefepime/nacubactam, ciclacillin, clometocillin, cloxacillin, dicloxacillin, epicillin, flucloxacillin, hetacillin, lenampicillin, mecillinam, metampicillin, meticillin, mezlocillin, mezlocillin/sulbactam, nacubactam, nafcillin, oxacillin, penamecillin, penicillin/novobiocin, penicillin/sulbactam, pheneticillin, phenoxymethylpenicillin, piperacillin, piperacillin/sulbactam, piperacillin/tazobactam, piridicillin, pivampicillin, pivmecillinam, procaine benzylpenicillin, propicillin, sarmoxicillin, sulbactam, sulbenicillin, sultamicillin, talampicillin, tazobactam, temocillin, ticarcillin, and ticarcillin/clavulanic acid) +\item polymyxins\cr(colistin, polymyxin B, and polymyxin B/polysorbate 80) +\item quinolones\cr(besifloxacin, cinoxacin, ciprofloxacin, ciprofloxacin/metronidazole, ciprofloxacin/ornidazole, ciprofloxacin/tinidazole, clinafloxacin, danofloxacin, delafloxacin, difloxacin, enoxacin, enrofloxacin, finafloxacin, fleroxacin, flumequine, garenoxacin, gatifloxacin, gemifloxacin, grepafloxacin, lascufloxacin, levofloxacin, levonadifloxacin, lomefloxacin, marbofloxacin, metioxate, miloxacin, moxifloxacin, nadifloxacin, nalidixic acid, nemonoxacin, nifuroquine, nitroxoline, norfloxacin, ofloxacin, orbifloxacin, oxolinic acid, pazufloxacin, pefloxacin, pipemidic acid, piromidic acid, pradofloxacin, premafloxacin, prulifloxacin, rosoxacin, rufloxacin, sarafloxacin, sitafloxacin, sparfloxacin, temafloxacin, tilbroquinol, tioxacin, tosufloxacin, and trovafloxacin) +\item rifamycins\cr(rifabutin, rifampicin, rifampicin/ethambutol/isoniazid, rifampicin/isoniazid, rifampicin/pyrazinamide/ethambutol/isoniazid, rifampicin/pyrazinamide/isoniazid, rifamycin, and rifapentine) +\item streptogramins\cr(pristinamycin and quinupristin/dalfopristin) +\item tetracyclines\cr(cetocycline, chlortetracycline, clomocycline, demeclocycline, doxycycline, eravacycline, lymecycline, metacycline, minocycline, omadacycline, oxytetracycline, penimepicycline, rolitetracycline, sarecycline, tetracycline, and tigecycline) +\item tetracyclines_except_tgc\cr(cetocycline, chlortetracycline, clomocycline, demeclocycline, doxycycline, eravacycline, lymecycline, metacycline, minocycline, omadacycline, oxytetracycline, penimepicycline, rolitetracycline, sarecycline, and tetracycline) +\item trimethoprims\cr(brodimoprim, sulfadiazine, sulfadiazine/tetroxoprim, sulfadiazine/trimethoprim, sulfadimethoxine, sulfadimidine, sulfadimidine/trimethoprim, sulfafurazole, sulfaisodimidine, sulfalene, sulfamazone, sulfamerazine, sulfamerazine/trimethoprim, sulfamethizole, sulfamethoxazole, sulfamethoxypyridazine, sulfametomidine, sulfametoxydiazine, sulfametrole/trimethoprim, sulfamoxole, sulfamoxole/trimethoprim, sulfanilamide, sulfaperin, sulfaphenazole, sulfapyridine, sulfathiazole, sulfathiourea, trimethoprim, and trimethoprim/sulfamethoxazole) +\item ureidopenicillins\cr(azlocillin, mezlocillin, piperacillin, and piperacillin/tazobactam) +} +} +} + +\examples{ +x <- custom_eucast_rules( + AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", + AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I" +) +x + +# run the custom rule set (verbose = TRUE will return a logbook instead of the data set): +eucast_rules(example_isolates, + rules = "custom", + custom_rules = x, + info = FALSE, + verbose = TRUE +) + +# combine rule sets +x2 <- c( + x, + custom_eucast_rules(TZP == "R" ~ carbapenems == "R") +) +x2 +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/dosage.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{dosage} +\alias{dosage} +\title{Data Set with Treatment Dosages as Defined by EUCAST} +\format{ +A \link[tibble:tibble]{tibble} with 503 observations and 9 variables: +\itemize{ +\item \code{ab}\cr Antibiotic ID as used in this package (such as \code{AMC}), using the official EARS-Net (European Antimicrobial Resistance Surveillance Network) codes where available +\item \code{name}\cr Official name of the antimicrobial drug as used by WHONET/EARS-Net or the WHO +\item \code{type}\cr Type of the dosage, either "high_dosage", "standard_dosage", or "uncomplicated_uti" +\item \code{dose}\cr Dose, such as "2 g" or "25 mg/kg" +\item \code{dose_times}\cr Number of times a dose must be administered +\item \code{administration}\cr Route of administration, either "im", "iv", or "oral" +\item \code{notes}\cr Additional dosage notes +\item \code{original_txt}\cr Original text in the PDF file of EUCAST +\item \code{eucast_version}\cr Version number of the EUCAST Clinical Breakpoints guideline to which these dosages apply, either 13, 12, or 11 +} +} +\usage{ +dosage +} +\description{ +EUCAST breakpoints used in this package are based on the dosages in this data set. They can be retrieved with \code{\link[=eucast_dosage]{eucast_dosage()}}. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +dosage +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/eucast_rules.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/eucast_rules.R +\name{eucast_rules} +\alias{eucast_rules} +\alias{EUCAST} +\alias{eucast_dosage} +\title{Apply EUCAST Rules} +\source{ +\itemize{ +\item EUCAST Expert Rules. Version 2.0, 2012.\cr +Leclercq et al. \strong{EUCAST expert rules in antimicrobial susceptibility testing.} \emph{Clin Microbiol Infect.} 2013;19(2):141-60; \doi{https://doi.org/10.1111/j.1469-0691.2011.03703.x} +\item EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes Tables. Version 3.1, 2016. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf}{(link)} +\item EUCAST Intrinsic Resistance and Unusual Phenotypes. Version 3.2, 2020. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf}{(link)} +\item EUCAST Intrinsic Resistance and Unusual Phenotypes. Version 3.3, 2021. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2021/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.3_20211018.pdf}{(link)} +\item EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 9.0, 2019. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_9.0_Breakpoint_Tables.xlsx}{(link)} +\item EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 10.0, 2020. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_10.0_Breakpoint_Tables.xlsx}{(link)} +\item EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 11.0, 2021. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_11.0_Breakpoint_Tables.xlsx}{(link)} +\item EUCAST Breakpoint tables for interpretation of MICs and zone diameters. Version 12.0, 2022. \href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Breakpoint_tables/v_12.0_Breakpoint_Tables.xlsx}{(link)} +} +} +\usage{ +eucast_rules( + x, + col_mo = NULL, + info = interactive(), + rules = getOption("AMR_eucastrules", default = c("breakpoints", "expert")), + verbose = FALSE, + version_breakpoints = 12, + version_expertrules = 3.3, + ampc_cephalosporin_resistance = NA, + only_sir_columns = FALSE, + custom_rules = NULL, + ... +) + +eucast_dosage(ab, administration = "iv", version_breakpoints = 12) +} +\arguments{ +\item{x}{a data set with antibiotic columns, such as \code{amox}, \code{AMX} and \code{AMC}} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{info}{a \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions} + +\item{rules}{a \link{character} vector that specifies which rules should be applied. Must be one or more of \code{"breakpoints"}, \code{"expert"}, \code{"other"}, \code{"custom"}, \code{"all"}, and defaults to \code{c("breakpoints", "expert")}. The default value can be set to another value using the package option \code{\link[=AMR-options]{AMR_eucastrules}}: \code{options(AMR_eucastrules = "all")}. If using \code{"custom"}, be sure to fill in argument \code{custom_rules} too. Custom rules can be created with \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}.} + +\item{verbose}{a \link{logical} to turn Verbose mode on and off (default is off). In Verbose mode, the function does not apply rules to the data, but instead returns a data set in logbook form with extensive info about which rows and columns would be effected and in which way. Using Verbose mode takes a lot more time.} + +\item{version_breakpoints}{the version number to use for the EUCAST Clinical Breakpoints guideline. Can be "12.0", "11.0", or "10.0".} + +\item{version_expertrules}{the version number to use for the EUCAST Expert Rules and Intrinsic Resistance guideline. Can be "3.3", "3.2", or "3.1".} + +\item{ampc_cephalosporin_resistance}{a \link{character} value that should be applied to cefotaxime, ceftriaxone and ceftazidime for AmpC de-repressed cephalosporin-resistant mutants - the default is \code{NA}. Currently only works when \code{version_expertrules} is \code{3.2} and higher; these version of '\emph{EUCAST Expert Rules on Enterobacterales}' state that results of cefotaxime, ceftriaxone and ceftazidime should be reported with a note, or results should be suppressed (emptied) for these three drugs. A value of \code{NA} (the default) for this argument will remove results for these three drugs, while e.g. a value of \code{"R"} will make the results for these drugs resistant. Use \code{NULL} or \code{FALSE} to not alter results for these three drugs of AmpC de-repressed cephalosporin-resistant mutants. Using \code{TRUE} is equal to using \code{"R"}. \cr For \emph{EUCAST Expert Rules} v3.2, this rule applies to: \emph{Citrobacter braakii}, \emph{Citrobacter freundii}, \emph{Citrobacter gillenii}, \emph{Citrobacter murliniae}, \emph{Citrobacter rodenticum}, \emph{Citrobacter sedlakii}, \emph{Citrobacter werkmanii}, \emph{Citrobacter youngae}, \emph{Enterobacter}, \emph{Hafnia alvei}, \emph{Klebsiella aerogenes}, \emph{Morganella morganii}, \emph{Providencia}, and \emph{Serratia}.} + +\item{only_sir_columns}{a \link{logical} to indicate whether only antibiotic columns must be detected that were transformed to class \code{sir} (see \code{\link[=as.sir]{as.sir()}}) on beforehand (default is \code{FALSE})} + +\item{custom_rules}{custom rules to apply, created with \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}} + +\item{...}{column name of an antibiotic, see section \emph{Antibiotics} below} + +\item{ab}{any (vector of) text that can be coerced to a valid antibiotic drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{administration}{route of administration, either "im", "iv", or "oral"} +} +\value{ +The input of \code{x}, possibly with edited values of antibiotics. Or, if \code{verbose = TRUE}, a \link{data.frame} with all original and new values of the affected bug-drug combinations. +} +\description{ +Apply rules for clinical breakpoints and intrinsic resistance as defined by the European Committee on Antimicrobial Susceptibility Testing (EUCAST, \url{https://www.eucast.org}), see \emph{Source}. Use \code{\link[=eucast_dosage]{eucast_dosage()}} to get a \link{data.frame} with advised dosages of a certain bug-drug combination, which is based on the \link{dosage} data set. + +To improve the interpretation of the antibiogram before EUCAST rules are applied, some non-EUCAST rules can applied at default, see \emph{Details}. +} +\details{ +\strong{Note:} This function does not translate MIC values to SIR values. Use \code{\link[=as.sir]{as.sir()}} for that. \cr +\strong{Note:} When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr + +The file containing all EUCAST rules is located here: \url{https://github.com/msberends/AMR/blob/main/data-raw/eucast_rules.tsv}. \strong{Note:} Old taxonomic names are replaced with the current taxonomy where applicable. For example, \emph{Ochrobactrum anthropi} was renamed to \emph{Brucella anthropi} in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The \code{AMR} package contains the full microbial taxonomy updated until June 24th, 2024, see \link{microorganisms}. +\subsection{Custom Rules}{ + +Custom rules can be created using \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}, e.g.: + +\if{html}{\out{
}}\preformatted{x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", + AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I") + +eucast_rules(example_isolates, rules = "custom", custom_rules = x) +}\if{html}{\out{
}} +} + +\subsection{'Other' Rules}{ + +Before further processing, two non-EUCAST rules about drug combinations can be applied to improve the efficacy of the EUCAST rules, and the reliability of your data (analysis). These rules are: +\enumerate{ +\item A drug \strong{with} enzyme inhibitor will be set to S if the same drug \strong{without} enzyme inhibitor is S +\item A drug \strong{without} enzyme inhibitor will be set to R if the same drug \strong{with} enzyme inhibitor is R +} + +Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set. + +Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include \code{"other"} to the \code{rules} argument, or use \code{eucast_rules(..., rules = "all")}. You can also set the package option \code{\link[=AMR-options]{AMR_eucastrules}}, i.e. run \code{options(AMR_eucastrules = "all")}. +} +} +\section{Antibiotics}{ + +To define antibiotics column names, leave as it is to determine it automatically with \code{\link[=guess_ab_col]{guess_ab_col()}} or input a text (case-insensitive), or use \code{NULL} to skip a column (e.g. \code{TIC = NULL} to skip ticarcillin). Manually defined but non-existing columns will be skipped with a warning. + +The following antibiotics are eligible for the functions \code{\link[=eucast_rules]{eucast_rules()}} and \code{\link[=mdro]{mdro()}}. These are shown below in the format 'name (\verb{antimicrobial ID}, \href{https://atcddd.fhi.no/atc/structure_and_principles/}{ATC code})', sorted alphabetically: + +Amikacin (\code{AMK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB06&showdescription=no}{J01GB06}), amoxicillin (\code{AMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA04&showdescription=no}{J01CA04}), amoxicillin/clavulanic acid (\code{AMC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR02&showdescription=no}{J01CR02}), ampicillin (\code{AMP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA01&showdescription=no}{J01CA01}), ampicillin/sulbactam (\code{SAM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR01&showdescription=no}{J01CR01}), apramycin (\code{APR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QA07AA92&showdescription=no}{QA07AA92}), arbekacin (\code{ARB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB12&showdescription=no}{J01GB12}), aspoxicillin (\code{APX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA19&showdescription=no}{J01CA19}), azidocillin (\code{AZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE04&showdescription=no}{J01CE04}), azithromycin (\code{AZM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA10&showdescription=no}{J01FA10}), azlocillin (\code{AZL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA09&showdescription=no}{J01CA09}), aztreonam (\code{ATM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DF01&showdescription=no}{J01DF01}), bacampicillin (\code{BAM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA06&showdescription=no}{J01CA06}), bekanamycin (\code{BEK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB13&showdescription=no}{J01GB13}), benzathine benzylpenicillin (\code{BNB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE08&showdescription=no}{J01CE08}), benzathine phenoxymethylpenicillin (\code{BNP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE10&showdescription=no}{J01CE10}), benzylpenicillin (\code{PEN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE01&showdescription=no}{J01CE01}), besifloxacin (\code{BES}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=S01AE08&showdescription=no}{S01AE08}), biapenem (\code{BIA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH05&showdescription=no}{J01DH05}), carbenicillin (\code{CRB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA03&showdescription=no}{J01CA03}), carindacillin (\code{CRN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA05&showdescription=no}{J01CA05}), cefacetrile (\code{CAC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB10&showdescription=no}{J01DB10}), cefaclor (\code{CEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC04&showdescription=no}{J01DC04}), cefadroxil (\code{CFR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB05&showdescription=no}{J01DB05}), cefalexin (\code{LEX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB01&showdescription=no}{J01DB01}), cefaloridine (\code{RID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB02&showdescription=no}{J01DB02}), cefalotin (\code{CEP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB03&showdescription=no}{J01DB03}), cefamandole (\code{MAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC03&showdescription=no}{J01DC03}), cefapirin (\code{HAP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB08&showdescription=no}{J01DB08}), cefatrizine (\code{CTZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB07&showdescription=no}{J01DB07}), cefazedone (\code{CZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB06&showdescription=no}{J01DB06}), cefazolin (\code{CZO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB04&showdescription=no}{J01DB04}), cefcapene (\code{CCP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD17&showdescription=no}{J01DD17}), cefdinir (\code{CDR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD15&showdescription=no}{J01DD15}), cefditoren (\code{DIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD16&showdescription=no}{J01DD16}), cefepime (\code{FEP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE01&showdescription=no}{J01DE01}), cefetamet (\code{CAT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD10&showdescription=no}{J01DD10}), cefiderocol (\code{FDC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI04&showdescription=no}{J01DI04}), cefixime (\code{CFM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD08&showdescription=no}{J01DD08}), cefmenoxime (\code{CMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD05&showdescription=no}{J01DD05}), cefmetazole (\code{CMZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC09&showdescription=no}{J01DC09}), cefodizime (\code{DIZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD09&showdescription=no}{J01DD09}), cefonicid (\code{CID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC06&showdescription=no}{J01DC06}), cefoperazone (\code{CFP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD12&showdescription=no}{J01DD12}), cefoperazone/sulbactam (\code{CSL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD62&showdescription=no}{J01DD62}), ceforanide (\code{CND}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC11&showdescription=no}{J01DC11}), cefotaxime (\code{CTX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD01&showdescription=no}{J01DD01}), cefotaxime/clavulanic acid (\code{CTC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD51&showdescription=no}{J01DD51}), cefotetan (\code{CTT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC05&showdescription=no}{J01DC05}), cefotiam (\code{CTF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC07&showdescription=no}{J01DC07}), cefovecin (\code{FOV}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01DD91&showdescription=no}{QJ01DD91}), cefoxitin (\code{FOX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC01&showdescription=no}{J01DC01}), cefozopran (\code{ZOP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE03&showdescription=no}{J01DE03}), cefpiramide (\code{CPM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD11&showdescription=no}{J01DD11}), cefpirome (\code{CPO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE02&showdescription=no}{J01DE02}), cefpodoxime (\code{CPD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD13&showdescription=no}{J01DD13}), cefprozil (\code{CPR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC10&showdescription=no}{J01DC10}), cefquinome (\code{CEQ}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QG51AA07&showdescription=no}{QG51AA07}), cefroxadine (\code{CRD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB11&showdescription=no}{J01DB11}), cefsulodin (\code{CFS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD03&showdescription=no}{J01DD03}), ceftaroline (\code{CPT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI02&showdescription=no}{J01DI02}), ceftazidime (\code{CAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD02&showdescription=no}{J01DD02}), ceftazidime/clavulanic acid (\code{CCV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD52&showdescription=no}{J01DD52}), cefteram (\code{CEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD18&showdescription=no}{J01DD18}), ceftezole (\code{CTL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB12&showdescription=no}{J01DB12}), ceftibuten (\code{CTB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD14&showdescription=no}{J01DD14}), ceftiofur (\code{TIO}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01DD90&showdescription=no}{QJ01DD90}), ceftizoxime (\code{CZX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD07&showdescription=no}{J01DD07}), ceftobiprole medocaril (\code{CFM1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI01&showdescription=no}{J01DI01}), ceftolozane/tazobactam (\code{CZT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI54&showdescription=no}{J01DI54}), ceftriaxone (\code{CRO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD04&showdescription=no}{J01DD04}), ceftriaxone/beta-lactamase inhibitor (\code{CEB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD63&showdescription=no}{J01DD63}), cefuroxime (\code{CXM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC02&showdescription=no}{J01DC02}), cephradine (\code{CED}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB09&showdescription=no}{J01DB09}), chloramphenicol (\code{CHL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01BA01&showdescription=no}{J01BA01}), ciprofloxacin (\code{CIP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA02&showdescription=no}{J01MA02}), clarithromycin (\code{CLR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA09&showdescription=no}{J01FA09}), clindamycin (\code{CLI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FF01&showdescription=no}{J01FF01}), clometocillin (\code{CLM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE07&showdescription=no}{J01CE07}), cloxacillin (\code{CLO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF02&showdescription=no}{J01CF02}), colistin (\code{COL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XB01&showdescription=no}{J01XB01}), cycloserine (\code{CYC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J04AB01&showdescription=no}{J04AB01}), dalbavancin (\code{DAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA04&showdescription=no}{J01XA04}), danofloxacin (\code{DAN}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA92&showdescription=no}{QJ01MA92}), daptomycin (\code{DAP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX09&showdescription=no}{J01XX09}), delafloxacin (\code{DFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA23&showdescription=no}{J01MA23}), dibekacin (\code{DKB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB09&showdescription=no}{J01GB09}), dicloxacillin (\code{DIC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF01&showdescription=no}{J01CF01}), difloxacin (\code{DIF}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA94&showdescription=no}{QJ01MA94}), dirithromycin (\code{DIR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA13&showdescription=no}{J01FA13}), doripenem (\code{DOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH04&showdescription=no}{J01DH04}), doxycycline (\code{DOX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA02&showdescription=no}{J01AA02}), enoxacin (\code{ENX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA04&showdescription=no}{J01MA04}), enrofloxacin (\code{ENR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA90&showdescription=no}{QJ01MA90}), epicillin (\code{EPC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA07&showdescription=no}{J01CA07}), ertapenem (\code{ETP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH03&showdescription=no}{J01DH03}), erythromycin (\code{ERY}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA01&showdescription=no}{J01FA01}), fleroxacin (\code{FLE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA08&showdescription=no}{J01MA08}), flucloxacillin (\code{FLC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF05&showdescription=no}{J01CF05}), flurithromycin (\code{FLR1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA14&showdescription=no}{J01FA14}), fosfomycin (\code{FOS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX01&showdescription=no}{J01XX01}), framycetin (\code{FRM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=D09AA01&showdescription=no}{D09AA01}), fusidic acid (\code{FUS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XC01&showdescription=no}{J01XC01}), gamithromycin (\code{GAM}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA95&showdescription=no}{QJ01FA95}), garenoxacin (\code{GRN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA19&showdescription=no}{J01MA19}), gatifloxacin (\code{GAT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA16&showdescription=no}{J01MA16}), gemifloxacin (\code{GEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA15&showdescription=no}{J01MA15}), gentamicin (\code{GEN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB03&showdescription=no}{J01GB03}), grepafloxacin (\code{GRX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA11&showdescription=no}{J01MA11}), hetacillin (\code{HET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA18&showdescription=no}{J01CA18}), imipenem (\code{IPM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH51&showdescription=no}{J01DH51}), imipenem/relebactam (\code{IMR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH56&showdescription=no}{J01DH56}), isepamicin (\code{ISE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB11&showdescription=no}{J01GB11}), josamycin (\code{JOS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA07&showdescription=no}{J01FA07}), kanamycin (\code{KAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB04&showdescription=no}{J01GB04}), kitasamycin (\code{KIT}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA93&showdescription=no}{QJ01FA93}), lascufloxacin (\code{LSC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA25&showdescription=no}{J01MA25}), latamoxef (\code{LTM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD06&showdescription=no}{J01DD06}), levofloxacin (\code{LVX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA12&showdescription=no}{J01MA12}), levonadifloxacin (\code{LND}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA24&showdescription=no}{J01MA24}), lincomycin (\code{LIN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FF02&showdescription=no}{J01FF02}), linezolid (\code{LNZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX08&showdescription=no}{J01XX08}), lomefloxacin (\code{LOM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA07&showdescription=no}{J01MA07}), loracarbef (\code{LOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC08&showdescription=no}{J01DC08}), marbofloxacin (\code{MAR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA93&showdescription=no}{QJ01MA93}), mecillinam (\code{MEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA11&showdescription=no}{J01CA11}), meropenem (\code{MEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH02&showdescription=no}{J01DH02}), meropenem/vaborbactam (\code{MEV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH52&showdescription=no}{J01DH52}), metampicillin (\code{MTM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA14&showdescription=no}{J01CA14}), meticillin (\code{MET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF03&showdescription=no}{J01CF03}), mezlocillin (\code{MEZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA10&showdescription=no}{J01CA10}), micronomicin (\code{MCR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=S01AA22&showdescription=no}{S01AA22}), midecamycin (\code{MID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA03&showdescription=no}{J01FA03}), minocycline (\code{MNO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA08&showdescription=no}{J01AA08}), miocamycin (\code{MCM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA11&showdescription=no}{J01FA11}), moxifloxacin (\code{MFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA14&showdescription=no}{J01MA14}), nadifloxacin (\code{NAD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=D10AF05&showdescription=no}{D10AF05}), nafcillin (\code{NAF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF06&showdescription=no}{J01CF06}), nalidixic acid (\code{NAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MB02&showdescription=no}{J01MB02}), neomycin (\code{NEO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB05&showdescription=no}{J01GB05}), netilmicin (\code{NET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB07&showdescription=no}{J01GB07}), nitrofurantoin (\code{NIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XE01&showdescription=no}{J01XE01}), norfloxacin (\code{NOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA06&showdescription=no}{J01MA06}), novobiocin (\code{NOV}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01XX95&showdescription=no}{QJ01XX95}), ofloxacin (\code{OFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA01&showdescription=no}{J01MA01}), oleandomycin (\code{OLE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA05&showdescription=no}{J01FA05}), orbifloxacin (\code{ORB}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA95&showdescription=no}{QJ01MA95}), oritavancin (\code{ORI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA05&showdescription=no}{J01XA05}), oxacillin (\code{OXA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF04&showdescription=no}{J01CF04}), panipenem (\code{PAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH55&showdescription=no}{J01DH55}), pazufloxacin (\code{PAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA18&showdescription=no}{J01MA18}), pefloxacin (\code{PEF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA03&showdescription=no}{J01MA03}), penamecillin (\code{PNM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE06&showdescription=no}{J01CE06}), pheneticillin (\code{PHE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE05&showdescription=no}{J01CE05}), phenoxymethylpenicillin (\code{PHN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE02&showdescription=no}{J01CE02}), piperacillin (\code{PIP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA12&showdescription=no}{J01CA12}), piperacillin/tazobactam (\code{TZP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR05&showdescription=no}{J01CR05}), pirlimycin (\code{PRL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ51FF90&showdescription=no}{QJ51FF90}), pivampicillin (\code{PVM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA02&showdescription=no}{J01CA02}), pivmecillinam (\code{PME}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA08&showdescription=no}{J01CA08}), plazomicin (\code{PLZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB14&showdescription=no}{J01GB14}), polymyxin B (\code{PLB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XB02&showdescription=no}{J01XB02}), pradofloxacin (\code{PRA}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA97&showdescription=no}{QJ01MA97}), pristinamycin (\code{PRI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FG01&showdescription=no}{J01FG01}), procaine benzylpenicillin (\code{PRB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE09&showdescription=no}{J01CE09}), propicillin (\code{PRP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE03&showdescription=no}{J01CE03}), prulifloxacin (\code{PRU}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA17&showdescription=no}{J01MA17}), quinupristin/dalfopristin (\code{QDA}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FG02&showdescription=no}{QJ01FG02}), ribostamycin (\code{RST}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB10&showdescription=no}{J01GB10}), rifampicin (\code{RIF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J04AB02&showdescription=no}{J04AB02}), rokitamycin (\code{ROK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA12&showdescription=no}{J01FA12}), roxithromycin (\code{RXT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA06&showdescription=no}{J01FA06}), rufloxacin (\code{RFL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA10&showdescription=no}{J01MA10}), sarafloxacin (\code{SAR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA98&showdescription=no}{QJ01MA98}), sisomicin (\code{SIS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB08&showdescription=no}{J01GB08}), sitafloxacin (\code{SIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA21&showdescription=no}{J01MA21}), solithromycin (\code{SOL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA16&showdescription=no}{J01FA16}), sparfloxacin (\code{SPX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA09&showdescription=no}{J01MA09}), spiramycin (\code{SPI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA02&showdescription=no}{J01FA02}), streptoduocin (\code{STR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GA02&showdescription=no}{J01GA02}), streptomycin (\code{STR1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GA01&showdescription=no}{J01GA01}), sulbactam (\code{SUL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CG01&showdescription=no}{J01CG01}), sulbenicillin (\code{SBC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA16&showdescription=no}{J01CA16}), sulfadiazine (\code{SDI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC02&showdescription=no}{J01EC02}), sulfadiazine/trimethoprim (\code{SLT1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE02&showdescription=no}{J01EE02}), sulfadimethoxine (\code{SUD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED01&showdescription=no}{J01ED01}), sulfadimidine (\code{SDM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB03&showdescription=no}{J01EB03}), sulfadimidine/trimethoprim (\code{SLT2}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE05&showdescription=no}{J01EE05}), sulfafurazole (\code{SLF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB05&showdescription=no}{J01EB05}), sulfaisodimidine (\code{SLF1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB01&showdescription=no}{J01EB01}), sulfalene (\code{SLF2}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED02&showdescription=no}{J01ED02}), sulfamazone (\code{SZO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED09&showdescription=no}{J01ED09}), sulfamerazine (\code{SLF3}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED07&showdescription=no}{J01ED07}), sulfamerazine/trimethoprim (\code{SLT3}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE07&showdescription=no}{J01EE07}), sulfamethizole (\code{SLF4}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB02&showdescription=no}{J01EB02}), sulfamethoxazole (\code{SMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC01&showdescription=no}{J01EC01}), sulfamethoxypyridazine (\code{SLF5}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED05&showdescription=no}{J01ED05}), sulfametomidine (\code{SLF6}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED03&showdescription=no}{J01ED03}), sulfametoxydiazine (\code{SLF7}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED04&showdescription=no}{J01ED04}), sulfametrole/trimethoprim (\code{SLT4}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE03&showdescription=no}{J01EE03}), sulfamoxole (\code{SLF8}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC03&showdescription=no}{J01EC03}), sulfamoxole/trimethoprim (\code{SLT5}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE04&showdescription=no}{J01EE04}), sulfanilamide (\code{SLF9}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB06&showdescription=no}{J01EB06}), sulfaperin (\code{SLF10}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED06&showdescription=no}{J01ED06}), sulfaphenazole (\code{SLF11}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED08&showdescription=no}{J01ED08}), sulfapyridine (\code{SLF12}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB04&showdescription=no}{J01EB04}), sulfathiazole (\code{SUT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB07&showdescription=no}{J01EB07}), sulfathiourea (\code{SLF13}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB08&showdescription=no}{J01EB08}), sultamicillin (\code{SLT6}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR04&showdescription=no}{J01CR04}), talampicillin (\code{TAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA15&showdescription=no}{J01CA15}), tazobactam (\code{TAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CG02&showdescription=no}{J01CG02}), tebipenem (\code{TBP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH06&showdescription=no}{J01DH06}), tedizolid (\code{TZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX11&showdescription=no}{J01XX11}), teicoplanin (\code{TEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA02&showdescription=no}{J01XA02}), telavancin (\code{TLV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA03&showdescription=no}{J01XA03}), telithromycin (\code{TLT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA15&showdescription=no}{J01FA15}), temafloxacin (\code{TMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA05&showdescription=no}{J01MA05}), temocillin (\code{TEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA17&showdescription=no}{J01CA17}), tetracycline (\code{TCY}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA07&showdescription=no}{J01AA07}), ticarcillin (\code{TIC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA13&showdescription=no}{J01CA13}), ticarcillin/clavulanic acid (\code{TCC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR03&showdescription=no}{J01CR03}), tigecycline (\code{TGC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA12&showdescription=no}{J01AA12}), tilbroquinol (\code{TBQ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=P01AA05&showdescription=no}{P01AA05}), tildipirosin (\code{TIP}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA96&showdescription=no}{QJ01FA96}), tilmicosin (\code{TIL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA91&showdescription=no}{QJ01FA91}), tobramycin (\code{TOB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB01&showdescription=no}{J01GB01}), tosufloxacin (\code{TFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA22&showdescription=no}{J01MA22}), trimethoprim (\code{TMP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EA01&showdescription=no}{J01EA01}), trimethoprim/sulfamethoxazole (\code{SXT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE01&showdescription=no}{J01EE01}), troleandomycin (\code{TRL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA08&showdescription=no}{J01FA08}), trovafloxacin (\code{TVA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA13&showdescription=no}{J01MA13}), tulathromycin (\code{TUL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA94&showdescription=no}{QJ01FA94}), tylosin (\code{TYL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA90&showdescription=no}{QJ01FA90}), tylvalosin (\code{TYL1}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA92&showdescription=no}{QJ01FA92}), vancomycin (\code{VAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA01&showdescription=no}{J01XA01}) +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +\donttest{ +a <- data.frame( + mo = c( + "Staphylococcus aureus", + "Enterococcus faecalis", + "Escherichia coli", + "Klebsiella pneumoniae", + "Pseudomonas aeruginosa" + ), + VAN = "-", # Vancomycin + AMX = "-", # Amoxicillin + COL = "-", # Colistin + CAZ = "-", # Ceftazidime + CXM = "-", # Cefuroxime + PEN = "S", # Benzylpenicillin + FOX = "S", # Cefoxitin + stringsAsFactors = FALSE +) + +head(a) + + +# apply EUCAST rules: some results wil be changed +b <- eucast_rules(a) + +head(b) + + +# do not apply EUCAST rules, but rather get a data.frame +# containing all details about the transformations: +c <- eucast_rules(a, verbose = TRUE) +head(c) +} + +# Dosage guidelines: + +eucast_dosage(c("tobra", "genta", "cipro"), "iv") + +eucast_dosage(c("tobra", "genta", "cipro"), "iv", version_breakpoints = 10) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/example_isolates.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{example_isolates} +\alias{example_isolates} +\title{Data Set with 2 000 Example Isolates} +\format{ +A \link[tibble:tibble]{tibble} with 2 000 observations and 46 variables: +\itemize{ +\item \code{date}\cr Date of receipt at the laboratory +\item \code{patient}\cr ID of the patient +\item \code{age}\cr Age of the patient +\item \code{gender}\cr Gender of the patient, either "F" or "M" +\item \code{ward}\cr Ward type where the patient was admitted, either "Clinical", "ICU", or "Outpatient" +\item \code{mo}\cr ID of microorganism created with \code{\link[=as.mo]{as.mo()}}, see also the \link{microorganisms} data set +\item \code{PEN:RIF}\cr 40 different antibiotics with class \code{\link{sir}} (see \code{\link[=as.sir]{as.sir()}}); these column names occur in the \link{antibiotics} data set and can be translated with \code{\link[=set_ab_names]{set_ab_names()}} or \code{\link[=ab_name]{ab_name()}} +} +} +\usage{ +example_isolates +} +\description{ +A data set containing 2 000 microbial isolates with their full antibiograms. This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis. For examples, please read \href{https://msberends.github.io/AMR/articles/AMR.html}{the tutorial on our website}. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +example_isolates +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/example_isolates_unclean.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{example_isolates_unclean} +\alias{example_isolates_unclean} +\title{Data Set with Unclean Data} +\format{ +A \link[tibble:tibble]{tibble} with 3 000 observations and 8 variables: +\itemize{ +\item \code{patient_id}\cr ID of the patient +\item \code{date}\cr date of receipt at the laboratory +\item \code{hospital}\cr ID of the hospital, from A to C +\item \code{bacteria}\cr info about microorganism that can be transformed with \code{\link[=as.mo]{as.mo()}}, see also \link{microorganisms} +\item \code{AMX:GEN}\cr 4 different antibiotics that have to be transformed with \code{\link[=as.sir]{as.sir()}} +} +} +\usage{ +example_isolates_unclean +} +\description{ +A data set containing 3 000 microbial isolates that are not cleaned up and consequently not ready for AMR data analysis. This data set can be used for practice. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +example_isolates_unclean +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/export_ncbi_biosample.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/export_biosample.R +\name{export_ncbi_biosample} +\alias{export_ncbi_biosample} +\title{Export Data Set as NCBI BioSample Antibiogram} +\usage{ +export_ncbi_biosample( + x, + filename = paste0("biosample_", format(Sys.time(), "\%Y-\%m-\%d-\%H\%M\%S"), ".xlsx"), + type = "pathogen MIC", + columns = where(is.mic), + save_as_xlsx = TRUE +) +} +\arguments{ +\item{x}{a data set} + +\item{filename}{a character string specifying the file name} + +\item{type}{a character string specifying the type of data set, either "pathogen MIC" or "beta-lactamase MIC", see \url{https://www.ncbi.nlm.nih.gov/biosample/docs/}} +} +\description{ +Export Data Set as NCBI BioSample Antibiogram +} +\keyword{internal} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/first_isolate.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/first_isolate.R +\name{first_isolate} +\alias{first_isolate} +\alias{filter_first_isolate} +\title{Determine First Isolates} +\source{ +Methodology of this function is strictly based on: +\itemize{ +\item \strong{M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition}, 2022, \emph{Clinical and Laboratory Standards Institute (CLSI)}. \url{https://clsi.org/standards/products/microbiology/documents/m39/}. +\item Hindler JF and Stelling J (2007). \strong{Analysis and Presentation of Cumulative Antibiograms: A New Consensus Guideline from the Clinical and Laboratory Standards Institute.} Clinical Infectious Diseases, 44(6), 867-873. \doi{10.1086/511864} +} +} +\usage{ +first_isolate( + x = NULL, + col_date = NULL, + col_patient_id = NULL, + col_mo = NULL, + col_testcode = NULL, + col_specimen = NULL, + col_icu = NULL, + col_keyantimicrobials = NULL, + episode_days = 365, + testcodes_exclude = NULL, + icu_exclude = FALSE, + specimen_group = NULL, + type = "points", + method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"), + ignore_I = TRUE, + points_threshold = 2, + info = interactive(), + include_unknown = FALSE, + include_untested_sir = TRUE, + ... +) + +filter_first_isolate( + x = NULL, + col_date = NULL, + col_patient_id = NULL, + col_mo = NULL, + episode_days = 365, + method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"), + ... +) +} +\arguments{ +\item{x}{a \link{data.frame} containing isolates. Can be left blank for automatic determination, see \emph{Examples}.} + +\item{col_date}{column name of the result date (or date that is was received on the lab) - the default is the first column with a date class} + +\item{col_patient_id}{column name of the unique IDs of the patients - the default is the first column that starts with 'patient' or 'patid' (case insensitive)} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{col_testcode}{column name of the test codes. Use \code{col_testcode = NULL} to \strong{not} exclude certain test codes (such as test codes for screening). In that case \code{testcodes_exclude} will be ignored.} + +\item{col_specimen}{column name of the specimen type or group} + +\item{col_icu}{column name of the logicals (\code{TRUE}/\code{FALSE}) whether a ward or department is an Intensive Care Unit (ICU). This can also be a \link{logical} vector with the same length as rows in \code{x}.} + +\item{col_keyantimicrobials}{(only useful when \code{method = "phenotype-based"}) column name of the key antimicrobials to determine first isolates, see \code{\link[=key_antimicrobials]{key_antimicrobials()}}. The default is the first column that starts with 'key' followed by 'ab' or 'antibiotics' or 'antimicrobials' (case insensitive). Use \code{col_keyantimicrobials = FALSE} to prevent this. Can also be the output of \code{\link[=key_antimicrobials]{key_antimicrobials()}}.} + +\item{episode_days}{episode in days after which a genus/species combination will be determined as 'first isolate' again. The default of 365 days is based on the guideline by CLSI, see \emph{Source}.} + +\item{testcodes_exclude}{a \link{character} vector with test codes that should be excluded (case-insensitive)} + +\item{icu_exclude}{a \link{logical} to indicate whether ICU isolates should be excluded (rows with value \code{TRUE} in the column set with \code{col_icu})} + +\item{specimen_group}{value in the column set with \code{col_specimen} to filter on} + +\item{type}{type to determine weighed isolates; can be \code{"keyantimicrobials"} or \code{"points"}, see \emph{Details}} + +\item{method}{the method to apply, either \code{"phenotype-based"}, \code{"episode-based"}, \code{"patient-based"} or \code{"isolate-based"} (can be abbreviated), see \emph{Details}. The default is \code{"phenotype-based"} if antimicrobial test results are present in the data, and \code{"episode-based"} otherwise.} + +\item{ignore_I}{\link{logical} to indicate whether antibiotic interpretations with \code{"I"} will be ignored when \code{type = "keyantimicrobials"}, see \emph{Details}} + +\item{points_threshold}{minimum number of points to require before differences in the antibiogram will lead to inclusion of an isolate when \code{type = "points"}, see \emph{Details}} + +\item{info}{a \link{logical} to indicate info should be printed - the default is \code{TRUE} only in interactive mode} + +\item{include_unknown}{a \link{logical} to indicate whether 'unknown' microorganisms should be included too, i.e. microbial code \code{"UNKNOWN"}, which defaults to \code{FALSE}. For WHONET users, this means that all records with organism code \code{"con"} (\emph{contamination}) will be excluded at default. Isolates with a microbial ID of \code{NA} will always be excluded as first isolate.} + +\item{include_untested_sir}{a \link{logical} to indicate whether also rows without antibiotic results are still eligible for becoming a first isolate. Use \code{include_untested_sir = FALSE} to always return \code{FALSE} for such rows. This checks the data set for columns of class \code{sir} and consequently requires transforming columns with antibiotic results using \code{\link[=as.sir]{as.sir()}} first.} + +\item{...}{arguments passed on to \code{\link[=first_isolate]{first_isolate()}} when using \code{\link[=filter_first_isolate]{filter_first_isolate()}}, otherwise arguments passed on to \code{\link[=key_antimicrobials]{key_antimicrobials()}} (such as \code{universal}, \code{gram_negative}, \code{gram_positive})} +} +\value{ +A \link{logical} vector +} +\description{ +Determine first isolates of all microorganisms of every patient per episode and (if needed) per specimen type. These functions support all four methods as summarised by Hindler \emph{et al.} in 2007 (\doi{10.1086/511864}). To determine patient episodes not necessarily based on microorganisms, use \code{\link[=is_new_episode]{is_new_episode()}} that also supports grouping with the \code{dplyr} package. +} +\details{ +To conduct epidemiological analyses on antimicrobial resistance data, only so-called first isolates should be included to prevent overestimation and underestimation of antimicrobial resistance. Different methods can be used to do so, see below. + +These functions are context-aware. This means that the \code{x} argument can be left blank if used inside a \link{data.frame} call, see \emph{Examples}. + +The \code{\link[=first_isolate]{first_isolate()}} function is a wrapper around the \code{\link[=is_new_episode]{is_new_episode()}} function, but more efficient for data sets containing microorganism codes or names. + +All isolates with a microbial ID of \code{NA} will be excluded as first isolate. +\subsection{Different methods}{ + +According to Hindler \emph{et al.} (2007, \doi{10.1086/511864}), there are different methods (algorithms) to select first isolates with increasing reliability: isolate-based, patient-based, episode-based and phenotype-based. All methods select on a combination of the taxonomic genus and species (not subspecies). + +All mentioned methods are covered in the \code{\link[=first_isolate]{first_isolate()}} function:\tabular{ll}{ + \strong{Method} \tab \strong{Function to apply} \cr + \strong{Isolate-based} \tab \code{first_isolate(x, method = "isolate-based")} \cr + \emph{(= all isolates)} \tab \cr + \tab \cr + \tab \cr + \strong{Patient-based} \tab \code{first_isolate(x, method = "patient-based")} \cr + \emph{(= first isolate per patient)} \tab \cr + \tab \cr + \tab \cr + \strong{Episode-based} \tab \code{first_isolate(x, method = "episode-based")}, or: \cr + \emph{(= first isolate per episode)} \tab \cr + - 7-Day interval from initial isolate \tab - \code{first_isolate(x, method = "e", episode_days = 7)} \cr + - 30-Day interval from initial isolate \tab - \code{first_isolate(x, method = "e", episode_days = 30)} \cr + \tab \cr + \tab \cr + \strong{Phenotype-based} \tab \code{first_isolate(x, method = "phenotype-based")}, or: \cr + \emph{(= first isolate per phenotype)} \tab \cr + - Major difference in any antimicrobial result \tab - \code{first_isolate(x, type = "points")} \cr + - Any difference in key antimicrobial results \tab - \code{first_isolate(x, type = "keyantimicrobials")} \cr +} + +} + +\subsection{Isolate-based}{ + +This method does not require any selection, as all isolates should be included. It does, however, respect all arguments set in the \code{\link[=first_isolate]{first_isolate()}} function. For example, the default setting for \code{include_unknown} (\code{FALSE}) will omit selection of rows without a microbial ID. +} + +\subsection{Patient-based}{ + +To include every genus-species combination per patient once, set the \code{episode_days} to \code{Inf}. Although often inappropriate, this method makes sure that no duplicate isolates are selected from the same patient. In a large longitudinal data set, this could mean that isolates are \emph{excluded} that were found years after the initial isolate. +} + +\subsection{Episode-based}{ + +To include every genus-species combination per patient episode once, set the \code{episode_days} to a sensible number of days. Depending on the type of analysis, this could be 14, 30, 60 or 365. Short episodes are common for analysing specific hospital or ward data, long episodes are common for analysing regional and national data. + +This is the most common method to correct for duplicate isolates. Patients are categorised into episodes based on their ID and dates (e.g., the date of specimen receipt or laboratory result). While this is a common method, it does not take into account antimicrobial test results. This means that e.g. a methicillin-resistant \emph{Staphylococcus aureus} (MRSA) isolate cannot be differentiated from a wildtype \emph{Staphylococcus aureus} isolate. +} + +\subsection{Phenotype-based}{ + +This is a more reliable method, since it also \emph{weighs} the antibiogram (antimicrobial test results) yielding so-called 'first weighted isolates'. There are two different methods to weigh the antibiogram: +\enumerate{ +\item Using \code{type = "points"} and argument \code{points_threshold} (default) + +This method weighs \emph{all} antimicrobial drugs available in the data set. Any difference from I to S or R (or vice versa) counts as \code{0.5} points, a difference from S to R (or vice versa) counts as \code{1} point. When the sum of points exceeds \code{points_threshold}, which defaults to \code{2}, an isolate will be selected as a first weighted isolate. + +All antimicrobials are internally selected using the \code{\link[=all_antimicrobials]{all_antimicrobials()}} function. The output of this function does not need to be passed to the \code{\link[=first_isolate]{first_isolate()}} function. +\item Using \code{type = "keyantimicrobials"} and argument \code{ignore_I} + +This method only weighs specific antimicrobial drugs, called \emph{key antimicrobials}. Any difference from S to R (or vice versa) in these key antimicrobials will select an isolate as a first weighted isolate. With \code{ignore_I = FALSE}, also differences from I to S or R (or vice versa) will lead to this. + +Key antimicrobials are internally selected using the \code{\link[=key_antimicrobials]{key_antimicrobials()}} function, but can also be added manually as a variable to the data and set in the \code{col_keyantimicrobials} argument. Another option is to pass the output of the \code{\link[=key_antimicrobials]{key_antimicrobials()}} function directly to the \code{col_keyantimicrobials} argument. +} + +The default method is phenotype-based (using \code{type = "points"}) and episode-based (using \code{episode_days = 365}). This makes sure that every genus-species combination is selected per patient once per year, while taking into account all antimicrobial test results. If no antimicrobial test results are available in the data set, only the episode-based method is applied at default. +} +} +\examples{ +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates. + +example_isolates[first_isolate(info = TRUE), ] +\donttest{ +# get all first Gram-negatives +example_isolates[which(first_isolate(info = FALSE) & mo_is_gram_negative()), ] + +if (require("dplyr")) { + # filter on first isolates using dplyr: + example_isolates \%>\% + filter(first_isolate(info = TRUE)) +} +if (require("dplyr")) { + # short-hand version: + example_isolates \%>\% + filter_first_isolate(info = FALSE) +} +if (require("dplyr")) { + # flag the first isolates per group: + example_isolates \%>\% + group_by(ward) \%>\% + mutate(first = first_isolate(info = TRUE)) \%>\% + select(ward, date, patient, mo, first) +} +} +} +\seealso{ +\code{\link[=key_antimicrobials]{key_antimicrobials()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/g.test.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/g.test.R +\name{g.test} +\alias{g.test} +\title{\emph{G}-test for Count Data} +\source{ +The code for this function is identical to that of \code{\link[=chisq.test]{chisq.test()}}, except that: +\itemize{ +\item The calculation of the statistic was changed to \eqn{2 * sum(x * log(x / E))} +\item Yates' continuity correction was removed as it does not apply to a \emph{G}-test +\item The possibility to simulate p values with \code{simulate.p.value} was removed +} +} +\usage{ +g.test(x, y = NULL, p = rep(1/length(x), length(x)), rescale.p = FALSE) +} +\arguments{ +\item{x}{a numeric vector or matrix. \code{x} and \code{y} can also + both be factors.} + +\item{y}{a numeric vector; ignored if \code{x} is a matrix. If + \code{x} is a factor, \code{y} should be a factor of the same length.} + +\item{p}{a vector of probabilities of the same length as \code{x}. + An error is given if any entry of \code{p} is negative.} + +\item{rescale.p}{a logical scalar; if TRUE then \code{p} is rescaled + (if necessary) to sum to 1. If \code{rescale.p} is FALSE, and + \code{p} does not sum to 1, an error is given.} +} +\value{ +A list with class \code{"htest"} containing the following + components: + \item{statistic}{the value the chi-squared test statistic.} + \item{parameter}{the degrees of freedom of the approximate + chi-squared distribution of the test statistic, \code{NA} if the + p-value is computed by Monte Carlo simulation.} + \item{p.value}{the p-value for the test.} + \item{method}{a character string indicating the type of test + performed, and whether Monte Carlo simulation or continuity + correction was used.} + \item{data.name}{a character string giving the name(s) of the data.} + \item{observed}{the observed counts.} + \item{expected}{the expected counts under the null hypothesis.} + \item{residuals}{the Pearson residuals, + \code{(observed - expected) / sqrt(expected)}.} + \item{stdres}{standardized residuals, + \code{(observed - expected) / sqrt(V)}, where \code{V} is the + residual cell variance (Agresti, 2007, section 2.4.5 + for the case where \code{x} is a matrix, \code{n * p * (1 - p)} otherwise).} +} +\description{ +\code{\link[=g.test]{g.test()}} performs chi-squared contingency table tests and goodness-of-fit tests, just like \code{\link[=chisq.test]{chisq.test()}} but is more reliable (1). A \emph{G}-test can be used to see whether the number of observations in each category fits a theoretical expectation (called a \strong{\emph{G}-test of goodness-of-fit}), or to see whether the proportions of one variable are different for different values of the other variable (called a \strong{\emph{G}-test of independence}). +} +\details{ +If \code{x} is a \link{matrix} with one row or column, or if \code{x} is a vector and \code{y} is not given, then a \emph{goodness-of-fit test} is performed (\code{x} is treated as a one-dimensional contingency table). The entries of \code{x} must be non-negative integers. In this case, the hypothesis tested is whether the population probabilities equal those in \code{p}, or are all equal if \code{p} is not given. + +If \code{x} is a \link{matrix} with at least two rows and columns, it is taken as a two-dimensional contingency table: the entries of \code{x} must be non-negative integers. Otherwise, \code{x} and \code{y} must be vectors or factors of the same length; cases with missing values are removed, the objects are coerced to factors, and the contingency table is computed from these. Then Pearson's chi-squared test is performed of the null hypothesis that the joint distribution of the cell counts in a 2-dimensional contingency table is the product of the row and column marginals. + +The p-value is computed from the asymptotic chi-squared distribution of the test statistic. + +In the contingency table case simulation is done by random sampling from the set of all contingency tables with given marginals, and works only if the marginals are strictly positive. Note that this is not the usual sampling situation assumed for a chi-squared test (such as the \emph{G}-test) but rather that for Fisher's exact test. + +In the goodness-of-fit case simulation is done by random sampling from the discrete distribution specified by \code{p}, each sample being of size \code{n = sum(x)}. This simulation is done in \R and may be slow. +\subsection{\emph{G}-test Of Goodness-of-Fit (Likelihood Ratio Test)}{ + +Use the \emph{G}-test of goodness-of-fit when you have one nominal variable with two or more values (such as male and female, or red, pink and white flowers). You compare the observed counts of numbers of observations in each category with the expected counts, which you calculate using some kind of theoretical expectation (such as a 1:1 sex ratio or a 1:2:1 ratio in a genetic cross). + +If the expected number of observations in any category is too small, the \emph{G}-test may give inaccurate results, and you should use an exact test instead (\code{\link[=fisher.test]{fisher.test()}}). + +The \emph{G}-test of goodness-of-fit is an alternative to the chi-square test of goodness-of-fit (\code{\link[=chisq.test]{chisq.test()}}); each of these tests has some advantages and some disadvantages, and the results of the two tests are usually very similar. +} + +\subsection{\emph{G}-test of Independence}{ + +Use the \emph{G}-test of independence when you have two nominal variables, each with two or more possible values. You want to know whether the proportions for one variable are different among values of the other variable. + +It is also possible to do a \emph{G}-test of independence with more than two nominal variables. For example, Jackson et al. (2013) also had data for children under 3, so you could do an analysis of old vs. young, thigh vs. arm, and reaction vs. no reaction, all analyzed together. + +Fisher's exact test (\code{\link[=fisher.test]{fisher.test()}}) is an \strong{exact} test, where the \emph{G}-test is still only an \strong{approximation}. For any 2x2 table, Fisher's Exact test may be slower but will still run in seconds, even if the sum of your observations is multiple millions. + +The \emph{G}-test of independence is an alternative to the chi-square test of independence (\code{\link[=chisq.test]{chisq.test()}}), and they will give approximately the same results. +} + +\subsection{How the Test Works}{ + +Unlike the exact test of goodness-of-fit (\code{\link[=fisher.test]{fisher.test()}}), the \emph{G}-test does not directly calculate the probability of obtaining the observed results or something more extreme. Instead, like almost all statistical tests, the \emph{G}-test has an intermediate step; it uses the data to calculate a test statistic that measures how far the observed data are from the null expectation. You then use a mathematical relationship, in this case the chi-square distribution, to estimate the probability of obtaining that value of the test statistic. + +The \emph{G}-test uses the log of the ratio of two likelihoods as the test statistic, which is why it is also called a likelihood ratio test or log-likelihood ratio test. The formula to calculate a \emph{G}-statistic is: + +\eqn{G = 2 * sum(x * log(x / E))} + +where \code{E} are the expected values. Since this is chi-square distributed, the p value can be calculated in \R with: + +\if{html}{\out{
}}\preformatted{p <- stats::pchisq(G, df, lower.tail = FALSE) +}\if{html}{\out{
}} + +where \code{df} are the degrees of freedom. + +If there are more than two categories and you want to find out which ones are significantly different from their null expectation, you can use the same method of testing each category vs. the sum of all categories, with the Bonferroni correction. You use \emph{G}-tests for each category, of course. +} +} +\examples{ +# = EXAMPLE 1 = +# Shivrain et al. (2006) crossed clearfield rice (which are resistant +# to the herbicide imazethapyr) with red rice (which are susceptible to +# imazethapyr). They then crossed the hybrid offspring and examined the +# F2 generation, where they found 772 resistant plants, 1611 moderately +# resistant plants, and 737 susceptible plants. If resistance is controlled +# by a single gene with two co-dominant alleles, you would expect a 1:2:1 +# ratio. + +x <- c(772, 1611, 737) +g.test(x, p = c(1, 2, 1) / 4) + +# There is no significant difference from a 1:2:1 ratio. +# Meaning: resistance controlled by a single gene with two co-dominant +# alleles, is plausible. + + +# = EXAMPLE 2 = +# Red crossbills (Loxia curvirostra) have the tip of the upper bill either +# right or left of the lower bill, which helps them extract seeds from pine +# cones. Some have hypothesized that frequency-dependent selection would +# keep the number of right and left-billed birds at a 1:1 ratio. Groth (1992) +# observed 1752 right-billed and 1895 left-billed crossbills. + +x <- c(1752, 1895) +g.test(x) + +# There is a significant difference from a 1:1 ratio. +# Meaning: there are significantly more left-billed birds. +} +\references{ +\enumerate{ +\item McDonald, J.H. 2014. \strong{Handbook of Biological Statistics (3rd ed.)}. Sparky House Publishing, Baltimore, Maryland. \url{http://www.biostathandbook.com/gtestgof.html}. +} +} +\seealso{ +\code{\link[=chisq.test]{chisq.test()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/get_episode.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/get_episode.R +\name{get_episode} +\alias{get_episode} +\alias{is_new_episode} +\title{Determine Clinical or Epidemic Episodes} +\usage{ +get_episode(x, episode_days = NULL, case_free_days = NULL, ...) + +is_new_episode(x, episode_days = NULL, case_free_days = NULL, ...) +} +\arguments{ +\item{x}{vector of dates (class \code{Date} or \code{POSIXt}), will be sorted internally to determine episodes} + +\item{episode_days}{episode length in days to specify the time period after which a new episode begins, can also be less than a day or \code{Inf}, see \emph{Details}} + +\item{case_free_days}{(inter-epidemic) interval length in days after which a new episode will start, can also be less than a day or \code{Inf}, see \emph{Details}} + +\item{...}{ignored, only in place to allow future extensions} +} +\value{ +\itemize{ +\item \code{\link[=get_episode]{get_episode()}}: an \link{integer} vector +\item \code{\link[=is_new_episode]{is_new_episode()}}: a \link{logical} vector +} +} +\description{ +These functions determine which items in a vector can be considered (the start of) a new episode. This can be used to determine clinical episodes for any epidemiological analysis. The \code{\link[=get_episode]{get_episode()}} function returns the index number of the episode per group, while the \code{\link[=is_new_episode]{is_new_episode()}} function returns \code{TRUE} for every new \code{\link[=get_episode]{get_episode()}} index. Both absolute and relative episode determination are supported. +} +\details{ +Episodes can be determined in two ways: absolute and relative. +\enumerate{ +\item Absolute + +This method uses \code{episode_days} to define an episode length in days, after which a new episode will start. A common use case in AMR data analysis is microbial epidemiology: episodes of \emph{S. aureus} bacteraemia in ICU patients for example. The episode length could then be 30 days, so that new \emph{S. aureus} isolates after an ICU episode of 30 days will be considered a different (or new) episode. + +Thus, this method counts \strong{since the start of the previous episode}. +\item Relative + +This method uses \code{case_free_days} to quantify the duration of case-free days (the inter-epidemic interval), after which a new episode will start. A common use case is infectious disease epidemiology: episodes of norovirus outbreaks in a hospital for example. The case-free period could then be 14 days, so that new norovirus cases after that time will be considered a different (or new) episode. + +Thus, this methods counts \strong{since the last case in the previous episode}. +} + +In a table:\tabular{ccc}{ + Date \tab Using \code{episode_days = 7} \tab Using \code{case_free_days = 7} \cr + 2023-01-01 \tab 1 \tab 1 \cr + 2023-01-02 \tab 1 \tab 1 \cr + 2023-01-05 \tab 1 \tab 1 \cr + 2023-01-08 \tab 2** \tab 1 \cr + 2023-02-21 \tab 3 \tab 2*** \cr + 2023-02-22 \tab 3 \tab 2 \cr + 2023-02-23 \tab 3 \tab 2 \cr + 2023-02-24 \tab 3 \tab 2 \cr + 2023-03-01 \tab 4 \tab 2 \cr +} + + +** This marks the start of a new episode, because 8 January 2023 is more than 7 days since the start of the previous episode (1 January 2023). \cr +*** This marks the start of a new episode, because 21 January 2023 is more than 7 days since the last case in the previous episode (8 January 2023). + +Either \code{episode_days} or \code{case_free_days} must be provided in the function. +\subsection{Difference between \code{get_episode()} and \code{is_new_episode()}}{ + +The \code{\link[=get_episode]{get_episode()}} function returns the index number of the episode, so all cases/patients/isolates in the first episode will have the number 1, all cases/patients/isolates in the second episode will have the number 2, etc. + +The \code{\link[=is_new_episode]{is_new_episode()}} function on the other hand, returns \code{TRUE} for every new \code{\link[=get_episode]{get_episode()}} index. + +To specify, when setting \code{episode_days = 365} (using method 1 as explained above), this is how the two functions differ:\tabular{cccc}{ + patient \tab date \tab \code{get_episode()} \tab \code{is_new_episode()} \cr + A \tab 2019-01-01 \tab 1 \tab TRUE \cr + A \tab 2019-03-01 \tab 1 \tab FALSE \cr + A \tab 2021-01-01 \tab 2 \tab TRUE \cr + B \tab 2008-01-01 \tab 1 \tab TRUE \cr + B \tab 2008-01-01 \tab 1 \tab FALSE \cr + C \tab 2020-01-01 \tab 1 \tab TRUE \cr +} + +} + +\subsection{Other}{ + +The \code{\link[=first_isolate]{first_isolate()}} function is a wrapper around the \code{\link[=is_new_episode]{is_new_episode()}} function, but is more efficient for data sets containing microorganism codes or names and allows for different isolate selection methods. + +The \code{dplyr} package is not required for these functions to work, but these episode functions do support \link[dplyr:group_by]{variable grouping} and work conveniently inside \code{dplyr} verbs such as \code{\link[dplyr:filter]{filter()}}, \code{\link[dplyr:mutate]{mutate()}} and \code{\link[dplyr:summarise]{summarise()}}. +} +} +\examples{ +# difference between absolute and relative determination of episodes: +x <- data.frame(dates = as.Date(c( + "2021-01-01", + "2021-01-02", + "2021-01-05", + "2021-01-08", + "2021-02-21", + "2021-02-22", + "2021-02-23", + "2021-02-24", + "2021-03-01", + "2021-03-01" +))) +x$absolute <- get_episode(x$dates, episode_days = 7) +x$relative <- get_episode(x$dates, case_free_days = 7) +x + + +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates +df <- example_isolates[sample(seq_len(2000), size = 100), ] + +get_episode(df$date, episode_days = 60) # indices +is_new_episode(df$date, episode_days = 60) # TRUE/FALSE + +# filter on results from the third 60-day episode only, using base R +df[which(get_episode(df$date, 60) == 3), ] + +# the functions also work for less than a day, e.g. to include one per hour: +get_episode( + c( + Sys.time(), + Sys.time() + 60 * 60 + ), + episode_days = 1 / 24 +) + +\donttest{ +if (require("dplyr")) { + # is_new_episode() can also be used in dplyr verbs to determine patient + # episodes based on any (combination of) grouping variables: + df \%>\% + mutate(condition = sample( + x = c("A", "B", "C"), + size = 100, + replace = TRUE + )) \%>\% + group_by(patient, condition) \%>\% + mutate(new_episode = is_new_episode(date, 365)) \%>\% + select(patient, date, condition, new_episode) \%>\% + arrange(patient, condition, date) +} + +if (require("dplyr")) { + df \%>\% + group_by(ward, patient) \%>\% + transmute(date, + patient, + new_index = get_episode(date, 60), + new_logical = is_new_episode(date, 60) + ) \%>\% + arrange(patient, ward, date) +} + +if (require("dplyr")) { + df \%>\% + group_by(ward) \%>\% + summarise( + n_patients = n_distinct(patient), + n_episodes_365 = sum(is_new_episode(date, episode_days = 365)), + n_episodes_60 = sum(is_new_episode(date, episode_days = 60)), + n_episodes_30 = sum(is_new_episode(date, episode_days = 30)) + ) +} + +# grouping on patients and microorganisms leads to the same +# results as first_isolate() when using 'episode-based': +if (require("dplyr")) { + x <- df \%>\% + filter_first_isolate( + include_unknown = TRUE, + method = "episode-based" + ) + + y <- df \%>\% + group_by(patient, mo) \%>\% + filter(is_new_episode(date, 365)) \%>\% + ungroup() + + identical(x, y) +} + +# but is_new_episode() has a lot more flexibility than first_isolate(), +# since you can now group on anything that seems relevant: +if (require("dplyr")) { + df \%>\% + group_by(patient, mo, ward) \%>\% + mutate(flag_episode = is_new_episode(date, 365)) \%>\% + select(group_vars(.), flag_episode) +} +} +} +\seealso{ +\code{\link[=first_isolate]{first_isolate()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/ggplot_pca.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ggplot_pca.R +\name{ggplot_pca} +\alias{ggplot_pca} +\title{PCA Biplot with \code{ggplot2}} +\source{ +The \code{\link[=ggplot_pca]{ggplot_pca()}} function is based on the \code{ggbiplot()} function from the \code{ggbiplot} package by Vince Vu, as found on GitHub: \url{https://github.com/vqv/ggbiplot} (retrieved: 2 March 2020, their latest commit: \href{https://github.com/vqv/ggbiplot/commit/7325e880485bea4c07465a0304c470608fffb5d9}{\code{7325e88}}; 12 February 2015). + +As per their GPL-2 licence that demands documentation of code changes, the changes made based on the source code were: +\enumerate{ +\item Rewritten code to remove the dependency on packages \code{plyr}, \code{scales} and \code{grid} +\item Parametrised more options, like arrow and ellipse settings +\item Hardened all input possibilities by defining the exact type of user input for every argument +\item Added total amount of explained variance as a caption in the plot +\item Cleaned all syntax based on the \code{lintr} package, fixed grammatical errors and added integrity checks +\item Updated documentation +} +} +\usage{ +ggplot_pca( + x, + choices = 1:2, + scale = 1, + pc.biplot = TRUE, + labels = NULL, + labels_textsize = 3, + labels_text_placement = 1.5, + groups = NULL, + ellipse = TRUE, + ellipse_prob = 0.68, + ellipse_size = 0.5, + ellipse_alpha = 0.5, + points_size = 2, + points_alpha = 0.25, + arrows = TRUE, + arrows_colour = "darkblue", + arrows_size = 0.5, + arrows_textsize = 3, + arrows_textangled = TRUE, + arrows_alpha = 0.75, + base_textsize = 10, + ... +) +} +\arguments{ +\item{x}{an object returned by \code{\link[=pca]{pca()}}, \code{\link[=prcomp]{prcomp()}} or \code{\link[=princomp]{princomp()}}} + +\item{choices}{ + length 2 vector specifying the components to plot. Only the default + is a biplot in the strict sense. + } + +\item{scale}{ + The variables are scaled by \code{lambda ^ scale} and the + observations are scaled by \code{lambda ^ (1-scale)} where + \code{lambda} are the singular values as computed by + \code{\link[stats]{princomp}}. Normally \code{0 <= scale <= 1}, and a warning + will be issued if the specified \code{scale} is outside this range. + } + +\item{pc.biplot}{ + If true, use what Gabriel (1971) refers to as a "principal component + biplot", with \code{lambda = 1} and observations scaled up by sqrt(n) and + variables scaled down by sqrt(n). Then inner products between + variables approximate covariances and distances between observations + approximate Mahalanobis distance. + } + +\item{labels}{an optional vector of labels for the observations. If set, the labels will be placed below their respective points. When using the \code{\link[=pca]{pca()}} function as input for \code{x}, this will be determined automatically based on the attribute \code{non_numeric_cols}, see \code{\link[=pca]{pca()}}.} + +\item{labels_textsize}{the size of the text used for the labels} + +\item{labels_text_placement}{adjustment factor the placement of the variable names (\verb{>=1} means further away from the arrow head)} + +\item{groups}{an optional vector of groups for the labels, with the same length as \code{labels}. If set, the points and labels will be coloured according to these groups. When using the \code{\link[=pca]{pca()}} function as input for \code{x}, this will be determined automatically based on the attribute \code{non_numeric_cols}, see \code{\link[=pca]{pca()}}.} + +\item{ellipse}{a \link{logical} to indicate whether a normal data ellipse should be drawn for each group (set with \code{groups})} + +\item{ellipse_prob}{statistical size of the ellipse in normal probability} + +\item{ellipse_size}{the size of the ellipse line} + +\item{ellipse_alpha}{the alpha (transparency) of the ellipse line} + +\item{points_size}{the size of the points} + +\item{points_alpha}{the alpha (transparency) of the points} + +\item{arrows}{a \link{logical} to indicate whether arrows should be drawn} + +\item{arrows_colour}{the colour of the arrow and their text} + +\item{arrows_size}{the size (thickness) of the arrow lines} + +\item{arrows_textsize}{the size of the text at the end of the arrows} + +\item{arrows_textangled}{a \link{logical} whether the text at the end of the arrows should be angled} + +\item{arrows_alpha}{the alpha (transparency) of the arrows and their text} + +\item{base_textsize}{the text size for all plot elements except the labels and arrows} + +\item{...}{arguments passed on to functions} +} +\description{ +Produces a \code{ggplot2} variant of a so-called \href{https://en.wikipedia.org/wiki/Biplot}{biplot} for PCA (principal component analysis), but is more flexible and more appealing than the base \R \code{\link[=biplot]{biplot()}} function. +} +\details{ +The colours for labels and points can be changed by adding another scale layer for colour, such as \code{scale_colour_viridis_d()} and \code{scale_colour_brewer()}. +} +\examples{ +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates. + +\donttest{ +if (require("dplyr")) { + # calculate the resistance per group first + resistance_data <- example_isolates \%>\% + group_by( + order = mo_order(mo), # group on anything, like order + genus = mo_genus(mo) + ) \%>\% # and genus as we do here; + filter(n() >= 30) \%>\% # filter on only 30 results per group + summarise_if(is.sir, resistance) # then get resistance of all drugs + + # now conduct PCA for certain antimicrobial drugs + pca_result <- resistance_data \%>\% + pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) + + summary(pca_result) + + # old base R plotting method: + biplot(pca_result, main = "Base R biplot") + + # new ggplot2 plotting method using this package: + if (require("ggplot2")) { + ggplot_pca(pca_result) + + labs(title = "ggplot2 biplot") + } + if (require("ggplot2")) { + # still extendible with any ggplot2 function + ggplot_pca(pca_result) + + scale_colour_viridis_d() + + labs(title = "ggplot2 biplot") + } +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/ggplot_sir.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ggplot_sir.R +\name{ggplot_sir} +\alias{ggplot_sir} +\alias{geom_sir} +\alias{facet_sir} +\alias{scale_y_percent} +\alias{scale_sir_colours} +\alias{theme_sir} +\alias{labels_sir_count} +\title{AMR Plots with \code{ggplot2}} +\usage{ +ggplot_sir( + data, + position = NULL, + x = "antibiotic", + fill = "interpretation", + facet = NULL, + breaks = seq(0, 1, 0.1), + limits = NULL, + translate_ab = "name", + combine_SI = TRUE, + minimum = 30, + language = get_AMR_locale(), + nrow = NULL, + colours = c(S = "#3CAEA3", SI = "#3CAEA3", I = "#F6D55C", IR = "#ED553B", R = + "#ED553B"), + datalabels = TRUE, + datalabels.size = 2.5, + datalabels.colour = "grey15", + title = NULL, + subtitle = NULL, + caption = NULL, + x.title = "Antimicrobial", + y.title = "Proportion", + ... +) + +geom_sir( + position = NULL, + x = c("antibiotic", "interpretation"), + fill = "interpretation", + translate_ab = "name", + minimum = 30, + language = get_AMR_locale(), + combine_SI = TRUE, + ... +) + +facet_sir(facet = c("interpretation", "antibiotic"), nrow = NULL) + +scale_y_percent(breaks = seq(0, 1, 0.1), limits = NULL) + +scale_sir_colours(..., aesthetics = "fill") + +theme_sir() + +labels_sir_count( + position = NULL, + x = "antibiotic", + translate_ab = "name", + minimum = 30, + language = get_AMR_locale(), + combine_SI = TRUE, + datalabels.size = 3, + datalabels.colour = "grey15" +) +} +\arguments{ +\item{data}{a \link{data.frame} with column(s) of class \code{\link{sir}} (see \code{\link[=as.sir]{as.sir()}})} + +\item{position}{position adjustment of bars, either \code{"fill"}, \code{"stack"} or \code{"dodge"}} + +\item{x}{variable to show on x axis, either \code{"antibiotic"} (default) or \code{"interpretation"} or a grouping variable} + +\item{fill}{variable to categorise using the plots legend, either \code{"antibiotic"} (default) or \code{"interpretation"} or a grouping variable} + +\item{facet}{variable to split plots by, either \code{"interpretation"} (default) or \code{"antibiotic"} or a grouping variable} + +\item{breaks}{a \link{numeric} vector of positions} + +\item{limits}{a \link{numeric} vector of length two providing limits of the scale, use \code{NA} to refer to the existing minimum or maximum} + +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} + +\item{combine_SI}{a \link{logical} to indicate whether all values of S, SDD, and I must be merged into one, so the output only consists of S+SDD+I vs. R (susceptible vs. resistant) - the default is \code{TRUE}} + +\item{minimum}{the minimum allowed number of available (tested) isolates. Any isolate count lower than \code{minimum} will return \code{NA} with a warning. The default number of \code{30} isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see \emph{Source}.} + +\item{language}{language of the returned text - the default is the current system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{nrow}{(when using \code{facet}) number of rows} + +\item{colours}{a named vactor with colour to be used for filling. The default colours are colour-blind friendly.} + +\item{datalabels}{show datalabels using \code{\link[=labels_sir_count]{labels_sir_count()}}} + +\item{datalabels.size}{size of the datalabels} + +\item{datalabels.colour}{colour of the datalabels} + +\item{title}{text to show as title of the plot} + +\item{subtitle}{text to show as subtitle of the plot} + +\item{caption}{text to show as caption of the plot} + +\item{x.title}{text to show as x axis description} + +\item{y.title}{text to show as y axis description} + +\item{...}{other arguments passed on to \code{\link[=geom_sir]{geom_sir()}} or, in case of \code{\link[=scale_sir_colours]{scale_sir_colours()}}, named values to set colours. The default colours are colour-blind friendly, while maintaining the convention that e.g. 'susceptible' should be green and 'resistant' should be red. See \emph{Examples}.} + +\item{aesthetics}{aesthetics to apply the colours to - the default is "fill" but can also be (a combination of) "alpha", "colour", "fill", "linetype", "shape" or "size"} +} +\description{ +Use these functions to create bar plots for AMR data analysis. All functions rely on \link[ggplot2:ggplot]{ggplot2} functions. +} +\details{ +At default, the names of antibiotics will be shown on the plots using \code{\link[=ab_name]{ab_name()}}. This can be set with the \code{translate_ab} argument. See \code{\link[=count_df]{count_df()}}. +\subsection{The Functions}{ + +\code{\link[=geom_sir]{geom_sir()}} will take any variable from the data that has an \code{\link{sir}} class (created with \code{\link[=as.sir]{as.sir()}}) using \code{\link[=sir_df]{sir_df()}} and will plot bars with the percentage S, I, and R. The default behaviour is to have the bars stacked and to have the different antibiotics on the x axis. + +\code{\link[=facet_sir]{facet_sir()}} creates 2d plots (at default based on S/I/R) using \code{\link[ggplot2:facet_wrap]{ggplot2::facet_wrap()}}. + +\code{\link[=scale_y_percent]{scale_y_percent()}} transforms the y axis to a 0 to 100\% range using \code{\link[ggplot2:scale_continuous]{ggplot2::scale_y_continuous()}}. + +\code{\link[=scale_sir_colours]{scale_sir_colours()}} sets colours to the bars (green for S, yellow for I, and red for R). with multilingual support. The default colours are colour-blind friendly, while maintaining the convention that e.g. 'susceptible' should be green and 'resistant' should be red. + +\code{\link[=theme_sir]{theme_sir()}} is a [ggplot2 theme][\code{\link[ggplot2:theme]{ggplot2::theme()}} with minimal distraction. + +\code{\link[=labels_sir_count]{labels_sir_count()}} print datalabels on the bars with percentage and amount of isolates using \code{\link[ggplot2:geom_text]{ggplot2::geom_text()}}. + +\code{\link[=ggplot_sir]{ggplot_sir()}} is a wrapper around all above functions that uses data as first input. This makes it possible to use this function after a pipe (\verb{\%>\%}). See \emph{Examples}. +} +} +\examples{ +\donttest{ +if (require("ggplot2") && require("dplyr")) { + # get antimicrobial results for drugs against a UTI: + ggplot(example_isolates \%>\% select(AMX, NIT, FOS, TMP, CIP)) + + geom_sir() +} +if (require("ggplot2") && require("dplyr")) { + # prettify the plot using some additional functions: + df <- example_isolates \%>\% select(AMX, NIT, FOS, TMP, CIP) + ggplot(df) + + geom_sir() + + scale_y_percent() + + scale_sir_colours() + + labels_sir_count() + + theme_sir() +} +if (require("ggplot2") && require("dplyr")) { + # or better yet, simplify this using the wrapper function - a single command: + example_isolates \%>\% + select(AMX, NIT, FOS, TMP, CIP) \%>\% + ggplot_sir() +} +if (require("ggplot2") && require("dplyr")) { + # get only proportions and no counts: + example_isolates \%>\% + select(AMX, NIT, FOS, TMP, CIP) \%>\% + ggplot_sir(datalabels = FALSE) +} +if (require("ggplot2") && require("dplyr")) { + # add other ggplot2 arguments as you like: + example_isolates \%>\% + select(AMX, NIT, FOS, TMP, CIP) \%>\% + ggplot_sir( + width = 0.5, + colour = "black", + size = 1, + linetype = 2, + alpha = 0.25 + ) +} +if (require("ggplot2") && require("dplyr")) { + # you can alter the colours with colour names: + example_isolates \%>\% + select(AMX) \%>\% + ggplot_sir(colours = c(SI = "yellow")) +} +if (require("ggplot2") && require("dplyr")) { + # but you can also use the built-in colour-blind friendly colours for + # your plots, where "S" is green, "I" is yellow and "R" is red: + data.frame( + x = c("Value1", "Value2", "Value3"), + y = c(1, 2, 3), + z = c("Value4", "Value5", "Value6") + ) \%>\% + ggplot() + + geom_col(aes(x = x, y = y, fill = z)) + + scale_sir_colours(Value4 = "S", Value5 = "I", Value6 = "R") +} +if (require("ggplot2") && require("dplyr")) { + # resistance of ciprofloxacine per age group + example_isolates \%>\% + mutate(first_isolate = first_isolate()) \%>\% + filter( + first_isolate == TRUE, + mo == as.mo("Escherichia coli") + ) \%>\% + # age_groups() is also a function in this AMR package: + group_by(age_group = age_groups(age)) \%>\% + select(age_group, CIP) \%>\% + ggplot_sir(x = "age_group") +} +if (require("ggplot2") && require("dplyr")) { + # a shorter version which also adjusts data label colours: + example_isolates \%>\% + select(AMX, NIT, FOS, TMP, CIP) \%>\% + ggplot_sir(colours = FALSE) +} +if (require("ggplot2") && require("dplyr")) { + # it also supports groups (don't forget to use the group var on `x` or `facet`): + example_isolates \%>\% + filter(mo_is_gram_negative(), ward != "Outpatient") \%>\% + # select only UTI-specific drugs + select(ward, AMX, NIT, FOS, TMP, CIP) \%>\% + group_by(ward) \%>\% + ggplot_sir( + x = "ward", + facet = "antibiotic", + nrow = 1, + title = "AMR of Anti-UTI Drugs Per Ward", + x.title = "Ward", + datalabels = FALSE + ) +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/guess_ab_col.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/guess_ab_col.R +\name{guess_ab_col} +\alias{guess_ab_col} +\title{Guess Antibiotic Column} +\usage{ +guess_ab_col( + x = NULL, + search_string = NULL, + verbose = FALSE, + only_sir_columns = FALSE +) +} +\arguments{ +\item{x}{a \link{data.frame}} + +\item{search_string}{a text to search \code{x} for, will be checked with \code{\link[=as.ab]{as.ab()}} if this value is not a column in \code{x}} + +\item{verbose}{a \link{logical} to indicate whether additional info should be printed} + +\item{only_sir_columns}{a \link{logical} to indicate whether only antibiotic columns must be detected that were transformed to class \code{sir} (see \code{\link[=as.sir]{as.sir()}}) on beforehand (default is \code{FALSE})} +} +\value{ +A column name of \code{x}, or \code{NULL} when no result is found. +} +\description{ +This tries to find a column name in a data set based on information from the \link{antibiotics} data set. Also supports WHONET abbreviations. +} +\details{ +You can look for an antibiotic (trade) name or abbreviation and it will search \code{x} and the \link{antibiotics} data set for any column containing a name or code of that antibiotic. +} +\examples{ +df <- data.frame( + amox = "S", + tetr = "R" +) + +guess_ab_col(df, "amoxicillin") +guess_ab_col(df, "J01AA07") # ATC code of tetracycline + +guess_ab_col(df, "J01AA07", verbose = TRUE) +# NOTE: Using column 'tetr' as input for J01AA07 (tetracycline). + +# WHONET codes +df <- data.frame( + AMP_ND10 = "R", + AMC_ED20 = "S" +) +guess_ab_col(df, "ampicillin") +guess_ab_col(df, "J01CR02") +guess_ab_col(df, as.ab("augmentin")) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/intrinsic_resistant.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{intrinsic_resistant} +\alias{intrinsic_resistant} +\title{Data Set with Bacterial Intrinsic Resistance} +\format{ +A \link[tibble:tibble]{tibble} with 301 583 observations and 2 variables: +\itemize{ +\item \code{mo}\cr Microorganism ID +\item \code{ab}\cr Antibiotic ID +} +} +\usage{ +intrinsic_resistant +} +\description{ +Data set containing defined intrinsic resistance by EUCAST of all bug-drug combinations. +} +\details{ +This data set is based on \href{https://www.eucast.org/expert_rules_and_expected_phenotypes}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021). +\subsection{Direct download}{ + +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. + +They \strong{allow for machine reading EUCAST and CLSI guidelines}, which is almost impossible with the MS Excel and PDF files distributed by EUCAST and CLSI. +} +} +\examples{ +intrinsic_resistant +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/italicise_taxonomy.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/italicise_taxonomy.R +\name{italicise_taxonomy} +\alias{italicise_taxonomy} +\alias{italicize_taxonomy} +\title{Italicise Taxonomic Families, Genera, Species, Subspecies} +\usage{ +italicise_taxonomy(string, type = c("markdown", "ansi", "html")) + +italicize_taxonomy(string, type = c("markdown", "ansi", "html")) +} +\arguments{ +\item{string}{a \link{character} (vector)} + +\item{type}{type of conversion of the taxonomic names, either "markdown", "html" or "ansi", see \emph{Details}} +} +\description{ +According to the binomial nomenclature, the lowest four taxonomic levels (family, genus, species, subspecies) should be printed in italics. This function finds taxonomic names within strings and makes them italic. +} +\details{ +This function finds the taxonomic names and makes them italic based on the \link{microorganisms} data set. + +The taxonomic names can be italicised using markdown (the default) by adding \code{*} before and after the taxonomic names, or \verb{} and \verb{} when using html. When using 'ansi', ANSI colours will be added using \verb{\\033[3m} before and \verb{\\033[23m} after the taxonomic names. If multiple ANSI colours are not available, no conversion will occur. + +This function also supports abbreviation of the genus if it is followed by a species, such as "E. coli" and "K. pneumoniae ozaenae". +} +\examples{ +italicise_taxonomy("An overview of Staphylococcus aureus isolates") +italicise_taxonomy("An overview of S. aureus isolates") + +cat(italicise_taxonomy("An overview of S. aureus isolates", type = "ansi")) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/join.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/join_microorganisms.R +\name{join} +\alias{join} +\alias{inner_join_microorganisms} +\alias{inner_join} +\alias{left_join_microorganisms} +\alias{right_join_microorganisms} +\alias{full_join_microorganisms} +\alias{semi_join_microorganisms} +\alias{anti_join_microorganisms} +\title{Join \link{microorganisms} to a Data Set} +\usage{ +inner_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...) + +left_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...) + +right_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...) + +full_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...) + +semi_join_microorganisms(x, by = NULL, ...) + +anti_join_microorganisms(x, by = NULL, ...) +} +\arguments{ +\item{x}{existing data set to join, or \link{character} vector. In case of a \link{character} vector, the resulting \link{data.frame} will contain a column 'x' with these values.} + +\item{by}{a variable to join by - if left empty will search for a column with class \code{\link{mo}} (created with \code{\link[=as.mo]{as.mo()}}) or will be \code{"mo"} if that column name exists in \code{x}, could otherwise be a column name of \code{x} with values that exist in \code{microorganisms$mo} (such as \code{by = "bacteria_id"}), or another column in \link{microorganisms} (but then it should be named, like \code{by = c("bacteria_id" = "fullname")})} + +\item{suffix}{if there are non-joined duplicate variables in \code{x} and \code{y}, these suffixes will be added to the output to disambiguate them. Should be a \link{character} vector of length 2.} + +\item{...}{ignored, only in place to allow future extensions} +} +\value{ +a \link{data.frame} +} +\description{ +Join the data set \link{microorganisms} easily to an existing data set or to a \link{character} vector. +} +\details{ +\strong{Note:} As opposed to the \code{join()} functions of \code{dplyr}, \link{character} vectors are supported and at default existing columns will get a suffix \code{"2"} and the newly joined columns will not get a suffix. + +If the \code{dplyr} package is installed, their join functions will be used. Otherwise, the much slower \code{\link[=merge]{merge()}} and \code{\link[=interaction]{interaction()}} functions from base \R will be used. +} +\examples{ +left_join_microorganisms(as.mo("K. pneumoniae")) +left_join_microorganisms("B_KLBSL_PNMN") + +df <- data.frame( + date = seq( + from = as.Date("2018-01-01"), + to = as.Date("2018-01-07"), + by = 1 + ), + bacteria = as.mo(c( + "S. aureus", "MRSA", "MSSA", "STAAUR", + "E. coli", "E. coli", "E. coli" + )), + stringsAsFactors = FALSE +) +colnames(df) + +df_joined <- left_join_microorganisms(df, "bacteria") +colnames(df_joined) + +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + left_join_microorganisms() \%>\% + colnames() +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/key_antimicrobials.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/key_antimicrobials.R +\name{key_antimicrobials} +\alias{key_antimicrobials} +\alias{all_antimicrobials} +\alias{antimicrobials_equal} +\title{(Key) Antimicrobials for First Weighted Isolates} +\usage{ +key_antimicrobials( + x = NULL, + col_mo = NULL, + universal = c("ampicillin", "amoxicillin/clavulanic acid", "cefuroxime", + "piperacillin/tazobactam", "ciprofloxacin", "trimethoprim/sulfamethoxazole"), + gram_negative = c("gentamicin", "tobramycin", "colistin", "cefotaxime", "ceftazidime", + "meropenem"), + gram_positive = c("vancomycin", "teicoplanin", "tetracycline", "erythromycin", + "oxacillin", "rifampin"), + antifungal = c("anidulafungin", "caspofungin", "fluconazole", "miconazole", "nystatin", + "voriconazole"), + only_sir_columns = FALSE, + ... +) + +all_antimicrobials(x = NULL, only_sir_columns = FALSE, ...) + +antimicrobials_equal( + y, + z, + type = c("points", "keyantimicrobials"), + ignore_I = TRUE, + points_threshold = 2, + ... +) +} +\arguments{ +\item{x}{a \link{data.frame} with antibiotics columns, like \code{AMX} or \code{amox}. Can be left blank to determine automatically} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{universal}{names of \strong{broad-spectrum} antimicrobial drugs, case-insensitive. Set to \code{NULL} to ignore. See \emph{Details} for the default antimicrobial drugs} + +\item{gram_negative}{names of antibiotic drugs for \strong{Gram-positives}, case-insensitive. Set to \code{NULL} to ignore. See \emph{Details} for the default antibiotic drugs} + +\item{gram_positive}{names of antibiotic drugs for \strong{Gram-negatives}, case-insensitive. Set to \code{NULL} to ignore. See \emph{Details} for the default antibiotic drugs} + +\item{antifungal}{names of antifungal drugs for \strong{fungi}, case-insensitive. Set to \code{NULL} to ignore. See \emph{Details} for the default antifungal drugs} + +\item{only_sir_columns}{a \link{logical} to indicate whether only columns must be included that were transformed to class \code{sir} (see \code{\link[=as.sir]{as.sir()}}) on beforehand (default is \code{FALSE})} + +\item{...}{ignored, only in place to allow future extensions} + +\item{y, z}{\link{character} vectors to compare} + +\item{type}{type to determine weighed isolates; can be \code{"keyantimicrobials"} or \code{"points"}, see \emph{Details}} + +\item{ignore_I}{\link{logical} to indicate whether antibiotic interpretations with \code{"I"} will be ignored when \code{type = "keyantimicrobials"}, see \emph{Details}} + +\item{points_threshold}{minimum number of points to require before differences in the antibiogram will lead to inclusion of an isolate when \code{type = "points"}, see \emph{Details}} +} +\description{ +These functions can be used to determine first weighted isolates by considering the phenotype for isolate selection (see \code{\link[=first_isolate]{first_isolate()}}). Using a phenotype-based method to determine first isolates is more reliable than methods that disregard phenotypes. +} +\details{ +The \code{\link[=key_antimicrobials]{key_antimicrobials()}} and \code{\link[=all_antimicrobials]{all_antimicrobials()}} functions are context-aware. This means that the \code{x} argument can be left blank if used inside a \link{data.frame} call, see \emph{Examples}. + +The function \code{\link[=key_antimicrobials]{key_antimicrobials()}} returns a \link{character} vector with 12 antimicrobial results for every isolate. The function \code{\link[=all_antimicrobials]{all_antimicrobials()}} returns a \link{character} vector with all antimicrobial drug results for every isolate. These vectors can then be compared using \code{\link[=antimicrobials_equal]{antimicrobials_equal()}}, to check if two isolates have generally the same antibiogram. Missing and invalid values are replaced with a dot (\code{"."}) by \code{\link[=key_antimicrobials]{key_antimicrobials()}} and ignored by \code{\link[=antimicrobials_equal]{antimicrobials_equal()}}. + +Please see the \code{\link[=first_isolate]{first_isolate()}} function how these important functions enable the 'phenotype-based' method for determination of first isolates. + +The default antimicrobial drugs used for \strong{all rows} (set in \code{universal}) are: +\itemize{ +\item Ampicillin +\item Amoxicillin/clavulanic acid +\item Cefuroxime +\item Ciprofloxacin +\item Piperacillin/tazobactam +\item Trimethoprim/sulfamethoxazole +} + +The default antimicrobial drugs used for \strong{Gram-negative bacteria} (set in \code{gram_negative}) are: +\itemize{ +\item Cefotaxime +\item Ceftazidime +\item Colistin +\item Gentamicin +\item Meropenem +\item Tobramycin +} + +The default antimicrobial drugs used for \strong{Gram-positive bacteria} (set in \code{gram_positive}) are: +\itemize{ +\item Erythromycin +\item Oxacillin +\item Rifampin +\item Teicoplanin +\item Tetracycline +\item Vancomycin +} + +The default antimicrobial drugs used for \strong{fungi} (set in \code{antifungal}) are: +\itemize{ +\item Anidulafungin +\item Caspofungin +\item Fluconazole +\item Miconazole +\item Nystatin +\item Voriconazole +} +} +\examples{ +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates. + +# output of the `key_antimicrobials()` function could be like this: +strainA <- "SSSRR.S.R..S" +strainB <- "SSSIRSSSRSSS" + +# those strings can be compared with: +antimicrobials_equal(strainA, strainB, type = "keyantimicrobials") +# TRUE, because I is ignored (as well as missing values) + +antimicrobials_equal(strainA, strainB, type = "keyantimicrobials", ignore_I = FALSE) +# FALSE, because I is not ignored and so the 4th [character] differs + +\donttest{ +if (require("dplyr")) { + # set key antibiotics to a new variable + my_patients <- example_isolates \%>\% + mutate(keyab = key_antimicrobials(antifungal = NULL)) \%>\% # no need to define `x` + mutate( + # now calculate first isolates + first_regular = first_isolate(col_keyantimicrobials = FALSE), + # and first WEIGHTED isolates + first_weighted = first_isolate(col_keyantimicrobials = "keyab") + ) + + # Check the difference in this data set, 'weighted' results in more isolates: + sum(my_patients$first_regular, na.rm = TRUE) + sum(my_patients$first_weighted, na.rm = TRUE) +} +} +} +\seealso{ +\code{\link[=first_isolate]{first_isolate()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/kurtosis.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/kurtosis.R +\name{kurtosis} +\alias{kurtosis} +\alias{kurtosis.default} +\alias{kurtosis.matrix} +\alias{kurtosis.data.frame} +\title{Kurtosis of the Sample} +\usage{ +kurtosis(x, na.rm = FALSE, excess = FALSE) + +\method{kurtosis}{default}(x, na.rm = FALSE, excess = FALSE) + +\method{kurtosis}{matrix}(x, na.rm = FALSE, excess = FALSE) + +\method{kurtosis}{data.frame}(x, na.rm = FALSE, excess = FALSE) +} +\arguments{ +\item{x}{a vector of values, a \link{matrix} or a \link{data.frame}} + +\item{na.rm}{a \link{logical} to indicate whether \code{NA} values should be stripped before the computation proceeds} + +\item{excess}{a \link{logical} to indicate whether the \emph{excess kurtosis} should be returned, defined as the kurtosis minus 3.} +} +\description{ +Kurtosis is a measure of the "tailedness" of the probability distribution of a real-valued random variable. A normal distribution has a kurtosis of 3 and a excess kurtosis of 0. +} +\examples{ +kurtosis(rnorm(10000)) +kurtosis(rnorm(10000), excess = TRUE) +} +\seealso{ +\code{\link[=skewness]{skewness()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/like.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/like.R +\name{like} +\alias{like} +\alias{\%like\%} +\alias{\%unlike\%} +\alias{\%like_case\%} +\alias{\%unlike_case\%} +\title{Vectorised Pattern Matching with Keyboard Shortcut} +\source{ +Idea from the \href{https://github.com/Rdatatable/data.table/blob/ec1259af1bf13fc0c96a1d3f9e84d55d8106a9a4/R/like.R}{\code{like} function from the \code{data.table} package}, although altered as explained in \emph{Details}. +} +\usage{ +like(x, pattern, ignore.case = TRUE) + +x \%like\% pattern + +x \%unlike\% pattern + +x \%like_case\% pattern + +x \%unlike_case\% pattern +} +\arguments{ +\item{x}{a \link{character} vector where matches are sought, or an object which can be coerced by \code{\link[=as.character]{as.character()}} to a \link{character} vector.} + +\item{pattern}{a \link{character} vector containing regular expressions (or a \link{character} string for \code{fixed = TRUE}) to be matched in the given \link{character} vector. Coerced by \code{\link[=as.character]{as.character()}} to a \link{character} string if possible.} + +\item{ignore.case}{if \code{FALSE}, the pattern matching is \emph{case sensitive} and if \code{TRUE}, case is ignored during matching.} +} +\value{ +A \link{logical} vector +} +\description{ +Convenient wrapper around \code{\link[=grepl]{grepl()}} to match a pattern: \code{x \%like\% pattern}. It always returns a \code{\link{logical}} vector and is always case-insensitive (use \code{x \%like_case\% pattern} for case-sensitive matching). Also, \code{pattern} can be as long as \code{x} to compare items of each index in both vectors, or they both can have the same length to iterate over all cases. +} +\details{ +These \code{\link[=like]{like()}} and \verb{\%like\%}/\verb{\%unlike\%} functions: +\itemize{ +\item Are case-insensitive (use \verb{\%like_case\%}/\verb{\%unlike_case\%} for case-sensitive matching) +\item Support multiple patterns +\item Check if \code{pattern} is a valid regular expression and sets \code{fixed = TRUE} if not, to greatly improve speed (vectorised over \code{pattern}) +\item Always use compatibility with Perl unless \code{fixed = TRUE}, to greatly improve speed +} + +Using RStudio? The \verb{\%like\%}/\verb{\%unlike\%} functions can also be directly inserted in your code from the Addins menu and can have its own keyboard shortcut like \code{Shift+Ctrl+L} or \code{Shift+Cmd+L} (see menu \code{Tools} > \verb{Modify Keyboard Shortcuts...}). If you keep pressing your shortcut, the inserted text will be iterated over \verb{\%like\%} -> \verb{\%unlike\%} -> \verb{\%like_case\%} -> \verb{\%unlike_case\%}. +} +\examples{ +# data.table has a more limited version of \%like\%, so unload it: +try(detach("package:data.table", unload = TRUE), silent = TRUE) + +a <- "This is a test" +b <- "TEST" +a \%like\% b +b \%like\% a + +# also supports multiple patterns +a <- c("Test case", "Something different", "Yet another thing") +b <- c("case", "diff", "yet") +a \%like\% b +a \%unlike\% b + +a[1] \%like\% b +a \%like\% b[1] + +\donttest{ +# get isolates whose name start with 'Entero' (case-insensitive) +example_isolates[which(mo_name() \%like\% "^entero"), ] + +if (require("dplyr")) { + example_isolates \%>\% + filter(mo_name() \%like\% "^ent") +} +} +} +\seealso{ +\code{\link[=grepl]{grepl()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/mdro.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mdro.R +\name{mdro} +\alias{mdro} +\alias{MDR} +\alias{XDR} +\alias{PDR} +\alias{BRMO} +\alias{3MRGN} +\alias{4MRGN} +\alias{custom_mdro_guideline} +\alias{brmo} +\alias{mrgn} +\alias{mdr_tb} +\alias{mdr_cmi2012} +\alias{eucast_exceptional_phenotypes} +\title{Determine Multidrug-Resistant Organisms (MDRO)} +\source{ +See the supported guidelines above for the \link{list} of publications used for this function. +} +\usage{ +mdro( + x = NULL, + guideline = "CMI2012", + col_mo = NULL, + info = interactive(), + pct_required_classes = 0.5, + combine_SI = TRUE, + verbose = FALSE, + only_sir_columns = FALSE, + ... +) + +custom_mdro_guideline(..., as_factor = TRUE) + +brmo(x = NULL, only_sir_columns = FALSE, ...) + +mrgn(x = NULL, only_sir_columns = FALSE, ...) + +mdr_tb(x = NULL, only_sir_columns = FALSE, ...) + +mdr_cmi2012(x = NULL, only_sir_columns = FALSE, ...) + +eucast_exceptional_phenotypes(x = NULL, only_sir_columns = FALSE, ...) +} +\arguments{ +\item{x}{a \link{data.frame} with antibiotics columns, like \code{AMX} or \code{amox}. Can be left blank for automatic determination.} + +\item{guideline}{a specific guideline to follow, see sections \emph{Supported international / national guidelines} and \emph{Using Custom Guidelines} below. When left empty, the publication by Magiorakos \emph{et al.} (see below) will be followed.} + +\item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} + +\item{info}{a \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions} + +\item{pct_required_classes}{minimal required percentage of antimicrobial classes that must be available per isolate, rounded down. For example, with the default guideline, 17 antimicrobial classes must be available for \emph{S. aureus}. Setting this \code{pct_required_classes} argument to \code{0.5} (default) means that for every \emph{S. aureus} isolate at least 8 different classes must be available. Any lower number of available classes will return \code{NA} for that isolate.} + +\item{combine_SI}{a \link{logical} to indicate whether all values of S and I must be merged into one, so resistance is only considered when isolates are R, not I. As this is the default behaviour of the \code{\link[=mdro]{mdro()}} function, it follows the redefinition by EUCAST about the interpretation of I (increased exposure) in 2019, see section 'Interpretation of S, I and R' below. When using \code{combine_SI = FALSE}, resistance is considered when isolates are R or I.} + +\item{verbose}{a \link{logical} to turn Verbose mode on and off (default is off). In Verbose mode, the function does not return the MDRO results, but instead returns a data set in logbook form with extensive info about which isolates would be MDRO-positive, or why they are not.} + +\item{only_sir_columns}{a \link{logical} to indicate whether only antibiotic columns must be detected that were transformed to class \code{sir} (see \code{\link[=as.sir]{as.sir()}}) on beforehand (default is \code{FALSE})} + +\item{...}{in case of \code{\link[=custom_mdro_guideline]{custom_mdro_guideline()}}: a set of rules, see section \emph{Using Custom Guidelines} below. Otherwise: column name of an antibiotic, see section \emph{Antibiotics} below.} + +\item{as_factor}{a \link{logical} to indicate whether the returned value should be an ordered \link{factor} (\code{TRUE}, default), or otherwise a \link{character} vector} +} +\value{ +\itemize{ +\item CMI 2012 paper - function \code{\link[=mdr_cmi2012]{mdr_cmi2012()}} or \code{\link[=mdro]{mdro()}}:\cr +Ordered \link{factor} with levels \code{Negative} < \code{Multi-drug-resistant (MDR)} < \verb{Extensively drug-resistant (XDR)} < \code{Pandrug-resistant (PDR)} +\item TB guideline - function \code{\link[=mdr_tb]{mdr_tb()}} or \code{\link[=mdro]{mdro(..., guideline = "TB")}}:\cr +Ordered \link{factor} with levels \code{Negative} < \code{Mono-resistant} < \code{Poly-resistant} < \code{Multi-drug-resistant} < \verb{Extensively drug-resistant} +\item German guideline - function \code{\link[=mrgn]{mrgn()}} or \code{\link[=mdro]{mdro(..., guideline = "MRGN")}}:\cr +Ordered \link{factor} with levels \code{Negative} < \verb{3MRGN} < \verb{4MRGN} +\item Everything else, except for custom guidelines:\cr +Ordered \link{factor} with levels \code{Negative} < \verb{Positive, unconfirmed} < \code{Positive}. The value \code{"Positive, unconfirmed"} means that, according to the guideline, it is not entirely sure if the isolate is multi-drug resistant and this should be confirmed with additional (e.g. molecular) tests +} +} +\description{ +Determine which isolates are multidrug-resistant organisms (MDRO) according to international, national and custom guidelines. +} +\details{ +These functions are context-aware. This means that the \code{x} argument can be left blank if used inside a \link{data.frame} call, see \emph{Examples}. + +For the \code{pct_required_classes} argument, values above 1 will be divided by 100. This is to support both fractions (\code{0.75} or \code{3/4}) and percentages (\code{75}). + +\strong{Note:} Every test that involves the Enterobacteriaceae family, will internally be performed using its newly named \emph{order} Enterobacterales, since the Enterobacteriaceae family has been taxonomically reclassified by Adeolu \emph{et al.} in 2016. Before that, Enterobacteriaceae was the only family under the Enterobacteriales (with an i) order. All species under the old Enterobacteriaceae family are still under the new Enterobacterales (without an i) order, but divided into multiple families. The way tests are performed now by this \code{\link[=mdro]{mdro()}} function makes sure that results from before 2016 and after 2016 are identical. +} +\section{Supported International / National Guidelines}{ + + +Currently supported guidelines are (case-insensitive): +\itemize{ +\item \code{guideline = "CMI2012"} (default) + +Magiorakos AP, Srinivasan A \emph{et al.} "Multidrug-resistant, extensively drug-resistant and pandrug-resistant bacteria: an international expert proposal for interim standard definitions for acquired resistance." Clinical Microbiology and Infection (2012) (\doi{10.1111/j.1469-0691.2011.03570.x}) +\item \code{guideline = "EUCAST3.3"} (or simply \code{guideline = "EUCAST"}) + +The European international guideline - EUCAST Expert Rules Version 3.3 "Intrinsic Resistance and Unusual Phenotypes" (\href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2021/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.3_20211018.pdf}{link}) +\item \code{guideline = "EUCAST3.2"} + +The European international guideline - EUCAST Expert Rules Version 3.2 "Intrinsic Resistance and Unusual Phenotypes" (\href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/2020/Intrinsic_Resistance_and_Unusual_Phenotypes_Tables_v3.2_20200225.pdf}{link}) +\item \code{guideline = "EUCAST3.1"} + +The European international guideline - EUCAST Expert Rules Version 3.1 "Intrinsic Resistance and Exceptional Phenotypes Tables" (\href{https://www.eucast.org/fileadmin/src/media/PDFs/EUCAST_files/Expert_Rules/Expert_rules_intrinsic_exceptional_V3.1.pdf}{link}) +\item \code{guideline = "TB"} + +The international guideline for multi-drug resistant tuberculosis - World Health Organization "Companion handbook to the WHO guidelines for the programmatic management of drug-resistant tuberculosis" (\href{https://www.who.int/publications/i/item/9789241548809}{link}) +\item \code{guideline = "MRGN"} + +The German national guideline - Mueller et al. (2015) Antimicrobial Resistance and Infection Control 4:7; \doi{10.1186/s13756-015-0047-6} +\item \code{guideline = "BRMO"} + +The Dutch national guideline - Rijksinstituut voor Volksgezondheid en Milieu "WIP-richtlijn BRMO (Bijzonder Resistente Micro-Organismen) (ZKH)" (\href{https://www.rivm.nl/wip-richtlijn-brmo-bijzonder-resistente-micro-organismen-zkh}{link}) +} + +Please suggest your own (country-specific) guidelines by letting us know: \url{https://github.com/msberends/AMR/issues/new}. +} + +\section{Using Custom Guidelines}{ + + +Custom guidelines can be set with the \code{\link[=custom_mdro_guideline]{custom_mdro_guideline()}} function. This is of great importance if you have custom rules to determine MDROs in your hospital, e.g., rules that are dependent on ward, state of contact isolation or other variables in your data. + +If you are familiar with the \code{\link[dplyr:case_when]{case_when()}} function of the \code{dplyr} package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule is written \emph{before} the tilde (\code{~}) and the consequence of the rule is written \emph{after} the tilde: + +\if{html}{\out{
}}\preformatted{custom <- custom_mdro_guideline(CIP == "R" & age > 60 ~ "Elderly Type A", + ERY == "R" & age > 60 ~ "Elderly Type B") +}\if{html}{\out{
}} + +If a row/an isolate matches the first rule, the value after the first \code{~} (in this case \emph{'Elderly Type A'}) will be set as MDRO value. Otherwise, the second rule will be tried and so on. The number of rules is unlimited. + +You can print the rules set in the console for an overview. Colours will help reading it if your console supports colours. + +\if{html}{\out{
}}\preformatted{custom +#> A set of custom MDRO rules: +#> 1. CIP is "R" and age is higher than 60 -> Elderly Type A +#> 2. ERY is "R" and age is higher than 60 -> Elderly Type B +#> 3. Otherwise -> Negative +#> +#> Unmatched rows will return NA. +}\if{html}{\out{
}} + +The outcome of the function can be used for the \code{guideline} argument in the \code{\link[=mdro]{mdro()}} function: + +\if{html}{\out{
}}\preformatted{x <- mdro(example_isolates, + guideline = custom) +table(x) +#> Negative Elderly Type A Elderly Type B +#> 1070 198 732 +}\if{html}{\out{
}} + +Rules can also be combined with other custom rules by using \code{\link[=c]{c()}}: + +\if{html}{\out{
}}\preformatted{x <- mdro(example_isolates, + guideline = c(custom, + custom_mdro_guideline(ERY == "R" & age > 50 ~ "Elderly Type C"))) +table(x) +#> Negative Elderly Type A Elderly Type B Elderly Type C +#> 961 198 732 109 +}\if{html}{\out{
}} + +The rules set (the \code{custom} object in this case) could be exported to a shared file location using \code{\link[=saveRDS]{saveRDS()}} if you collaborate with multiple users. The custom rules set could then be imported using \code{\link[=readRDS]{readRDS()}}. +} + +\section{Antibiotics}{ + +To define antibiotics column names, leave as it is to determine it automatically with \code{\link[=guess_ab_col]{guess_ab_col()}} or input a text (case-insensitive), or use \code{NULL} to skip a column (e.g. \code{TIC = NULL} to skip ticarcillin). Manually defined but non-existing columns will be skipped with a warning. + +The following antibiotics are eligible for the functions \code{\link[=eucast_rules]{eucast_rules()}} and \code{\link[=mdro]{mdro()}}. These are shown below in the format 'name (\verb{antimicrobial ID}, \href{https://atcddd.fhi.no/atc/structure_and_principles/}{ATC code})', sorted alphabetically: + +Amikacin (\code{AMK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB06&showdescription=no}{J01GB06}), amoxicillin (\code{AMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA04&showdescription=no}{J01CA04}), amoxicillin/clavulanic acid (\code{AMC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR02&showdescription=no}{J01CR02}), ampicillin (\code{AMP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA01&showdescription=no}{J01CA01}), ampicillin/sulbactam (\code{SAM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR01&showdescription=no}{J01CR01}), apramycin (\code{APR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QA07AA92&showdescription=no}{QA07AA92}), arbekacin (\code{ARB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB12&showdescription=no}{J01GB12}), aspoxicillin (\code{APX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA19&showdescription=no}{J01CA19}), azidocillin (\code{AZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE04&showdescription=no}{J01CE04}), azithromycin (\code{AZM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA10&showdescription=no}{J01FA10}), azlocillin (\code{AZL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA09&showdescription=no}{J01CA09}), aztreonam (\code{ATM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DF01&showdescription=no}{J01DF01}), bacampicillin (\code{BAM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA06&showdescription=no}{J01CA06}), bekanamycin (\code{BEK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB13&showdescription=no}{J01GB13}), benzathine benzylpenicillin (\code{BNB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE08&showdescription=no}{J01CE08}), benzathine phenoxymethylpenicillin (\code{BNP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE10&showdescription=no}{J01CE10}), benzylpenicillin (\code{PEN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE01&showdescription=no}{J01CE01}), besifloxacin (\code{BES}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=S01AE08&showdescription=no}{S01AE08}), biapenem (\code{BIA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH05&showdescription=no}{J01DH05}), carbenicillin (\code{CRB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA03&showdescription=no}{J01CA03}), carindacillin (\code{CRN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA05&showdescription=no}{J01CA05}), cefacetrile (\code{CAC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB10&showdescription=no}{J01DB10}), cefaclor (\code{CEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC04&showdescription=no}{J01DC04}), cefadroxil (\code{CFR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB05&showdescription=no}{J01DB05}), cefalexin (\code{LEX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB01&showdescription=no}{J01DB01}), cefaloridine (\code{RID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB02&showdescription=no}{J01DB02}), cefalotin (\code{CEP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB03&showdescription=no}{J01DB03}), cefamandole (\code{MAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC03&showdescription=no}{J01DC03}), cefapirin (\code{HAP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB08&showdescription=no}{J01DB08}), cefatrizine (\code{CTZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB07&showdescription=no}{J01DB07}), cefazedone (\code{CZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB06&showdescription=no}{J01DB06}), cefazolin (\code{CZO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB04&showdescription=no}{J01DB04}), cefcapene (\code{CCP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD17&showdescription=no}{J01DD17}), cefdinir (\code{CDR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD15&showdescription=no}{J01DD15}), cefditoren (\code{DIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD16&showdescription=no}{J01DD16}), cefepime (\code{FEP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE01&showdescription=no}{J01DE01}), cefetamet (\code{CAT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD10&showdescription=no}{J01DD10}), cefiderocol (\code{FDC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI04&showdescription=no}{J01DI04}), cefixime (\code{CFM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD08&showdescription=no}{J01DD08}), cefmenoxime (\code{CMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD05&showdescription=no}{J01DD05}), cefmetazole (\code{CMZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC09&showdescription=no}{J01DC09}), cefodizime (\code{DIZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD09&showdescription=no}{J01DD09}), cefonicid (\code{CID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC06&showdescription=no}{J01DC06}), cefoperazone (\code{CFP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD12&showdescription=no}{J01DD12}), cefoperazone/sulbactam (\code{CSL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD62&showdescription=no}{J01DD62}), ceforanide (\code{CND}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC11&showdescription=no}{J01DC11}), cefotaxime (\code{CTX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD01&showdescription=no}{J01DD01}), cefotaxime/clavulanic acid (\code{CTC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD51&showdescription=no}{J01DD51}), cefotetan (\code{CTT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC05&showdescription=no}{J01DC05}), cefotiam (\code{CTF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC07&showdescription=no}{J01DC07}), cefovecin (\code{FOV}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01DD91&showdescription=no}{QJ01DD91}), cefoxitin (\code{FOX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC01&showdescription=no}{J01DC01}), cefozopran (\code{ZOP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE03&showdescription=no}{J01DE03}), cefpiramide (\code{CPM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD11&showdescription=no}{J01DD11}), cefpirome (\code{CPO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DE02&showdescription=no}{J01DE02}), cefpodoxime (\code{CPD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD13&showdescription=no}{J01DD13}), cefprozil (\code{CPR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC10&showdescription=no}{J01DC10}), cefquinome (\code{CEQ}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QG51AA07&showdescription=no}{QG51AA07}), cefroxadine (\code{CRD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB11&showdescription=no}{J01DB11}), cefsulodin (\code{CFS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD03&showdescription=no}{J01DD03}), ceftaroline (\code{CPT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI02&showdescription=no}{J01DI02}), ceftazidime (\code{CAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD02&showdescription=no}{J01DD02}), ceftazidime/clavulanic acid (\code{CCV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD52&showdescription=no}{J01DD52}), cefteram (\code{CEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD18&showdescription=no}{J01DD18}), ceftezole (\code{CTL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB12&showdescription=no}{J01DB12}), ceftibuten (\code{CTB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD14&showdescription=no}{J01DD14}), ceftiofur (\code{TIO}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01DD90&showdescription=no}{QJ01DD90}), ceftizoxime (\code{CZX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD07&showdescription=no}{J01DD07}), ceftobiprole medocaril (\code{CFM1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI01&showdescription=no}{J01DI01}), ceftolozane/tazobactam (\code{CZT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DI54&showdescription=no}{J01DI54}), ceftriaxone (\code{CRO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD04&showdescription=no}{J01DD04}), ceftriaxone/beta-lactamase inhibitor (\code{CEB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD63&showdescription=no}{J01DD63}), cefuroxime (\code{CXM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC02&showdescription=no}{J01DC02}), cephradine (\code{CED}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DB09&showdescription=no}{J01DB09}), chloramphenicol (\code{CHL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01BA01&showdescription=no}{J01BA01}), ciprofloxacin (\code{CIP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA02&showdescription=no}{J01MA02}), clarithromycin (\code{CLR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA09&showdescription=no}{J01FA09}), clindamycin (\code{CLI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FF01&showdescription=no}{J01FF01}), clometocillin (\code{CLM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE07&showdescription=no}{J01CE07}), cloxacillin (\code{CLO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF02&showdescription=no}{J01CF02}), colistin (\code{COL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XB01&showdescription=no}{J01XB01}), cycloserine (\code{CYC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J04AB01&showdescription=no}{J04AB01}), dalbavancin (\code{DAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA04&showdescription=no}{J01XA04}), danofloxacin (\code{DAN}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA92&showdescription=no}{QJ01MA92}), daptomycin (\code{DAP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX09&showdescription=no}{J01XX09}), delafloxacin (\code{DFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA23&showdescription=no}{J01MA23}), dibekacin (\code{DKB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB09&showdescription=no}{J01GB09}), dicloxacillin (\code{DIC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF01&showdescription=no}{J01CF01}), difloxacin (\code{DIF}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA94&showdescription=no}{QJ01MA94}), dirithromycin (\code{DIR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA13&showdescription=no}{J01FA13}), doripenem (\code{DOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH04&showdescription=no}{J01DH04}), doxycycline (\code{DOX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA02&showdescription=no}{J01AA02}), enoxacin (\code{ENX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA04&showdescription=no}{J01MA04}), enrofloxacin (\code{ENR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA90&showdescription=no}{QJ01MA90}), epicillin (\code{EPC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA07&showdescription=no}{J01CA07}), ertapenem (\code{ETP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH03&showdescription=no}{J01DH03}), erythromycin (\code{ERY}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA01&showdescription=no}{J01FA01}), fleroxacin (\code{FLE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA08&showdescription=no}{J01MA08}), flucloxacillin (\code{FLC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF05&showdescription=no}{J01CF05}), flurithromycin (\code{FLR1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA14&showdescription=no}{J01FA14}), fosfomycin (\code{FOS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX01&showdescription=no}{J01XX01}), framycetin (\code{FRM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=D09AA01&showdescription=no}{D09AA01}), fusidic acid (\code{FUS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XC01&showdescription=no}{J01XC01}), gamithromycin (\code{GAM}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA95&showdescription=no}{QJ01FA95}), garenoxacin (\code{GRN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA19&showdescription=no}{J01MA19}), gatifloxacin (\code{GAT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA16&showdescription=no}{J01MA16}), gemifloxacin (\code{GEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA15&showdescription=no}{J01MA15}), gentamicin (\code{GEN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB03&showdescription=no}{J01GB03}), grepafloxacin (\code{GRX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA11&showdescription=no}{J01MA11}), hetacillin (\code{HET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA18&showdescription=no}{J01CA18}), imipenem (\code{IPM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH51&showdescription=no}{J01DH51}), imipenem/relebactam (\code{IMR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH56&showdescription=no}{J01DH56}), isepamicin (\code{ISE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB11&showdescription=no}{J01GB11}), josamycin (\code{JOS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA07&showdescription=no}{J01FA07}), kanamycin (\code{KAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB04&showdescription=no}{J01GB04}), kitasamycin (\code{KIT}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA93&showdescription=no}{QJ01FA93}), lascufloxacin (\code{LSC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA25&showdescription=no}{J01MA25}), latamoxef (\code{LTM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DD06&showdescription=no}{J01DD06}), levofloxacin (\code{LVX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA12&showdescription=no}{J01MA12}), levonadifloxacin (\code{LND}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA24&showdescription=no}{J01MA24}), lincomycin (\code{LIN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FF02&showdescription=no}{J01FF02}), linezolid (\code{LNZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX08&showdescription=no}{J01XX08}), lomefloxacin (\code{LOM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA07&showdescription=no}{J01MA07}), loracarbef (\code{LOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DC08&showdescription=no}{J01DC08}), marbofloxacin (\code{MAR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA93&showdescription=no}{QJ01MA93}), mecillinam (\code{MEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA11&showdescription=no}{J01CA11}), meropenem (\code{MEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH02&showdescription=no}{J01DH02}), meropenem/vaborbactam (\code{MEV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH52&showdescription=no}{J01DH52}), metampicillin (\code{MTM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA14&showdescription=no}{J01CA14}), meticillin (\code{MET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF03&showdescription=no}{J01CF03}), mezlocillin (\code{MEZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA10&showdescription=no}{J01CA10}), micronomicin (\code{MCR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=S01AA22&showdescription=no}{S01AA22}), midecamycin (\code{MID}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA03&showdescription=no}{J01FA03}), minocycline (\code{MNO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA08&showdescription=no}{J01AA08}), miocamycin (\code{MCM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA11&showdescription=no}{J01FA11}), moxifloxacin (\code{MFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA14&showdescription=no}{J01MA14}), nadifloxacin (\code{NAD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=D10AF05&showdescription=no}{D10AF05}), nafcillin (\code{NAF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF06&showdescription=no}{J01CF06}), nalidixic acid (\code{NAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MB02&showdescription=no}{J01MB02}), neomycin (\code{NEO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB05&showdescription=no}{J01GB05}), netilmicin (\code{NET}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB07&showdescription=no}{J01GB07}), nitrofurantoin (\code{NIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XE01&showdescription=no}{J01XE01}), norfloxacin (\code{NOR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA06&showdescription=no}{J01MA06}), novobiocin (\code{NOV}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01XX95&showdescription=no}{QJ01XX95}), ofloxacin (\code{OFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA01&showdescription=no}{J01MA01}), oleandomycin (\code{OLE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA05&showdescription=no}{J01FA05}), orbifloxacin (\code{ORB}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA95&showdescription=no}{QJ01MA95}), oritavancin (\code{ORI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA05&showdescription=no}{J01XA05}), oxacillin (\code{OXA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CF04&showdescription=no}{J01CF04}), panipenem (\code{PAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH55&showdescription=no}{J01DH55}), pazufloxacin (\code{PAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA18&showdescription=no}{J01MA18}), pefloxacin (\code{PEF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA03&showdescription=no}{J01MA03}), penamecillin (\code{PNM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE06&showdescription=no}{J01CE06}), pheneticillin (\code{PHE}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE05&showdescription=no}{J01CE05}), phenoxymethylpenicillin (\code{PHN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE02&showdescription=no}{J01CE02}), piperacillin (\code{PIP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA12&showdescription=no}{J01CA12}), piperacillin/tazobactam (\code{TZP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR05&showdescription=no}{J01CR05}), pirlimycin (\code{PRL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ51FF90&showdescription=no}{QJ51FF90}), pivampicillin (\code{PVM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA02&showdescription=no}{J01CA02}), pivmecillinam (\code{PME}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA08&showdescription=no}{J01CA08}), plazomicin (\code{PLZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB14&showdescription=no}{J01GB14}), polymyxin B (\code{PLB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XB02&showdescription=no}{J01XB02}), pradofloxacin (\code{PRA}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA97&showdescription=no}{QJ01MA97}), pristinamycin (\code{PRI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FG01&showdescription=no}{J01FG01}), procaine benzylpenicillin (\code{PRB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE09&showdescription=no}{J01CE09}), propicillin (\code{PRP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CE03&showdescription=no}{J01CE03}), prulifloxacin (\code{PRU}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA17&showdescription=no}{J01MA17}), quinupristin/dalfopristin (\code{QDA}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FG02&showdescription=no}{QJ01FG02}), ribostamycin (\code{RST}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB10&showdescription=no}{J01GB10}), rifampicin (\code{RIF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J04AB02&showdescription=no}{J04AB02}), rokitamycin (\code{ROK}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA12&showdescription=no}{J01FA12}), roxithromycin (\code{RXT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA06&showdescription=no}{J01FA06}), rufloxacin (\code{RFL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA10&showdescription=no}{J01MA10}), sarafloxacin (\code{SAR}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01MA98&showdescription=no}{QJ01MA98}), sisomicin (\code{SIS}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB08&showdescription=no}{J01GB08}), sitafloxacin (\code{SIT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA21&showdescription=no}{J01MA21}), solithromycin (\code{SOL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA16&showdescription=no}{J01FA16}), sparfloxacin (\code{SPX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA09&showdescription=no}{J01MA09}), spiramycin (\code{SPI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA02&showdescription=no}{J01FA02}), streptoduocin (\code{STR}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GA02&showdescription=no}{J01GA02}), streptomycin (\code{STR1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GA01&showdescription=no}{J01GA01}), sulbactam (\code{SUL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CG01&showdescription=no}{J01CG01}), sulbenicillin (\code{SBC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA16&showdescription=no}{J01CA16}), sulfadiazine (\code{SDI}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC02&showdescription=no}{J01EC02}), sulfadiazine/trimethoprim (\code{SLT1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE02&showdescription=no}{J01EE02}), sulfadimethoxine (\code{SUD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED01&showdescription=no}{J01ED01}), sulfadimidine (\code{SDM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB03&showdescription=no}{J01EB03}), sulfadimidine/trimethoprim (\code{SLT2}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE05&showdescription=no}{J01EE05}), sulfafurazole (\code{SLF}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB05&showdescription=no}{J01EB05}), sulfaisodimidine (\code{SLF1}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB01&showdescription=no}{J01EB01}), sulfalene (\code{SLF2}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED02&showdescription=no}{J01ED02}), sulfamazone (\code{SZO}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED09&showdescription=no}{J01ED09}), sulfamerazine (\code{SLF3}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED07&showdescription=no}{J01ED07}), sulfamerazine/trimethoprim (\code{SLT3}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE07&showdescription=no}{J01EE07}), sulfamethizole (\code{SLF4}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB02&showdescription=no}{J01EB02}), sulfamethoxazole (\code{SMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC01&showdescription=no}{J01EC01}), sulfamethoxypyridazine (\code{SLF5}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED05&showdescription=no}{J01ED05}), sulfametomidine (\code{SLF6}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED03&showdescription=no}{J01ED03}), sulfametoxydiazine (\code{SLF7}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED04&showdescription=no}{J01ED04}), sulfametrole/trimethoprim (\code{SLT4}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE03&showdescription=no}{J01EE03}), sulfamoxole (\code{SLF8}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EC03&showdescription=no}{J01EC03}), sulfamoxole/trimethoprim (\code{SLT5}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE04&showdescription=no}{J01EE04}), sulfanilamide (\code{SLF9}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB06&showdescription=no}{J01EB06}), sulfaperin (\code{SLF10}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED06&showdescription=no}{J01ED06}), sulfaphenazole (\code{SLF11}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01ED08&showdescription=no}{J01ED08}), sulfapyridine (\code{SLF12}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB04&showdescription=no}{J01EB04}), sulfathiazole (\code{SUT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB07&showdescription=no}{J01EB07}), sulfathiourea (\code{SLF13}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EB08&showdescription=no}{J01EB08}), sultamicillin (\code{SLT6}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR04&showdescription=no}{J01CR04}), talampicillin (\code{TAL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA15&showdescription=no}{J01CA15}), tazobactam (\code{TAZ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CG02&showdescription=no}{J01CG02}), tebipenem (\code{TBP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01DH06&showdescription=no}{J01DH06}), tedizolid (\code{TZD}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XX11&showdescription=no}{J01XX11}), teicoplanin (\code{TEC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA02&showdescription=no}{J01XA02}), telavancin (\code{TLV}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA03&showdescription=no}{J01XA03}), telithromycin (\code{TLT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA15&showdescription=no}{J01FA15}), temafloxacin (\code{TMX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA05&showdescription=no}{J01MA05}), temocillin (\code{TEM}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA17&showdescription=no}{J01CA17}), tetracycline (\code{TCY}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA07&showdescription=no}{J01AA07}), ticarcillin (\code{TIC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CA13&showdescription=no}{J01CA13}), ticarcillin/clavulanic acid (\code{TCC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01CR03&showdescription=no}{J01CR03}), tigecycline (\code{TGC}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01AA12&showdescription=no}{J01AA12}), tilbroquinol (\code{TBQ}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=P01AA05&showdescription=no}{P01AA05}), tildipirosin (\code{TIP}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA96&showdescription=no}{QJ01FA96}), tilmicosin (\code{TIL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA91&showdescription=no}{QJ01FA91}), tobramycin (\code{TOB}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01GB01&showdescription=no}{J01GB01}), tosufloxacin (\code{TFX}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA22&showdescription=no}{J01MA22}), trimethoprim (\code{TMP}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EA01&showdescription=no}{J01EA01}), trimethoprim/sulfamethoxazole (\code{SXT}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01EE01&showdescription=no}{J01EE01}), troleandomycin (\code{TRL}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01FA08&showdescription=no}{J01FA08}), trovafloxacin (\code{TVA}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01MA13&showdescription=no}{J01MA13}), tulathromycin (\code{TUL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA94&showdescription=no}{QJ01FA94}), tylosin (\code{TYL}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA90&showdescription=no}{QJ01FA90}), tylvalosin (\code{TYL1}, \href{https://atcddd.fhi.no/atcvet/atcvet_index/?code=QJ01FA92&showdescription=no}{QJ01FA92}), vancomycin (\code{VAN}, \href{https://atcddd.fhi.no/atc_ddd_index//?code=J01XA01&showdescription=no}{J01XA01}) +} + +\section{Interpretation of SIR}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (\url{https://www.eucast.org/newsiandr}): +\itemize{ +\item \strong{S - Susceptible, standard dosing regimen}\cr +A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +\item \strong{I - Susceptible, increased exposure} \emph{\cr +A microorganism is categorised as "Susceptible, Increased exposure}" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +\item \strong{R = Resistant}\cr +A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +\itemize{ +\item \emph{Exposure} is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +} +} + +This AMR package honours this insight. Use \code{\link[=susceptibility]{susceptibility()}} (equal to \code{\link[=proportion_SI]{proportion_SI()}}) to determine antimicrobial susceptibility and \code{\link[=count_susceptible]{count_susceptible()}} (equal to \code{\link[=count_SI]{count_SI()}}) to count susceptible isolates. +} + +\examples{ +out <- mdro(example_isolates, guideline = "EUCAST") +str(out) +table(out) + +out <- mdro(example_isolates, + guideline = custom_mdro_guideline( + AMX == "R" ~ "Custom MDRO 1", + VAN == "R" ~ "Custom MDRO 2" + ) +) +table(out) + +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + mdro() \%>\% + table() + + # no need to define `x` when used inside dplyr verbs: + example_isolates \%>\% + mutate(MDRO = mdro()) \%>\% + pull(MDRO) \%>\% + table() +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/mean_amr_distance.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mean_amr_distance.R +\name{mean_amr_distance} +\alias{mean_amr_distance} +\alias{mean_amr_distance.sir} +\alias{mean_amr_distance.data.frame} +\alias{amr_distance_from_row} +\title{Calculate the Mean AMR Distance} +\usage{ +mean_amr_distance(x, ...) + +\method{mean_amr_distance}{sir}(x, ..., combine_SI = TRUE) + +\method{mean_amr_distance}{data.frame}(x, ..., combine_SI = TRUE) + +amr_distance_from_row(amr_distance, row) +} +\arguments{ +\item{x}{a vector of class \link[=as.sir]{sir}, \link[=as.mic]{mic} or \link[=as.disk]{disk}, or a \link{data.frame} containing columns of any of these classes} + +\item{...}{variables to select (supports \link[tidyselect:language]{tidyselect language} such as \code{column1:column4} and \code{where(is.mic)}, and can thus also be \link[=ab_selector]{antibiotic selectors}} + +\item{combine_SI}{a \link{logical} to indicate whether all values of S, SDD, and I must be merged into one, so the input only consists of S+I vs. R (susceptible vs. resistant) - the default is \code{TRUE}} + +\item{amr_distance}{the outcome of \code{\link[=mean_amr_distance]{mean_amr_distance()}}} + +\item{row}{an index, such as a row number} +} +\description{ +Calculates a normalised mean for antimicrobial resistance between multiple observations, to help to identify similar isolates without comparing antibiograms by hand. +} +\details{ +The mean AMR distance is effectively \href{https://en.wikipedia.org/wiki/Standard_score}{the Z-score}; a normalised numeric value to compare AMR test results which can help to identify similar isolates, without comparing antibiograms by hand. + +MIC values (see \code{\link[=as.mic]{as.mic()}}) are transformed with \code{\link[=log2]{log2()}} first; their distance is thus calculated as \code{(log2(x) - mean(log2(x))) / sd(log2(x))}. + +SIR values (see \code{\link[=as.sir]{as.sir()}}) are transformed using \code{"S"} = 1, \code{"I"} = 2, and \code{"R"} = 3. If \code{combine_SI} is \code{TRUE} (default), the \code{"I"} will be considered to be 1. + +For data sets, the mean AMR distance will be calculated per column, after which the mean per row will be returned, see \emph{Examples}. + +Use \code{\link[=amr_distance_from_row]{amr_distance_from_row()}} to subtract distances from the distance of one row, see \emph{Examples}. +} +\section{Interpretation}{ + +Isolates with distances less than 0.01 difference from each other should be considered similar. Differences lower than 0.025 should be considered suspicious. +} + +\examples{ +sir <- random_sir(10) +sir +mean_amr_distance(sir) + +mic <- random_mic(10) +mic +mean_amr_distance(mic) +# equal to the Z-score of their log2: +(log2(mic) - mean(log2(mic))) / sd(log2(mic)) + +disk <- random_disk(10) +disk +mean_amr_distance(disk) + +y <- data.frame( + id = LETTERS[1:10], + amox = random_sir(10, ab = "amox", mo = "Escherichia coli"), + cipr = random_disk(10, ab = "cipr", mo = "Escherichia coli"), + gent = random_mic(10, ab = "gent", mo = "Escherichia coli"), + tobr = random_mic(10, ab = "tobr", mo = "Escherichia coli") +) +y +mean_amr_distance(y) +y$amr_distance <- mean_amr_distance(y, where(is.mic)) +y[order(y$amr_distance), ] + +if (require("dplyr")) { + y \%>\% + mutate( + amr_distance = mean_amr_distance(y), + check_id_C = amr_distance_from_row(amr_distance, id == "C") + ) \%>\% + arrange(check_id_C) +} +if (require("dplyr")) { + # support for groups + example_isolates \%>\% + filter(mo_genus() == "Enterococcus" & mo_species() != "") \%>\% + select(mo, TCY, carbapenems()) \%>\% + group_by(mo) \%>\% + mutate(dist = mean_amr_distance(.)) \%>\% + arrange(mo, dist) +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/microorganisms.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{microorganisms} +\alias{microorganisms} +\title{Data Set with 78 678 Taxonomic Records of Microorganisms} +\format{ +A \link[tibble:tibble]{tibble} with 78 678 observations and 26 variables: +\itemize{ +\item \code{mo}\cr ID of microorganism as used by this package. \emph{\strong{This is a unique identifier.}} +\item \code{fullname}\cr Full name, like \code{"Escherichia coli"}. For the taxonomic ranks genus, species and subspecies, this is the 'pasted' text of genus, species, and subspecies. For all taxonomic ranks higher than genus, this is the name of the taxon. \emph{\strong{This is a unique identifier.}} +\item \code{status} \cr Status of the taxon, either "accepted", "not validly published", "synonym", or "unknown" +\item \code{kingdom}, \code{phylum}, \code{class}, \code{order}, \code{family}, \code{genus}, \code{species}, \code{subspecies}\cr Taxonomic rank of the microorganism. Note that for fungi, \emph{phylum} is equal to their taxonomic \emph{division}. Also, for fungi, \emph{subkingdom} and \emph{subdivision} were left out since they do not occur in the bacterial taxonomy. +\item \code{rank}\cr Text of the taxonomic rank of the microorganism, such as \code{"species"} or \code{"genus"} +\item \code{ref}\cr Author(s) and year of related scientific publication. This contains only the \emph{first surname} and year of the \emph{latest} authors, e.g. "Wallis \emph{et al.} 2006 \emph{emend.} Smith and Jones 2018" becomes "Smith \emph{et al.}, 2018". This field is directly retrieved from the source specified in the column \code{source}. Moreover, accents were removed to comply with CRAN that only allows ASCII characters. +\item \code{oxygen_tolerance} \cr Oxygen tolerance, either "aerobe", "anaerobe", "anaerobe/microaerophile", "facultative anaerobe", "likely facultative anaerobe", or "microaerophile". These data were retrieved from BacDive (see \emph{Source}). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently 68.3\% of all ~39 000 bacteria in the data set contain an oxygen tolerance. +\item \code{source}\cr Either "GBIF", "LPSN", "MycoBank", or "manually added" (see \emph{Source}) +\item \code{lpsn}\cr Identifier ('Record number') of List of Prokaryotic names with Standing in Nomenclature (LPSN). This will be the first/highest LPSN identifier to keep one identifier per row. For example, \emph{Acetobacter ascendens} has LPSN Record number 7864 and 11011. Only the first is available in the \code{microorganisms} data set. \emph{\strong{This is a unique identifier}}, though available for only ~33 000 records. +\item \code{lpsn_parent}\cr LPSN identifier of the parent taxon +\item \code{lpsn_renamed_to}\cr LPSN identifier of the currently valid taxon +\item \code{mycobank}\cr Identifier ('MycoBank #') of MycoBank. \emph{\strong{This is a unique identifier}}, though available for only ~18 000 records. +\item \code{mycobank_parent}\cr MycoBank identifier of the parent taxon +\item \code{mycobank_renamed_to}\cr MycoBank identifier of the currently valid taxon +\item \code{gbif}\cr Identifier ('taxonID') of Global Biodiversity Information Facility (GBIF). \emph{\strong{This is a unique identifier}}, though available for only ~49 000 records. +\item \code{gbif_parent}\cr GBIF identifier of the parent taxon +\item \code{gbif_renamed_to}\cr GBIF identifier of the currently valid taxon +\item \code{prevalence}\cr Prevalence of the microorganism based on Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}), see \code{\link[=mo_matching_score]{mo_matching_score()}} for the full explanation +\item \code{snomed}\cr Systematized Nomenclature of Medicine (SNOMED) code of the microorganism, version of July 16th, 2024 (see \emph{Source}). Use \code{\link[=mo_snomed]{mo_snomed()}} to retrieve it quickly, see \code{\link[=mo_property]{mo_property()}}. +} +} +\source{ +Taxonomic entries were imported in this order of importance: +\enumerate{ +\item List of Prokaryotic names with Standing in Nomenclature (LPSN):\cr\cr +Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on June 24th, 2024. +\item MycoBank:\cr\cr +Vincent, R \emph{et al} (2013). \strong{MycoBank gearing up for new horizons.} IMA Fungus, 4(2), 371-9; \doi{10.5598/imafungus.2013.04.02.16}. Accessed from \url{https://www.mycobank.org} on June 24th, 2024. +\item Global Biodiversity Information Facility (GBIF):\cr\cr +GBIF Secretariat (2023). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on June 24th, 2024. +} + +Furthermore, these sources were used for additional details: +\itemize{ +\item BacDive:\cr\cr +Reimer, LC \emph{et al.} (2022). \strong{\emph{BacDive} in 2022: the knowledge base for standardized bacterial and archaeal data.} Nucleic Acids Res., 50(D1):D741-D74; \doi{10.1093/nar/gkab961}. Accessed from \url{https://bacdive.dsmz.de} on July 16th, 2024. +\item Systematized Nomenclature of Medicine - Clinical Terms (SNOMED-CT):\cr\cr +Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microorganism', OID 2.16.840.1.114222.4.11.1009 (v12). Accessed from \url{https://phinvads.cdc.gov} on July 16th, 2024. +\item Grimont \emph{et al.} (2007). Antigenic Formulae of the Salmonella Serovars, 9th Edition. WHO Collaborating Centre for Reference and Research on \emph{Salmonella} (WHOCC-SALM). +\item Bartlett \emph{et al.} (2022). \strong{A comprehensive list of bacterial pathogens infecting humans} \emph{Microbiology} 168:001269; \doi{10.1099/mic.0.001269} +} +} +\usage{ +microorganisms +} +\description{ +A data set containing the full microbial taxonomy (\strong{last updated: June 24th, 2024}) of six kingdoms. This data set is the backbone of this \code{AMR} package. MO codes can be looked up using \code{\link[=as.mo]{as.mo()}} and microorganism properties can be looked up using any of the \code{\link[=mo_property]{mo_*}} functions. + +This data set is carefully crafted, yet made 100\% reproducible from public and authoritative taxonomic sources (using \href{https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R}{this script}), namely: \emph{List of Prokaryotic names with Standing in Nomenclature (LPSN)} for bacteria, \emph{MycoBank} for fungi, and \emph{Global Biodiversity Information Facility (GBIF)} for all others taxons. +} +\details{ +Please note that entries are only based on LPSN, MycoBank, and GBIF (see below). Since these sources incorporate entries based on (recent) publications in the International Journal of Systematic and Evolutionary Microbiology (IJSEM), it can happen that the year of publication is sometimes later than one might expect. + +For example, \emph{Staphylococcus pettenkoferi} was described for the first time in Diagnostic Microbiology and Infectious Disease in 2002 (\doi{10.1016/s0732-8893(02)00399-1}), but it was not until 2007 that a publication in IJSEM followed (\doi{10.1099/ijs.0.64381-0}). Consequently, the \code{AMR} package returns 2007 for \code{mo_year("S. pettenkoferi")}. +} +\section{Included Taxa}{ + +Included taxonomic data from \href{https://lpsn.dsmz.de}{LPSN}, \href{https://www.mycobank.org}{MycoBank}, and \href{https://www.gbif.org}{GBIF} are: +\itemize{ +\item All ~39 000 (sub)species from the kingdoms of Archaea and Bacteria +\item ~28 000 species from the kingdom of Fungi. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package. Only relevant fungi are covered (such as all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histoplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}). +\item ~8 100 (sub)species from the kingdom of Protozoa +\item ~1 600 (sub)species from 39 other relevant genera from the kingdom of Animalia (such as \emph{Strongyloides} and \emph{Taenia}) +\item All ~22 000 previously accepted names of all included (sub)species (these were taxonomically renamed) +\item The complete taxonomic tree of all included (sub)species: from kingdom to subspecies +\item The identifier of the parent taxons +\item The year and first author of the related scientific publication +} +\subsection{Manual additions}{ + +For convenience, some entries were added manually: +\itemize{ +\item ~1 500 entries of \emph{Salmonella}, such as the city-like serovars and groups A to H +\item 36 species groups (such as the beta-haemolytic \emph{Streptococcus} groups A to K, coagulase-negative \emph{Staphylococcus} (CoNS), \emph{Mycobacterium tuberculosis} complex, etc.), of which the group compositions are stored in the \link{microorganisms.groups} data set +\item 1 entry of \emph{Blastocystis} (\emph{B. hominis}), although it officially does not exist (Noel \emph{et al.} 2005, PMID 15634993) +\item 1 entry of \emph{Moraxella} (\emph{M. catarrhalis}), which was formally named \emph{Branhamella catarrhalis} (Catlin, 1970) though this change was never accepted within the field of clinical microbiology +\item 8 other 'undefined' entries (unknown, unknown Gram-negatives, unknown Gram-positives, unknown yeast, unknown fungus, and unknown anaerobic Gram-pos/Gram-neg bacteria) +} + +The syntax used to transform the original data to a cleansed \R format, can be \href{https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R}{found here}. +} + +\subsection{Direct download}{ + +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +} + +\examples{ +microorganisms +} +\seealso{ +\code{\link[=as.mo]{as.mo()}}, \code{\link[=mo_property]{mo_property()}}, \link{microorganisms.groups}, \link{microorganisms.codes}, \link{intrinsic_resistant} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/microorganisms.codes.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{microorganisms.codes} +\alias{microorganisms.codes} +\title{Data Set with 4 971 Common Microorganism Codes} +\format{ +A \link[tibble:tibble]{tibble} with 4 971 observations and 2 variables: +\itemize{ +\item \code{code}\cr Commonly used code of a microorganism. \emph{\strong{This is a unique identifier.}} +\item \code{mo}\cr ID of the microorganism in the \link{microorganisms} data set +} +} +\usage{ +microorganisms.codes +} +\description{ +A data set containing commonly used codes for microorganisms, from laboratory systems and \href{https://whonet.org}{WHONET}. Define your own with \code{\link[=set_mo_source]{set_mo_source()}}. They will all be searched when using \code{\link[=as.mo]{as.mo()}} and consequently all the \code{\link[=mo_property]{mo_*}} functions. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +microorganisms.codes + +# 'ECO' or 'eco' is the WHONET code for E. coli: +microorganisms.codes[microorganisms.codes$code == "ECO", ] + +# and therefore, 'eco' will be understood as E. coli in this package: +mo_info("eco") + +# works for all AMR functions: +mo_is_intrinsic_resistant("eco", ab = "vancomycin") +} +\seealso{ +\code{\link[=as.mo]{as.mo()}} \link{microorganisms} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/microorganisms.groups.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/data.R +\docType{data} +\name{microorganisms.groups} +\alias{microorganisms.groups} +\title{Data Set with 521 Microorganisms In Species Groups} +\format{ +A \link[tibble:tibble]{tibble} with 521 observations and 4 variables: +\itemize{ +\item \code{mo_group}\cr ID of the species group / microbiological complex +\item \code{mo}\cr ID of the microorganism belonging in the species group / microbiological complex +\item \code{mo_group_name}\cr Name of the species group / microbiological complex, as retrieved with \code{\link[=mo_name]{mo_name()}} +\item \code{mo_name}\cr Name of the microorganism belonging in the species group / microbiological complex, as retrieved with \code{\link[=mo_name]{mo_name()}} +} +} +\usage{ +microorganisms.groups +} +\description{ +A data set containing species groups and microbiological complexes, which are used in \link[=clinical_breakpoints]{the clinical breakpoints table}. +} +\details{ +Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} +\examples{ +microorganisms.groups + +# these are all species in the Bacteroides fragilis group, as per WHONET: +microorganisms.groups[microorganisms.groups$mo_group == "B_BCTRD_FRGL-C", ] +} +\seealso{ +\code{\link[=as.mo]{as.mo()}} \link{microorganisms} +} +\keyword{datasets} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/mo_matching_score.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mo_matching_score.R +\name{mo_matching_score} +\alias{mo_matching_score} +\title{Calculate the Matching Score for Microorganisms} +\usage{ +mo_matching_score(x, n) +} +\arguments{ +\item{x}{Any user input value(s)} + +\item{n}{A full taxonomic name, that exists in \code{\link[=microorganisms]{microorganisms$fullname}}} +} +\description{ +This algorithm is used by \code{\link[=as.mo]{as.mo()}} and all the \code{\link[=mo_property]{mo_*}} functions to determine the most probable match of taxonomic records based on user input. +} +\note{ +This algorithm was originally developed in 2018 and subsequently described in: Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03}. + +Later, the work of Bartlett A \emph{et al.} about bacterial pathogens infecting humans (2022, \doi{10.1099/mic.0.001269}) was incorporated, and optimalisations to the algorithm were made. +} +\section{Matching Score for Microorganisms}{ + +With ambiguous user input in \code{\link[=as.mo]{as.mo()}} and all the \code{\link[=mo_property]{mo_*}} functions, the returned results are chosen based on their matching score using \code{\link[=mo_matching_score]{mo_matching_score()}}. This matching score \eqn{m}, is calculated as: + +\ifelse{latex}{\deqn{m_{(x, n)} = \frac{l_{n} - 0.5 \cdot \min \begin{cases}l_{n} \\ \textrm{lev}(x, n)\end{cases}}{l_{n} \cdot p_{n} \cdot k_{n}}}}{ + +\ifelse{html}{\figure{mo_matching_score.png}{options: width="300" alt="mo matching score"}}{m(x, n) = ( l_n * min(l_n, lev(x, n) ) ) / ( l_n * p_n * k_n )}} + +where: +\itemize{ +\item \eqn{x} is the user input; +\item \eqn{n} is a taxonomic name (genus, species, and subspecies); +\item \eqn{l_n} is the length of \eqn{n}; +\item \eqn{lev} is the \href{https://en.wikipedia.org/wiki/Levenshtein_distance}{Levenshtein distance function} (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n}; +\item \eqn{p_n} is the human pathogenic prevalence group of \eqn{n}, as described below; +\item \eqn{k_n} is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 1.25, Protozoa = 1.5, Chromista = 1.75, Archaea = 2, others = 3. +} + +The grouping into human pathogenic prevalence \eqn{p} is based on recent work from Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}) who extensively studied medical-scientific literature to categorise all bacterial species into these groups: +\itemize{ +\item \strong{Established}, if a taxonomic species has infected at least three persons in three or more references. These records have \code{prevalence = 1.15} in the \link{microorganisms} data set; +\item \strong{Putative}, if a taxonomic species has fewer than three known cases. These records have \code{prevalence = 1.25} in the \link{microorganisms} data set. +} + +Furthermore, +\itemize{ +\item Genera from the World Health Organization's (WHO) Priority Pathogen List have \code{prevalence = 1.0} in the \link{microorganisms} data set; +\item Any genus present in the \strong{established} list also has \code{prevalence = 1.15} in the \link{microorganisms} data set; +\item Any other genus present in the \strong{putative} list has \code{prevalence = 1.25} in the \link{microorganisms} data set; +\item Any other species or subspecies of which the genus is present in the two aforementioned groups, has \code{prevalence = 1.5} in the \link{microorganisms} data set; +\item Any \emph{non-bacterial} genus, species or subspecies of which the genus is present in the following list, has \code{prevalence = 1.25} in the \link{microorganisms} data set: \emph{Absidia}, \emph{Acanthamoeba}, \emph{Acremonium}, \emph{Actinomucor}, \emph{Aedes}, \emph{Alternaria}, \emph{Amoeba}, \emph{Ancylostoma}, \emph{Angiostrongylus}, \emph{Anisakis}, \emph{Anopheles}, \emph{Apophysomyces}, \emph{Arthroderma}, \emph{Aspergillus}, \emph{Aureobasidium}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Bipolaris}, \emph{Blastobotrys}, \emph{Blastocystis}, \emph{Blastomyces}, \emph{Candida}, \emph{Capillaria}, \emph{Chaetomium}, \emph{Chilomastix}, \emph{Chrysonilia}, \emph{Chrysosporium}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Clavispora}, \emph{Coccidioides}, \emph{Cokeromyces}, \emph{Conidiobolus}, \emph{Coniochaeta}, \emph{Contracaecum}, \emph{Cordylobia}, \emph{Cryptococcus}, \emph{Cryptosporidium}, \emph{Cunninghamella}, \emph{Curvularia}, \emph{Cyberlindnera}, \emph{Debaryozyma}, \emph{Demodex}, \emph{Dermatobia}, \emph{Dientamoeba}, \emph{Diphyllobothrium}, \emph{Dirofilaria}, \emph{Echinostoma}, \emph{Entamoeba}, \emph{Enterobius}, \emph{Epidermophyton}, \emph{Exidia}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Fasciola}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Geotrichum}, \emph{Giardia}, \emph{Graphium}, \emph{Haloarcula}, \emph{Halobacterium}, \emph{Halococcus}, \emph{Hansenula}, \emph{Hendersonula}, \emph{Heterophyes}, \emph{Histomonas}, \emph{Histoplasma}, \emph{Hortaea}, \emph{Hymenolepis}, \emph{Hypomyces}, \emph{Hysterothylacium}, \emph{Kloeckera}, \emph{Kluyveromyces}, \emph{Kodamaea}, \emph{Lacazia}, \emph{Leishmania}, \emph{Lichtheimia}, \emph{Lodderomyces}, \emph{Lomentospora}, \emph{Madurella}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Metagonimus}, \emph{Meyerozyma}, \emph{Microsporidium}, \emph{Microsporum}, \emph{Millerozyma}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Nannizzia}, \emph{Necator}, \emph{Nectria}, \emph{Ochroconis}, \emph{Oesophagostomum}, \emph{Oidiodendron}, \emph{Opisthorchis}, \emph{Paecilomyces}, \emph{Paracoccidioides}, \emph{Pediculus}, \emph{Penicillium}, \emph{Phaeoacremonium}, \emph{Phaeomoniella}, \emph{Phialophora}, \emph{Phlebotomus}, \emph{Phoma}, \emph{Pichia}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Pneumocystis}, \emph{Pseudallescheria}, \emph{Pseudoscopulariopsis}, \emph{Pseudoterranova}, \emph{Pulex}, \emph{Purpureocillium}, \emph{Quambalaria}, \emph{Rhinocladiella}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Saccharomyces}, \emph{Saksenaea}, \emph{Saprochaete}, \emph{Sarcoptes}, \emph{Scedosporium}, \emph{Schistosoma}, \emph{Schizosaccharomyces}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Spirometra}, \emph{Sporobolomyces}, \emph{Sporopachydermia}, \emph{Sporothrix}, \emph{Sporotrichum}, \emph{Stachybotrys}, \emph{Strongyloides}, \emph{Syncephalastrum}, \emph{Syngamus}, \emph{Taenia}, \emph{Talaromyces}, \emph{Teleomorph}, \emph{Toxocara}, \emph{Trichinella}, \emph{Trichobilharzia}, \emph{Trichoderma}, \emph{Trichomonas}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Trichostrongylus}, \emph{Trichuris}, \emph{Tritirachium}, \emph{Trombicula}, \emph{Trypanosoma}, \emph{Tunga}, \emph{Ulocladium}, \emph{Ustilago}, \emph{Verticillium}, \emph{Wallemia}, \emph{Wangiella}, \emph{Wickerhamomyces}, \emph{Wuchereria}, \emph{Yarrowia}, or \emph{Zygosaccharomyces}; +\item All other records have \code{prevalence = 2.0} in the \link{microorganisms} data set. +} + +When calculating the matching score, all characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses. + +All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.381}, a less prevalent microorganism in humans), although the latter would alphabetically come first. +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +mo_reset_session() + +as.mo("E. coli") +mo_uncertainties() + +mo_matching_score( + x = "E. coli", + n = c("Escherichia coli", "Entamoeba coli") +) +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/mo_property.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mo_property.R +\name{mo_property} +\alias{mo_property} +\alias{mo_name} +\alias{mo_fullname} +\alias{mo_shortname} +\alias{mo_subspecies} +\alias{mo_species} +\alias{mo_genus} +\alias{mo_family} +\alias{mo_order} +\alias{mo_class} +\alias{mo_phylum} +\alias{mo_kingdom} +\alias{mo_domain} +\alias{mo_type} +\alias{mo_status} +\alias{mo_pathogenicity} +\alias{mo_gramstain} +\alias{mo_is_gram_negative} +\alias{mo_is_gram_positive} +\alias{mo_is_yeast} +\alias{mo_is_intrinsic_resistant} +\alias{mo_oxygen_tolerance} +\alias{mo_is_anaerobic} +\alias{mo_snomed} +\alias{mo_ref} +\alias{mo_authors} +\alias{mo_year} +\alias{mo_lpsn} +\alias{mo_mycobank} +\alias{mo_gbif} +\alias{mo_rank} +\alias{mo_taxonomy} +\alias{mo_synonyms} +\alias{mo_current} +\alias{mo_group_members} +\alias{mo_info} +\alias{mo_url} +\title{Get Properties of a Microorganism} +\usage{ +mo_name( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_fullname( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_shortname( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_subspecies( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_species( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_genus( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_family( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_order( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_class( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_phylum( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_kingdom( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_domain( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_type( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_status( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_pathogenicity( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_gramstain( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_is_gram_negative( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_is_gram_positive( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_is_yeast( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_is_intrinsic_resistant( + x, + ab, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_oxygen_tolerance( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_is_anaerobic( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_snomed( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_ref( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_authors( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_year( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_lpsn( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_mycobank( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_gbif( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_rank( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_taxonomy( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_synonyms( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_current(x, language = get_AMR_locale(), ...) + +mo_group_members( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_info( + x, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_url( + x, + open = FALSE, + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) + +mo_property( + x, + property = "fullname", + language = get_AMR_locale(), + keep_synonyms = getOption("AMR_keep_synonyms", FALSE), + ... +) +} +\arguments{ +\item{x}{any \link{character} (vector) that can be coerced to a valid microorganism code with \code{\link[=as.mo]{as.mo()}}. Can be left blank for auto-guessing the column containing microorganism codes if used in a data set, see \emph{Examples}.} + +\item{language}{language to translate text like "no growth", which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})} + +\item{keep_synonyms}{a \link{logical} to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is \code{FALSE}, which will return a note if old taxonomic names were processed. The default can be set with the package option \code{\link[=AMR-options]{AMR_keep_synonyms}}, i.e. \code{options(AMR_keep_synonyms = TRUE)} or \code{options(AMR_keep_synonyms = FALSE)}.} + +\item{...}{other arguments passed on to \code{\link[=as.mo]{as.mo()}}, such as 'minimum_matching_score', 'ignore_pattern', and 'remove_from_input'} + +\item{ab}{any (vector of) text that can be coerced to a valid antibiotic drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{open}{browse the URL using \code{\link[utils:browseURL]{browseURL()}}} + +\item{property}{one of the column names of the \link{microorganisms} data set: "mo", "fullname", "status", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "oxygen_tolerance", "source", "lpsn", "lpsn_parent", "lpsn_renamed_to", "mycobank", "mycobank_parent", "mycobank_renamed_to", "gbif", "gbif_parent", "gbif_renamed_to", "prevalence", or "snomed", or must be \code{"shortname"}} +} +\value{ +\itemize{ +\item An \link{integer} in case of \code{\link[=mo_year]{mo_year()}} +\item An \link[=factor]{ordered factor} in case of \code{\link[=mo_pathogenicity]{mo_pathogenicity()}} +\item A \link{list} in case of \code{\link[=mo_taxonomy]{mo_taxonomy()}}, \code{\link[=mo_synonyms]{mo_synonyms()}}, \code{\link[=mo_snomed]{mo_snomed()}}, and \code{\link[=mo_info]{mo_info()}} +\item A \link{logical} in case of \code{\link[=mo_is_anaerobic]{mo_is_anaerobic()}}, \code{\link[=mo_is_gram_negative]{mo_is_gram_negative()}}, \code{\link[=mo_is_gram_positive]{mo_is_gram_positive()}}, \code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}}, and \code{\link[=mo_is_yeast]{mo_is_yeast()}} +\item A named \link{character} in case of \code{\link[=mo_synonyms]{mo_synonyms()}} and \code{\link[=mo_url]{mo_url()}} +\item A \link{character} in all other cases +} +} +\description{ +Use these functions to return a specific property of a microorganism based on the latest accepted taxonomy. All input values will be evaluated internally with \code{\link[=as.mo]{as.mo()}}, which makes it possible to use microbial abbreviations, codes and names as input. See \emph{Examples}. +} +\details{ +All functions will, at default, \strong{not} keep old taxonomic properties, as synonyms are automatically replaced with the current taxonomy. Take for example \emph{Enterobacter aerogenes}, which was initially named in 1960 but renamed to \emph{Klebsiella aerogenes} in 2017: +\itemize{ +\item \code{mo_genus("Enterobacter aerogenes")} will return \code{"Klebsiella"} (with a note about the renaming) +\item \code{mo_genus("Enterobacter aerogenes", keep_synonyms = TRUE)} will return \code{"Enterobacter"} (with a once-per-session warning that the name is outdated) +\item \code{mo_ref("Enterobacter aerogenes")} will return \code{"Tindall et al., 2017"} (with a note about the renaming) +\item \code{mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)} will return \code{"Hormaeche et al., 1960"} (with a once-per-session warning that the name is outdated) +} + +The short name (\code{\link[=mo_shortname]{mo_shortname()}}) returns the first character of the genus and the full species, such as \code{"E. coli"}, for species and subspecies. Exceptions are abbreviations of staphylococci (such as \emph{"CoNS"}, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as \emph{"GBS"}, Group B Streptococci). Please bear in mind that e.g. \emph{E. coli} could mean \emph{Escherichia coli} (kingdom of Bacteria) as well as \emph{Entamoeba coli} (kingdom of Protozoa). Returning to the full name will be done using \code{\link[=as.mo]{as.mo()}} internally, giving priority to bacteria and human pathogens, i.e. \code{"E. coli"} will be considered \emph{Escherichia coli}. As a result, \code{mo_fullname(mo_shortname("Entamoeba coli"))} returns \code{"Escherichia coli"}. + +Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions \code{\link[=mo_kingdom]{mo_kingdom()}} and \code{\link[=mo_domain]{mo_domain()}} return the exact same results. + +Determination of human pathogenicity (\code{\link[=mo_pathogenicity]{mo_pathogenicity()}}) is strongly based on Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}). This function returns a \link{factor} with the levels \emph{Pathogenic}, \emph{Potentially pathogenic}, \emph{Non-pathogenic}, and \emph{Unknown}. + +Determination of the Gram stain (\code{\link[=mo_gramstain]{mo_gramstain()}}) will be based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, \href{https://pubmed.ncbi.nlm.nih.gov/11837318/}{PMID 11837318}), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, \href{https://pubmed.ncbi.nlm.nih.gov/34694987/}{PMID 34694987}). Bacteria in these phyla are considered Gram-positive in this \code{AMR} package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value \code{NA}. Functions \code{\link[=mo_is_gram_negative]{mo_is_gram_negative()}} and \code{\link[=mo_is_gram_positive]{mo_is_gram_positive()}} always return \code{TRUE} or \code{FALSE} (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}), thus always return \code{FALSE} for species outside the taxonomic kingdom of Bacteria. + +Determination of yeasts (\code{\link[=mo_is_yeast]{mo_is_yeast()}}) will be based on the taxonomic kingdom and class. \emph{Budding yeasts} are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. \emph{True yeasts} quite specifically refers to yeasts in the underlying order Saccharomycetales (such as \emph{Saccharomyces cerevisiae}). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return \code{TRUE}. It returns \code{FALSE} otherwise (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}). + +Determination of intrinsic resistance (\code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}}) will be based on the \link{intrinsic_resistant} data set, which is based on \href{https://www.eucast.org/expert_rules_and_expected_phenotypes}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021). The \code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}} function can be vectorised over both argument \code{x} (input for microorganisms) and \code{ab} (input for antibiotics). + +Determination of bacterial oxygen tolerance (\code{\link[=mo_oxygen_tolerance]{mo_oxygen_tolerance()}}) will be based on BacDive, see \emph{Source}. The function \code{\link[=mo_is_anaerobic]{mo_is_anaerobic()}} only returns \code{TRUE} if the oxygen tolerance is \code{"anaerobe"}, indicting an obligate anaerobic species or genus. It always returns \code{FALSE} for species outside the taxonomic kingdom of Bacteria. + +The function \code{\link[=mo_url]{mo_url()}} will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. \href{https://www.mycobank.org}{This MycoBank URL} will be used for fungi wherever available , \href{https://www.mycobank.org}{this LPSN URL} for bacteria wherever available, and \href{https://www.gbif.org}{this GBIF link} otherwise. + +SNOMED codes (\code{\link[=mo_snomed]{mo_snomed()}}) was last updated on July 16th, 2024. See \emph{Source} and the \link{microorganisms} data set for more info. + +Old taxonomic names (so-called 'synonyms') can be retrieved with \code{\link[=mo_synonyms]{mo_synonyms()}} (which will have the scientific reference as \link[base:names]{name}), the current taxonomic name can be retrieved with \code{\link[=mo_current]{mo_current()}}. Both functions return full names. + +All output \link[=translate]{will be translated} where possible. +} +\section{Matching Score for Microorganisms}{ + +This function uses \code{\link[=as.mo]{as.mo()}} internally, which uses an advanced algorithm to translate arbitrary user input to valid taxonomy using a so-called matching score. You can read about this public algorithm on the \link[=mo_matching_score]{MO matching score page}. +} + +\section{Source}{ + +\itemize{ +\item Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03} +\item Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on June 24th, 2024. +\item Vincent, R \emph{et al} (2013). \strong{MycoBank gearing up for new horizons.} IMA Fungus, 4(2), 371-9; \doi{10.5598/imafungus.2013.04.02.16}. Accessed from \url{https://www.mycobank.org} on June 24th, 2024. +\item GBIF Secretariat (2023). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on June 24th, 2024. +\item Reimer, LC \emph{et al.} (2022). \strong{\emph{BacDive} in 2022: the knowledge base for standardized bacterial and archaeal data.} Nucleic Acids Res., 50(D1):D741-D74; \doi{10.1093/nar/gkab961}. Accessed from \url{https://bacdive.dsmz.de} on July 16th, 2024. +\item Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microorganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: \url{https://phinvads.cdc.gov} +\item Bartlett A \emph{et al.} (2022). \strong{A comprehensive list of bacterial pathogens infecting humans} \emph{Microbiology} 168:001269; \doi{10.1099/mic.0.001269} +} +} + +\section{Reference Data Publicly Available}{ + +All data sets in this \code{AMR} package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}. +} + +\examples{ +# taxonomic tree ----------------------------------------------------------- + +mo_kingdom("Klebsiella pneumoniae") +mo_phylum("Klebsiella pneumoniae") +mo_class("Klebsiella pneumoniae") +mo_order("Klebsiella pneumoniae") +mo_family("Klebsiella pneumoniae") +mo_genus("Klebsiella pneumoniae") +mo_species("Klebsiella pneumoniae") +mo_subspecies("Klebsiella pneumoniae") + + +# full names and short names ----------------------------------------------- + +mo_name("Klebsiella pneumoniae") +mo_fullname("Klebsiella pneumoniae") +mo_shortname("Klebsiella pneumoniae") + + +# other properties --------------------------------------------------------- + +mo_pathogenicity("Klebsiella pneumoniae") +mo_gramstain("Klebsiella pneumoniae") +mo_snomed("Klebsiella pneumoniae") +mo_type("Klebsiella pneumoniae") +mo_rank("Klebsiella pneumoniae") +mo_url("Klebsiella pneumoniae") +mo_is_yeast(c("Candida", "Trichophyton", "Klebsiella")) + +mo_group_members(c("Streptococcus group A", + "Streptococcus group C", + "Streptococcus group G", + "Streptococcus group L")) + + +# scientific reference ----------------------------------------------------- + +mo_ref("Klebsiella aerogenes") +mo_authors("Klebsiella aerogenes") +mo_year("Klebsiella aerogenes") +mo_synonyms("Klebsiella aerogenes") +mo_lpsn("Klebsiella aerogenes") +mo_gbif("Klebsiella aerogenes") +mo_mycobank("Candida albicans") +mo_mycobank("Candida krusei") +mo_mycobank("Candida krusei", keep_synonyms = TRUE) + + +# abbreviations known in the field ----------------------------------------- + +mo_genus("MRSA") +mo_species("MRSA") +mo_shortname("VISA") +mo_gramstain("VISA") + +mo_genus("EHEC") +mo_species("EIEC") +mo_name("UPEC") + + +# known subspecies --------------------------------------------------------- + +mo_fullname("K. pneu rh") +mo_shortname("K. pneu rh") + +\donttest{ +# Becker classification, see ?as.mo ---------------------------------------- + +mo_fullname("Staph epidermidis") +mo_fullname("Staph epidermidis", Becker = TRUE) +mo_shortname("Staph epidermidis") +mo_shortname("Staph epidermidis", Becker = TRUE) + + +# Lancefield classification, see ?as.mo ------------------------------------ + +mo_fullname("Strep agalactiae") +mo_fullname("Strep agalactiae", Lancefield = TRUE) +mo_shortname("Strep agalactiae") +mo_shortname("Strep agalactiae", Lancefield = TRUE) + + +# language support -------------------------------------------------------- + +mo_gramstain("Klebsiella pneumoniae", language = "de") # German +mo_gramstain("Klebsiella pneumoniae", language = "nl") # Dutch +mo_gramstain("Klebsiella pneumoniae", language = "es") # Spanish +mo_gramstain("Klebsiella pneumoniae", language = "el") # Greek +mo_gramstain("Klebsiella pneumoniae", language = "uk") # Ukrainian + +# mo_type is equal to mo_kingdom, but mo_kingdom will remain untranslated +mo_kingdom("Klebsiella pneumoniae") +mo_type("Klebsiella pneumoniae") +mo_kingdom("Klebsiella pneumoniae", language = "zh") # Chinese, no effect +mo_type("Klebsiella pneumoniae", language = "zh") # Chinese, translated + +mo_fullname("S. pyogenes", Lancefield = TRUE, language = "de") +mo_fullname("S. pyogenes", Lancefield = TRUE, language = "uk") + + +# other -------------------------------------------------------------------- + +# gram stains and intrinsic resistance can be used as a filter in dplyr verbs +if (require("dplyr")) { + example_isolates \%>\% + filter(mo_is_gram_positive()) \%>\% + count(mo_genus(), sort = TRUE) +} +if (require("dplyr")) { + example_isolates \%>\% + filter(mo_is_intrinsic_resistant(ab = "vanco")) \%>\% + count(mo_genus(), sort = TRUE) +} + +# get a list with the complete taxonomy (from kingdom to subspecies) +mo_taxonomy("Klebsiella pneumoniae") + +# get a list with the taxonomy, the authors, Gram-stain, +# SNOMED codes, and URL to the online database +mo_info("Klebsiella pneumoniae") +} +} +\seealso{ +Data set \link{microorganisms} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/mo_source.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/mo_source.R +\name{mo_source} +\alias{mo_source} +\alias{set_mo_source} +\alias{get_mo_source} +\title{User-Defined Reference Data Set for Microorganisms} +\usage{ +set_mo_source( + path, + destination = getOption("AMR_mo_source", "~/mo_source.rds") +) + +get_mo_source(destination = getOption("AMR_mo_source", "~/mo_source.rds")) +} +\arguments{ +\item{path}{location of your reference file, this can be any text file (comma-, tab- or pipe-separated) or an Excel file (see \emph{Details}). Can also be \code{""}, \code{NULL} or \code{FALSE} to delete the reference file.} + +\item{destination}{destination of the compressed data file - the default is the user's home directory.} +} +\description{ +These functions can be used to predefine your own reference to be used in \code{\link[=as.mo]{as.mo()}} and consequently all \code{\link[=mo_property]{mo_*}} functions (such as \code{\link[=mo_genus]{mo_genus()}} and \code{\link[=mo_gramstain]{mo_gramstain()}}). + +This is \strong{the fastest way} to have your organisation (or analysis) specific codes picked up and translated by this package, since you don't have to bother about it again after setting it up once. +} +\details{ +The reference file can be a text file separated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an \R object file (extension '.rds'). To use an Excel file, you will need to have the \code{readxl} package installed. + +\code{\link[=set_mo_source]{set_mo_source()}} will check the file for validity: it must be a \link{data.frame}, must have a column named \code{"mo"} which contains values from \code{\link[=microorganisms]{microorganisms$mo}} or \code{\link[=microorganisms]{microorganisms$fullname}} and must have a reference column with your own defined values. If all tests pass, \code{\link[=set_mo_source]{set_mo_source()}} will read the file into \R and will ask to export it to \code{"~/mo_source.rds"}. The CRAN policy disallows packages to write to the file system, although '\emph{exceptions may be allowed in interactive sessions if the package obtains confirmation from the user}'. For this reason, this function only works in interactive sessions so that the user can \strong{specifically confirm and allow} that this file will be created. The destination of this file can be set with the \code{destination} argument and defaults to the user's home directory. It can also be set with the package option \code{\link[=AMR-options]{AMR_mo_source}}, e.g. \code{options(AMR_mo_source = "my/location/file.rds")}. + +The created compressed data file \code{"mo_source.rds"} will be used at default for MO determination (function \code{\link[=as.mo]{as.mo()}} and consequently all \verb{mo_*} functions like \code{\link[=mo_genus]{mo_genus()}} and \code{\link[=mo_gramstain]{mo_gramstain()}}). The location and timestamp of the original file will be saved as an \link[base:attributes]{attribute} to the compressed data file. + +The function \code{\link[=get_mo_source]{get_mo_source()}} will return the data set by reading \code{"mo_source.rds"} with \code{\link[=readRDS]{readRDS()}}. If the original file has changed (by checking the location and timestamp of the original file), it will call \code{\link[=set_mo_source]{set_mo_source()}} to update the data file automatically if used in an interactive session. + +Reading an Excel file (\code{.xlsx}) with only one row has a size of 8-9 kB. The compressed file created with \code{\link[=set_mo_source]{set_mo_source()}} will then have a size of 0.1 kB and can be read by \code{\link[=get_mo_source]{get_mo_source()}} in only a couple of microseconds (millionths of a second). +} +\section{How to Setup}{ + + +Imagine this data on a sheet of an Excel file. The first column contains the organisation specific codes, the second column contains valid taxonomic names: + +\if{html}{\out{
}}\preformatted{ | A | B | +--|--------------------|-----------------------| +1 | Organisation XYZ | mo | +2 | lab_mo_ecoli | Escherichia coli | +3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +4 | | | +}\if{html}{\out{
}} + +We save it as \code{"home/me/ourcodes.xlsx"}. Now we have to set it as a source: + +\if{html}{\out{
}}\preformatted{set_mo_source("home/me/ourcodes.xlsx") +#> NOTE: Created mo_source file '/Users/me/mo_source.rds' (0.3 kB) from +#> '/Users/me/Documents/ourcodes.xlsx' (9 kB), columns +#> "Organisation XYZ" and "mo" +}\if{html}{\out{
}} + +It has now created a file \code{"~/mo_source.rds"} with the contents of our Excel file. Only the first column with foreign values and the 'mo' column will be kept when creating the RDS file. + +And now we can use it in our functions: + +\if{html}{\out{
}}\preformatted{as.mo("lab_mo_ecoli") +#> Class 'mo' +#> [1] B_ESCHR_COLI + +mo_genus("lab_mo_kpneumoniae") +#> [1] "Klebsiella" + +# other input values still work too +as.mo(c("Escherichia coli", "E. coli", "lab_mo_ecoli")) +#> NOTE: Translation to one microorganism was guessed with uncertainty. +#> Use mo_uncertainties() to review it. +#> Class 'mo' +#> [1] B_ESCHR_COLI B_ESCHR_COLI B_ESCHR_COLI +}\if{html}{\out{
}} + +If we edit the Excel file by, let's say, adding row 4 like this: + +\if{html}{\out{
}}\preformatted{ | A | B | +--|--------------------|-----------------------| +1 | Organisation XYZ | mo | +2 | lab_mo_ecoli | Escherichia coli | +3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +4 | lab_Staph_aureus | Staphylococcus aureus | +5 | | | +}\if{html}{\out{
}} + +...any new usage of an MO function in this package will update your data file: + +\if{html}{\out{
}}\preformatted{as.mo("lab_mo_ecoli") +#> NOTE: Updated mo_source file '/Users/me/mo_source.rds' (0.3 kB) from +#> '/Users/me/Documents/ourcodes.xlsx' (9 kB), columns +#> "Organisation XYZ" and "mo" +#> Class 'mo' +#> [1] B_ESCHR_COLI + +mo_genus("lab_Staph_aureus") +#> [1] "Staphylococcus" +}\if{html}{\out{
}} + +To delete the reference data file, just use \code{""}, \code{NULL} or \code{FALSE} as input for \code{\link[=set_mo_source]{set_mo_source()}}: + +\if{html}{\out{
}}\preformatted{set_mo_source(NULL) +#> Removed mo_source file '/Users/me/mo_source.rds' +}\if{html}{\out{
}} + +If the original file (in the previous case an Excel file) is moved or deleted, the \code{mo_source.rds} file will be removed upon the next use of \code{\link[=as.mo]{as.mo()}} or any \code{\link[=mo_property]{mo_*}} function. +} + + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/pca.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/pca.R +\name{pca} +\alias{pca} +\title{Principal Component Analysis (for AMR)} +\usage{ +pca( + x, + ..., + retx = TRUE, + center = TRUE, + scale. = TRUE, + tol = NULL, + rank. = NULL +) +} +\arguments{ +\item{x}{a \link{data.frame} containing \link{numeric} columns} + +\item{...}{columns of \code{x} to be selected for PCA, can be unquoted since it supports quasiquotation.} + +\item{retx}{a logical value indicating whether the rotated variables + should be returned.} + +\item{center}{a logical value indicating whether the variables + should be shifted to be zero centered. Alternately, a vector of + length equal the number of columns of \code{x} can be supplied. + The value is passed to \code{scale}.} + +\item{scale.}{a logical value indicating whether the variables should + be scaled to have unit variance before the analysis takes + place. The default is \code{FALSE} for consistency with S, but + in general scaling is advisable. Alternatively, a vector of length + equal the number of columns of \code{x} can be supplied. The + value is passed to \code{\link{scale}}.} + +\item{tol}{a value indicating the magnitude below which components + should be omitted. (Components are omitted if their + standard deviations are less than or equal to \code{tol} times the + standard deviation of the first component.) With the default null + setting, no components are omitted (unless \code{rank.} is specified + less than \code{min(dim(x))}.). Other settings for \code{tol} could be + \code{tol = 0} or \code{tol = sqrt(.Machine$double.eps)}, which + would omit essentially constant components.} + +\item{rank.}{optionally, a number specifying the maximal rank, i.e., + maximal number of principal components to be used. Can be set as + alternative or in addition to \code{tol}, useful notably when the + desired rank is considerably smaller than the dimensions of the matrix.} +} +\value{ +An object of classes \link{pca} and \link{prcomp} +} +\description{ +Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels, and automatic filtering on only suitable (i.e. non-empty and numeric) variables. +} +\details{ +The \code{\link[=pca]{pca()}} function takes a \link{data.frame} as input and performs the actual PCA with the \R function \code{\link[=prcomp]{prcomp()}}. + +The result of the \code{\link[=pca]{pca()}} function is a \link{prcomp} object, with an additional attribute \code{non_numeric_cols} which is a vector with the column names of all columns that do not contain \link{numeric} values. These are probably the groups and labels, and will be used by \code{\link[=ggplot_pca]{ggplot_pca()}}. +} +\examples{ +# `example_isolates` is a data set available in the AMR package. +# See ?example_isolates. + +\donttest{ +if (require("dplyr")) { + # calculate the resistance per group first + resistance_data <- example_isolates \%>\% + group_by( + order = mo_order(mo), # group on anything, like order + genus = mo_genus(mo) + ) \%>\% # and genus as we do here; + filter(n() >= 30) \%>\% # filter on only 30 results per group + summarise_if(is.sir, resistance) # then get resistance of all drugs + + # now conduct PCA for certain antimicrobial drugs + pca_result <- resistance_data \%>\% + pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) + + pca_result + summary(pca_result) + + # old base R plotting method: + biplot(pca_result) + # new ggplot2 plotting method using this package: + if (require("ggplot2")) { + ggplot_pca(pca_result) + + ggplot_pca(pca_result) + + scale_colour_viridis_d() + + labs(title = "Title here") + } +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/plot.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/plotting.R +\name{plot} +\alias{plot} +\alias{scale_x_mic} +\alias{scale_y_mic} +\alias{scale_colour_mic} +\alias{scale_fill_mic} +\alias{plot.mic} +\alias{autoplot.mic} +\alias{fortify.mic} +\alias{plot.disk} +\alias{autoplot.disk} +\alias{fortify.disk} +\alias{plot.sir} +\alias{autoplot.sir} +\alias{fortify.sir} +\title{Plotting for Classes \code{sir}, \code{mic} and \code{disk}} +\usage{ +scale_x_mic(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) + +scale_y_mic(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) + +scale_colour_mic(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) + +scale_fill_mic(keep_operators = "edges", mic_range = NULL, drop = FALSE, ...) + +\method{plot}{mic}( + x, + mo = NULL, + ab = NULL, + guideline = "EUCAST", + main = deparse(substitute(x)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Minimum Inhibitory Concentration (mg/L)", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ... +) + +\method{autoplot}{mic}( + object, + mo = NULL, + ab = NULL, + guideline = "EUCAST", + title = deparse(substitute(object)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Minimum Inhibitory Concentration (mg/L)", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ... +) + +\method{fortify}{mic}(object, ...) + +\method{plot}{disk}( + x, + main = deparse(substitute(x)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Disk diffusion diameter (mm)", language = language), + mo = NULL, + ab = NULL, + guideline = "EUCAST", + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ... +) + +\method{autoplot}{disk}( + object, + mo = NULL, + ab = NULL, + title = deparse(substitute(object)), + ylab = translate_AMR("Frequency", language = language), + xlab = translate_AMR("Disk diffusion diameter (mm)", language = language), + guideline = "EUCAST", + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + expand = TRUE, + include_PKPD = getOption("AMR_include_PKPD", TRUE), + breakpoint_type = getOption("AMR_breakpoint_type", "human"), + ... +) + +\method{fortify}{disk}(object, ...) + +\method{plot}{sir}( + x, + ylab = translate_AMR("Percentage", language = language), + xlab = translate_AMR("Antimicrobial Interpretation", language = language), + main = deparse(substitute(x)), + language = get_AMR_locale(), + ... +) + +\method{autoplot}{sir}( + object, + title = deparse(substitute(object)), + xlab = translate_AMR("Antimicrobial Interpretation", language = language), + ylab = translate_AMR("Frequency", language = language), + colours_SIR = c("#3CAEA3", "#F6D55C", "#ED553B"), + language = get_AMR_locale(), + ... +) + +\method{fortify}{sir}(object, ...) +} +\arguments{ +\item{keep_operators}{a \link{character} specifying how to handle operators (such as \code{>} and \code{<=}) in the input. Accepts one of three values: \code{"all"} (or \code{TRUE}) to keep all operators, \code{"none"} (or \code{FALSE}) to remove all operators, or \code{"edges"} to keep operators only at both ends of the range.} + +\item{mic_range}{a manual range to limit the MIC values, e.g., \code{mic_range = c(0.001, 32)}. Use \code{NA} to set no limit on one side, e.g., \code{mic_range = c(NA, 32)}.} + +\item{drop}{a \link{logical} to remove intermediate MIC values, defaults to \code{FALSE}} + +\item{...}{arguments passed on to methods} + +\item{x, object}{values created with \code{\link[=as.mic]{as.mic()}}, \code{\link[=as.disk]{as.disk()}} or \code{\link[=as.sir]{as.sir()}} (or their \verb{random_*} variants, such as \code{\link[=random_mic]{random_mic()}})} + +\item{mo}{any (vector of) text that can be coerced to a valid microorganism code with \code{\link[=as.mo]{as.mo()}}} + +\item{ab}{any (vector of) text that can be coerced to a valid antimicrobial drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{guideline}{interpretation guideline to use - the default is the latest included EUCAST guideline, see \emph{Details}} + +\item{main, title}{title of the plot} + +\item{xlab, ylab}{axis title} + +\item{colours_SIR}{colours to use for filling in the bars, must be a vector of three values (in the order S, I and R). The default colours are colour-blind friendly.} + +\item{language}{language to be used to translate 'Susceptible', 'Increased exposure'/'Intermediate' and 'Resistant' - the default is system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can be overwritten by setting the package option \code{\link[=AMR-options]{AMR_locale}}, e.g. \code{options(AMR_locale = "de")}, see \link{translate}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{expand}{a \link{logical} to indicate whether the range on the x axis should be expanded between the lowest and highest value. For MIC values, intermediate values will be factors of 2 starting from the highest MIC value. For disk diameters, the whole diameter range will be filled.} + +\item{include_PKPD}{a \link{logical} to indicate that PK/PD clinical breakpoints must be applied as a last resort - the default is \code{TRUE}. Can also be set with the package option \code{\link[=AMR-options]{AMR_include_PKPD}}.} + +\item{breakpoint_type}{the type of breakpoints to use, either "ECOFF", "animal", or "human". ECOFF stands for Epidemiological Cut-Off values. The default is \code{"human"}, which can also be set with the package option \code{\link[=AMR-options]{AMR_breakpoint_type}}. If \code{host} is set to values of veterinary species, this will automatically be set to \code{"animal"}.} +} +\value{ +The \code{autoplot()} functions return a \code{\link[ggplot2:ggplot]{ggplot}} model that is extendible with any \code{ggplot2} function. + +The \code{fortify()} functions return a \link{data.frame} as an extension for usage in the \code{\link[ggplot2:ggplot]{ggplot2::ggplot()}} function. +} +\description{ +Functions to plot classes \code{sir}, \code{mic} and \code{disk}, with support for base \R and \code{ggplot2}. + +Especially the \verb{scale_*_mic()} functions are relevant wrappers to plot MIC values for \code{ggplot2}. They allows custom MIC ranges and to plot intermediate log2 levels for missing MIC values. +} +\details{ +The interpretation of "I" will be named "Increased exposure" for all EUCAST guidelines since 2019, and will be named "Intermediate" in all other cases. + +For interpreting MIC values as well as disk diffusion diameters, supported guidelines to be used as input for the \code{guideline} argument are: "EUCAST 2024", "EUCAST 2023", "EUCAST 2022", "EUCAST 2021", "EUCAST 2020", "EUCAST 2019", "EUCAST 2018", "EUCAST 2017", "EUCAST 2016", "EUCAST 2015", "EUCAST 2014", "EUCAST 2013", "EUCAST 2012", "EUCAST 2011", "CLSI 2024", "CLSI 2023", "CLSI 2022", "CLSI 2021", "CLSI 2020", "CLSI 2019", "CLSI 2018", "CLSI 2017", "CLSI 2016", "CLSI 2015", "CLSI 2014", "CLSI 2013", "CLSI 2012", and "CLSI 2011". + +Simply using \code{"CLSI"} or \code{"EUCAST"} as input will automatically select the latest version of that guideline. +} +\examples{ +some_mic_values <- random_mic(size = 100) +some_disk_values <- random_disk(size = 100, mo = "Escherichia coli", ab = "cipro") +some_sir_values <- random_sir(50, prob_SIR = c(0.55, 0.05, 0.30)) + +plot(some_mic_values) +plot(some_disk_values) +plot(some_sir_values) + +# when providing the microorganism and antibiotic, colours will show interpretations: +plot(some_mic_values, mo = "S. aureus", ab = "ampicillin") +plot(some_disk_values, mo = "Escherichia coli", ab = "cipro") +plot(some_disk_values, mo = "Escherichia coli", ab = "cipro", language = "nl") + + +# Plotting using scale_x_mic() +\donttest{ +if (require("ggplot2")) { + mic_plot <- ggplot(data.frame(mics = as.mic(c(0.25, "<=4", 4, 8, 32, ">=32")), + counts = c(1, 1, 2, 2, 3, 3)), + aes(mics, counts)) + + geom_col() + mic_plot + + labs(title = "without scale_x_mic()") +} +if (require("ggplot2")) { + mic_plot + + scale_x_mic() + + labs(title = "with scale_x_mic()") +} +if (require("ggplot2")) { + mic_plot + + scale_x_mic(keep_operators = "all") + + labs(title = "with scale_x_mic() keeping all operators") +} +if (require("ggplot2")) { + mic_plot + + scale_x_mic(mic_range = c(1, 16)) + + labs(title = "with scale_x_mic() using a manual 'within' range") +} +if (require("ggplot2")) { + mic_plot + + scale_x_mic(mic_range = c(0.032, 256)) + + labs(title = "with scale_x_mic() using a manual 'outside' range") +} + +if (require("ggplot2")) { + autoplot(some_mic_values) +} +if (require("ggplot2")) { + autoplot(some_disk_values, mo = "Escherichia coli", ab = "cipro") +} +if (require("ggplot2")) { + autoplot(some_sir_values) +} +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/proportion.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/proportion.R, R/sir_df.R +\name{proportion} +\alias{proportion} +\alias{resistance} +\alias{portion} +\alias{susceptibility} +\alias{sir_confidence_interval} +\alias{proportion_R} +\alias{proportion_IR} +\alias{proportion_I} +\alias{proportion_SI} +\alias{proportion_S} +\alias{proportion_df} +\alias{sir_df} +\title{Calculate Antimicrobial Resistance} +\source{ +\strong{M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 5th Edition}, 2022, \emph{Clinical and Laboratory Standards Institute (CLSI)}. \url{https://clsi.org/standards/products/microbiology/documents/m39/}. +} +\usage{ +resistance(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +susceptibility(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +sir_confidence_interval( + ..., + ab_result = "R", + minimum = 30, + as_percent = FALSE, + only_all_tested = FALSE, + confidence_level = 0.95, + side = "both", + collapse = FALSE +) + +proportion_R(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +proportion_IR(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +proportion_I(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +proportion_SI(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +proportion_S(..., minimum = 30, as_percent = FALSE, only_all_tested = FALSE) + +proportion_df( + data, + translate_ab = "name", + language = get_AMR_locale(), + minimum = 30, + as_percent = FALSE, + combine_SI = TRUE, + confidence_level = 0.95 +) + +sir_df( + data, + translate_ab = "name", + language = get_AMR_locale(), + minimum = 30, + as_percent = FALSE, + combine_SI = TRUE, + confidence_level = 0.95 +) +} +\arguments{ +\item{...}{one or more vectors (or columns) with antibiotic interpretations. They will be transformed internally with \code{\link[=as.sir]{as.sir()}} if needed. Use multiple columns to calculate (the lack of) co-resistance: the probability where one of two drugs have a resistant or susceptible result. See \emph{Examples}.} + +\item{minimum}{the minimum allowed number of available (tested) isolates. Any isolate count lower than \code{minimum} will return \code{NA} with a warning. The default number of \code{30} isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see \emph{Source}.} + +\item{as_percent}{a \link{logical} to indicate whether the output must be returned as a hundred fold with \% sign (a character). A value of \code{0.123456} will then be returned as \code{"12.3\%"}.} + +\item{only_all_tested}{(for combination therapies, i.e. using more than one variable for \code{...}): a \link{logical} to indicate that isolates must be tested for all antibiotics, see section \emph{Combination Therapy} below} + +\item{ab_result}{antibiotic results to test against, must be one or more values of "S", "SDD", "I", or "R"} + +\item{confidence_level}{the confidence level for the returned confidence interval. For the calculation, the number of S or SI isolates, and R isolates are compared with the total number of available isolates with R, S, or I by using \code{\link[=binom.test]{binom.test()}}, i.e., the Clopper-Pearson method.} + +\item{side}{the side of the confidence interval to return. The default is \code{"both"} for a length 2 vector, but can also be (abbreviated as) \code{"min"}/\code{"left"}/\code{"lower"}/\code{"less"} or \code{"max"}/\code{"right"}/\code{"higher"}/\code{"greater"}.} + +\item{collapse}{a \link{logical} to indicate whether the output values should be 'collapsed', i.e. be merged together into one value, or a character value to use for collapsing} + +\item{data}{a \link{data.frame} containing columns with class \code{\link{sir}} (see \code{\link[=as.sir]{as.sir()}})} + +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} + +\item{language}{language of the returned text - the default is the current system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can also be set with the package option \code{\link[=AMR-options]{AMR_locale}}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} + +\item{combine_SI}{a \link{logical} to indicate whether all values of S, SDD, and I must be merged into one, so the output only consists of S+SDD+I vs. R (susceptible vs. resistant) - the default is \code{TRUE}} +} +\value{ +A \link{double} or, when \code{as_percent = TRUE}, a \link{character}. +} +\description{ +These functions can be used to calculate the (co-)resistance or susceptibility of microbial isolates (i.e. percentage of S, SI, I, IR or R). All functions support quasiquotation with pipes, can be used in \code{summarise()} from the \code{dplyr} package and also support grouped variables, see \emph{Examples}. + +\code{\link[=resistance]{resistance()}} should be used to calculate resistance, \code{\link[=susceptibility]{susceptibility()}} should be used to calculate susceptibility.\cr +} +\details{ +\strong{Remember that you should filter your data to let it contain only first isolates!} This is needed to exclude duplicates and to reduce selection bias. Use \code{\link[=first_isolate]{first_isolate()}} to determine them in your data set with one of the four available algorithms. + +The function \code{\link[=resistance]{resistance()}} is equal to the function \code{\link[=proportion_R]{proportion_R()}}. The function \code{\link[=susceptibility]{susceptibility()}} is equal to the function \code{\link[=proportion_SI]{proportion_SI()}}. Since AMR v3.0, \code{\link[=proportion_SI]{proportion_SI()}} and \code{\link[=proportion_I]{proportion_I()}} include dose-dependent susceptibility ('SDD'). + +Use \code{\link[=sir_confidence_interval]{sir_confidence_interval()}} to calculate the confidence interval, which relies on \code{\link[=binom.test]{binom.test()}}, i.e., the Clopper-Pearson method. This function returns a vector of length 2 at default for antimicrobial \emph{resistance}. Change the \code{side} argument to "left"/"min" or "right"/"max" to return a single value, and change the \code{ab_result} argument to e.g. \code{c("S", "I")} to test for antimicrobial \emph{susceptibility}, see Examples. + +These functions are not meant to count isolates, but to calculate the proportion of resistance/susceptibility. Use the \code{\link[=count]{count_*()}} functions to count isolates. The function \code{\link[=susceptibility]{susceptibility()}} is essentially equal to \code{\link[=count_susceptible]{count_susceptible()}}\code{/}\code{\link[=count_all]{count_all()}}. \emph{Low counts can influence the outcome - the \verb{proportion_*()} functions may camouflage this, since they only return the proportion (albeit dependent on the \code{minimum} argument).} + +The function \code{\link[=proportion_df]{proportion_df()}} takes any variable from \code{data} that has an \code{\link{sir}} class (created with \code{\link[=as.sir]{as.sir()}}) and calculates the proportions S, I, and R. It also supports grouped variables. The function \code{\link[=sir_df]{sir_df()}} works exactly like \code{\link[=proportion_df]{proportion_df()}}, but adds the number of isolates. +} +\section{Combination Therapy}{ + +When using more than one variable for \code{...} (= combination therapy), use \code{only_all_tested} to only count isolates that are tested for all antibiotics/variables that you test them for. See this example for two antibiotics, Drug A and Drug B, about how \code{\link[=susceptibility]{susceptibility()}} works to calculate the \%SI: + +\if{html}{\out{
}}\preformatted{-------------------------------------------------------------------- + only_all_tested = FALSE only_all_tested = TRUE + ----------------------- ----------------------- + Drug A Drug B include as include as include as include as + numerator denominator numerator denominator +-------- -------- ---------- ----------- ---------- ----------- + S or I S or I X X X X + R S or I X X X X + S or I X X - - + S or I R X X X X + R R - X - X + R - - - - + S or I X X - - + R - - - - + - - - - +-------------------------------------------------------------------- +}\if{html}{\out{
}} + +Please note that, in combination therapies, for \code{only_all_tested = TRUE} applies that: + +\if{html}{\out{
}}\preformatted{ count_S() + count_I() + count_R() = count_all() + proportion_S() + proportion_I() + proportion_R() = 1 +}\if{html}{\out{
}} + +and that, in combination therapies, for \code{only_all_tested = FALSE} applies that: + +\if{html}{\out{
}}\preformatted{ count_S() + count_I() + count_R() >= count_all() + proportion_S() + proportion_I() + proportion_R() >= 1 +}\if{html}{\out{
}} + +Using \code{only_all_tested} has no impact when only using one antibiotic as input. +} + +\section{Interpretation of SIR}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (\url{https://www.eucast.org/newsiandr}): +\itemize{ +\item \strong{S - Susceptible, standard dosing regimen}\cr +A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +\item \strong{I - Susceptible, increased exposure} \emph{\cr +A microorganism is categorised as "Susceptible, Increased exposure}" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +\item \strong{R = Resistant}\cr +A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +\itemize{ +\item \emph{Exposure} is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +} +} + +This AMR package honours this insight. Use \code{\link[=susceptibility]{susceptibility()}} (equal to \code{\link[=proportion_SI]{proportion_SI()}}) to determine antimicrobial susceptibility and \code{\link[=count_susceptible]{count_susceptible()}} (equal to \code{\link[=count_SI]{count_SI()}}) to count susceptible isolates. +} + +\examples{ +# example_isolates is a data set available in the AMR package. +# run ?example_isolates for more info. +example_isolates + + +# base R ------------------------------------------------------------ +# determines \%R +resistance(example_isolates$AMX) +sir_confidence_interval(example_isolates$AMX) +sir_confidence_interval(example_isolates$AMX, + confidence_level = 0.975 +) +sir_confidence_interval(example_isolates$AMX, + confidence_level = 0.975, + collapse = ", " +) + +# determines \%S+I: +susceptibility(example_isolates$AMX) +sir_confidence_interval(example_isolates$AMX, + ab_result = c("S", "I") +) + +# be more specific +proportion_S(example_isolates$AMX) +proportion_SI(example_isolates$AMX) +proportion_I(example_isolates$AMX) +proportion_IR(example_isolates$AMX) +proportion_R(example_isolates$AMX) + +# dplyr ------------------------------------------------------------- +\donttest{ +if (require("dplyr")) { + example_isolates \%>\% + group_by(ward) \%>\% + summarise( + r = resistance(CIP), + n = n_sir(CIP) + ) # n_sir works like n_distinct in dplyr, see ?n_sir +} +if (require("dplyr")) { + example_isolates \%>\% + group_by(ward) \%>\% + summarise( + cipro_R = resistance(CIP), + ci_min = sir_confidence_interval(CIP, side = "min"), + ci_max = sir_confidence_interval(CIP, side = "max"), + ) +} +if (require("dplyr")) { + # scoped dplyr verbs with antibiotic selectors + # (you could also use across() of course) + example_isolates \%>\% + group_by(ward) \%>\% + summarise_at( + c(aminoglycosides(), carbapenems()), + resistance + ) +} +if (require("dplyr")) { + example_isolates \%>\% + group_by(ward) \%>\% + summarise( + R = resistance(CIP, as_percent = TRUE), + SI = susceptibility(CIP, as_percent = TRUE), + n1 = count_all(CIP), # the actual total; sum of all three + n2 = n_sir(CIP), # same - analogous to n_distinct + total = n() + ) # NOT the number of tested isolates! + + # Calculate co-resistance between amoxicillin/clav acid and gentamicin, + # so we can see that combination therapy does a lot more than mono therapy: + example_isolates \%>\% susceptibility(AMC) # \%SI = 76.3\% + example_isolates \%>\% count_all(AMC) # n = 1879 + + example_isolates \%>\% susceptibility(GEN) # \%SI = 75.4\% + example_isolates \%>\% count_all(GEN) # n = 1855 + + example_isolates \%>\% susceptibility(AMC, GEN) # \%SI = 94.1\% + example_isolates \%>\% count_all(AMC, GEN) # n = 1939 + + + # See Details on how `only_all_tested` works. Example: + example_isolates \%>\% + summarise( + numerator = count_susceptible(AMC, GEN), + denominator = count_all(AMC, GEN), + proportion = susceptibility(AMC, GEN) + ) + + example_isolates \%>\% + summarise( + numerator = count_susceptible(AMC, GEN, only_all_tested = TRUE), + denominator = count_all(AMC, GEN, only_all_tested = TRUE), + proportion = susceptibility(AMC, GEN, only_all_tested = TRUE) + ) + + + example_isolates \%>\% + group_by(ward) \%>\% + summarise( + cipro_p = susceptibility(CIP, as_percent = TRUE), + cipro_n = count_all(CIP), + genta_p = susceptibility(GEN, as_percent = TRUE), + genta_n = count_all(GEN), + combination_p = susceptibility(CIP, GEN, as_percent = TRUE), + combination_n = count_all(CIP, GEN) + ) + + # Get proportions S/I/R immediately of all sir columns + example_isolates \%>\% + select(AMX, CIP) \%>\% + proportion_df(translate = FALSE) + + # It also supports grouping variables + # (use sir_df to also include the count) + example_isolates \%>\% + select(ward, AMX, CIP) \%>\% + group_by(ward) \%>\% + sir_df(translate = FALSE) +} +} +} +\seealso{ +\code{\link[=count]{count()}} to count resistant and susceptible isolates. +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/random.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/random.R +\name{random} +\alias{random} +\alias{random_mic} +\alias{random_disk} +\alias{random_sir} +\title{Random MIC Values/Disk Zones/SIR Generation} +\usage{ +random_mic(size = NULL, mo = NULL, ab = NULL, ...) + +random_disk(size = NULL, mo = NULL, ab = NULL, ...) + +random_sir(size = NULL, prob_SIR = c(0.33, 0.33, 0.33), ...) +} +\arguments{ +\item{size}{desired size of the returned vector. If used in a \link{data.frame} call or \code{dplyr} verb, will get the current (group) size if left blank.} + +\item{mo}{any \link{character} that can be coerced to a valid microorganism code with \code{\link[=as.mo]{as.mo()}}} + +\item{ab}{any \link{character} that can be coerced to a valid antimicrobial drug code with \code{\link[=as.ab]{as.ab()}}} + +\item{...}{ignored, only in place to allow future extensions} + +\item{prob_SIR}{a vector of length 3: the probabilities for "S" (1st value), "I" (2nd value) and "R" (3rd value)} +} +\value{ +class \code{mic} for \code{\link[=random_mic]{random_mic()}} (see \code{\link[=as.mic]{as.mic()}}) and class \code{disk} for \code{\link[=random_disk]{random_disk()}} (see \code{\link[=as.disk]{as.disk()}}) +} +\description{ +These functions can be used for generating random MIC values and disk diffusion diameters, for AMR data analysis practice. By providing a microorganism and antimicrobial drug, the generated results will reflect reality as much as possible. +} +\details{ +The base \R function \code{\link[=sample]{sample()}} is used for generating values. + +Generated values are based on the EUCAST 2024 guideline as implemented in the \link{clinical_breakpoints} data set. To create specific generated values per bug or drug, set the \code{mo} and/or \code{ab} argument. +} +\examples{ +random_mic(25) +random_disk(25) +random_sir(25) + +\donttest{ +# make the random generation more realistic by setting a bug and/or drug: +random_mic(25, "Klebsiella pneumoniae") # range 0.0625-64 +random_mic(25, "Klebsiella pneumoniae", "meropenem") # range 0.0625-16 +random_mic(25, "Streptococcus pneumoniae", "meropenem") # range 0.0625-4 + +random_disk(25, "Klebsiella pneumoniae") # range 8-50 +random_disk(25, "Klebsiella pneumoniae", "ampicillin") # range 11-17 +random_disk(25, "Streptococcus pneumoniae", "ampicillin") # range 12-27 +} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/resistance_predict.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/resistance_predict.R +\name{resistance_predict} +\alias{resistance_predict} +\alias{sir_predict} +\alias{plot.resistance_predict} +\alias{ggplot_sir_predict} +\alias{autoplot.resistance_predict} +\title{Predict Antimicrobial Resistance} +\usage{ +resistance_predict( + x, + col_ab, + col_date = NULL, + year_min = NULL, + year_max = NULL, + year_every = 1, + minimum = 30, + model = NULL, + I_as_S = TRUE, + preserve_measurements = TRUE, + info = interactive(), + ... +) + +sir_predict( + x, + col_ab, + col_date = NULL, + year_min = NULL, + year_max = NULL, + year_every = 1, + minimum = 30, + model = NULL, + I_as_S = TRUE, + preserve_measurements = TRUE, + info = interactive(), + ... +) + +\method{plot}{resistance_predict}(x, main = paste("Resistance Prediction of", x_name), ...) + +ggplot_sir_predict( + x, + main = paste("Resistance Prediction of", x_name), + ribbon = TRUE, + ... +) + +\method{autoplot}{resistance_predict}( + object, + main = paste("Resistance Prediction of", x_name), + ribbon = TRUE, + ... +) +} +\arguments{ +\item{x}{a \link{data.frame} containing isolates. Can be left blank for automatic determination, see \emph{Examples}.} + +\item{col_ab}{column name of \code{x} containing antimicrobial interpretations (\code{"R"}, \code{"I"} and \code{"S"})} + +\item{col_date}{column name of the date, will be used to calculate years if this column doesn't consist of years already - the default is the first column of with a date class} + +\item{year_min}{lowest year to use in the prediction model, dafaults to the lowest year in \code{col_date}} + +\item{year_max}{highest year to use in the prediction model - the default is 10 years after today} + +\item{year_every}{unit of sequence between lowest year found in the data and \code{year_max}} + +\item{minimum}{minimal amount of available isolates per year to include. Years containing less observations will be estimated by the model.} + +\item{model}{the statistical model of choice. This could be a generalised linear regression model with binomial distribution (i.e. using \code{glm(..., family = binomial)}, assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance. See \emph{Details} for all valid options.} + +\item{I_as_S}{a \link{logical} to indicate whether values \code{"I"} should be treated as \code{"S"} (will otherwise be treated as \code{"R"}). The default, \code{TRUE}, follows the redefinition by EUCAST about the interpretation of I (increased exposure) in 2019, see section \emph{Interpretation of S, I and R} below.} + +\item{preserve_measurements}{a \link{logical} to indicate whether predictions of years that are actually available in the data should be overwritten by the original data. The standard errors of those years will be \code{NA}.} + +\item{info}{a \link{logical} to indicate whether textual analysis should be printed with the name and \code{\link[=summary]{summary()}} of the statistical model.} + +\item{...}{arguments passed on to functions} + +\item{main}{title of the plot} + +\item{ribbon}{a \link{logical} to indicate whether a ribbon should be shown (default) or error bars} + +\item{object}{model data to be plotted} +} +\value{ +A \link{data.frame} with extra class \code{\link{resistance_predict}} with columns: +\itemize{ +\item \code{year} +\item \code{value}, the same as \code{estimated} when \code{preserve_measurements = FALSE}, and a combination of \code{observed} and \code{estimated} otherwise +\item \code{se_min}, the lower bound of the standard error with a minimum of \code{0} (so the standard error will never go below 0\%) +\item \code{se_max} the upper bound of the standard error with a maximum of \code{1} (so the standard error will never go above 100\%) +\item \code{observations}, the total number of available observations in that year, i.e. \eqn{S + I + R} +\item \code{observed}, the original observed resistant percentages +\item \code{estimated}, the estimated resistant percentages, calculated by the model +} + +Furthermore, the model itself is available as an attribute: \code{attributes(x)$model}, see \emph{Examples}. +} +\description{ +Create a prediction model to predict antimicrobial resistance for the next years on statistical solid ground. Standard errors (SE) will be returned as columns \code{se_min} and \code{se_max}. See \emph{Examples} for a real live example. +} +\details{ +Valid options for the statistical model (argument \code{model}) are: +\itemize{ +\item \code{"binomial"} or \code{"binom"} or \code{"logit"}: a generalised linear regression model with binomial distribution +\item \code{"loglin"} or \code{"poisson"}: a generalised log-linear regression model with poisson distribution +\item \code{"lin"} or \code{"linear"}: a linear regression model +} +} +\section{Interpretation of SIR}{ + +In 2019, the European Committee on Antimicrobial Susceptibility Testing (EUCAST) has decided to change the definitions of susceptibility testing categories S, I, and R as shown below (\url{https://www.eucast.org/newsiandr}): +\itemize{ +\item \strong{S - Susceptible, standard dosing regimen}\cr +A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent. +\item \strong{I - Susceptible, increased exposure} \emph{\cr +A microorganism is categorised as "Susceptible, Increased exposure}" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection. +\item \strong{R = Resistant}\cr +A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure. +\itemize{ +\item \emph{Exposure} is a function of how the mode of administration, dose, dosing interval, infusion time, as well as distribution and excretion of the antimicrobial agent will influence the infecting organism at the site of infection. +} +} + +This AMR package honours this insight. Use \code{\link[=susceptibility]{susceptibility()}} (equal to \code{\link[=proportion_SI]{proportion_SI()}}) to determine antimicrobial susceptibility and \code{\link[=count_susceptible]{count_susceptible()}} (equal to \code{\link[=count_SI]{count_SI()}}) to count susceptible isolates. +} + +\examples{ +x <- resistance_predict(example_isolates, + col_ab = "AMX", + year_min = 2010, + model = "binomial" +) +plot(x) +\donttest{ +if (require("ggplot2")) { + ggplot_sir_predict(x) +} + +# using dplyr: +if (require("dplyr")) { + x <- example_isolates \%>\% + filter_first_isolate() \%>\% + filter(mo_genus(mo) == "Staphylococcus") \%>\% + resistance_predict("PEN", model = "binomial") + print(plot(x)) + + # get the model from the object + mymodel <- attributes(x)$model + summary(mymodel) +} + +# create nice plots with ggplot2 yourself +if (require("dplyr") && require("ggplot2")) { + data <- example_isolates \%>\% + filter(mo == as.mo("E. coli")) \%>\% + resistance_predict( + col_ab = "AMX", + col_date = "date", + model = "binomial", + info = FALSE, + minimum = 15 + ) + head(data) + autoplot(data) +} +} +} +\seealso{ +The \code{\link[=proportion]{proportion()}} functions to calculate resistance + +Models: \code{\link[=lm]{lm()}} \code{\link[=glm]{glm()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/skewness.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/skewness.R +\name{skewness} +\alias{skewness} +\alias{skewness.default} +\alias{skewness.matrix} +\alias{skewness.data.frame} +\title{Skewness of the Sample} +\usage{ +skewness(x, na.rm = FALSE) + +\method{skewness}{default}(x, na.rm = FALSE) + +\method{skewness}{matrix}(x, na.rm = FALSE) + +\method{skewness}{data.frame}(x, na.rm = FALSE) +} +\arguments{ +\item{x}{a vector of values, a \link{matrix} or a \link{data.frame}} + +\item{na.rm}{a \link{logical} value indicating whether \code{NA} values should be stripped before the computation proceeds} +} +\description{ +Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. + +When negative ('left-skewed'): the left tail is longer; the mass of the distribution is concentrated on the right of a histogram. When positive ('right-skewed'): the right tail is longer; the mass of the distribution is concentrated on the left of a histogram. A normal distribution has a skewness of 0. +} +\examples{ +skewness(runif(1000)) +} +\seealso{ +\code{\link[=kurtosis]{kurtosis()}} +} + + + +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE man/translate.Rd + + + +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/translate.R +\name{translate} +\alias{translate} +\alias{get_AMR_locale} +\alias{set_AMR_locale} +\alias{reset_AMR_locale} +\alias{translate_AMR} +\title{Translate Strings from the AMR Package} +\usage{ +get_AMR_locale() + +set_AMR_locale(language) + +reset_AMR_locale() + +translate_AMR(x, language = get_AMR_locale()) +} +\arguments{ +\item{language}{language to choose. Use one of these supported language names or ISO-639-1 codes: English (en), Chinese (zh), Czech (cs), Danish (da), Dutch (nl), Finnish (fi), French (fr), German (de), Greek (el), Italian (it), Japanese (ja), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Spanish (es), Swedish (sv), Turkish (tr), or Ukrainian (uk).} + +\item{x}{text to translate} +} +\description{ +For language-dependent output of \code{AMR} functions, such as \code{\link[=mo_name]{mo_name()}}, \code{\link[=mo_gramstain]{mo_gramstain()}}, \code{\link[=mo_type]{mo_type()}} and \code{\link[=ab_name]{ab_name()}}. +} +\details{ +The currently 20 supported languages are English (en), Chinese (zh), Czech (cs), Danish (da), Dutch (nl), Finnish (fi), French (fr), German (de), Greek (el), Italian (it), Japanese (ja), Norwegian (no), Polish (pl), Portuguese (pt), Romanian (ro), Russian (ru), Spanish (es), Swedish (sv), Turkish (tr), and Ukrainian (uk). All these languages have translations available for all antimicrobial drugs and colloquial microorganism names. + +To permanently silence the once-per-session language note on a non-English operating system, you can set the package option \code{\link[=AMR-options]{AMR_locale}} in your \code{.Rprofile} file like this: + +\if{html}{\out{
}}\preformatted{# Open .Rprofile file +utils::file.edit("~/.Rprofile") + +# Then add e.g. Italian support to that file using: +options(AMR_locale = "Italian") +}\if{html}{\out{
}} + +And then save the file. + +Please read about adding or updating a language in \href{https://github.com/msberends/AMR/wiki/}{our Wiki}. +\subsection{Changing the Default Language}{ + +The system language will be used at default (as returned by \code{Sys.getenv("LANG")} or, if \code{LANG} is not set, \code{\link[=Sys.getlocale]{Sys.getlocale("LC_COLLATE")}}), if that language is supported. But the language to be used can be overwritten in two ways and will be checked in this order: +\enumerate{ +\item Setting the package option \code{\link[=AMR-options]{AMR_locale}}, either by using e.g. \code{set_AMR_locale("German")} or by running e.g. \code{options(AMR_locale = "German")}. + +Note that setting an \R option only works in the same session. Save the command \code{options(AMR_locale = "(your language)")} to your \code{.Rprofile} file to apply it for every session. Run \code{utils::file.edit("~/.Rprofile")} to edit your \code{.Rprofile} file. +\item Setting the system variable \code{LANGUAGE} or \code{LANG}, e.g. by adding \code{LANGUAGE="de_DE.utf8"} to your \code{.Renviron} file in your home directory. +} + +Thus, if the package option \code{\link[=AMR-options]{AMR_locale}} is set, the system variables \code{LANGUAGE} and \code{LANG} will be ignored. +} +} +\examples{ +# Current settings (based on system language) +ab_name("Ciprofloxacin") +mo_name("Coagulase-negative Staphylococcus (CoNS)") + +# setting another language +set_AMR_locale("Dutch") +ab_name("Ciprofloxacin") +mo_name("Coagulase-negative Staphylococcus (CoNS)") + +# setting yet another language +set_AMR_locale("German") +ab_name("Ciprofloxacin") +mo_name("Coagulase-negative Staphylococcus (CoNS)") + +# set_AMR_locale() understands endonyms, English exonyms, and ISO-639-1: +set_AMR_locale("Deutsch") +set_AMR_locale("German") +set_AMR_locale("de") +ab_name("amox/clav") + +# reset to system default +reset_AMR_locale() +ab_name("amox/clav") +} + + +