diff --git a/DESCRIPTION b/DESCRIPTION index 0993a6a9..f913bc12 100755 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.3.0.9006 -Date: 2018-09-01 +Version: 0.3.0.9007 +Date: 2018-09-04 Title: Antimicrobial Resistance Analysis Authors@R: c( person( diff --git a/NAMESPACE b/NAMESPACE index bf01e726..80bfba22 100755 --- a/NAMESPACE +++ b/NAMESPACE @@ -94,12 +94,10 @@ export(mo_family) export(mo_fullname) export(mo_genus) export(mo_gramstain) -export(mo_gramstain_nl) export(mo_property) export(mo_species) export(mo_subspecies) export(mo_type) -export(mo_type_nl) export(n_rsi) export(p.symbol) export(portion_I) diff --git a/NEWS.md b/NEWS.md index 4c25f10c..ddc4b17a 100755 --- a/NEWS.md +++ b/NEWS.md @@ -10,7 +10,15 @@ * Column names of datasets `microorganisms` and `septic_patients` * All old syntaxes will still work with this version, but will throw warnings * Functions `as.atc` and `is.atc` to transform/look up antibiotic ATC codes as defined by the WHO. The existing function `guess_atc` is now an alias of `as.atc`. -* Aliases for existing function `mo_property`: `mo_family`, `mo_genus`, `mo_species`, `mo_subspecies`, `mo_fullname`, `mo_type`, `mo_gramstain`, `mo_aerobic`, `mo_type_nl` and `mo_gramstain_nl` +* Aliases for existing function `mo_property`: `mo_family`, `mo_genus`, `mo_species`, `mo_subspecies`, `mo_fullname`, `mo_aerobic`, `mo_type`, `mo_gramstain`. The last two functions have a `language` parameter, with support for Spanish, German and Dutch: + ```r + mo_gramstain("E. coli") + # [1] "Negative rods" + mo_gramstain("E. coli", language = "de") # "de" = Deutsch / German + # [1] "Negative Staebchen" + mo_gramstain("E. coli", language = "es") # "es" = Español / Spanish + # [1] "Bacilos negativos" + ``` * Function `ab_property` and its aliases: `ab_official`, `ab_tradenames`, `ab_certe`, `ab_umcg`, `ab_official_nl` and `ab_trivial_nl` * Introduction to AMR as a vignette diff --git a/R/ab_property.R b/R/ab_property.R index a29ca321..ac5a1fee 100644 --- a/R/ab_property.R +++ b/R/ab_property.R @@ -36,7 +36,7 @@ ab_property <- function(x, property = 'official') { property <- property[1] if (!property %in% colnames(antibiotics)) { - stop("invalid property: ", property, " - use a column name of `antibiotics`") + stop("invalid property: ", property, " - use a column name of the `antibiotics` data set") } if (!is.atc(x)) { x <- as.atc(x) # this will give a warning if x cannot be coerced diff --git a/R/atc.R b/R/atc.R index 7ce60f74..d27e408f 100755 --- a/R/atc.R +++ b/R/atc.R @@ -17,9 +17,9 @@ # ==================================================================== # -#' Find ATC code based on antibiotic property +#' Transform to ATC code #' -#' Use this function to determine the ATC code of one or more antibiotics. The dataset \code{\link{antibiotics}} will be searched for abbreviations, official names and trade names. +#' Use this function to determine the ATC code of one or more antibiotics. The data set \code{\link{antibiotics}} will be searched for abbreviations, official names and trade names. #' @param x character vector to determine \code{ATC} code #' @rdname as.atc #' @aliases atc diff --git a/R/data.R b/R/data.R index d667332d..d6505cb2 100755 --- a/R/data.R +++ b/R/data.R @@ -123,7 +123,7 @@ #' Data set with human pathogenic microorganisms #' #' A data set containing 2,664 (potential) human pathogenic microorganisms. MO codes can be looked up using \code{\link{guess_mo}}. -#' @format A \code{\link{tibble}} with 2,664 observations and 12 variables: +#' @format A \code{\link{tibble}} with 2,664 observations and 16 variables: #' \describe{ #' \item{\code{mo}}{ID of microorganism} #' \item{\code{bactsys}}{Bactsyscode of microorganism} @@ -132,11 +132,15 @@ #' \item{\code{species}}{Species name of microorganism, like \code{"coli"}} #' \item{\code{subspecies}}{Subspecies name of bio-/serovar of microorganism, like \code{"EHEC"}} #' \item{\code{fullname}}{Full name, like \code{"Echerichia coli (EHEC)"}} +#' \item{\code{aerobic}}{Logical whether bacteria is aerobic} #' \item{\code{type}}{Type of microorganism, like \code{"Bacteria"} and \code{"Fungus/yeast"}} #' \item{\code{gramstain}}{Gram of microorganism, like \code{"Negative rods"}} -#' \item{\code{aerobic}}{Logical whether bacteria is aerobic} +#' \item{\code{type_de}}{Type of microorganism in German, like \code{"Bakterien"} and \code{"Pilz/Hefe"}} +#' \item{\code{gramstain_de}}{Gram of microorganism in German, like \code{"Negative Staebchen"}} #' \item{\code{type_nl}}{Type of microorganism in Dutch, like \code{"Bacterie"} and \code{"Schimmel/gist"}} #' \item{\code{gramstain_nl}}{Gram of microorganism in Dutch, like \code{"Negatieve staven"}} +#' \item{\code{type_es}}{Type of microorganism in Spanish, like \code{"Bacteria"} and \code{"Hongo/levadura"}} +#' \item{\code{gramstain_es}}{Gram of microorganism in Spanish, like \code{"Bacilos negativos"}} #' } # source MOLIS (LIS of Certe) - \url{https://www.certe.nl} # new <- microorganisms %>% filter(genus == "Bacteroides") %>% .[1,] diff --git a/R/mo.R b/R/mo.R index 55b05097..80e0a345 100644 --- a/R/mo.R +++ b/R/mo.R @@ -19,15 +19,19 @@ #' Transform to microorganism ID #' #' Use this function to determine a valid ID based on a genus (and species). This input can be a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples. -#' @param x a character vector or a dataframe with one or two columns -#' @param Becker a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1]. This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS". -#' @param Lancefield a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. Groups D and E will be ignored, since they are \emph{Enterococci}. +#' @param x a character vector or a \code{data.frame} with one or two columns +#' @param Becker a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1]. +#' +#' This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS". +#' @param Lancefield a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. +#' +#' This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D. #' @rdname as.mo #' @aliases mo #' @keywords mo Becker becker Lancefield lancefield guess #' @details \code{guess_mo} is an alias of \code{as.mo}. #' -#' Use the \code{\link{mo_property}} functions to get properties based on the returned mo, see Examples. +#' Use the \code{\link{mo_property}} functions to get properties based on the returned code, see Examples. #' #' Some exceptions have been built in to get more logical results, based on prevalence of human pathogens. These are: #' \itemize{ @@ -39,10 +43,9 @@ #' Moreover, this function also supports ID's based on only Gram stain, when the species is not known. \cr #' For example, \code{"Gram negative rods"} and \code{"GNR"} will both return the ID of a Gram negative rod: \code{GNR}. #' @source -#' [1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \cr -#' \url{https://dx.doi.org/10.1128/CMR.00109-13} \cr -#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \cr -#' \url{https://dx.doi.org/10.1084/jem.57.4.571} +#' [1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13} +#' +#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571} #' @export #' @importFrom dplyr %>% pull left_join #' @return Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}. @@ -63,7 +66,7 @@ #' guess_mo("S. epidermidis") # will remain species: STAEPI #' guess_mo("S. epidermidis", Becker = TRUE) # will not remain species: STACNS #' -#' guess_mo("S. pyogenes") # will remain species: STCAGA +#' guess_mo("S. pyogenes") # will remain species: STCPYO #' guess_mo("S. pyogenes", Lancefield = TRUE) # will not remain species: STCGRA #' #' # Use mo_* functions to get a specific property based on `mo` @@ -177,10 +180,17 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { if (tolower(x[i]) %like% 'coagulase negative' | tolower(x[i]) %like% 'cns' | tolower(x[i]) %like% 'cons') { - # coerce S. coagulase negative, also as CNS and CoNS + # coerce S. coagulase negative x[i] <- 'STACNS' next } + if (tolower(x[i]) %like% 'coagulase positive' + | tolower(x[i]) %like% 'cps' + | tolower(x[i]) %like% 'cops') { + # coerce S. coagulase positive + x[i] <- 'STACPS' + next + } # translate known trivial names to genus+species if (!is.na(x_trimmed[i])) { @@ -204,7 +214,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { next } if (toupper(x_trimmed[i]) %in% c('PISP', 'PRSP', 'VISP', 'VRSP')) { - # peni R, peni I, vanco I, vanco R: S. pneumoniae + # peni I, peni R, vanco I, vanco R: S. pneumoniae x[i] <- 'STCPNE' next } @@ -327,7 +337,7 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { } } - if (Lancefield == TRUE) { + if (Lancefield == TRUE | Lancefield == "all") { # group A x[x == "STCPYO"] <- "STCGRA" # S. pyogenes # group B @@ -338,6 +348,9 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { "zooepidemicus", "dysgalactiae")) %>% pull(mo) x[x %in% S_groupC] <- "STCGRC" # S. agalactiae + if (Lancefield == "all") { + x[substr(x, 1, 3) == "ENC"] <- "STCGRD" # all Enterococci + } # group F x[x == "STCANG"] <- "STCGRF" # S. anginosus # group H diff --git a/R/mo_property.R b/R/mo_property.R index a0459a59..52bfeb14 100644 --- a/R/mo_property.R +++ b/R/mo_property.R @@ -18,59 +18,80 @@ #' Property of a microorganism #' -#' Use these functions to return a specific property of a microorganism from the \code{\link{microorganisms}} data set, based on their \code{mo}. Get such an ID with \code{\link{as.mo}}. -#' @param x a (vector of a) valid \code{\link{mo}} or any text that can be coerced to a valid microorganism code with \code{\link{as.mo}} +#' Use these functions to return a specific property of a microorganism from the \code{\link{microorganisms}} data set. All input values will be evaluated internally with \code{\link{as.mo}}. +#' @param x any (vector of) text that can be coerced to a valid microorganism code with \code{\link{as.mo}} #' @param property one of the column names of one of the \code{\link{microorganisms}} data set, like \code{"mo"}, \code{"bactsys"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"fullname"}, \code{"gramstain"} and \code{"aerobic"} +#' @inheritParams as.mo +#' @param language language of the returned text, either one of \code{"en"} (English), \code{"de"} (German) or \code{"nl"} (Dutch) +#' @source +#' [1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13} +#' +#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571} #' @rdname mo_property #' @export #' @importFrom dplyr %>% left_join pull #' @seealso \code{\link{microorganisms}} #' @examples #' # All properties -#' mo_family("E. coli") # Enterobacteriaceae -#' mo_genus("E. coli") # Escherichia -#' mo_species("E. coli") # coli -#' mo_subspecies("E. coli") # -#' mo_fullname("E. coli") # Escherichia coli -#' mo_type("E. coli") # Bacteria -#' mo_gramstain("E. coli") # Negative rods -#' mo_aerobic("E. coli") # TRUE -#' mo_type_nl("E. coli") # Bacterie -#' mo_gramstain_nl("E. coli") # Negatieve staven +#' mo_family("E. coli") # "Enterobacteriaceae" +#' mo_genus("E. coli") # "Escherichia" +#' mo_species("E. coli") # "coli" +#' mo_subspecies("E. coli") # +#' mo_fullname("E. coli") # "Escherichia coli" +#' mo_type("E. coli") # "Bacteria" +#' mo_gramstain("E. coli") # "Negative rods" +#' mo_aerobic("E. coli") # TRUE +#' +#' # language support for Spanish, German and Dutch +#' mo_type("E. coli", "es") # "Bakteria" +#' mo_type("E. coli", "de") # "Bakterien" +#' mo_type("E. coli", "nl") # "Bacterie" +#' mo_gramstain("E. coli", "es") # "Bacilos negativos" +#' mo_gramstain("E. coli", "de") # "Negative Staebchen" +#' mo_gramstain("E. coli", "nl") # "Negatieve staven" #' #' #' # Abbreviations known in the field -#' mo_genus("EHEC") # Escherichia -#' mo_species("EHEC") # coli -#' mo_subspecies("EHEC") # EHEC -#' mo_fullname("EHEC") # Escherichia coli (EHEC) +#' mo_genus("MRSA") # "Staphylococcus" +#' mo_species("MRSA") # "aureus" +#' mo_gramstain("MRSA") # "Positive cocci" #' -#' mo_genus("MRSA") # Staphylococcus -#' mo_species("MRSA") # aureus -#' mo_gramstain("MRSA") # Positive cocci -#' -#' mo_genus("VISA") # Staphylococcus -#' mo_species("VISA") # aureus +#' mo_genus("VISA") # "Staphylococcus" +#' mo_species("VISA") # "aureus" #' #' #' # Known subspecies -#' mo_genus("doylei") # Campylobacter -#' mo_species("doylei") # jejuni -#' mo_fullname("doylei") # Campylobacter jejuni (doylei) +#' mo_genus("EHEC") # "Escherichia" +#' mo_species("EHEC") # "coli" +#' mo_subspecies("EHEC") # "EHEC" +#' mo_fullname("EHEC") # "Escherichia coli (EHEC)" +#' +#' mo_genus("doylei") # "Campylobacter" +#' mo_species("doylei") # "jejuni" +#' mo_fullname("doylei") # "Campylobacter jejuni (doylei)" +#' +#' mo_fullname("K. pneu rh") # "Klebsiella pneumoniae (rhinoscleromatis)" #' #' #' # Anaerobic bacteria -#' mo_genus("B. fragilis") # Bacteroides -#' mo_species("B. fragilis") # fragilis -#' mo_aerobic("B. fragilis") # FALSE -mo_property <- function(x, property = 'fullname') { - property <- property[1] +#' mo_genus("B. fragilis") # "Bacteroides" +#' mo_species("B. fragilis") # "fragilis" +#' mo_aerobic("B. fragilis") # FALSE +#' +#' +#' # Becker classification, see ?as.mo +#' mo_fullname("S. epidermidis") # "Staphylococcus epidermidis" +#' mo_fullname("S. epidermidis", Becker = TRUE) # "Coagulase Negative Staphylococcus (CoNS)" +#' +#' # Lancefield classification, see ?as.mo +#' mo_fullname("S. pyogenes") # "Streptococcus pyogenes" +#' mo_fullname("S. pyogenes", Lancefield = TRUE) # "Streptococcus group A" +mo_property <- function(x, property = 'fullname', Becker = FALSE, Lancefield = FALSE) { + property <- tolower(property[1]) if (!property %in% colnames(microorganisms)) { - stop("invalid property: ", property, " - use a column name of `microorganisms`") - } - if (!is.mo(x)) { - x <- as.mo(x) # this will give a warning if x cannot be coerced + stop("invalid property: ", property, " - use a column name of the `microorganisms` data set") } + x <- as.mo(x = x, Becker = Becker, Lancefield = Lancefield) # this will give a warning if x cannot be coerced suppressWarnings( data.frame(mo = x, stringsAsFactors = FALSE) %>% left_join(AMR::microorganisms, by = "mo") %>% @@ -92,32 +113,32 @@ mo_genus <- function(x) { #' @rdname mo_property #' @export -mo_species <- function(x) { - mo_property(x, "species") +mo_species <- function(x, Becker = FALSE, Lancefield = FALSE) { + mo_property(x, "species", Becker = Becker, Lancefield = Lancefield) } #' @rdname mo_property #' @export -mo_subspecies <- function(x) { - mo_property(x, "subspecies") +mo_subspecies <- function(x, Becker = FALSE, Lancefield = FALSE) { + mo_property(x, "subspecies", Becker = Becker, Lancefield = Lancefield) } #' @rdname mo_property #' @export -mo_fullname <- function(x) { - mo_property(x, "fullname") +mo_fullname <- function(x, Becker = FALSE, Lancefield = FALSE) { + mo_property(x, "fullname", Becker = Becker, Lancefield = Lancefield) } #' @rdname mo_property #' @export -mo_type <- function(x) { - mo_property(x, "type") +mo_type <- function(x, language = "en") { + mo_property(x, paste0("type", checklang(language))) } #' @rdname mo_property #' @export -mo_gramstain <- function(x) { - mo_property(x, "gramstain") +mo_gramstain <- function(x, language = "en") { + mo_property(x, paste0("gramstain", checklang(language))) } #' @rdname mo_property @@ -126,14 +147,15 @@ mo_aerobic <- function(x) { mo_property(x, "aerobic") } -#' @rdname mo_property -#' @export -mo_type_nl <- function(x) { - mo_property(x, "type_nl") -} - -#' @rdname mo_property -#' @export -mo_gramstain_nl <- function(x) { - mo_property(x, "gramstain_nl") +checklang <- function(language) { + language <- tolower(language[1]) + supported <- c("en", "de", "nl", "es") + if (!language %in% c(NULL, "", supported)) { + stop("invalid language: ", language, " - use one of ", paste0("'", sort(supported), "'", collapse = ", "), call. = FALSE) + } + if (language %in% c(NULL, "", "en")) { + "" + } else { + paste0("_", language) + } } diff --git a/README.md b/README.md index 223a57a9..ba748292 100755 --- a/README.md +++ b/README.md @@ -55,7 +55,7 @@ This `AMR` package basically does four important things: * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute). * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them. * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported. - * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.mo` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. + * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. Some functions can return results in Spanish, German and Dutch. These functions can be used to add new variables to your data. * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_official` and `ab_tradenames` to look up values. As the `mo_*` functions use `as.mo` internally, the `ab_*` functions use `as.atc` internally so it uses AI to guess your expected result. For example, `ab_official("Fluclox")`, `ab_official("Floxapen")` and `ab_official("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data. 3. It **analyses the data** with convenient functions that use well-known methods. diff --git a/data/microorganisms.rda b/data/microorganisms.rda index d939951a..1ab46d3b 100755 Binary files a/data/microorganisms.rda and b/data/microorganisms.rda differ diff --git a/man/as.atc.Rd b/man/as.atc.Rd index cf0f053a..445be02f 100644 --- a/man/as.atc.Rd +++ b/man/as.atc.Rd @@ -5,7 +5,7 @@ \alias{atc} \alias{guess_atc} \alias{is.atc} -\title{Find ATC code based on antibiotic property} +\title{Transform to ATC code} \usage{ as.atc(x) @@ -20,7 +20,7 @@ is.atc(x) Character (vector) with class \code{"act"}. Unknown values will return \code{NA}. } \description{ -Use this function to determine the ATC code of one or more antibiotics. The dataset \code{\link{antibiotics}} will be searched for abbreviations, official names and trade names. +Use this function to determine the ATC code of one or more antibiotics. The data set \code{\link{antibiotics}} will be searched for abbreviations, official names and trade names. } \details{ Use the \code{\link{ab_property}} functions to get properties based on the returned ATC code, see Examples. diff --git a/man/as.mo.Rd b/man/as.mo.Rd index fcaa458a..8a10f8e3 100644 --- a/man/as.mo.Rd +++ b/man/as.mo.Rd @@ -7,10 +7,9 @@ \alias{guess_mo} \title{Transform to microorganism ID} \source{ -[1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \cr - \url{https://dx.doi.org/10.1128/CMR.00109-13} \cr -[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \cr - \url{https://dx.doi.org/10.1084/jem.57.4.571} +[1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13} + +[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571} } \usage{ as.mo(x, Becker = FALSE, Lancefield = FALSE) @@ -20,11 +19,15 @@ is.mo(x) guess_mo(x, Becker = FALSE, Lancefield = FALSE) } \arguments{ -\item{x}{a character vector or a dataframe with one or two columns} +\item{x}{a character vector or a \code{data.frame} with one or two columns} -\item{Becker}{a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1]. This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".} +\item{Becker}{a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1]. -\item{Lancefield}{a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. Groups D and E will be ignored, since they are \emph{Enterococci}.} + This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".} + +\item{Lancefield}{a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. + + This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D.} } \value{ Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}. @@ -35,7 +38,7 @@ Use this function to determine a valid ID based on a genus (and species). This i \details{ \code{guess_mo} is an alias of \code{as.mo}. -Use the \code{\link{mo_property}} functions to get properties based on the returned mo, see Examples. +Use the \code{\link{mo_property}} functions to get properties based on the returned code, see Examples. Some exceptions have been built in to get more logical results, based on prevalence of human pathogens. These are: \itemize{ @@ -63,7 +66,7 @@ as.mo("VRSA") # Vancomycin Resistant S. aureus guess_mo("S. epidermidis") # will remain species: STAEPI guess_mo("S. epidermidis", Becker = TRUE) # will not remain species: STACNS -guess_mo("S. pyogenes") # will remain species: STCAGA +guess_mo("S. pyogenes") # will remain species: STCPYO guess_mo("S. pyogenes", Lancefield = TRUE) # will not remain species: STCGRA # Use mo_* functions to get a specific property based on `mo` diff --git a/man/microorganisms.Rd b/man/microorganisms.Rd index 4a3bf3a8..c6a2796c 100755 --- a/man/microorganisms.Rd +++ b/man/microorganisms.Rd @@ -4,7 +4,7 @@ \name{microorganisms} \alias{microorganisms} \title{Data set with human pathogenic microorganisms} -\format{A \code{\link{tibble}} with 2,664 observations and 12 variables: +\format{A \code{\link{tibble}} with 2,664 observations and 16 variables: \describe{ \item{\code{mo}}{ID of microorganism} \item{\code{bactsys}}{Bactsyscode of microorganism} @@ -13,11 +13,15 @@ \item{\code{species}}{Species name of microorganism, like \code{"coli"}} \item{\code{subspecies}}{Subspecies name of bio-/serovar of microorganism, like \code{"EHEC"}} \item{\code{fullname}}{Full name, like \code{"Echerichia coli (EHEC)"}} + \item{\code{aerobic}}{Logical whether bacteria is aerobic} \item{\code{type}}{Type of microorganism, like \code{"Bacteria"} and \code{"Fungus/yeast"}} \item{\code{gramstain}}{Gram of microorganism, like \code{"Negative rods"}} - \item{\code{aerobic}}{Logical whether bacteria is aerobic} + \item{\code{type_de}}{Type of microorganism in German, like \code{"Bakterien"} and \code{"Pilz/Hefe"}} + \item{\code{gramstain_de}}{Gram of microorganism in German, like \code{"Negative Staebchen"}} \item{\code{type_nl}}{Type of microorganism in Dutch, like \code{"Bacterie"} and \code{"Schimmel/gist"}} \item{\code{gramstain_nl}}{Gram of microorganism in Dutch, like \code{"Negatieve staven"}} + \item{\code{type_es}}{Type of microorganism in Spanish, like \code{"Bacteria"} and \code{"Hongo/levadura"}} + \item{\code{gramstain_es}}{Gram of microorganism in Spanish, like \code{"Bacilos negativos"}} }} \usage{ microorganisms diff --git a/man/mo_property.Rd b/man/mo_property.Rd index cd8caabe..76574879 100644 --- a/man/mo_property.Rd +++ b/man/mo_property.Rd @@ -10,78 +10,105 @@ \alias{mo_type} \alias{mo_gramstain} \alias{mo_aerobic} -\alias{mo_type_nl} -\alias{mo_gramstain_nl} \title{Property of a microorganism} +\source{ +[1] Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870–926. \url{https://dx.doi.org/10.1128/CMR.00109-13} + +[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571–95. \url{https://dx.doi.org/10.1084/jem.57.4.571} +} \usage{ -mo_property(x, property = "fullname") +mo_property(x, property = "fullname", Becker = FALSE, + Lancefield = FALSE) mo_family(x) mo_genus(x) -mo_species(x) +mo_species(x, Becker = FALSE, Lancefield = FALSE) -mo_subspecies(x) +mo_subspecies(x, Becker = FALSE, Lancefield = FALSE) -mo_fullname(x) +mo_fullname(x, Becker = FALSE, Lancefield = FALSE) -mo_type(x) +mo_type(x, language = "en") -mo_gramstain(x) +mo_gramstain(x, language = "en") mo_aerobic(x) - -mo_type_nl(x) - -mo_gramstain_nl(x) } \arguments{ -\item{x}{a (vector of a) valid \code{\link{mo}} or any text that can be coerced to a valid microorganism code with \code{\link{as.mo}}} +\item{x}{any (vector of) text that can be coerced to a valid microorganism code with \code{\link{as.mo}}} \item{property}{one of the column names of one of the \code{\link{microorganisms}} data set, like \code{"mo"}, \code{"bactsys"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"fullname"}, \code{"gramstain"} and \code{"aerobic"}} + +\item{Becker}{a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1]. + + This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".} + +\item{Lancefield}{a logical to indicate whether beta-haemolytic \emph{Streptococci} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield [2]. These \emph{Streptococci} will be categorised in their first group, i.e. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L. + + This excludes \emph{Enterococci} at default (who are in group D), use \code{Lancefield = "all"} to also categorise all \emph{Enterococci} as group D.} + +\item{language}{language of the returned text, either one of \code{"en"} (English), \code{"de"} (German) or \code{"nl"} (Dutch)} } \description{ -Use these functions to return a specific property of a microorganism from the \code{\link{microorganisms}} data set, based on their \code{mo}. Get such an ID with \code{\link{as.mo}}. +Use these functions to return a specific property of a microorganism from the \code{\link{microorganisms}} data set. All input values will be evaluated internally with \code{\link{as.mo}}. } \examples{ # All properties -mo_family("E. coli") # Enterobacteriaceae -mo_genus("E. coli") # Escherichia -mo_species("E. coli") # coli -mo_subspecies("E. coli") # -mo_fullname("E. coli") # Escherichia coli -mo_type("E. coli") # Bacteria -mo_gramstain("E. coli") # Negative rods -mo_aerobic("E. coli") # TRUE -mo_type_nl("E. coli") # Bacterie -mo_gramstain_nl("E. coli") # Negatieve staven +mo_family("E. coli") # "Enterobacteriaceae" +mo_genus("E. coli") # "Escherichia" +mo_species("E. coli") # "coli" +mo_subspecies("E. coli") # +mo_fullname("E. coli") # "Escherichia coli" +mo_type("E. coli") # "Bacteria" +mo_gramstain("E. coli") # "Negative rods" +mo_aerobic("E. coli") # TRUE + +# language support for Spanish, German and Dutch +mo_type("E. coli", "es") # "Bakteria" +mo_type("E. coli", "de") # "Bakterien" +mo_type("E. coli", "nl") # "Bacterie" +mo_gramstain("E. coli", "es") # "Bacilos negativos" +mo_gramstain("E. coli", "de") # "Negative Staebchen" +mo_gramstain("E. coli", "nl") # "Negatieve staven" # Abbreviations known in the field -mo_genus("EHEC") # Escherichia -mo_species("EHEC") # coli -mo_subspecies("EHEC") # EHEC -mo_fullname("EHEC") # Escherichia coli (EHEC) +mo_genus("MRSA") # "Staphylococcus" +mo_species("MRSA") # "aureus" +mo_gramstain("MRSA") # "Positive cocci" -mo_genus("MRSA") # Staphylococcus -mo_species("MRSA") # aureus -mo_gramstain("MRSA") # Positive cocci - -mo_genus("VISA") # Staphylococcus -mo_species("VISA") # aureus +mo_genus("VISA") # "Staphylococcus" +mo_species("VISA") # "aureus" # Known subspecies -mo_genus("doylei") # Campylobacter -mo_species("doylei") # jejuni -mo_fullname("doylei") # Campylobacter jejuni (doylei) +mo_genus("EHEC") # "Escherichia" +mo_species("EHEC") # "coli" +mo_subspecies("EHEC") # "EHEC" +mo_fullname("EHEC") # "Escherichia coli (EHEC)" + +mo_genus("doylei") # "Campylobacter" +mo_species("doylei") # "jejuni" +mo_fullname("doylei") # "Campylobacter jejuni (doylei)" + +mo_fullname("K. pneu rh") # "Klebsiella pneumoniae (rhinoscleromatis)" # Anaerobic bacteria -mo_genus("B. fragilis") # Bacteroides -mo_species("B. fragilis") # fragilis -mo_aerobic("B. fragilis") # FALSE +mo_genus("B. fragilis") # "Bacteroides" +mo_species("B. fragilis") # "fragilis" +mo_aerobic("B. fragilis") # FALSE + + +# Becker classification, see ?as.mo +mo_fullname("S. epidermidis") # "Staphylococcus epidermidis" +mo_fullname("S. epidermidis", Becker = TRUE) # "Coagulase Negative Staphylococcus (CoNS)" + +# Lancefield classification, see ?as.mo +mo_fullname("S. pyogenes") # "Streptococcus pyogenes" +mo_fullname("S. pyogenes", Lancefield = TRUE) # "Streptococcus group A" } \seealso{ \code{\link{microorganisms}} diff --git a/tests/testthat/test-ab_property.R b/tests/testthat/test-ab_property.R index 9088cfbf..fb32a170 100644 --- a/tests/testthat/test-ab_property.R +++ b/tests/testthat/test-ab_property.R @@ -8,4 +8,7 @@ test_that("ab_property works", { expect_equal(ab_umcg("amox"), "AMOX") expect_equal(class(ab_tradenames("amox")), "character") expect_equal(class(ab_tradenames(c("amox", "amox"))), "list") + expect_equal(ab_atc("amox"), as.character(as.atc("amox"))) + + expect_error(ab_property("amox", "invalid property")) }) diff --git a/tests/testthat/test-deprecated.R b/tests/testthat/test-deprecated.R index 353242bd..12df2550 100644 --- a/tests/testthat/test-deprecated.R +++ b/tests/testthat/test-deprecated.R @@ -16,9 +16,11 @@ test_that("deprecated functions work", { old_mo <- "ESCCOL" class(old_mo) <- "bactid" + df_oldmo <- data.frame(test = old_mo) # print expect_output(print(old_mo)) - # test data.frame and pull - expect_equal(as.character(dplyr::pull(data.frame(test = old_mo), test)), "ESCCOL") + # test pull + library(dplyr) + expect_identical(df_oldmo %>% pull(test), old_mo) }) diff --git a/tests/testthat/test-mo.R b/tests/testthat/test-mo.R index 36185168..afbe1237 100644 --- a/tests/testthat/test-mo.R +++ b/tests/testthat/test-mo.R @@ -12,7 +12,7 @@ test_that("as.mo works", { expect_equal(as.character(as.mo("klpn")), "KLEPNE") expect_equal(as.character(as.mo("Klebsiella")), "KLE") expect_equal(as.character(as.mo("K. pneu rhino")), "KLEPNERH") # K. pneumoniae subspp. rhinoscleromatis - expect_equal(as.character(as.mo("coagulase negative")), "STACNS") + expect_equal(as.character(as.mo("Bartonella")), "BAR") expect_equal(as.character(as.mo("P. aer")), "PSEAER") # not Pasteurella aerogenes @@ -30,16 +30,21 @@ test_that("as.mo works", { expect_equal(as.character(as.mo("VISP")), "STCPNE") expect_equal(as.character(as.mo("VRSP")), "STCPNE") + expect_equal(as.character(as.mo("CNS")), "STACNS") + expect_equal(as.character(as.mo("CoNS")), "STACNS") + expect_equal(as.character(as.mo("CPS")), "STACPS") + expect_equal(as.character(as.mo("CoPS")), "STACPS") + expect_identical( as.character( as.mo(c("stau", - "STAU", - "staaur", - "S. aureus", - "S aureus", - "Staphylococcus aureus", - "MRSA", - "VISA"))), + "STAU", + "staaur", + "S. aureus", + "S aureus", + "Staphylococcus aureus", + "MRSA", + "VISA"))), rep("STAAUR", 8)) # check for Becker classification @@ -55,19 +60,23 @@ test_that("as.mo works", { expect_identical(as.character(guess_mo("STAAUR", Becker = "all")), "STACPS") # check for Lancefield classification - expect_identical(as.character(guess_mo("S. pyogenes", Lancefield = FALSE)), "STCPYO") - expect_identical(as.character(guess_mo("S. pyogenes", Lancefield = TRUE)), "STCGRA") - expect_identical(as.character(guess_mo("STCPYO", Lancefield = TRUE)), "STCGRA") - expect_identical(as.character(guess_mo("S. agalactiae", Lancefield = FALSE)), "STCAGA") - expect_identical(as.character(guess_mo("S. agalactiae", Lancefield = TRUE)), "STCGRB") # group B - expect_identical(as.character(guess_mo("S. equisimilis", Lancefield = FALSE)), "STCEQS") - expect_identical(as.character(guess_mo("S. equisimilis", Lancefield = TRUE)), "STCGRC") # group C - expect_identical(as.character(guess_mo("S. anginosus", Lancefield = FALSE)), "STCANG") - expect_identical(as.character(guess_mo("S. anginosus", Lancefield = TRUE)), "STCGRF") # group F - expect_identical(as.character(guess_mo("S. sanguis", Lancefield = FALSE)), "STCSAN") - expect_identical(as.character(guess_mo("S. sanguis", Lancefield = TRUE)), "STCGRH") # group H - expect_identical(as.character(guess_mo("S. salivarius", Lancefield = FALSE)), "STCSAL") - expect_identical(as.character(guess_mo("S. salivarius", Lancefield = TRUE)), "STCGRK") # group K + expect_identical(as.character(guess_mo("S. pyogenes", Lancefield = FALSE)), "STCPYO") + expect_identical(as.character(guess_mo("S. pyogenes", Lancefield = TRUE)), "STCGRA") + expect_identical(as.character(guess_mo("STCPYO", Lancefield = TRUE)), "STCGRA") # group A + expect_identical(as.character(guess_mo("S. agalactiae", Lancefield = FALSE)), "STCAGA") + expect_identical(as.character(guess_mo("S. agalactiae", Lancefield = TRUE)), "STCGRB") # group B + expect_identical(as.character(guess_mo("S. equisimilis", Lancefield = FALSE)), "STCEQS") + expect_identical(as.character(guess_mo("S. equisimilis", Lancefield = TRUE)), "STCGRC") # group C + # Enterococci must only be influenced if Lancefield = "all" + expect_identical(as.character(guess_mo("E. faecium", Lancefield = FALSE)), "ENCFAC") + expect_identical(as.character(guess_mo("E. faecium", Lancefield = TRUE)), "ENCFAC") + expect_identical(as.character(guess_mo("E. faecium", Lancefield = "all")), "STCGRD") # group D + expect_identical(as.character(guess_mo("S. anginosus", Lancefield = FALSE)), "STCANG") + expect_identical(as.character(guess_mo("S. anginosus", Lancefield = TRUE)), "STCGRF") # group F + expect_identical(as.character(guess_mo("S. sanguis", Lancefield = FALSE)), "STCSAN") + expect_identical(as.character(guess_mo("S. sanguis", Lancefield = TRUE)), "STCGRH") # group H + expect_identical(as.character(guess_mo("S. salivarius", Lancefield = FALSE)), "STCSAL") + expect_identical(as.character(guess_mo("S. salivarius", Lancefield = TRUE)), "STCGRK") # group K library(dplyr) diff --git a/tests/testthat/test-mo_property.R b/tests/testthat/test-mo_property.R index b78b6a2f..782f6aef 100644 --- a/tests/testthat/test-mo_property.R +++ b/tests/testthat/test-mo_property.R @@ -9,6 +9,12 @@ test_that("mo_property works", { expect_equal(mo_type("E. coli"), "Bacteria") expect_equal(mo_gramstain("E. coli"), "Negative rods") expect_equal(mo_aerobic("E. coli"), TRUE) - expect_equal(mo_type_nl("E. coli"), "Bacterie") - expect_equal(mo_gramstain_nl("E. coli"), "Negatieve staven") + + expect_equal(mo_type("E. coli", language = "de"), "Bakterien") + expect_equal(mo_gramstain("E. coli", language = "de"), "Negative Staebchen") + + expect_equal(mo_type("E. coli", language = "nl"), "Bacterie") + expect_equal(mo_gramstain("E. coli", language = "nl"), "Negatieve staven") + + expect_error(mo_type("E. coli", language = "INVALID")) }) diff --git a/vignettes/.gitignore b/vignettes/.gitignore index 5a5283f9..efb49e92 100755 --- a/vignettes/.gitignore +++ b/vignettes/.gitignore @@ -1,4 +1,5 @@ figure *.html *.md +*.R rsconnect diff --git a/vignettes/AMR.Rmd b/vignettes/AMR.Rmd index cf1e2b96..c785a189 100755 --- a/vignettes/AMR.Rmd +++ b/vignettes/AMR.Rmd @@ -34,7 +34,7 @@ This `AMR` package basically does four important things: * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute). * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them. * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported. - * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 2,650 microorganisms (2,207 bacteria, 285 fungi/yeasts, 153 parasites, 1 other). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.mo` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. + * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. Some functions can return results in Spanish, German and Dutch. These functions can be used to add new variables to your data. * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_official` and `ab_tradenames` to look up values. As the `mo_*` functions use `as.mo` internally, the `ab_*` functions use `as.atc` internally so it uses AI to guess your expected result. For example, `ab_official("Fluclox")`, `ab_official("Floxapen")` and `ab_official("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data. 3. It **analyses the data** with convenient functions that use well-known methods. @@ -52,7 +52,6 @@ This `AMR` package basically does four important things: * Results of 40 antibiotics (each antibiotic in its own column) with a total of 38,414 antimicrobial results * Real and genuine data - ---- ```{r, echo = FALSE} # this will print "2018" in 2018, and "2018-yyyy" after 2018.