diff --git a/DESCRIPTION b/DESCRIPTION index 25a95240..0993a6a9 100755 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR Version: 0.3.0.9006 -Date: 2018-08-31 +Date: 2018-09-01 Title: Antimicrobial Resistance Analysis Authors@R: c( person( diff --git a/NEWS.md b/NEWS.md index f8ae0d5c..3f43e475 100755 --- a/NEWS.md +++ b/NEWS.md @@ -15,7 +15,7 @@ * Introduction to AMR as a vignette #### Changed -* Added 182 microorganisms to the `microorganisms` data set, now *n* = 2,646 (2,207 bacteria, 285 fungi/yeasts, 153 parasites, 1 other) +* Added 226 microorganisms to the `microorganisms` data set and removed the few viruses it contained, now *n* = 2,664 (2,225 bacteria, 285 fungi/yeasts, 153 parasites, 1 other) * Added three antimicrobial agents to the `antibiotics` data set: Terbinafine (D01BA02), Rifaximin (A07AA11) and Isoconazole (D01AC05) * Added 163 trade names to the `antibiotics` data set, it now contains 298 different trade names in total, e.g.: ```r @@ -28,7 +28,7 @@ ``` * Function `ratio` is now deprecated and will be removed in a future release, as it is not really the scope of this package * Fix for `as.mic` for values ending in zeroes after a real number -* Huge speed improvement for `as.bactid` (now `as.mo`) +* Tremendous speed improvement for `as.bactid` (now `as.mo`) * Added parameters `minimum` and `as_percent` to `portion_df` * Support for quasiquotation in the functions series `count_*` and `portions_*`, and `n_rsi`. This allows to check for more than 2 vectors or columns. ```r diff --git a/R/atc.R b/R/atc.R index dd1a41a9..7ce60f74 100755 --- a/R/atc.R +++ b/R/atc.R @@ -31,7 +31,7 @@ #' In the ATC classification system, the active substances are classified in a hierarchy with five different levels. The system has fourteen main anatomical/pharmacological groups or 1st levels. Each ATC main group is divided into 2nd levels which could be either pharmacological or therapeutic groups. The 3rd and 4th levels are chemical, pharmacological or therapeutic subgroups and the 5th level is the chemical substance. The 2nd, 3rd and 4th levels are often used to identify pharmacological subgroups when that is considered more appropriate than therapeutic or chemical subgroups. #' Source: \url{https://www.whocc.no/atc/structure_and_principles/} #' @return Character (vector) with class \code{"act"}. Unknown values will return \code{NA}. -#' @seealso \code{\link{antibiotics}} for the dataframe that is being used to determine ATC's. +#' @seealso \code{\link{antibiotics}} for the dataframe that is being used to determine ATCs. #' @examples #' # These examples all return "J01FA01", the ATC code of Erythromycin: #' as.atc("J01FA01") diff --git a/R/data.R b/R/data.R index 8a480a0f..d667332d 100755 --- a/R/data.R +++ b/R/data.R @@ -16,10 +16,10 @@ # GNU General Public License for more details. # # ==================================================================== # -#' Dataset with 423 antibiotics +#' Data set with 423 antibiotics #' -#' A dataset containing all antibiotics with a J0 code and some other antimicrobial agents, with their DDD's. Except for trade names and abbreviations, all properties were downloaded from the WHO, see Source. -#' @format A data.frame with 423 observations and 18 variables: +#' A data set containing all antibiotics with a J0 code and some other antimicrobial agents, with their DDDs. Except for trade names and abbreviations, all properties were downloaded from the WHO, see Source. +#' @format A \code{\link{tibble}} with 423 observations and 18 variables: #' \describe{ #' \item{\code{atc}}{ATC code, like \code{J01CR02}} #' \item{\code{certe}}{Certe code, like \code{amcl}} @@ -120,10 +120,10 @@ # "antibiotics" -#' Dataset with ~2650 microorganisms +#' Data set with human pathogenic microorganisms #' -#' A dataset containing 2,646 microorganisms. MO codes of the UMCG can be looked up using \code{\link{microorganisms.umcg}}. -#' @format A data.frame with 2,646 observations and 12 variables: +#' A data set containing 2,664 (potential) human pathogenic microorganisms. MO codes can be looked up using \code{\link{guess_mo}}. +#' @format A \code{\link{tibble}} with 2,664 observations and 12 variables: #' \describe{ #' \item{\code{mo}}{ID of microorganism} #' \item{\code{bactsys}}{Bactsyscode of microorganism} @@ -151,10 +151,10 @@ #' @seealso \code{\link{guess_mo}} \code{\link{antibiotics}} \code{\link{microorganisms.umcg}} "microorganisms" -#' Translation table for UMCG with ~1100 microorganisms +#' Translation table for UMCG with ~1,100 microorganisms #' -#' A dataset containing all bacteria codes of UMCG MMB. These codes can be joined to data with an ID from \code{\link{microorganisms}$mo} (using \code{\link{left_join_microorganisms}}). GLIMS codes can also be translated to valid \code{mo}'s with \code{\link{guess_mo}}. -#' @format A data.frame with 1090 observations and 2 variables: +#' A data set containing all bacteria codes of UMCG MMB. These codes can be joined to data with an ID from \code{\link{microorganisms}$mo} (using \code{\link{left_join_microorganisms}}). GLIMS codes can also be translated to valid \code{MO}s with \code{\link{guess_mo}}. +#' @format A \code{\link{tibble}} with 1,090 observations and 2 variables: #' \describe{ #' \item{\code{umcg}}{Code of microorganism according to UMCG MMB} #' \item{\code{mo}}{Code of microorganism in \code{\link{microorganisms}}} @@ -163,10 +163,10 @@ #' @seealso \code{\link{guess_mo}} \code{\link{microorganisms}} "microorganisms.umcg" -#' Dataset with 2000 blood culture isolates of septic patients +#' Data set with 2000 blood culture isolates of septic patients #' -#' An anonymised dataset containing 2000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This \code{data.frame} can be used to practice AMR analysis. For examples, press F1. -#' @format A data.frame with 2000 observations and 49 variables: +#' An anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This \code{data.frame} can be used to practice AMR analysis. For examples, press F1. +#' @format A \code{\link{tibble}} with 2,000 observations and 49 variables: #' \describe{ #' \item{\code{date}}{date of receipt at the laboratory} #' \item{\code{hospital_id}}{ID of the hospital, from A to D} @@ -185,13 +185,13 @@ #' # PREPARATION # #' # ----------- # #' -#' # Save this example dataset to an object, so we can edit it: +#' # Save this example data set to an object, so we can edit it: #' my_data <- septic_patients #' #' # load the dplyr package to make data science A LOT easier #' library(dplyr) #' -#' # Add first isolates to our dataset: +#' # Add first isolates to our data set: #' my_data <- my_data %>% #' mutate(first_isolates = first_isolate(my_data, "date", "patient_id", "mo")) #' diff --git a/R/eucast.R b/R/eucast.R index c23ca0fe..9d794f40 100755 --- a/R/eucast.R +++ b/R/eucast.R @@ -280,8 +280,10 @@ EUCAST_rules <- function(tbl, } # join to microorganisms data set + col_mo_original <- NULL if (!tbl %>% pull(col_mo) %>% is.mo()) { - warning("Improve integrity of the `", col_mo, "` column by transforming it with 'as.mo'.") + col_mo_original <- tbl %>% pull(col_mo) + tbl[, col_mo] <- as.mo(tbl[, col_mo]) } tbl <- tbl %>% left_join_microorganisms(by = col_mo, suffix = c("_tempmicroorganisms", "")) @@ -685,6 +687,10 @@ EUCAST_rules <- function(tbl, tbl <- tbl %>% select(-c((tbl.ncol - microorganisms.ncol):tbl.ncol)) # and remove added suffices colnames(tbl) <- gsub("_tempmicroorganisms", "", colnames(tbl)) + # restore old col_mo values if needed + if (!is.null(col_mo_original)) { + tbl[, col_mo] <- col_mo_original + } if (info == TRUE) { cat('Done.\n\nEUCAST Expert rules applied to', diff --git a/R/first_isolate.R b/R/first_isolate.R index a1a65e9a..ac25ff9f 100755 --- a/R/first_isolate.R +++ b/R/first_isolate.R @@ -178,7 +178,7 @@ first_isolate <- function(tbl, if (!is.na(col_mo)) { if (!tbl %>% pull(col_mo) %>% is.mo()) { - warning("Improve integrity of the `", col_mo, "` column by transforming it with 'as.mo'.") + tbl[, col_mo] <- as.mo(tbl[, col_mo]) } # join to microorganisms data set tbl <- tbl %>% left_join_microorganisms(by = col_mo) @@ -311,7 +311,7 @@ first_isolate <- function(tbl, if (info == TRUE) { message('No isolates found.') } - # NA's where genus is unavailable + # NAs where genus is unavailable tbl <- tbl %>% mutate(real_first_isolate = if_else(genus == '', NA, FALSE)) if (output_logical == FALSE) { @@ -406,7 +406,7 @@ first_isolate <- function(tbl, all_first[which(all_first[, col_icu] == TRUE), 'real_first_isolate'] <- FALSE } - # NA's where genus is unavailable + # NAs where genus is unavailable all_first <- all_first %>% mutate(real_first_isolate = if_else(genus %in% c('', '(no MO)', NA), NA, real_first_isolate)) diff --git a/R/globals.R b/R/globals.R index aa0ffdb4..268258d2 100755 --- a/R/globals.R +++ b/R/globals.R @@ -16,61 +16,41 @@ # GNU General Public License for more details. # # ==================================================================== # -globalVariables(c('abname', - 'Antibiotic', - 'Interpretation', - 'Percentage', - 'bind_rows', - 'element_blank', - 'element_line', - 'theme', - 'theme_minimal', - 'antibiotic', - 'antibiotics', - 'atc', - 'bactid', - 'C_chisq_sim', - 'certe', - 'cnt', - 'count', - 'Count', - 'counts', - 'cum_count', - 'cum_percent', - 'date_lab', - 'days_diff', - 'fctlvl', - 'first_isolate_row_index', - 'Freq', - 'fullname', - 'genus', - 'gramstain', - 'item', - 'key_ab', - 'key_ab_lag', - 'key_ab_other', - 'labs', - 'median', - 'mic', - 'MIC', - 'microorganisms', - 'mocode', - 'n', - 'na.omit', - 'observations', - 'official', - 'other_pat_or_mo', - 'Pasted', - 'patient_id', - 'quantile', - 'R', - 'real_first_isolate', - 'S', - 'septic_patients', - 'species', - 'umcg', - 'value', - 'values', - 'View', - 'y', - '.')) +globalVariables(c(".", + "antibiotic", + "Antibiotic", + "antibiotics", + "cnt", + "count", + "Count", + "cum_count", + "cum_percent", + "date_lab", + "days_diff", + "fctlvl", + "first_isolate_row_index", + "Freq", + "genus", + "gramstain", + "Interpretation", + "item", + "key_ab", + "key_ab_lag", + "key_ab_other", + "median", + "mic", + "microorganisms", + "mo", + "n", + "observations", + "other_pat_or_mo", + "Pasted", + "patient_id", + "Percentage", + "R", + "real_first_isolate", + "S", + "septic_patients", + "species", + "value", + "y")) diff --git a/R/key_antibiotics.R b/R/key_antibiotics.R index 8db9b8a9..818e7386 100644 --- a/R/key_antibiotics.R +++ b/R/key_antibiotics.R @@ -140,6 +140,9 @@ key_antibiotics <- function(tbl, GramNeg_4, GramNeg_5, GramNeg_6) gram_negative <- gram_negative[!is.na(gram_negative)] + if (!tbl %>% pull(col_mo) %>% is.mo()) { + tbl[, col_mo] <- as.mo(tbl[, col_mo]) + } # join microorganisms tbl <- tbl %>% left_join_microorganisms(col_mo) diff --git a/R/mo.R b/R/mo.R index 9b1990d7..55b05097 100644 --- a/R/mo.R +++ b/R/mo.R @@ -91,7 +91,6 @@ #' } as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { - if (NCOL(x) == 2) { # support tidyverse selection like: df %>% select(colA, colB) # paste these columns together @@ -131,74 +130,11 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { x_species <- paste(x, 'species') # add start en stop regex x <- paste0('^', x, '$') + x_withspaces_all <- x_withspaces x_withspaces <- paste0('^', x_withspaces, '$') for (i in 1:length(x)) { - if (Becker == TRUE | Becker == "all") { - mo <- suppressWarnings(guess_mo(x_backup[i])) - if (mo %like% '^STA') { - # See Source. It's this figure: - # https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4187637/figure/F3/ - species <- left_join_microorganisms(mo)$species - if (species %in% c("arlettae", "auricularis", "capitis", - "caprae", "carnosus", "cohnii", "condimenti", - "devriesei", "epidermidis", "equorum", - "fleurettii", "gallinarum", "haemolyticus", - "hominis", "jettensis", "kloosii", "lentus", - "lugdunensis", "massiliensis", "microti", - "muscae", "nepalensis", "pasteuri", "petrasii", - "pettenkoferi", "piscifermentans", "rostri", - "saccharolyticus", "saprophyticus", "sciuri", - "stepanovicii", "simulans", "succinus", - "vitulinus", "warneri", "xylosus")) { - x[i] <- "STACNS" - next - } else if ((Becker == "all" & species == "aureus") - | species %in% c("simiae", "agnetis", "chromogenes", - "delphini", "felis", "lutrae", - "hyicus", "intermedius", - "pseudintermedius", "pseudointermedius", - "schleiferi")) { - x[i] <- "STACPS" - next - } - } - } - - if (Lancefield == TRUE) { - mo <- suppressWarnings(guess_mo(x_backup[i])) - if (mo %like% '^STC') { - # See Source - species <- left_join_microorganisms(mo)$species - if (species == "pyogenes") { - x[i] <- "STCGRA" - next - } - if (species == "agalactiae") { - x[i] <- "STCGRB" - next - } - if (species %in% c("equisimilis", "equi", - "zooepidemicus", "dysgalactiae")) { - x[i] <- "STCGRC" - next - } - if (species == "anginosus") { - x[i] <- "STCGRF" - next - } - if (species == "sanguis") { - x[i] <- "STCGRH" - next - } - if (species == "salivarius") { - x[i] <- "STCGRK" - next - } - } - } - if (identical(x_trimmed[i], "")) { # empty values x[i] <- NA @@ -206,12 +142,12 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { next } if (x_backup[i] %in% AMR::microorganisms$mo) { - # is already a valid mo + # is already a valid MO code x[i] <- x_backup[i] next } if (x_trimmed[i] %in% AMR::microorganisms$mo) { - # is already a valid mo + # is already a valid MO code x[i] <- x_trimmed[i] next } @@ -303,6 +239,13 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { next } + # try fullname without start and stop regex, to also find subspecies, like "K. pneu rhino" + found <- MOs[which(gsub("[\\(\\)]", "", MOs$fullname) %like% x_withspaces_all[i]),]$mo + if (length(found) > 0) { + x[i] <- found[1L] + next + } + # search for GLIMS code found <- AMR::microorganisms.umcg[which(toupper(AMR::microorganisms.umcg$umcg) == toupper(x_trimmed[i])),]$mo if (length(found) > 0) { @@ -352,6 +295,57 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE) { call. = FALSE) } + if (Becker == TRUE | Becker == "all") { + # See Source. It's this figure: + # https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4187637/figure/F3/ + CoNS <- MOs %>% + filter(genus == "Staphylococcus", + species %in% c("arlettae", "auricularis", "capitis", + "caprae", "carnosus", "cohnii", "condimenti", + "devriesei", "epidermidis", "equorum", + "fleurettii", "gallinarum", "haemolyticus", + "hominis", "jettensis", "kloosii", "lentus", + "lugdunensis", "massiliensis", "microti", + "muscae", "nepalensis", "pasteuri", "petrasii", + "pettenkoferi", "piscifermentans", "rostri", + "saccharolyticus", "saprophyticus", "sciuri", + "stepanovicii", "simulans", "succinus", + "vitulinus", "warneri", "xylosus")) %>% + pull(mo) + CoPS <- MOs %>% + filter(genus == "Staphylococcus", + species %in% c("simiae", "agnetis", "chromogenes", + "delphini", "felis", "lutrae", + "hyicus", "intermedius", + "pseudintermedius", "pseudointermedius", + "schleiferi")) %>% + pull(mo) + x[x %in% CoNS] <- "STACNS" + x[x %in% CoPS] <- "STACPS" + if (Becker == "all") { + x[x == "STAAUR"] <- "STACPS" + } + } + + if (Lancefield == TRUE) { + # group A + x[x == "STCPYO"] <- "STCGRA" # S. pyogenes + # group B + x[x == "STCAGA"] <- "STCGRB" # S. agalactiae + # group C + S_groupC <- MOs %>% filter(genus == "Streptococcus", + species %in% c("equisimilis", "equi", + "zooepidemicus", "dysgalactiae")) %>% + pull(mo) + x[x %in% S_groupC] <- "STCGRC" # S. agalactiae + # group F + x[x == "STCANG"] <- "STCGRF" # S. anginosus + # group H + x[x == "STCSAN"] <- "STCGRH" # S. sanguis + # group K + x[x == "STCSAL"] <- "STCGRK" # S. salivarius + } + # left join the found results to the original input values (x_input) df_found <- data.frame(input = as.character(unique(x_input)), found = x, diff --git a/README.md b/README.md index 882a0fbc..223a57a9 100755 --- a/README.md +++ b/README.md @@ -55,7 +55,7 @@ This `AMR` package basically does four important things: * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute). * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them. * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported. - * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 2,650 microorganisms (2,207 bacteria, 285 fungi/yeasts, 153 parasites, 1 other). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.mo` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. + * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.mo` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_official` and `ab_tradenames` to look up values. As the `mo_*` functions use `as.mo` internally, the `ab_*` functions use `as.atc` internally so it uses AI to guess your expected result. For example, `ab_official("Fluclox")`, `ab_official("Floxapen")` and `ab_official("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data. 3. It **analyses the data** with convenient functions that use well-known methods. @@ -74,7 +74,7 @@ This `AMR` package basically does four important things: * Real and genuine data ## How to get it? -All versions of this package [are published on CRAN](http://cran.r-project.org/package=AMR), the official R network with a peer-reviewed submission process. +All stable versions of this package [are published on CRAN](http://cran.r-project.org/package=AMR), the official R network with a peer-reviewed submission process. ### Install from CRAN [![CRAN_Badge](https://www.r-pkg.org/badges/version/AMR)](http://cran.r-project.org/package=AMR) [![CRAN_Downloads](https://cranlogs.r-pkg.org/badges/grand-total/AMR)](http://cran.r-project.org/package=AMR) @@ -89,15 +89,14 @@ All versions of this package [are published on CRAN](http://cran.r-project.org/p - `install.packages("AMR")` ### Install from GitHub -[![Last_Commit](https://img.shields.io/github/last-commit/msberends/AMR.svg)](https://github.com/msberends/AMR/commits/master) -This is the latest development version. Although it may contain bugfixes and even new functions compared to the latest released version on CRAN, it is also subject to change and may be unstable or behave unexpectedly. Always consider this a beta version. All below 'badges' should be green. +This is the latest **development version**. Although it may contain bugfixes and even new functions compared to the latest released version on CRAN, it is also subject to change and may be unstable or behave unexpectedly. Always consider this a beta version. All below 'badges' should be green: -Development Test | Result ---- | :---: -Works on Linux and macOS | [![Travis_Build](https://travis-ci.org/msberends/AMR.svg?branch=master)](https://travis-ci.org/msberends/AMR) -Works on Windows | [![AppVeyor_Build](https://ci.appveyor.com/api/projects/status/github/msberends/AMR?branch=master&svg=true)](https://ci.appveyor.com/project/msberends/AMR) -Syntax lines checked | [![Code_Coverage](https://codecov.io/gh/msberends/AMR/branch/master/graph/badge.svg)](https://codecov.io/gh/msberends/AMR) +Development Test | Result | Reference +--- | :---: | --- +Works on Linux and macOS | [![Travis_Build](https://travis-ci.org/msberends/AMR.svg?branch=master)](https://travis-ci.org/msberends/AMR) | Checked by Travis CI, GmbH [[ref 1]](https://travis-ci.org/msberends/AMR) +Works on Windows | [![AppVeyor_Build](https://ci.appveyor.com/api/projects/status/github/msberends/AMR?branch=master&svg=true)](https://ci.appveyor.com/project/msberends/AMR) | Checked by Appveyor Systems Inc. [[ref 2]](https://ci.appveyor.com/project/msberends/AMR) +Syntax lines checked | [![Code_Coverage](https://codecov.io/gh/msberends/AMR/branch/master/graph/badge.svg)](https://codecov.io/gh/msberends/AMR) | Checked by Codecov LLC [[ref 3]](https://codecov.io/gh/msberends/AMR) If so, try it with: ```r diff --git a/data/microorganisms.rda b/data/microorganisms.rda index 094dc216..d939951a 100755 Binary files a/data/microorganisms.rda and b/data/microorganisms.rda differ diff --git a/man/antibiotics.Rd b/man/antibiotics.Rd index 598497fd..c101908b 100755 --- a/man/antibiotics.Rd +++ b/man/antibiotics.Rd @@ -3,8 +3,8 @@ \docType{data} \name{antibiotics} \alias{antibiotics} -\title{Dataset with 423 antibiotics} -\format{A data.frame with 423 observations and 18 variables: +\title{Data set with 423 antibiotics} +\format{A \code{\link{tibble}} with 423 observations and 18 variables: \describe{ \item{\code{atc}}{ATC code, like \code{J01CR02}} \item{\code{certe}}{Certe code, like \code{amcl}} @@ -32,7 +32,7 @@ antibiotics } \description{ -A dataset containing all antibiotics with a J0 code and some other antimicrobial agents, with their DDD's. Except for trade names and abbreviations, all properties were downloaded from the WHO, see Source. +A data set containing all antibiotics with a J0 code and some other antimicrobial agents, with their DDDs. Except for trade names and abbreviations, all properties were downloaded from the WHO, see Source. } \seealso{ \code{\link{microorganisms}} diff --git a/man/as.atc.Rd b/man/as.atc.Rd index 9a6aa6c4..cf0f053a 100644 --- a/man/as.atc.Rd +++ b/man/as.atc.Rd @@ -45,6 +45,6 @@ ab_official(Cipro) # returns "Ciprofloxacin" ab_umcg(Cipro) # returns "CIPR", the code used in the UMCG } \seealso{ -\code{\link{antibiotics}} for the dataframe that is being used to determine ATC's. +\code{\link{antibiotics}} for the dataframe that is being used to determine ATCs. } \keyword{atc} diff --git a/man/microorganisms.Rd b/man/microorganisms.Rd index dd57a3d2..4a3bf3a8 100755 --- a/man/microorganisms.Rd +++ b/man/microorganisms.Rd @@ -3,8 +3,8 @@ \docType{data} \name{microorganisms} \alias{microorganisms} -\title{Dataset with ~2650 microorganisms} -\format{A data.frame with 2,646 observations and 12 variables: +\title{Data set with human pathogenic microorganisms} +\format{A \code{\link{tibble}} with 2,664 observations and 12 variables: \describe{ \item{\code{mo}}{ID of microorganism} \item{\code{bactsys}}{Bactsyscode of microorganism} @@ -23,7 +23,7 @@ microorganisms } \description{ -A dataset containing 2,646 microorganisms. MO codes of the UMCG can be looked up using \code{\link{microorganisms.umcg}}. +A data set containing 2,664 (potential) human pathogenic microorganisms. MO codes can be looked up using \code{\link{guess_mo}}. } \seealso{ \code{\link{guess_mo}} \code{\link{antibiotics}} \code{\link{microorganisms.umcg}} diff --git a/man/microorganisms.umcg.Rd b/man/microorganisms.umcg.Rd index 517fadf0..edf38ee0 100755 --- a/man/microorganisms.umcg.Rd +++ b/man/microorganisms.umcg.Rd @@ -3,8 +3,8 @@ \docType{data} \name{microorganisms.umcg} \alias{microorganisms.umcg} -\title{Translation table for UMCG with ~1100 microorganisms} -\format{A data.frame with 1090 observations and 2 variables: +\title{Translation table for UMCG with ~1,100 microorganisms} +\format{A \code{\link{tibble}} with 1,090 observations and 2 variables: \describe{ \item{\code{umcg}}{Code of microorganism according to UMCG MMB} \item{\code{mo}}{Code of microorganism in \code{\link{microorganisms}}} @@ -13,7 +13,7 @@ microorganisms.umcg } \description{ -A dataset containing all bacteria codes of UMCG MMB. These codes can be joined to data with an ID from \code{\link{microorganisms}$mo} (using \code{\link{left_join_microorganisms}}). GLIMS codes can also be translated to valid \code{mo}'s with \code{\link{guess_mo}}. +A data set containing all bacteria codes of UMCG MMB. These codes can be joined to data with an ID from \code{\link{microorganisms}$mo} (using \code{\link{left_join_microorganisms}}). GLIMS codes can also be translated to valid \code{MO}s with \code{\link{guess_mo}}. } \seealso{ \code{\link{guess_mo}} \code{\link{microorganisms}} diff --git a/man/septic_patients.Rd b/man/septic_patients.Rd index 0e6645c0..d63b449d 100755 --- a/man/septic_patients.Rd +++ b/man/septic_patients.Rd @@ -3,8 +3,8 @@ \docType{data} \name{septic_patients} \alias{septic_patients} -\title{Dataset with 2000 blood culture isolates of septic patients} -\format{A data.frame with 2000 observations and 49 variables: +\title{Data set with 2000 blood culture isolates of septic patients} +\format{A \code{\link{tibble}} with 2,000 observations and 49 variables: \describe{ \item{\code{date}}{date of receipt at the laboratory} \item{\code{hospital_id}}{ID of the hospital, from A to D} @@ -21,20 +21,20 @@ septic_patients } \description{ -An anonymised dataset containing 2000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This \code{data.frame} can be used to practice AMR analysis. For examples, press F1. +An anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This \code{data.frame} can be used to practice AMR analysis. For examples, press F1. } \examples{ # ----------- # # PREPARATION # # ----------- # -# Save this example dataset to an object, so we can edit it: +# Save this example data set to an object, so we can edit it: my_data <- septic_patients # load the dplyr package to make data science A LOT easier library(dplyr) -# Add first isolates to our dataset: +# Add first isolates to our data set: my_data <- my_data \%>\% mutate(first_isolates = first_isolate(my_data, "date", "patient_id", "mo")) diff --git a/tests/testthat/test-eucast.R b/tests/testthat/test-eucast.R index c2e40425..2978a093 100755 --- a/tests/testthat/test-eucast.R +++ b/tests/testthat/test-eucast.R @@ -20,7 +20,6 @@ test_that("EUCAST rules work", { "ENTAER"), # Enterobacter aerogenes amox = "R", # Amoxicillin stringsAsFactors = FALSE) - expect_warning(EUCAST_rules(a, info = FALSE)) expect_identical(suppressWarnings(EUCAST_rules(a, info = FALSE)), b) expect_identical(suppressWarnings(interpretive_reading(a, info = TRUE)), b) diff --git a/tests/testthat/test-first_isolate.R b/tests/testthat/test-first_isolate.R index bb18789d..7f3fe893 100755 --- a/tests/testthat/test-first_isolate.R +++ b/tests/testthat/test-first_isolate.R @@ -124,10 +124,15 @@ test_that("first isolates work", { col_date = "non-existing col", col_mo = "mo")) - expect_warning(septic_patients %>% + # if mo is not an mo class, result should be the same + expect_identical(septic_patients %>% mutate(mo = as.character(mo)) %>% first_isolate(col_date = "date", col_mo = "mo", - col_patient_id = "patient_id")) + col_patient_id = "patient_id"), + septic_patients %>% + first_isolate(col_date = "date", + col_mo = "mo", + col_patient_id = "patient_id")) }) diff --git a/tests/testthat/test-mo.R b/tests/testthat/test-mo.R index 47b5baeb..36185168 100644 --- a/tests/testthat/test-mo.R +++ b/tests/testthat/test-mo.R @@ -11,6 +11,7 @@ test_that("as.mo works", { expect_equal(as.character(as.mo(" ESCCOL ")), "ESCCOL") expect_equal(as.character(as.mo("klpn")), "KLEPNE") expect_equal(as.character(as.mo("Klebsiella")), "KLE") + expect_equal(as.character(as.mo("K. pneu rhino")), "KLEPNERH") # K. pneumoniae subspp. rhinoscleromatis expect_equal(as.character(as.mo("coagulase negative")), "STACNS") expect_equal(as.character(as.mo("P. aer")), "PSEAER") # not Pasteurella aerogenes diff --git a/vignettes/AMR.Rmd b/vignettes/AMR.Rmd index 2469a031..cf1e2b96 100755 --- a/vignettes/AMR.Rmd +++ b/vignettes/AMR.Rmd @@ -5,7 +5,7 @@ output: rmarkdown::html_vignette: toc: true vignette: > - %\VignetteIndexEntry{Creating Frequency Tables} + %\VignetteIndexEntry{Introduction to the AMR package} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- @@ -23,10 +23,10 @@ This `AMR` package basically does four important things: 1. It **cleanses existing data**, by transforming it to reproducible and profound *classes*, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect: - * Use `as.bactid` to get an ID of a microorganism. The IDs are quite obvious - the ID of *E. coli* is "ESCCOL" and the ID of *S. aureus* is "STAAUR". This `as.bactid` function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.bactid("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, this package contains a freely available database of ~2,650 different (potential) human pathogenic microorganisms. + * Use `as.mo` to get an ID of a microorganism. The IDs are quite obvious - the ID of *E. coli* is "ESCCOL" and the ID of *S. aureus* is "STAAUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, this package contains a freely available database of ~2,650 different (potential) human pathogenic microorganisms. * Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S". * Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called *ordinal* in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002". - * Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantine", "nitro" all return the ATC code of Nitrofurantoine. + * Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine. 2. It **enhances existing data** and **adds new data** from data sets included in this package. @@ -34,8 +34,8 @@ This `AMR` package basically does four important things: * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute). * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them. * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported. - * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 2,650 microorganisms (2,207 bacteria, 285 fungi/yeasts, 153 parasites, 1 other). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.bactid` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. - * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_official` and `ab_tradenames` to look up values. As the `mo_*` functions use `as.bactid` internally, the `ab_*` functions use `as.atc` internally so it uses AI to guess your expected result. For example, `ab_official("Fluclox")`, `ab_official("Floxapen")` and `ab_official("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data. + * The data set `microorganisms` contains the family, genus, species, subspecies, colloquial name and Gram stain of almost 2,650 microorganisms (2,207 bacteria, 285 fungi/yeasts, 153 parasites, 1 other). This enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family` or `mo_gramstain`. Since it uses `as.mo` internally, AI is supported. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. These functions can be used to add new variables to your data. + * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_official` and `ab_tradenames` to look up values. As the `mo_*` functions use `as.mo` internally, the `ab_*` functions use `as.atc` internally so it uses AI to guess your expected result. For example, `ab_official("Fluclox")`, `ab_official("Floxapen")` and `ab_official("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data. 3. It **analyses the data** with convenient functions that use well-known methods.