(v1.2.0.9019) ab_from_text() dose and administration

2026-02-27 01:18:46 +01:00 · 2020-07-01 11:07:01 +02:00
parent 240d817b4e
commit 329a0eb0b6
40 changed files with 350 additions and 170 deletions
--- a/R/aa_helper_functions.R
+++ b/R/aa_helper_functions.R
@@ -201,35 +201,43 @@ import_fn <- function(name, pkg) {
  get(name, envir = asNamespace(pkg))
 }

+stop_ <- function(..., call = TRUE) {
+  msg <- paste0(c(...), collapse = "")
+  if (!isFALSE(call)) {
+    if (isTRUE(call)) {
+      call <- as.character(sys.call(-1)[1])
+    } else {
+      # so you can go back more than 1 call, as used in rsi_calc(), that now throws a reference to e.g. n_rsi()
+      call <- as.character(sys.call(call)[1])
+    }
+    msg <- paste0("in ", call, "(): ", msg)
+  }
+  stop(msg, call. = FALSE)
+}
+
 stop_if <- function(expr, ..., call = TRUE) {
  if (isTRUE(expr)) {
-    msg <- paste0(c(...), collapse = "")
-    if (!isFALSE(call)) {
-      if (isTRUE(call)) {
-        call <- as.character(sys.call(-1)[1])
-      } else {
-        # so you can go back more than 1 call, as used in rsi_calc(), that now throws a reference to e.g. n_rsi()
-        call <- as.character(sys.call(call)[1])
-      }
-      msg <- paste0("in ", call, "(): ", msg)
+    if (isTRUE(call)) {
+      call <- -1
    }
-    stop(msg, call. = FALSE)
+    if (!isFALSE(call)) {
+      # since we're calling stop_(), which is another call
+      call <- call - 1
+    }
+    stop_(..., call = call)
  }
 }

 stop_ifnot <- function(expr, ..., call = TRUE) {
  if (!isTRUE(expr)) {
-    msg <- paste0(c(...), collapse = "")
-    if (!isFALSE(call)) {
-      if (isTRUE(call)) {
-        call <- as.character(sys.call(-1)[1])
-      } else {
-        # so you can go back more than 1 call, as used in rsi_calc(), that now throws a reference to e.g. n_rsi()
-        call <- as.character(sys.call(call)[1])
-      }
-      msg <- paste0("in ", call, "(): ", msg)
+    if (isTRUE(call)) {
+      call <- -1
    }
-    stop(msg, call. = FALSE)
+    if (!isFALSE(call)) {
+      # since we're calling stop_(), which is another call
+      call <- call - 1
+    }
+    stop_(..., call = call)
  }
 }

--- a/R/ab_from_text.R
+++ b/R/ab_from_text.R
@@ -19,86 +19,162 @@
 # Visit our website for more info: https://msberends.gitlab.io/AMR.    #
 # ==================================================================== #

-#' Retrieve antimicrobial drugs from clinical text
+#' Retrieve antimicrobial drug names and doses from clinical text
 #' 
-#' Use this function on e.g. clinical texts from health care records. It returns a [list] with all antimicrobial drugs found in the texts.
+#' Use this function on e.g. clinical texts from health care records. It returns a [list] with all antimicrobial drugs, doses and forms of administration found in the texts.
+#' @inheritSection lifecycle Maturing lifecycle
 #' @param text text to analyse
-#' @param collapse character to pass on to `paste(..., collapse = ...)` to only return one character per element of `text`, see Examples
-#' @param translate_ab a column name of the [antibiotics] data set to translate the antibiotic abbreviations to, using [ab_property()]. Defaults to `FALSE`. Using `TRUE` is equal to using "name".
+#' @param type type of property to search for, either `"drug"`, `"dose"` or `"administration"`, see *Examples*
+#' @param collapse character to pass on to `paste(..., collapse = ...)` to only return one character per element of `text`, see *Examples*
+#' @param translate_ab if `type = "drug"`: a column name of the [antibiotics] data set to translate the antibiotic abbreviations to, using [ab_property()]. Defaults to `FALSE`. Using `TRUE` is equal to using "name".
 #' @param ... parameters passed on to [as.ab()]
-#' @details Without using `collapse`, this function will return a [list]. This can be convenient to use e.g. inside a `mutate()`):\cr
+#' @details This function is also internally used by [as.ab()], although it then only searches for the first drug name and will throw a note if more drug names could have been returned.
+#' 
+#' ## Parameter `type`
+#' At default, the function will search for antimicrobial drug names. All text elements will be searched for official names, ATC codes and brand names. As it uses [as.ab()] internally, it will correct for misspelling.
+#' 
+#' With `type = "dose"` (or similar, like "dosing", "doses"), all text elements will be searched for numeric values that are higher than 100 and do not resemble years. The output will be numeric. It supports any unit (g, mg, IE, etc.) and multiple values in one clinical text, see *Examples*.
+#' 
+#' With `type = "administration"` (or abbreviations, like "admin", "adm"), all text elements will be searched for a form of drug administration. It supports the following forms (including common abbreviations): buccal, implant, inhalation, instillation, intravenous, nasal, oral, parenteral, rectal, sublingual, transdermal and vaginal. Abbreviations for oral (such as 'po', 'per os') will become "oral", all values for intravenous (such as 'iv', 'intraven') will become "iv". It supports multiple values in one clinical text, see *Examples*.
+#' 
+#' ## Parameter `collapse`
+#' Without using `collapse`, this function will return a [list]. This can be convenient to use e.g. inside a `mutate()`):\cr
 #' `df %>% mutate(abx = ab_from_text(clinical_text))` 
 #' 
 #' The returned AB codes can be transformed to official names, groups, etc. with all [ab_property()] functions like [ab_name()] and [ab_group()], or by using the `translate_ab` parameter.
 #' 
 #' With using `collapse`, this function will return a [character]:\cr
 #' `df %>% mutate(abx = ab_from_text(clinical_text, collapse = "|"))` 
-#' 
-#' This function is also internally used by [as.ab()], although it then only returns the first hit and will throw a note if more results could have been returned.
 #' @export
 #' @return A [list], or a [character] if `collapse` is not `NULL`
+#' @inheritSection AMR Read more on our website!
 #' @examples 
 #' # mind the bad spelling of amoxicillin in this line, 
 #' # straight from a true health care record:
 #' ab_from_text("28/03/2020 regular amoxicilliin 500mg po tds")
 #' 
-#' ab_from_text("administered amoxi/clav and cipro")
-#' ab_from_text("administered amoxi/clav and cipro", collapse = ", ")
+#' ab_from_text("500 mg amoxi po and 400mg cipro iv")
+#' ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "dose")
+#' ab_from_text("500 mg amoxi po and 400mg cipro iv", type = "admin")
 #' 
-#' # if you want to know which antibiotic groups were administered, check it:
-#' abx <- ab_from_text("administered amoxi/clav and cipro")
+#' ab_from_text("500 mg amoxi po and 400mg cipro iv", collapse = ", ")
+#' 
+#' # if you want to know which antibiotic groups were administered, do e.g.:
+#' abx <- ab_from_text("500 mg amoxi po and 400mg cipro iv")
 #' ab_group(abx[[1]])
 #' 
 #' if (require(dplyr)) {
-#'   tibble(clinical_text = c("given cipro and mero",
-#'                            "started on doxy today")) %>% 
-#'     mutate(abx = ab_from_text(clinical_text),
-#'            abx2 = ab_from_text(clinical_text,
-#'                                collapse = "|"),
-#'            abx3 = ab_from_text(clinical_text,
-#'                                collapse = "|",
-#'                                translate_ab = "name"))
+#'   tibble(clinical_text = c("given 400mg cipro and 500 mg amox",
+#'                            "started on doxy iv today")) %>% 
+#'     mutate(abx_codes = ab_from_text(clinical_text),
+#'            abx_doses = ab_from_text(clinical_text, type = "doses"),
+#'            abx_admin = ab_from_text(clinical_text, type = "admin"),
+#'            abx_coll = ab_from_text(clinical_text, collapse = "|"),
+#'            abx_coll_names = ab_from_text(clinical_text,
+#'                                          collapse = "|",
+#'                                          translate_ab = "name"),
+#'            abx_coll_doses = ab_from_text(clinical_text,
+#'                                          type = "doses",
+#'                                          collapse = "|"),
+#'            abx_coll_admin = ab_from_text(clinical_text,
+#'                                          type = "admin",
+#'                                          collapse = "|"))
 #' 
 #' }
-ab_from_text <- function(text, collapse = NULL, translate_ab = FALSE, ...) {
+ab_from_text <- function(text,
+                         type = c("drug", "dose", "administration"),
+                         collapse = NULL,
+                         translate_ab = FALSE,
+                         ...) {
+  
+  if (missing(type)) {
+    type <- type[1L]
+  }
+  type <- tolower(trimws(type))
+  stop_if(length(type) != 1, "`type` must be of length 1")
  
  text <- tolower(as.character(text))
-  translate_ab <- get_translate_ab(translate_ab)
-  
-  abbr <- unlist(antibiotics$abbreviations)
-  abbr <- abbr[nchar(abbr) >= 4]
-  names <- substr(antibiotics$name, 1, 5)
-  synonyms <- unlist(antibiotics$synonyms)
-  synonyms <- synonyms[nchar(synonyms) >= 4]
-  to_regex <- function(x) {
-    paste0("^(",
-           paste0(unique(gsub("[^a-z0-9]", ".*", sort(tolower(x)))), collapse = "|"),
-           ").*")
-  }
-  
  text_split_all <- strsplit(text, "[ ;.,:/\\|-]")
-  result <- lapply(text_split_all, function(text_split) {
-    suppressWarnings(
-      out <- as.ab(unique(c(text_split[grep(to_regex(abbr), text_split)],
-                     text_split[grep(to_regex(names), text_split)],
-                     # regular expression must not be too long, so split synonyms in two:
-                     text_split[grep(to_regex(synonyms[c(1:0.5 * length(synonyms))]), text_split)],
-                     text_split[grep(to_regex(synonyms[c(0.5 * length(synonyms):length(synonyms))]), text_split)])),
-            ...))
-    out <- out[!is.na(out)]
-    if (length(out) == 0) {
-      as.ab(NA)
-    } else {
-      if (!isFALSE(translate_ab)) {
-        out <- ab_property(out, property = translate_ab, initial = FALSE)
-      }
-      out
+  
+  if (type %like% "(drug|ab|anti)") {
+    
+    translate_ab <- get_translate_ab(translate_ab)
+    
+    abbr <- unlist(antibiotics$abbreviations)
+    abbr <- abbr[nchar(abbr) >= 4]
+    names_atc <- substr(c(antibiotics$name, antibiotics$atc), 1, 5)
+    synonyms <- unlist(antibiotics$synonyms)
+    synonyms <- synonyms[nchar(synonyms) >= 4]
+    to_regex <- function(x) {
+      paste0("^(",
+             paste0(unique(gsub("[^a-z0-9]", ".*", sort(tolower(x)))), collapse = "|"),
+             ").*")
    }
-  })
-
-  if (!is.null(collapse)) {
-    result <- sapply(result, function(x) paste0(x, collapse = collapse))
+    
+    result <- lapply(text_split_all, function(text_split) {
+      suppressWarnings(
+        out <- as.ab(unique(c(text_split[grep(to_regex(abbr), text_split)],
+                              text_split[grep(to_regex(names_atc), text_split)],
+                              # regular expression must not be too long, so split synonyms in two:
+                              text_split[grep(to_regex(synonyms[c(1:0.5 * length(synonyms))]), text_split)],
+                              text_split[grep(to_regex(synonyms[c(0.5 * length(synonyms):length(synonyms))]), text_split)])),
+                     ...))
+      out <- out[!is.na(out)]
+      if (length(out) == 0) {
+        as.ab(NA)
+      } else {
+        if (!isFALSE(translate_ab)) {
+          out <- ab_property(out, property = translate_ab, initial = FALSE)
+        }
+        out
+      }
+    })
+    
+  } else if (type %like% "dos") {
+    text_split_all <- strsplit(text, " ")
+    result <- lapply(text_split_all, function(text_split) {
+      text_split <- text_split[text_split %like% "^[0-9]{2,}(/[0-9]+)?[a-z]*$"]
+      # only left part of "/", like 500 in  "500/125"
+      text_split <-  gsub("/.*", "", text_split)
+      text_split <- gsub(",", ".", text_split, fixed = TRUE) # foreign system using comma as decimal sep
+      text_split <- as.double(gsub("[^0-9.]", "", text_split))
+      # minimal 100 units/mg and no years that unlikely doses
+      text_split <- text_split[text_split >= 100 & !text_split %in% c(1951:1999, 2001:2049)]
+      
+      if (length(text_split) > 0) {
+        text_split
+      } else {
+        NA_real_
+      }
+    })
+    
+  } else if (type %like% "adm") {
+    result <- lapply(text_split_all, function(text_split) {
+      text_split <- text_split[text_split %like% "(^iv$|intraven|^po$|per os|oral|implant|inhal|instill|nasal|paren|rectal|sublingual|buccal|trans.*dermal|vaginal)"]
+      if (length(text_split) > 0) {
+        text_split <- gsub("(^po$|.*per os.*)", "oral", text_split)
+        text_split <- gsub("(^iv$|.*intraven.*)", "iv", text_split)
+        text_split
+      } else {
+        NA_character_
+      }
+    })
+    
+  } else {
+    stop_("`type` must be either 'drug', 'dose' or 'administration'")
  }
-
+  
+  # collapse text if needed
+  if (!is.null(collapse)) {
+    result <- sapply(result, function(x) {
+      if(length(x) == 1 & all(is.na(x))) {
+        NA_character_
+      } else {
+        paste0(x, collapse = collapse)
+      }
+    })
+  }
+  
  result
+  
 }
--- a/R/amr.R
+++ b/R/amr.R
@@ -30,6 +30,7 @@
 #' This package can be used for:
 #' - Reference for the taxonomy of microorganisms, since the package contains all microbial (sub)species from the [Catalogue of Life](http://www.catalogueoflife.org)
 #' - Interpreting raw MIC and disk diffusion values, based on the latest CLSI or EUCAST guidelines
+#' - Retrieving antimicrobial drug names, doses and forms of administration from clinical health care records
 #' - Determining first isolates to be used for AMR analysis
 #' - Calculating antimicrobial resistance
 #' - Determining multi-drug resistance (MDR) / multi-drug resistant organisms (MDRO)
@@ -45,12 +46,13 @@

 #' @section Read more on our website!:
 #' On our website <https://msberends.gitlab.io/AMR> you can find [a comprehensive tutorial](https://msberends.gitlab.io/AMR/articles/AMR.html) about how to conduct AMR analysis, the [complete documentation of all functions](https://msberends.gitlab.io/AMR/reference) (which reads a lot easier than here in R) and [an example analysis using WHONET data](https://msberends.gitlab.io/AMR/articles/WHONET.html).
-#' @section Contact us:
+#' @section Contact Us:
 #' For suggestions, comments or questions, please contact us at:
 #'
 #' Matthijs S. Berends \cr
 #' m.s.berends \[at\] umcg \[dot\] nl \cr
-#' Department of Medical Microbiology, University of Groningen \cr
+#' University of Groningen
+#' Department of Medical Microbiology
 #' University Medical Center Groningen \cr
 #' Post Office Box 30001 \cr
 #' 9700 RB Groningen \cr
--- a/R/lifecycle.R
+++ b/R/lifecycle.R
@@ -35,7 +35,7 @@
 #' The [lifecycle][AMR::lifecycle] of this function is **experimental**. An experimental function is in early stages of development. The unlying code might be changing frequently. Experimental functions might be removed without deprecation, so you are generally best off waiting until a function is more mature before you use it in production code. Experimental functions are only available in development versions of this `AMR` package and will thus not be included in releases that are submitted to CRAN, since such functions have not yet matured enough.
 #' @section Maturing lifecycle:
 #' \if{html}{\figure{lifecycle_maturing.svg}{options: style=margin-bottom:5px} \cr}
-#' The [lifecycle][AMR::lifecycle] of this function is **maturing**. The unlying code of a maturing function has been roughed out, but finer details might still change. This function needs wider usage and more extensive testing in order to optimise the unlying code.
+#' The [lifecycle][AMR::lifecycle] of this function is **maturing**. The unlying code of a maturing function has been roughed out, but finer details might still change. Since this function needs wider usage and more extensive testing, you are very welcome [to suggest changes at our repository](https://gitlab.com/msberends/AMR/-/issues) or [write us an email (see section 'Contact Us')][AMR::AMR].
 #' @section Stable lifecycle:
 #' \if{html}{\figure{lifecycle_stable.svg}{options: style=margin-bottom:5px} \cr}
 #' The [lifecycle][AMR::lifecycle] of this function is **stable**. In a stable function, major changes are unlikely. This means that the unlying code will generally evolve by adding new arguments; removing arguments or changing the meaning of existing arguments will be avoided.
--- a/R/rsi_calc.R
+++ b/R/rsi_calc.R
@@ -174,7 +174,9 @@ rsi_calc_df <- function(type, # "proportion", "count" or "both"
    combine_SI <- FALSE
  }
  stop_if(isTRUE(combine_SI) & isTRUE(combine_IR), "either `combine_SI` or `combine_IR` can be TRUE, not both", call = -2)
-
+  stop_ifnot(is.numeric(minimum), "`minimum` must be numeric", call = -2)
+  stop_ifnot(is.logical(as_percent), "`as_percent` must be logical", call = -2)
+  
  translate_ab <- get_translate_ab(translate_ab)

  # select only groups and antibiotics
--- a/R/sysdata.rda
+++ b/R/sysdata.rda