AMR/R/first_isolate.R

# ==================================================================== #
# TITLE                                                                #
# Antimicrobial Resistance (AMR) Data Analysis for R                   #
#                                                                      #
# SOURCE                                                               #
# https://github.com/msberends/AMR                                     #
#                                                                      #
# LICENCE                                                              #
# (c) 2018-2021 Berends MS, Luz CF et al.                              #
# Developed at the University of Groningen, the Netherlands, in        #
# collaboration with non-profit organisations Certe Medical            #
# Diagnostics & Advice, and University Medical Center Groningen.       # 
#                                                                      #
# This R package is free software; you can freely use and distribute   #
# it for both personal and commercial purposes under the terms of the  #
# GNU General Public License version 2.0 (GNU GPL-2), as published by  #
# the Free Software Foundation.                                        #
# We created this package for both routine data analysis and academic  #
# research and it was publicly released in the hope that it will be    #
# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY.              #
#                                                                      #
# Visit our website for the full manual and a complete tutorial about  #
# how to conduct AMR data analysis: https://msberends.github.io/AMR/   #
# ==================================================================== #

#' Determine First (Weighted) Isolates
#'
#' Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. These functions support all four methods as summarised by Hindler *et al.* in 2007 (\doi{10.1086/511864}). To determine patient episodes not necessarily based on microorganisms, use [is_new_episode()] that also supports grouping with the `dplyr` package.
#' @inheritSection lifecycle Stable Lifecycle
#' @param x a [data.frame] containing isolates. Can be left blank for automatic determination, see *Examples*.
#' @param col_date column name of the result date (or date that is was received on the lab), defaults to the first column with a date class
#' @param col_patient_id column name of the unique IDs of the patients, defaults to the first column that starts with 'patient' or 'patid' (case insensitive)
#' @param col_mo column name of the IDs of the microorganisms (see [as.mo()]), defaults to the first column of class [`mo`]. Values will be coerced using [as.mo()].
#' @param col_testcode column name of the test codes. Use `col_testcode = NULL` to **not** exclude certain test codes (such as test codes for screening). In that case `testcodes_exclude` will be ignored.
#' @param col_specimen column name of the specimen type or group
#' @param col_icu column name of the logicals (`TRUE`/`FALSE`) whether a ward or department is an Intensive Care Unit (ICU)
#' @param col_keyantimicrobials (only useful when `method = "phenotype-based"`) column name of the key antimicrobials to determine first (weighted) isolates, see [key_antimicrobials()]. Defaults to the first column that starts with 'key' followed by 'ab' or 'antibiotics' or 'antimicrobials' (case insensitive). Use `col_keyantimicrobials = FALSE` to prevent this. Can also be the output of [key_antimicrobials()].
#' @param episode_days episode in days after which a genus/species combination will be determined as 'first isolate' again. The default of 365 days is based on the guideline by CLSI, see *Source*. 
#' @param testcodes_exclude a [character] vector with test codes that should be excluded (case-insensitive)
#' @param icu_exclude a [logical] to indicate whether ICU isolates should be excluded (rows with value `TRUE` in the column set with `col_icu`)
#' @param specimen_group value in the column set with `col_specimen` to filter on
#' @param type type to determine weighed isolates; can be `"keyantimicrobials"` or `"points"`, see *Details*
#' @param method the method to apply, either `"phenotype-based"`, `"episode-based"`, `"patient-based"` or `"isolate-based"` (can be abbreviated), see *Details*. The default is `"phenotype-based"` if antimicrobial test results are present in the data, and `"episode-based"` otherwise.
#' @param ignore_I [logical] to indicate whether antibiotic interpretations with `"I"` will be ignored when `type = "keyantimicrobials"`, see *Details*
#' @param points_threshold minimum number of points to require before differences in the antibiogram will lead to inclusion of an isolate when `type = "points"`, see *Details*
#' @param info a [logical] to indicate info should be printed, defaults to `TRUE` only in interactive mode
#' @param include_unknown a [logical] to indicate whether 'unknown' microorganisms should be included too, i.e. microbial code `"UNKNOWN"`, which defaults to `FALSE`. For WHONET users, this means that all records with organism code `"con"` (*contamination*) will be excluded at default. Isolates with a microbial ID of `NA` will always be excluded as first isolate.
#' @param include_untested_rsi a [logical] to indicate whether also rows without antibiotic results are still eligible for becoming a first isolate. Use `include_untested_rsi = FALSE` to always return `FALSE` for such rows. This checks the data set for columns of class `<rsi>` and consequently requires transforming columns with antibiotic results using [as.rsi()] first.
#' @param ... arguments passed on to [first_isolate()] when using [filter_first_isolate()], otherwise arguments passed on to [key_antimicrobials()] (such as `universal`, `gram_negative`, `gram_positive`)
#' @details 
#' To conduct epidemiological analyses on antimicrobial resistance data, only so-called first isolates should be included to prevent overestimation and underestimation of antimicrobial resistance. Different methods can be used to do so, see below.
#' 
#' These functions are context-aware. This means that then the `x` argument can be left blank, see *Examples*.
#' 
#' The [first_isolate()] function is a wrapper around the [is_new_episode()] function, but more efficient for data sets containing microorganism codes or names.
#' 
#' All isolates with a microbial ID of `NA` will be excluded as first isolate.
#' 
#' ## Different methods
#' 
#' According to Hindler *et al.* (2007, \doi{10.1086/511864}), there are different methods (algorithms) to select first isolates with increasing reliability: isolate-based, patient-based, episode-based and phenotype-based. All methods select on a combination of the taxonomic genus and species (not subspecies). 
#' 
#' All mentioned methods are covered in the [first_isolate()] function:
#' 
#' 
#' | **Method**                                       | **Function to apply**                                 |
#' |--------------------------------------------------|-------------------------------------------------------|
#' | **Isolate-based**                                | `first_isolate(x, method = "isolate-based")`          |
#' | *(= all isolates)*                               |                                                       |
#' |                                                  |                                                       |
#' |                                                  |                                                       |
#' | **Patient-based**                                | `first_isolate(x, method = "patient-based")`          |
#' | *(= first isolate per patient)*                  |                                                       |
#' |                                                  |                                                       |
#' |                                                  |                                                       |
#' | **Episode-based**                                | `first_isolate(x, method = "episode-based")`, or:     |
#' | *(= first isolate per episode)*                  |                                                       |
#' | - 7-Day interval from initial isolate            | - `first_isolate(x, method = "e", episode_days = 7)`  |
#' | - 30-Day interval from initial isolate           | - `first_isolate(x, method = "e", episode_days = 30)` |
#' |                                                  |                                                       |
#' |                                                  |                                                       |
#' | **Phenotype-based**                              | `first_isolate(x, method = "phenotype-based")`, or:   |
#' | *(= first isolate per phenotype)*                |                                                       |
#' | - Major difference in any antimicrobial result   | - `first_isolate(x, type = "points")`                 |
#' | - Any difference in key antimicrobial results    | - `first_isolate(x, type = "keyantimicrobials")`      |
#' 
#' ### Isolate-based
#' 
#' This method does not require any selection, as all isolates should be included. It does, however, respect all arguments set in the [first_isolate()] function. For example, the default setting for `include_unknown` (`FALSE`) will omit selection of rows without a microbial ID.
#' 
#' ### Patient-based
#' 
#' To include every genus-species combination per patient once, set the `episode_days` to `Inf`. Although often inappropriate, this method makes sure that no duplicate isolates are selected from the same patient. In a large longitudinal data set, this could mean that isolates are *excluded* that were found years after the initial isolate.
#' 
#' ### Episode-based
#' 
#' To include every genus-species combination per patient episode once, set the `episode_days` to a sensible number of days. Depending on the type of analysis, this could be 14, 30, 60 or 365. Short episodes are common for analysing specific hospital or ward data, long episodes are common for analysing regional and national data.
#' 
#' This is the most common method to correct for duplicate isolates. Patients are categorised into episodes based on their ID and dates (e.g., the date of specimen receipt or laboratory result). While this is a common method, it does not take into account antimicrobial test results. This means that e.g. a methicillin-resistant *Staphylococcus aureus* (MRSA) isolate cannot be differentiated from a wildtype *Staphylococcus aureus* isolate.
#' 
#' ### Phenotype-based
#' 
#' This is a more reliable method, since it also *weighs* the antibiogram (antimicrobial test results) yielding so-called 'first weighted isolates'. There are two different methods to weigh the antibiogram:
#' 
#' 1. Using `type = "points"` and argument `points_threshold`
#' 
#'    This method weighs *all* antimicrobial agents available in the data set. Any difference from I to S or R (or vice versa) counts as 0.5 points, a difference from S to R (or vice versa) counts as 1 point. When the sum of points exceeds `points_threshold`, which defaults to `2`, an isolate will be selected as a first weighted isolate.
#'    
#'    All antimicrobials are internally selected using the [all_antimicrobials()] function. The output of this function does not need to be passed to the [first_isolate()] function.
#' 
#'       
#' 2. Using `type = "keyantimicrobials"` and argument `ignore_I`
#' 
#'    This method only weighs specific antimicrobial agents, called *key antimicrobials*. Any difference from S to R (or vice versa) in these key antimicrobials will select an isolate as a first weighted isolate. With `ignore_I = FALSE`, also differences from I to S or R (or vice versa) will lead to this. 
#'    
#'    Key antimicrobials are internally selected using the [key_antimicrobials()] function, but can also be added manually as a variable to the data and set in the `col_keyantimicrobials` argument. Another option is to pass the output of the [key_antimicrobials()] function directly to the `col_keyantimicrobials` argument.
#'    
#'    
#' The default method is phenotype-based (using `type = "points"`) and episode-based (using `episode_days = 365`). This makes sure that every genus-species combination is selected per patient once per year, while taking into account all antimicrobial test results. If no antimicrobial test results are available in the data set, only the episode-based method is applied at default.
#' @rdname first_isolate
#' @seealso [key_antimicrobials()]
#' @export
#' @return A [`logical`] vector
#' @source Methodology of this function is strictly based on:
#' 
#' - **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 4th Edition**, 2014, *Clinical and Laboratory Standards Institute (CLSI)*. <https://clsi.org/standards/products/microbiology/documents/m39/>.
#' 
#' - Hindler JF and Stelling J (2007). **Analysis and Presentation of Cumulative Antibiograms: A New Consensus Guideline from the Clinical and Laboratory Standards Institute.** Clinical Infectious Diseases, 44(6), 867–873. \doi{10.1086/511864}
#' @inheritSection AMR Read more on Our Website!
#' @examples
#' # `example_isolates` is a data set available in the AMR package.
#' # See ?example_isolates.
#' 
#' example_isolates[first_isolate(example_isolates), ]
#' \donttest{
#' # faster way, only works in R 3.2 and later:
#' example_isolates[first_isolate(), ]
#' 
#' # get all first Gram-negatives
#' example_isolates[which(first_isolate() & mo_is_gram_negative()), ]
#'
#' if (require("dplyr")) {
#'   # filter on first isolates using dplyr:
#'   example_isolates %>%
#'     filter(first_isolate())
#'  
#'   # short-hand version:
#'   example_isolates %>%
#'     filter_first_isolate()
#'     
#'  # grouped determination of first isolates (also prints group names):
#'  example_isolates %>%
#'    group_by(hospital_id) %>%
#'    mutate(first = first_isolate())
#'   
#'   # now let's see if first isolates matter:
#'   A <- example_isolates %>%
#'     group_by(hospital_id) %>%
#'     summarise(count = n_rsi(GEN),            # gentamicin availability
#'               resistance = resistance(GEN))  # gentamicin resistance
#'  
#'   B <- example_isolates %>%
#'     filter_first_isolate() %>%               # the 1st isolate filter
#'     group_by(hospital_id) %>%
#'     summarise(count = n_rsi(GEN),            # gentamicin availability
#'               resistance = resistance(GEN))  # gentamicin resistance
#'  
#'   # Have a look at A and B.
#'   # B is more reliable because every isolate is counted only once.
#'   # Gentamicin resistance in hospital D appears to be 4.2% higher than
#'   # when you (erroneously) would have used all isolates for analysis.
#' }
#' }
first_isolate <- function(x = NULL,
                          col_date = NULL,
                          col_patient_id = NULL,
                          col_mo = NULL,
                          col_testcode = NULL,
                          col_specimen = NULL,
                          col_icu = NULL,
                          col_keyantimicrobials = NULL,
                          episode_days = 365,
                          testcodes_exclude = NULL,
                          icu_exclude = FALSE,
                          specimen_group = NULL,
                          type = "points",
                          method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"),
                          ignore_I = TRUE,
                          points_threshold = 2,
                          info = interactive(),
                          include_unknown = FALSE,
                          include_untested_rsi = TRUE,
                          ...) {
  
  dots <- unlist(list(...))
  if (length(dots) != 0) {
    # backwards compatibility with old arguments
    dots.names <- names(dots)
    if ("filter_specimen" %in% dots.names) {
      specimen_group <- dots[which(dots.names == "filter_specimen")]
    }
    if ("col_keyantibiotics" %in% dots.names) {
      col_keyantimicrobials <- dots[which(dots.names == "col_keyantibiotics")]
    }
  }
  
  if (is_null_or_grouped_tbl(x)) {
    # when `x` is left blank, auto determine it (get_current_data() also contains dplyr::cur_data_all())
    # is also fix for using a grouped df as input (a dot as first argument)
    x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x)
  }
  meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0
  meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_testcode, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  if (isFALSE(col_specimen)) {
    col_specimen <- NULL
  }
  meet_criteria(col_specimen, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_icu, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  # method
  method <- coerce_method(method)
  meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based", "p", "e", "i"))
  # key antimicrobials
  if (length(col_keyantimicrobials) > 1) {
    meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = nrow(x))
    x$keyabcol <- col_keyantimicrobials
    col_keyantimicrobials <- "keyabcol"
  } else {
    if (isFALSE(col_keyantimicrobials)) {
      col_keyantimicrobials <- NULL
      # method cannot be phenotype-based anymore
      if (method == "phenotype-based") {
        method <- "episode-based"
      }
    }
    meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  }
  meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE)
  meet_criteria(testcodes_exclude, allow_class = "character", allow_NULL = TRUE)
  meet_criteria(icu_exclude, allow_class = "logical", has_length = 1)
  meet_criteria(specimen_group, allow_class = "character", has_length = 1, allow_NULL = TRUE)
  meet_criteria(type, allow_class = "character", has_length = 1)
  meet_criteria(ignore_I, allow_class = "logical", has_length = 1)
  meet_criteria(points_threshold, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE)
  meet_criteria(info, allow_class = "logical", has_length = 1)
  meet_criteria(include_unknown, allow_class = "logical", has_length = 1)
  meet_criteria(include_untested_rsi, allow_class = "logical", has_length = 1)
  
  # remove data.table, grouping from tibbles, etc.
  x <- as.data.frame(x, stringsAsFactors = FALSE)
  
  any_col_contains_rsi <- any(vapply(FUN.VALUE = logical(1), 
                                     X = x, 
                                     FUN = function(x) any(as.character(x) %in% c("R", "S", "I"), na.rm = TRUE),
                                     USE.NAMES = FALSE))
  if (method == "phenotype-based" & !any_col_contains_rsi) {
    method <- "episode-based"
  }
  if (info == TRUE & message_not_thrown_before("first_isolate.method")) {
    message_(paste0("Determining first isolates using the '", font_bold(method), "' method",
                    ifelse(method %in% c("episode-based", "phenotype-based"),
                           ifelse(is.infinite(episode_days),
                                  " without a specified episode length",
                                  paste(" and an episode length of", episode_days, "days")),
                           "")),
             as_note = FALSE,
             add_fn = font_black)
    remember_thrown_message("first_isolate.method")
  }
  
  # try to find columns based on type
  # -- mo
  if (is.null(col_mo)) {
    col_mo <- search_type_in_df(x = x, type = "mo")
    stop_if(is.null(col_mo), "`col_mo` must be set")
  }
  
  # methods ----
  if (method == "isolate-based") {
    episode_days <- Inf
    col_keyantimicrobials <- NULL
    x$dummy_dates <- Sys.Date()
    col_date <- "dummy_dates"
    x$dummy_patients <- paste("dummy", seq_len(nrow(x))) # all 'patients' must be unique
    col_patient_id <- "dummy_patients"
  } else if (method == "patient-based") {
    episode_days <- Inf
    col_keyantimicrobials <- NULL
  } else if (method == "episode-based") {
    col_keyantimicrobials <- NULL
  } else if (method == "phenotype-based") {
    if (missing(type) & !is.null(col_keyantimicrobials)) {
      # type = "points" is default, but not set explicitly, while col_keyantimicrobials is
      type <- "keyantimicrobials"
    }
    if (type == "points") {
      x$keyantimicrobials <- all_antimicrobials(x, only_rsi_columns = FALSE)
      col_keyantimicrobials <- "keyantimicrobials"
    } else if (type == "keyantimicrobials" & is.null(col_keyantimicrobials)) {
      col_keyantimicrobials <- search_type_in_df(x = x, type = "keyantibiotics")
      if (is.null(col_keyantimicrobials)) {
        # still not found as a column, create it ourselves
        x$keyantimicrobials <- key_antimicrobials(x, only_rsi_columns = FALSE, col_mo = col_mo, ...)
        col_keyantimicrobials <- "keyantimicrobials"
      }
    }
  }
  
  # -- date
  if (is.null(col_date)) {
    col_date <- search_type_in_df(x = x, type = "date")
    stop_if(is.null(col_date), "`col_date` must be set")
  }
  
  # -- patient id
  if (is.null(col_patient_id)) {
    if (all(c("First name", "Last name", "Sex") %in% colnames(x))) {
      # WHONET support
      x$patient_id <- paste(x$`First name`, x$`Last name`, x$Sex)
      col_patient_id <- "patient_id"
      message_("Using combined columns '", font_bold("First name"), "', '", font_bold("Last name"), "' and '", font_bold("Sex"), "' as input for `col_patient_id`")
    } else {
      col_patient_id <- search_type_in_df(x = x, type = "patient_id")
    }
    stop_if(is.null(col_patient_id), "`col_patient_id` must be set")
  }

  # -- specimen
  if (is.null(col_specimen) & !is.null(specimen_group)) {
    col_specimen <- search_type_in_df(x = x, type = "specimen")
  }
  
  # check if columns exist
  check_columns_existance <- function(column, tblname = x) {
    if (!is.null(column)) {
      stop_ifnot(column %in% colnames(tblname),
                 "Column '", column, "' not found.", call = FALSE)
    }
  }
  
  check_columns_existance(col_date)
  check_columns_existance(col_patient_id)
  check_columns_existance(col_mo)
  check_columns_existance(col_testcode)
  check_columns_existance(col_icu)
  check_columns_existance(col_keyantimicrobials)
  
  # convert dates to Date
  dates <- as.Date(x[, col_date, drop = TRUE])
  dates[is.na(dates)] <- as.Date("1970-01-01")
  x[, col_date] <- dates
  
  # create original row index
  x$newvar_row_index <- seq_len(nrow(x))
  x$newvar_mo <- as.mo(x[, col_mo, drop = TRUE])
  x$newvar_genus_species <- paste(mo_genus(x$newvar_mo), mo_species(x$newvar_mo))
  x$newvar_date <- x[, col_date, drop = TRUE]
  x$newvar_patient_id <- x[, col_patient_id, drop = TRUE]
  
  if (is.null(col_testcode)) {
    testcodes_exclude <- NULL
  }
  # remove testcodes
  if (!is.null(testcodes_exclude) & info == TRUE & message_not_thrown_before("first_isolate.excludingtestcodes")) {
    message_("Excluding test codes: ", toString(paste0("'", testcodes_exclude, "'")),
             add_fn = font_black,
             as_note = FALSE)
    remember_thrown_message("first_isolate.excludingtestcodes")
  }
  
  if (is.null(col_specimen)) {
    specimen_group <- NULL
  }
  
  # filter on specimen group and keyantibiotics when they are filled in
  if (!is.null(specimen_group)) {
    check_columns_existance(col_specimen, x)
    if (info == TRUE & message_not_thrown_before("first_isolate.excludingspecimen")) {
      message_("Excluding other than specimen group '", specimen_group, "'",
               add_fn = font_black,
               as_note = FALSE)
      remember_thrown_message("first_isolate.excludingspecimen")
    }
  }
  if (!is.null(col_keyantimicrobials)) {
    x$newvar_key_ab <- x[, col_keyantimicrobials, drop = TRUE]
  }
  
  if (is.null(testcodes_exclude)) {
    testcodes_exclude <- ""
  }
  
  # arrange data to the right sorting
  if (is.null(specimen_group)) {
    x <- x[order(x$newvar_patient_id, 
                 x$newvar_genus_species,
                 x$newvar_date), ]
    rownames(x) <- NULL
    row.start <- 1
    row.end <- nrow(x)
  } else {
    # filtering on specimen and only analyse these rows to save time
    x <- x[order(pm_pull(x, col_specimen),
                 x$newvar_patient_id, 
                 x$newvar_genus_species,
                 x$newvar_date), ]
    rownames(x) <- NULL
    suppressWarnings(
      row.start <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% min(na.rm = TRUE)
    )
    suppressWarnings(
      row.end <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% max(na.rm = TRUE)
    )
  }
  
  # speed up - return immediately if obvious
  if (abs(row.start) == Inf | abs(row.end) == Inf) {
    if (info == TRUE) {
      message_("=> Found ", font_bold("no isolates"),
               add_fn = font_black, 
               as_note = FALSE)
    }
    return(rep(FALSE, nrow(x)))
  }
  if (row.start == row.end) {
    if (info == TRUE) {
      message_("=> Found ", font_bold("1 first isolate"), ", as the data only contained 1 row", 
               add_fn = font_black,
               as_note = FALSE)
    }
    return(TRUE)
  }
  if (length(c(row.start:row.end)) == pm_n_distinct(x[c(row.start:row.end), col_mo, drop = TRUE])) {
    if (info == TRUE) {
      message_("=> Found ", font_bold(paste(length(c(row.start:row.end)), "first isolates")),
               ", as all isolates were different microorganisms",
               add_fn = font_black,
               as_note = FALSE)
    }
    return(rep(TRUE, length(c(row.start:row.end))))
  }
  
  # did find some isolates - add new index numbers of rows
  x$newvar_row_index_sorted <- seq_len(nrow(x))
  
  scope.size <- nrow(x[which(x$newvar_row_index_sorted %in% c(row.start + 1:row.end) &
                               !is.na(x$newvar_mo)), , drop = FALSE])
  
  # Analysis of first isolate ----
  x$other_pat_or_mo <- ifelse(x$newvar_patient_id == pm_lag(x$newvar_patient_id) &
                                x$newvar_genus_species == pm_lag(x$newvar_genus_species),
                              FALSE,
                              TRUE)
  x$episode_group <- paste(x$newvar_patient_id, x$newvar_genus_species)
  x$more_than_episode_ago <- unlist(lapply(split(x$newvar_date,
                                                 x$episode_group), 
                                           is_new_episode,
                                           episode_days = episode_days),
                                    use.names = FALSE)
  
  weighted.notice <- ""
  if (!is.null(col_keyantimicrobials)) {
    weighted.notice <- "weighted "
    if (info == TRUE & message_not_thrown_before("first_isolate.type")) {
      if (type == "keyantimicrobials") {
        message_("Basing inclusion on key antimicrobials, ",
                 ifelse(ignore_I == FALSE, "not ", ""),
                 "ignoring I",
                 add_fn = font_black,
                 as_note = FALSE)
      }
      if (type == "points") {
        message_("Basing inclusion on all antimicrobial results, using a points threshold of "
                 , points_threshold,
                 add_fn = font_black,
                 as_note = FALSE)
      }
      remember_thrown_message("first_isolate.type")
    }
    type_param <- type
    
    x$other_key_ab <- !antimicrobials_equal(y = x$newvar_key_ab,
                                            z = pm_lag(x$newvar_key_ab),
                                            type = type_param,
                                            ignore_I = ignore_I,
                                            points_threshold = points_threshold)
    # with key antibiotics
    x$newvar_first_isolate <- pm_if_else(x$newvar_row_index_sorted >= row.start &
                                           x$newvar_row_index_sorted <= row.end &
                                           x$newvar_genus_species != "" & 
                                           (x$other_pat_or_mo | x$more_than_episode_ago | x$other_key_ab),
                                         TRUE,
                                         FALSE)
    
  } else {
    # no key antibiotics
    x$newvar_first_isolate <- pm_if_else(x$newvar_row_index_sorted >= row.start &
                                           x$newvar_row_index_sorted <= row.end &
                                           x$newvar_genus_species != "" & 
                                           (x$other_pat_or_mo | x$more_than_episode_ago),
                                         TRUE,
                                         FALSE)
  }
  
  # first one as TRUE
  x[row.start, "newvar_first_isolate"] <- TRUE
  # no tests that should be included, or ICU
  if (!is.null(col_testcode)) {
    x[which(x[, col_testcode] %in% tolower(testcodes_exclude)), "newvar_first_isolate"] <- FALSE
  }
  if (!is.null(col_icu)) {
    if (icu_exclude == TRUE) {
      message_("Excluding isolates from ICU.",
               add_fn = font_black,
               as_note = FALSE)
      x[which(as.logical(x[, col_icu, drop = TRUE])), "newvar_first_isolate"] <- FALSE
    } else {
      message_("Including isolates from ICU.",
               add_fn = font_black,
               as_note = FALSE)
    }
  }
  
  decimal.mark <- getOption("OutDec")
  big.mark <- ifelse(decimal.mark != ",", ",", ".")
  
  if (info == TRUE) {
    # print group name if used in dplyr::group_by()
    cur_group <- import_fn("cur_group", "dplyr", error_on_fail = FALSE)
    if (!is.null(cur_group)) {
      group_df <- tryCatch(cur_group(), error = function(e) data.frame())
      if (NCOL(group_df) > 0) {
        # transform factors to characters
        group <- vapply(FUN.VALUE = character(1), group_df, function(x) {
          if (is.numeric(x)) {
            format(x)
          } else if (is.logical(x)) {
            as.character(x)
          } else {
            paste0('"', x, '"')
          }
        })
        message_("\nGroup: ", paste0(names(group), " = ", group, collapse = ", "), "\n",
                 as_note = FALSE,
                 add_fn = font_red)
      }
    }
  }
  
  # handle empty microorganisms
  if (any(x$newvar_mo == "UNKNOWN", na.rm = TRUE) & info == TRUE) {
    message_(ifelse(include_unknown == TRUE, "Included ", "Excluded "), 
             format(sum(x$newvar_mo == "UNKNOWN", na.rm = TRUE),
                    decimal.mark = decimal.mark, big.mark = big.mark), 
             " isolates with a microbial ID 'UNKNOWN' (in column '", font_bold(col_mo), "')")
  }
  x[which(x$newvar_mo == "UNKNOWN"), "newvar_first_isolate"] <- include_unknown
  
  # exclude all NAs
  if (any(is.na(x$newvar_mo)) & info == TRUE) {
    message_("Excluded ", format(sum(is.na(x$newvar_mo), na.rm = TRUE),
                                 decimal.mark = decimal.mark, big.mark = big.mark), 
             " isolates with a microbial ID 'NA' (in column '", font_bold(col_mo), "')")
  }
  x[which(is.na(x$newvar_mo)), "newvar_first_isolate"] <- FALSE
  
  # handle isolates without antibiogram
  if (include_untested_rsi == FALSE && any(is.rsi(x))) {
    rsi_all_NA <- which(unname(vapply(FUN.VALUE = logical(1), 
                                      as.data.frame(t(x[, is.rsi(x), drop = FALSE])),
                                      function(rsi_values) all(is.na(rsi_values)))))
    x[rsi_all_NA, "newvar_first_isolate"] <- FALSE
  }
  
  # arrange back according to original sorting again
  x <- x[order(x$newvar_row_index), ]
  rownames(x) <- NULL
  
  if (info == TRUE) {
    n_found <- sum(x$newvar_first_isolate, na.rm = TRUE)
    p_found_total <- percentage(n_found / nrow(x[which(!is.na(x$newvar_mo)), , drop = FALSE]), digits = 1)
    p_found_scope <- percentage(n_found / scope.size, digits = 1)
    if (p_found_total %unlike% "[.]") {
      p_found_total <- gsub("%", ".0%", p_found_total, fixed = TRUE)
    }
    if (p_found_scope %unlike% "[.]") {
      p_found_scope <- gsub("%", ".0%", p_found_scope, fixed = TRUE)
    }
    # mark up number of found
    n_found <- format(n_found, big.mark = big.mark, decimal.mark = decimal.mark)
    if (p_found_total != p_found_scope) {
      msg_txt <- paste0("=> Found ",
                        font_bold(paste0(n_found, " first ", weighted.notice, "isolates")),
                        " (", method, ", ", p_found_scope, " within scope and ", p_found_total, " of total where a microbial ID was available)")
    } else {
      msg_txt <- paste0("=> Found ",
                        font_bold(paste0(n_found, " first ", weighted.notice, "isolates")),
                        " (", method, ", ", p_found_total, " of total where a microbial ID was available)")
    }
    message_(msg_txt, add_fn = font_black, as_note = FALSE)
  }
  
  x$newvar_first_isolate
  
}

#' @rdname first_isolate
#' @export
filter_first_isolate <- function(x = NULL,
                                 col_date = NULL,
                                 col_patient_id = NULL,
                                 col_mo = NULL,
                                 episode_days = 365,
                                 method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"),
                                 ...) {
  if (is_null_or_grouped_tbl(x)) {
    # when `x` is left blank, auto determine it (get_current_data() also contains dplyr::cur_data_all())
    # is also fix for using a grouped df as input (a dot as first argument)
    x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x)
  }
  meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0
  meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
  meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE)
  method <- coerce_method(method)
  meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based", "p", "e", "i"))
  
  subset(x, first_isolate(x = x,
                          col_date = col_date,
                          col_patient_id = col_patient_id,
                          col_mo = col_mo,
                          episode_days = episode_days,
                          method = method,
                          ...))
}

coerce_method <- function(method) {
  if (is.null(method)) {
    return(method)
  }
  method <- tolower(as.character(method[1L]))
  method[method %like% "^(p$|pheno)"] <- "phenotype-based"
  method[method %like% "^(e$|episode)"] <- "episode-based"
  method[method %like% "^patient"] <- "patient-based"
  method[method %like% "^(i$|iso)"] <- "isolate-based"
  method
}
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								# ==================================================================== #
 								# TITLE                                                                #
-												(v1.5.0.9014) only_rsi_columns, is.rsi.eligible improvement

											
										
										
											2021-02-02 23:57:35 +01:00
+								# Antimicrobial Resistance (AMR) Data Analysis for R                   #
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#                                                                      #
-												big website update, licence txt update

											
										
										
											2019-01-02 23:24:07 +01:00
+								# SOURCE                                                               #
-												(v1.2.0.9026) move to github

											
										
										
											2020-07-08 14:48:06 +02:00
+								# https://github.com/msberends/AMR                                     #
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#                                                                      #
 								# LICENCE                                                              #
-												(v1.4.0.9047) unit tests

											
										
										
											2020-12-27 00:30:28 +01:00
+								# (c) 2018-2021 Berends MS, Luz CF et al.                              #
-												(v1.4.0) matching score update

											
										
										
											2020-10-08 11:16:03 +02:00
+								# Developed at the University of Groningen, the Netherlands, in        #
 								# collaboration with non-profit organisations Certe Medical            #
 								# Diagnostics & Advice, and University Medical Center Groningen.       #
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#                                                                      #
-												big website update, licence txt update

											
										
										
											2019-01-02 23:24:07 +01:00
+								# This R package is free software; you can freely use and distribute   #
 								# it for both personal and commercial purposes under the terms of the  #
 								# GNU General Public License version 2.0 (GNU GPL-2), as published by  #
 								# the Free Software Foundation.                                        #
-												(v0.9.0.9008) Happy new year! Add lifecycles

											
										
										
											2020-01-05 17:22:09 +01:00
+								# We created this package for both routine data analysis and academic  #
 								# research and it was publicly released in the hope that it will be    #
 								# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY.              #
-												(v1.4.0) matching score update

											
										
										
											2020-10-08 11:16:03 +02:00
+								#                                                                      #
 								# Visit our website for the full manual and a complete tutorial about  #
-												(v1.5.0.9014) only_rsi_columns, is.rsi.eligible improvement

											
										
										
											2021-02-02 23:57:35 +01:00
+								# how to conduct AMR data analysis: https://msberends.github.io/AMR/   #
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								# ==================================================================== #
-												(v1.5.0.9006) major documentation update

											
										
										
											2021-01-18 16:57:56 +01:00
+								#' Determine First (Weighted) Isolates
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. These functions support all four methods as summarised by Hindler *et al.* in 2007 (\doi{10.1086/511864}). To determine patient episodes not necessarily based on microorganisms, use [is_new_episode()] that also supports grouping with the `dplyr` package.
-												(v1.5.0.9006) major documentation update

											
										
										
											2021-01-18 16:57:56 +01:00
+								#' @inheritSection lifecycle Stable Lifecycle
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								#' @param x a [data.frame] containing isolates. Can be left blank for automatic determination, see *Examples*.
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#' @param col_date column name of the result date (or date that is was received on the lab), defaults to the first column with a date class
-												keyab automatic

											
										
										
											2018-12-10 15:14:29 +01:00
+								#' @param col_patient_id column name of the unique IDs of the patients, defaults to the first column that starts with 'patient' or 'patid' (case insensitive)
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#' @param col_mo column name of the IDs of the microorganisms (see [as.mo()]), defaults to the first column of class [`mo`]. Values will be coerced using [as.mo()].
-												(v1.4.0.9041) updates based on review

											
										
										
											2020-12-17 16:22:25 +01:00
+								#' @param col_testcode column name of the test codes. Use `col_testcode = NULL` to **not** exclude certain test codes (such as test codes for screening). In that case `testcodes_exclude` will be ignored.
-												fix clipboard on linux

											
										
										
											2018-04-02 11:11:21 +02:00
+								#' @param col_specimen column name of the specimen type or group
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#' @param col_icu column name of the logicals (`TRUE`/`FALSE`) whether a ward or department is an Intensive Care Unit (ICU)
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' @param col_keyantimicrobials (only useful when `method = "phenotype-based"`) column name of the key antimicrobials to determine first (weighted) isolates, see [key_antimicrobials()]. Defaults to the first column that starts with 'key' followed by 'ab' or 'antibiotics' or 'antimicrobials' (case insensitive). Use `col_keyantimicrobials = FALSE` to prevent this. Can also be the output of [key_antimicrobials()].
-												(v1.5.0.9006) major documentation update

											
										
										
											2021-01-18 16:57:56 +01:00
+								#' @param episode_days episode in days after which a genus/species combination will be determined as 'first isolate' again. The default of 365 days is based on the guideline by CLSI, see *Source*.
-												(v1.6.0.9021) join functions update

											
										
										
											2021-05-12 18:15:03 +02:00
+								#' @param testcodes_exclude a [character] vector with test codes that should be excluded (case-insensitive)
 								#' @param icu_exclude a [logical] to indicate whether ICU isolates should be excluded (rows with value `TRUE` in the column set with `col_icu`)
-												(v1.4.0.9043) documentation update

											
										
										
											2020-12-22 00:51:17 +01:00
+								#' @param specimen_group value in the column set with `col_specimen` to filter on
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' @param type type to determine weighed isolates; can be `"keyantimicrobials"` or `"points"`, see *Details*
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' @param method the method to apply, either `"phenotype-based"`, `"episode-based"`, `"patient-based"` or `"isolate-based"` (can be abbreviated), see *Details*. The default is `"phenotype-based"` if antimicrobial test results are present in the data, and `"episode-based"` otherwise.
-												(v1.6.0.9021) join functions update

											
										
										
											2021-05-12 18:15:03 +02:00
+								#' @param ignore_I [logical] to indicate whether antibiotic interpretations with `"I"` will be ignored when `type = "keyantimicrobials"`, see *Details*
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' @param points_threshold minimum number of points to require before differences in the antibiogram will lead to inclusion of an isolate when `type = "points"`, see *Details*
-												(v1.6.0.9009) key_antibiotics update

											
										
										
											2021-04-23 16:13:26 +02:00
+								#' @param info a [logical] to indicate info should be printed, defaults to `TRUE` only in interactive mode
-												(v1.6.0.9021) join functions update

											
										
										
											2021-05-12 18:15:03 +02:00
+								#' @param include_unknown a [logical] to indicate whether 'unknown' microorganisms should be included too, i.e. microbial code `"UNKNOWN"`, which defaults to `FALSE`. For WHONET users, this means that all records with organism code `"con"` (*contamination*) will be excluded at default. Isolates with a microbial ID of `NA` will always be excluded as first isolate.
 								#' @param include_untested_rsi a [logical] to indicate whether also rows without antibiotic results are still eligible for becoming a first isolate. Use `include_untested_rsi = FALSE` to always return `FALSE` for such rows. This checks the data set for columns of class `<rsi>` and consequently requires transforming columns with antibiotic results using [as.rsi()] first.
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' @param ... arguments passed on to [first_isolate()] when using [filter_first_isolate()], otherwise arguments passed on to [key_antimicrobials()] (such as `universal`, `gram_negative`, `gram_positive`)
-												(v1.4.0.9032) auto-data guessing for functions

											
										
										
											2020-12-07 16:06:42 +01:00
+								#' @details
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' To conduct epidemiological analyses on antimicrobial resistance data, only so-called first isolates should be included to prevent overestimation and underestimation of antimicrobial resistance. Different methods can be used to do so, see below.
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'
-												(v1.5.0.9010) MDRO vignette update, get_episode for < day

											
										
										
											2021-01-24 14:48:56 +01:00
+								#' These functions are context-aware. This means that then the `x` argument can be left blank, see *Examples*.
-												(v1.4.0.9032) auto-data guessing for functions

											
										
										
											2020-12-07 16:06:42 +01:00
+								#'
 								#' The [first_isolate()] function is a wrapper around the [is_new_episode()] function, but more efficient for data sets containing microorganism codes or names.
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#'
 								#' All isolates with a microbial ID of `NA` will be excluded as first isolate.
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' ## Different methods
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' According to Hindler *et al.* (2007, \doi{10.1086/511864}), there are different methods (algorithms) to select first isolates with increasing reliability: isolate-based, patient-based, episode-based and phenotype-based. All methods select on a combination of the taxonomic genus and species (not subspecies).
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' All mentioned methods are covered in the [first_isolate()] function:
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#'
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' | **Method**                                       | **Function to apply**                                 |
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' |--------------------------------------------------|-------------------------------------------------------|
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' | **Isolate-based**                                | `first_isolate(x, method = "isolate-based")`          |
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' | *(= all isolates)*                               |                                                       |
 								#' |                                                  |                                                       |
 								#' |                                                  |                                                       |
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' | **Patient-based**                                | `first_isolate(x, method = "patient-based")`          |
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' | *(= first isolate per patient)*                  |                                                       |
 								#' |                                                  |                                                       |
 								#' |                                                  |                                                       |
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' | **Episode-based**                                | `first_isolate(x, method = "episode-based")`, or:     |
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' | *(= first isolate per episode)*                  |                                                       |
 								#' | - 7-Day interval from initial isolate            | - `first_isolate(x, method = "e", episode_days = 7)`  |
 								#' | - 30-Day interval from initial isolate           | - `first_isolate(x, method = "e", episode_days = 30)` |
 								#' |                                                  |                                                       |
 								#' |                                                  |                                                       |
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' | **Phenotype-based**                              | `first_isolate(x, method = "phenotype-based")`, or:   |
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' | *(= first isolate per phenotype)*                |                                                       |
 								#' | - Major difference in any antimicrobial result   | - `first_isolate(x, type = "points")`                 |
 								#' | - Any difference in key antimicrobial results    | - `first_isolate(x, type = "keyantimicrobials")`      |
 								#'
 								#' ### Isolate-based
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' This method does not require any selection, as all isolates should be included. It does, however, respect all arguments set in the [first_isolate()] function. For example, the default setting for `include_unknown` (`FALSE`) will omit selection of rows without a microbial ID.
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'
 								#' ### Patient-based
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' To include every genus-species combination per patient once, set the `episode_days` to `Inf`. Although often inappropriate, this method makes sure that no duplicate isolates are selected from the same patient. In a large longitudinal data set, this could mean that isolates are *excluded* that were found years after the initial isolate.
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'
 								#' ### Episode-based
 								#'
 								#' To include every genus-species combination per patient episode once, set the `episode_days` to a sensible number of days. Depending on the type of analysis, this could be 14, 30, 60 or 365. Short episodes are common for analysing specific hospital or ward data, long episodes are common for analysing regional and national data.
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' This is the most common method to correct for duplicate isolates. Patients are categorised into episodes based on their ID and dates (e.g., the date of specimen receipt or laboratory result). While this is a common method, it does not take into account antimicrobial test results. This means that e.g. a methicillin-resistant *Staphylococcus aureus* (MRSA) isolate cannot be differentiated from a wildtype *Staphylococcus aureus* isolate.
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'
 								#' ### Phenotype-based
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' This is a more reliable method, since it also *weighs* the antibiogram (antimicrobial test results) yielding so-called 'first weighted isolates'. There are two different methods to weigh the antibiogram:
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'
 								#' 1. Using `type = "points"` and argument `points_threshold`
 								#'
 								#'    This method weighs *all* antimicrobial agents available in the data set. Any difference from I to S or R (or vice versa) counts as 0.5 points, a difference from S to R (or vice versa) counts as 1 point. When the sum of points exceeds `points_threshold`, which defaults to `2`, an isolate will be selected as a first weighted isolate.
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#'
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'    All antimicrobials are internally selected using the [all_antimicrobials()] function. The output of this function does not need to be passed to the [first_isolate()] function.
 								#'
 								#'
 								#' 2. Using `type = "keyantimicrobials"` and argument `ignore_I`
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#'
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'    This method only weighs specific antimicrobial agents, called *key antimicrobials*. Any difference from S to R (or vice versa) in these key antimicrobials will select an isolate as a first weighted isolate. With `ignore_I = FALSE`, also differences from I to S or R (or vice versa) will lead to this.
 								#'
 								#'    Key antimicrobials are internally selected using the [key_antimicrobials()] function, but can also be added manually as a variable to the data and set in the `col_keyantimicrobials` argument. Another option is to pass the output of the [key_antimicrobials()] function directly to the `col_keyantimicrobials` argument.
 								#'
 								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' The default method is phenotype-based (using `type = "points"`) and episode-based (using `episode_days = 365`). This makes sure that every genus-species combination is selected per patient once per year, while taking into account all antimicrobial test results. If no antimicrobial test results are available in the data set, only the episode-based method is applied at default.
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								#' @rdname first_isolate
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#' @seealso [key_antimicrobials()]
-												export first_isolate

											
										
										
											2018-02-26 12:15:52 +01:00
+								#' @export
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#' @return A [`logical`] vector
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								#' @source Methodology of this function is strictly based on:
-												(v0.8.0.9036) complete documentation rewrite

											
										
										
											2019-11-28 22:32:17 +01:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#' - **M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 4th Edition**, 2014, *Clinical and Laboratory Standards Institute (CLSI)*. <https://clsi.org/standards/products/microbiology/documents/m39/>.
 								#'
 								#' - Hindler JF and Stelling J (2007). **Analysis and Presentation of Cumulative Antibiograms: A New Consensus Guideline from the Clinical and Laboratory Standards Institute.** Clinical Infectious Diseases, 44(6), 867–873. \doi{10.1086/511864}
-												(v1.5.0.9006) major documentation update

											
										
										
											2021-01-18 16:57:56 +01:00
+								#' @inheritSection AMR Read more on Our Website!
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#' @examples
-												(v1.5.0.9010) MDRO vignette update, get_episode for < day

											
										
										
											2021-01-24 14:48:56 +01:00
+								#' # `example_isolates` is a data set available in the AMR package.
-												(v0.7.1.9063) septic_patients -> example_isolates

											
										
										
											2019-08-27 16:45:42 +02:00
+								#' # See ?example_isolates.
-												(v0.7.1.9056) mo and ab subsetting

											
										
										
											2019-08-14 14:57:06 +02:00
+								#'
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								#' example_isolates[first_isolate(example_isolates), ]
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#' \donttest{
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								#' # faster way, only works in R 3.2 and later:
 								#' example_isolates[first_isolate(), ]
 								#'
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								#' # get all first Gram-negatives
 								#' example_isolates[which(first_isolate() & mo_is_gram_negative()), ]
 								#'
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#' if (require("dplyr")) {
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								#'   # filter on first isolates using dplyr:
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'   example_isolates %>%
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								#'     filter(first_isolate())
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#'   # short-hand version:
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'   example_isolates %>%
 								#'     filter_first_isolate()
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								#'
 								#'  # grouped determination of first isolates (also prints group names):
 								#'  example_isolates %>%
 								#'    group_by(hospital_id) %>%
 								#'    mutate(first = first_isolate())
-												(v1.0.1.9009) prepare for next release

											
										
										
											2020-04-15 11:30:28 +02:00
+								#'
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								#'   # now let's see if first isolates matter:
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'   A <- example_isolates %>%
 								#'     group_by(hospital_id) %>%
 								#'     summarise(count = n_rsi(GEN),            # gentamicin availability
 								#'               resistance = resistance(GEN))  # gentamicin resistance
 								#'
 								#'   B <- example_isolates %>%
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								#'     filter_first_isolate() %>%               # the 1st isolate filter
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'     group_by(hospital_id) %>%
 								#'     summarise(count = n_rsi(GEN),            # gentamicin availability
 								#'               resistance = resistance(GEN))  # gentamicin resistance
 								#'
 								#'   # Have a look at A and B.
 								#'   # B is more reliable because every isolate is counted only once.
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								#'   # Gentamicin resistance in hospital D appears to be 4.2% higher than
-												(v1.3.0.9035) mdro() for EUCAST 3.2, examples cleanup

											
										
										
											2020-09-29 23:35:46 +02:00
+								#'   # when you (erroneously) would have used all isolates for analysis.
 								#' }
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								#' }
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								first_isolate <- function(x = NULL,
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								                          col_date = NULL,
 								                          col_patient_id = NULL,
 								                          col_mo = NULL,
 								                          col_testcode = NULL,
 								                          col_specimen = NULL,
 								                          col_icu = NULL,
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								                          col_keyantimicrobials = NULL,
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								                          episode_days = 365,
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								                          testcodes_exclude = NULL,
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								                          icu_exclude = FALSE,
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								                          specimen_group = NULL,
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								                          type = "points",
 								                          method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"),
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								                          ignore_I = TRUE,
-												added septic_patients

											
										
										
											2018-02-27 20:01:02 +01:00
+								                          points_threshold = 2,
-												(v1.0.0.9005) info printing only in interactive mode

											
										
										
											2020-02-21 21:13:38 +01:00
+								                          info = interactive(),
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								                          include_unknown = FALSE,
-												(v1.5.0.9039) handle first isolates for missing antibiograms

											
										
										
											2021-03-08 02:38:32 +01:00
+								                          include_untested_rsi = TRUE,
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								                          ...) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
 								  dots <- unlist(list(...))
 								  if (length(dots) != 0) {
 								    # backwards compatibility with old arguments
 								    dots.names <- names(dots)
 								    if ("filter_specimen" %in% dots.names) {
 								      specimen_group <- dots[which(dots.names == "filter_specimen")]
 								    }
 								    if ("col_keyantibiotics" %in% dots.names) {
 								      col_keyantimicrobials <- dots[which(dots.names == "col_keyantibiotics")]
 								    }
 								  }
-												(v1.5.0.9018) fix unit tests

											
										
										
											2021-02-09 12:28:15 +01:00
+								  if (is_null_or_grouped_tbl(x)) {
 								    # when `x` is left blank, auto determine it (get_current_data() also contains dplyr::cur_data_all())
 								    # is also fix for using a grouped df as input (a dot as first argument)
 								    x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x)
 								  }
 								  meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								  meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_testcode, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
-												(v1.4.0.9007) bugfix

											
										
										
											2020-10-21 15:28:48 +02:00
+								  if (isFALSE(col_specimen)) {
 								    col_specimen <- NULL
 								  }
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								  meet_criteria(col_specimen, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_icu, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  # method
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								  method <- coerce_method(method)
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based", "p", "e", "i"))
 								  # key antimicrobials
 								  if (length(col_keyantimicrobials) > 1) {
 								    meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = nrow(x))
 								    x$keyabcol <- col_keyantimicrobials
 								    col_keyantimicrobials <- "keyabcol"
-												(v1.6.0.9001) support Inf for episodes

											
										
										
											2021-04-12 12:35:13 +02:00
+								  } else {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								    if (isFALSE(col_keyantimicrobials)) {
 								      col_keyantimicrobials <- NULL
 								      # method cannot be phenotype-based anymore
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								      if (method == "phenotype-based") {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								        method <- "episode-based"
 								      }
-												(v1.6.0.9001) support Inf for episodes

											
										
										
											2021-04-12 12:35:13 +02:00
+								    }
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								    meet_criteria(col_keyantimicrobials, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
-												(v1.4.0.9007) bugfix

											
										
										
											2020-10-21 15:28:48 +02:00
+								  }
-												(v1.6.0.9001) support Inf for episodes

											
										
										
											2021-04-12 12:35:13 +02:00
+								  meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE)
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								  meet_criteria(testcodes_exclude, allow_class = "character", allow_NULL = TRUE)
 								  meet_criteria(icu_exclude, allow_class = "logical", has_length = 1)
 								  meet_criteria(specimen_group, allow_class = "character", has_length = 1, allow_NULL = TRUE)
 								  meet_criteria(type, allow_class = "character", has_length = 1)
 								  meet_criteria(ignore_I, allow_class = "logical", has_length = 1)
-												(v1.5.0.9010) MDRO vignette update, get_episode for < day

											
										
										
											2021-01-24 14:48:56 +01:00
+								  meet_criteria(points_threshold, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = TRUE)
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								  meet_criteria(info, allow_class = "logical", has_length = 1)
 								  meet_criteria(include_unknown, allow_class = "logical", has_length = 1)
-												(v1.5.0.9039) handle first isolates for missing antibiograms

											
										
										
											2021-03-08 02:38:32 +01:00
+								  meet_criteria(include_untested_rsi, allow_class = "logical", has_length = 1)
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								  # remove data.table, grouping from tibbles, etc.
 								  x <- as.data.frame(x, stringsAsFactors = FALSE)
 								  any_col_contains_rsi <- any(vapply(FUN.VALUE = logical(1),
 								                                     X = x,
 								                                     FUN = function(x) any(as.character(x) %in% c("R", "S", "I"), na.rm = TRUE),
 								                                     USE.NAMES = FALSE))
 								  if (method == "phenotype-based" & !any_col_contains_rsi) {
 								    method <- "episode-based"
 								  }
 								  if (info == TRUE & message_not_thrown_before("first_isolate.method")) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								    message_(paste0("Determining first isolates using the '", font_bold(method), "' method",
 								                    ifelse(method %in% c("episode-based", "phenotype-based"),
 								                           ifelse(is.infinite(episode_days),
 								                                  " without a specified episode length",
 								                                  paste(" and an episode length of", episode_days, "days")),
 								                           "")),
 								             as_note = FALSE,
 								             add_fn = font_black)
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								    remember_thrown_message("first_isolate.method")
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  }
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								  # try to find columns based on type
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  # -- mo
-												resistance predict

											
										
										
											2019-01-15 12:45:24 +01:00
+								  if (is.null(col_mo)) {
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								    col_mo <- search_type_in_df(x = x, type = "mo")
-												(v1.2.0.9011) mo_domain(), improved error handling

											
										
										
											2020-06-22 11:18:40 +02:00
+								    stop_if(is.null(col_mo), "`col_mo` must be set")
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  # methods ----
 								  if (method == "isolate-based") {
 								    episode_days <- Inf
 								    col_keyantimicrobials <- NULL
 								    x$dummy_dates <- Sys.Date()
 								    col_date <- "dummy_dates"
 								    x$dummy_patients <- paste("dummy", seq_len(nrow(x))) # all 'patients' must be unique
 								    col_patient_id <- "dummy_patients"
 								  } else if (method == "patient-based") {
 								    episode_days <- Inf
 								    col_keyantimicrobials <- NULL
 								  } else if (method == "episode-based") {
 								    col_keyantimicrobials <- NULL
 								  } else if (method == "phenotype-based") {
 								    if (missing(type) & !is.null(col_keyantimicrobials)) {
 								      # type = "points" is default, but not set explicitly, while col_keyantimicrobials is
 								      type <- "keyantimicrobials"
 								    }
 								    if (type == "points") {
 								      x$keyantimicrobials <- all_antimicrobials(x, only_rsi_columns = FALSE)
 								      col_keyantimicrobials <- "keyantimicrobials"
 								    } else if (type == "keyantimicrobials" & is.null(col_keyantimicrobials)) {
 								      col_keyantimicrobials <- search_type_in_df(x = x, type = "keyantibiotics")
 								      if (is.null(col_keyantimicrobials)) {
 								        # still not found as a column, create it ourselves
 								        x$keyantimicrobials <- key_antimicrobials(x, only_rsi_columns = FALSE, col_mo = col_mo, ...)
 								        col_keyantimicrobials <- "keyantimicrobials"
 								      }
 								    }
 								  }
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  # -- date
-												keyab automatic

											
										
										
											2018-12-10 15:14:29 +01:00
+								  if (is.null(col_date)) {
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								    col_date <- search_type_in_df(x = x, type = "date")
-												(v1.2.0.9011) mo_domain(), improved error handling

											
										
										
											2020-06-22 11:18:40 +02:00
+								    stop_if(is.null(col_date), "`col_date` must be set")
-												replaced bactid by mo

											
										
										
											2018-08-31 13:36:19 +02:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  # -- patient id
-												resistance predict

											
										
										
											2019-01-15 12:45:24 +01:00
+								  if (is.null(col_patient_id)) {
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
+								    if (all(c("First name", "Last name", "Sex") %in% colnames(x))) {
-												freq update

											
										
										
											2019-01-29 20:20:09 +01:00
+								      # WHONET support
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								      x$patient_id <- paste(x$`First name`, x$`Last name`, x$Sex)
-												freq update

											
										
										
											2019-01-29 20:20:09 +01:00
+								      col_patient_id <- "patient_id"
-												(v1.4.0.9030) as.mo() fix for known lab codes

											
										
										
											2020-12-03 16:59:04 +01:00
+								      message_("Using combined columns '", font_bold("First name"), "', '", font_bold("Last name"), "' and '", font_bold("Sex"), "' as input for `col_patient_id`")
-												freq update

											
										
										
											2019-01-29 20:20:09 +01:00
+								    } else {
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								      col_patient_id <- search_type_in_df(x = x, type = "patient_id")
-												freq update

											
										
										
											2019-01-29 20:20:09 +01:00
+								    }
-												(v1.2.0.9011) mo_domain(), improved error handling

											
										
										
											2020-06-22 11:18:40 +02:00
+								    stop_if(is.null(col_patient_id), "`col_patient_id` must be set")
-												keyab automatic

											
										
										
											2018-12-10 15:14:29 +01:00
+								  }
-												(v1.4.0.9007) bugfix

											
										
										
											2020-10-21 15:28:48 +02:00
-												WHONET/EARS-Net support

											
										
										
											2019-01-29 00:06:50 +01:00
+								  # -- specimen
-												(v0.6.1.9044) first_isolate fix for species

											
										
										
											2019-05-31 14:25:11 +02:00
+								  if (is.null(col_specimen) & !is.null(specimen_group)) {
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								    col_specimen <- search_type_in_df(x = x, type = "specimen")
-												WHONET/EARS-Net support

											
										
										
											2019-01-29 00:06:50 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # check if columns exist
-												first isolate missing dates fix

											
										
										
											2019-05-13 14:56:23 +02:00
+								  check_columns_existance <- function(column, tblname = x) {
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								    if (!is.null(column)) {
-												(v1.2.0.9011) mo_domain(), improved error handling

											
										
										
											2020-06-22 11:18:40 +02:00
+								      stop_ifnot(column %in% colnames(tblname),
-												(v1.4.0.9043) documentation update

											
										
										
											2020-12-22 00:51:17 +01:00
+								                 "Column '", column, "' not found.", call = FALSE)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								    }
 								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  check_columns_existance(col_date)
-												first isolates

											
										
										
											2018-02-26 14:06:31 +01:00
+								  check_columns_existance(col_patient_id)
-												replaced bactid by mo

											
										
										
											2018-08-31 13:36:19 +02:00
+								  check_columns_existance(col_mo)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  check_columns_existance(col_testcode)
 								  check_columns_existance(col_icu)
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  check_columns_existance(col_keyantimicrobials)
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v1.2.0.9035) as.mo() speed improvement

											
										
										
											2020-07-22 10:24:23 +02:00
+								  # convert dates to Date
 								  dates <- as.Date(x[, col_date, drop = TRUE])
 								  dates[is.na(dates)] <- as.Date("1970-01-01")
 								  x[, col_date] <- dates
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  # create original row index
 								  x$newvar_row_index <- seq_len(nrow(x))
-												(v1.5.0.9014) only_rsi_columns, is.rsi.eligible improvement

											
										
										
											2021-02-02 23:57:35 +01:00
+								  x$newvar_mo <- as.mo(x[, col_mo, drop = TRUE])
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x$newvar_genus_species <- paste(mo_genus(x$newvar_mo), mo_species(x$newvar_mo))
-												(v1.2.0.9035) as.mo() speed improvement

											
										
										
											2020-07-22 10:24:23 +02:00
+								  x$newvar_date <- x[, col_date, drop = TRUE]
 								  x$newvar_patient_id <- x[, col_patient_id, drop = TRUE]
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  if (is.null(col_testcode)) {
 								    testcodes_exclude <- NULL
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # remove testcodes
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								  if (!is.null(testcodes_exclude) & info == TRUE & message_not_thrown_before("first_isolate.excludingtestcodes")) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								    message_("Excluding test codes: ", toString(paste0("'", testcodes_exclude, "'")),
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								             add_fn = font_black,
 								             as_note = FALSE)
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								    remember_thrown_message("first_isolate.excludingtestcodes")
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  if (is.null(col_specimen)) {
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								    specimen_group <- NULL
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # filter on specimen group and keyantibiotics when they are filled in
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								  if (!is.null(specimen_group)) {
-												first isolate missing dates fix

											
										
										
											2019-05-13 14:56:23 +02:00
+								    check_columns_existance(col_specimen, x)
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								    if (info == TRUE & message_not_thrown_before("first_isolate.excludingspecimen")) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      message_("Excluding other than specimen group '", specimen_group, "'",
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								               add_fn = font_black,
 								               as_note = FALSE)
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								      remember_thrown_message("first_isolate.excludingspecimen")
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								    }
 								  }
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  if (!is.null(col_keyantimicrobials)) {
 								    x$newvar_key_ab <- x[, col_keyantimicrobials, drop = TRUE]
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  if (is.null(testcodes_exclude)) {
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
+								    testcodes_exclude <- ""
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								  # arrange data to the right sorting
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								  if (is.null(specimen_group)) {
-												(v1.2.0.9023) ab_from_text() improvement

											
										
										
											2020-07-02 21:12:52 +02:00
+								    x <- x[order(x$newvar_patient_id,
 								                 x$newvar_genus_species,
 								                 x$newvar_date), ]
 								    rownames(x) <- NULL
 								    row.start <- 1
 								    row.end <- nrow(x)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  } else {
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								    # filtering on specimen and only analyse these rows to save time
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								    x <- x[order(pm_pull(x, col_specimen),
-												(v1.2.0.9023) ab_from_text() improvement

											
										
										
											2020-07-02 21:12:52 +02:00
+								                 x$newvar_patient_id,
 								                 x$newvar_genus_species,
 								                 x$newvar_date), ]
 								    rownames(x) <- NULL
 								    suppressWarnings(
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								      row.start <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% min(na.rm = TRUE)
-												(v1.2.0.9023) ab_from_text() improvement

											
										
										
											2020-07-02 21:12:52 +02:00
+								    )
 								    suppressWarnings(
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								      row.end <- which(x %pm>% pm_pull(col_specimen) == specimen_group) %pm>% max(na.rm = TRUE)
-												(v1.2.0.9023) ab_from_text() improvement

											
										
										
											2020-07-02 21:12:52 +02:00
+								    )
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								  # speed up - return immediately if obvious
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  if (abs(row.start) == Inf | abs(row.end) == Inf) {
 								    if (info == TRUE) {
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								      message_("=> Found ", font_bold("no isolates"),
 								               add_fn = font_black,
 								               as_note = FALSE)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								    }
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								    return(rep(FALSE, nrow(x)))
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								  if (row.start == row.end) {
 								    if (info == TRUE) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      message_("=> Found ", font_bold("1 first isolate"), ", as the data only contained 1 row",
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								               add_fn = font_black,
 								               as_note = FALSE)
 								    }
 								    return(TRUE)
 								  }
 								  if (length(c(row.start:row.end)) == pm_n_distinct(x[c(row.start:row.end), col_mo, drop = TRUE])) {
 								    if (info == TRUE) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      message_("=> Found ", font_bold(paste(length(c(row.start:row.end)), "first isolates")),
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								               ", as all isolates were different microorganisms",
 								               add_fn = font_black,
 								               as_note = FALSE)
 								    }
 								    return(rep(TRUE, length(c(row.start:row.end))))
 								  }
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
 								  # did find some isolates - add new index numbers of rows
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x$newvar_row_index_sorted <- seq_len(nrow(x))
-												(v1.2.0.9023) ab_from_text() improvement

											
										
										
											2020-07-02 21:12:52 +02:00
-												(v1.2.0.9017) ab_from_text() improvement

											
										
										
											2020-06-26 12:31:27 +02:00
+								  scope.size <- nrow(x[which(x$newvar_row_index_sorted %in% c(row.start + 1:row.end) &
 								                               !is.na(x$newvar_mo)), , drop = FALSE])
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # Analysis of first isolate ----
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								  x$other_pat_or_mo <- ifelse(x$newvar_patient_id == pm_lag(x$newvar_patient_id) &
 								                                x$newvar_genus_species == pm_lag(x$newvar_genus_species),
-												(v1.2.0.9035) as.mo() speed improvement

											
										
										
											2020-07-22 10:24:23 +02:00
+								                              FALSE,
 								                              TRUE)
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x$episode_group <- paste(x$newvar_patient_id, x$newvar_genus_species)
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								  x$more_than_episode_ago <- unlist(lapply(split(x$newvar_date,
 								                                                 x$episode_group),
 								                                           is_new_episode,
 								                                           episode_days = episode_days),
 								                                    use.names = FALSE)
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
 								  weighted.notice <- ""
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								  if (!is.null(col_keyantimicrobials)) {
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
+								    weighted.notice <- "weighted "
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								    if (info == TRUE & message_not_thrown_before("first_isolate.type")) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      if (type == "keyantimicrobials") {
 								        message_("Basing inclusion on key antimicrobials, ",
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								                 ifelse(ignore_I == FALSE, "not ", ""),
 								                 "ignoring I",
 								                 add_fn = font_black,
 								                 as_note = FALSE)
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								      }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
+								      if (type == "points") {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								        message_("Basing inclusion on all antimicrobial results, using a points threshold of "
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								                 , points_threshold,
 								                 add_fn = font_black,
 								                 as_note = FALSE)
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								      }
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								      remember_thrown_message("first_isolate.type")
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								    }
-												typo

											
										
										
											2018-03-19 21:03:23 +01:00
+								    type_param <- type
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								    x$other_key_ab <- !antimicrobials_equal(y = x$newvar_key_ab,
 								                                            z = pm_lag(x$newvar_key_ab),
 								                                            type = type_param,
 								                                            ignore_I = ignore_I,
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								                                            points_threshold = points_threshold)
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								    # with key antibiotics
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								    x$newvar_first_isolate <- pm_if_else(x$newvar_row_index_sorted >= row.start &
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								                                           x$newvar_row_index_sorted <= row.end &
 								                                           x$newvar_genus_species != "" &
 								                                           (x$other_pat_or_mo | x$more_than_episode_ago | x$other_key_ab),
 								                                         TRUE,
 								                                         FALSE)
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  } else {
-												eucast rules fix, 1st isolate fix, website update

											
										
										
											2018-12-31 01:48:53 +01:00
+								    # no key antibiotics
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								    x$newvar_first_isolate <- pm_if_else(x$newvar_row_index_sorted >= row.start &
-												(v1.4.0.9001) is_gram_positive(), is_gram_negative(), parameter hardening

											
										
										
											2020-10-19 17:09:19 +02:00
+								                                           x$newvar_row_index_sorted <= row.end &
 								                                           x$newvar_genus_species != "" &
 								                                           (x$other_pat_or_mo | x$more_than_episode_ago),
 								                                         TRUE,
 								                                         FALSE)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # first one as TRUE
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x[row.start, "newvar_first_isolate"] <- TRUE
-												- Added new algorithm to determine weighted isolates, can now be `points` or `keyantibiotics, see `?first_isolate`
- Function `first_isolate` supports tidyverse-like evaluation of parameters (no need to quote them anymore)
- Functions `as.rsi` and `as.mic` now add the package name and version as attribute

											
										
										
											2018-03-19 20:39:23 +01:00
+								  # no tests that should be included, or ICU
-												mdro and 1st isolate improvements

											
										
										
											2018-10-23 11:15:05 +02:00
+								  if (!is.null(col_testcode)) {
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								    x[which(x[, col_testcode] %in% tolower(testcodes_exclude)), "newvar_first_isolate"] <- FALSE
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  if (!is.null(col_icu)) {
 								    if (icu_exclude == TRUE) {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      message_("Excluding isolates from ICU.",
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								               add_fn = font_black,
 								               as_note = FALSE)
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								      x[which(as.logical(x[, col_icu, drop = TRUE])), "newvar_first_isolate"] <- FALSE
 								    } else {
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								      message_("Including isolates from ICU.",
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								               add_fn = font_black,
 								               as_note = FALSE)
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								    }
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
 								  decimal.mark <- getOption("OutDec")
 								  big.mark <- ifelse(decimal.mark != ",", ",", ".")
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								  if (info == TRUE) {
 								    # print group name if used in dplyr::group_by()
 								    cur_group <- import_fn("cur_group", "dplyr", error_on_fail = FALSE)
 								    if (!is.null(cur_group)) {
 								      group_df <- tryCatch(cur_group(), error = function(e) data.frame())
 								      if (NCOL(group_df) > 0) {
 								        # transform factors to characters
 								        group <- vapply(FUN.VALUE = character(1), group_df, function(x) {
 								          if (is.numeric(x)) {
 								            format(x)
 								          } else if (is.logical(x)) {
 								            as.character(x)
 								          } else {
 								            paste0('"', x, '"')
 								          }
 								        })
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								        message_("\nGroup: ", paste0(names(group), " = ", group, collapse = ", "), "\n",
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								                 as_note = FALSE,
 								                 add_fn = font_red)
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								      }
 								    }
 								  }
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								  # handle empty microorganisms
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  if (any(x$newvar_mo == "UNKNOWN", na.rm = TRUE) & info == TRUE) {
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								    message_(ifelse(include_unknown == TRUE, "Included ", "Excluded "),
 								             format(sum(x$newvar_mo == "UNKNOWN", na.rm = TRUE),
 								                    decimal.mark = decimal.mark, big.mark = big.mark),
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								             " isolates with a microbial ID 'UNKNOWN' (in column '", font_bold(col_mo), "')")
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								  }
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x[which(x$newvar_mo == "UNKNOWN"), "newvar_first_isolate"] <- include_unknown
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
 								  # exclude all NAs
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  if (any(is.na(x$newvar_mo)) & info == TRUE) {
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								    message_("Excluded ", format(sum(is.na(x$newvar_mo), na.rm = TRUE),
 								                                 decimal.mark = decimal.mark, big.mark = big.mark),
-												(v1.5.0.9015) unit test fix, grouped first isolates

											
										
										
											2021-02-04 16:48:16 +01:00
+								             " isolates with a microbial ID 'NA' (in column '", font_bold(col_mo), "')")
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								  }
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x[which(is.na(x$newvar_mo)), "newvar_first_isolate"] <- FALSE
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
-												(v1.5.0.9039) handle first isolates for missing antibiograms

											
										
										
											2021-03-08 02:38:32 +01:00
+								  # handle isolates without antibiogram
 								  if (include_untested_rsi == FALSE && any(is.rsi(x))) {
 								    rsi_all_NA <- which(unname(vapply(FUN.VALUE = logical(1),
 								                                      as.data.frame(t(x[, is.rsi(x), drop = FALSE])),
 								                                      function(rsi_values) all(is.na(rsi_values)))))
 								    x[rsi_all_NA, "newvar_first_isolate"] <- FALSE
 								  }
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
+								  # arrange back according to original sorting again
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x <- x[order(x$newvar_row_index), ]
 								  rownames(x) <- NULL
-												(v0.7.1.9031) include_unknown for first_isolate()

											
										
										
											2019-08-08 22:39:42 +02:00
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  if (info == TRUE) {
-												(v1.3.0.9014) as.mo() speed improvement

											
										
										
											2020-09-03 12:31:48 +02:00
+								    n_found <- sum(x$newvar_first_isolate, na.rm = TRUE)
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								    p_found_total <- percentage(n_found / nrow(x[which(!is.na(x$newvar_mo)), , drop = FALSE]), digits = 1)
 								    p_found_scope <- percentage(n_found / scope.size, digits = 1)
-												(v1.6.0.9008) unlike, bugfix for col_mo naming

											
										
										
											2021-04-23 09:59:36 +02:00
+								    if (p_found_total %unlike% "[.]") {
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								      p_found_total <- gsub("%", ".0%", p_found_total, fixed = TRUE)
 								    }
-												(v1.6.0.9008) unlike, bugfix for col_mo naming

											
										
										
											2021-04-23 09:59:36 +02:00
+								    if (p_found_scope %unlike% "[.]") {
-												(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

											
										
										
											2020-09-18 16:05:53 +02:00
+								      p_found_scope <- gsub("%", ".0%", p_found_scope, fixed = TRUE)
 								    }
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								    # mark up number of found
-												(v1.3.0.9014) as.mo() speed improvement

											
										
										
											2020-09-03 12:31:48 +02:00
+								    n_found <- format(n_found, big.mark = big.mark, decimal.mark = decimal.mark)
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								    if (p_found_total != p_found_scope) {
 								      msg_txt <- paste0("=> Found ",
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								                        font_bold(paste0(n_found, " first ", weighted.notice, "isolates")),
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								                        " (", method, ", ", p_found_scope, " within scope and ", p_found_total, " of total where a microbial ID was available)")
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								    } else {
 								      msg_txt <- paste0("=> Found ",
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								                        font_bold(paste0(n_found, " first ", weighted.notice, "isolates")),
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								                        " (", method, ", ", p_found_total, " of total where a microbial ID was available)")
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								    }
-												(v1.4.0.9011) message formatting

											
										
										
											2020-10-27 15:56:51 +01:00
+								    message_(msg_txt, add_fn = font_black, as_note = FALSE)
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								  }
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												(v1.1.0.9004) lose dependencies

											
										
										
											2020-05-16 13:05:47 +02:00
+								  x$newvar_first_isolate
-												(v0.7.1.9102) lintr

											
										
										
											2019-10-11 17:21:02 +02:00
-												first commit

											
										
										
											2018-02-21 11:52:31 +01:00
+								}
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
 								#' @rdname first_isolate
 								#' @export
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								filter_first_isolate <- function(x = NULL,
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								                                 col_date = NULL,
 								                                 col_patient_id = NULL,
 								                                 col_mo = NULL,
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								                                 episode_days = 365,
 								                                 method = c("phenotype-based", "episode-based", "patient-based", "isolate-based"),
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								                                 ...) {
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								  if (is_null_or_grouped_tbl(x)) {
 								    # when `x` is left blank, auto determine it (get_current_data() also contains dplyr::cur_data_all())
 								    # is also fix for using a grouped df as input (a dot as first argument)
-												(v1.5.0.9017) unit testing

											
										
										
											2021-02-08 21:09:36 +01:00
+								    x <- tryCatch(get_current_data(arg_name = "x", call = -2), error = function(e) x)
-												(v1.5.0.9016) only_rsi_columns update, documentation

											
										
										
											2021-02-08 14:18:42 +01:00
+								  }
-												(v1.5.0.9018) fix unit tests

											
										
										
											2021-02-09 12:28:15 +01:00
+								  meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0
 								  meet_criteria(col_date, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_patient_id, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
 								  meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								  meet_criteria(episode_days, allow_class = c("numeric", "integer"), has_length = 1, is_positive = TRUE, is_finite = FALSE)
 								  method <- coerce_method(method)
 								  meet_criteria(method, allow_class = "character", has_length = 1, is_in = c("phenotype-based", "episode-based", "patient-based", "isolate-based", "p", "e", "i"))
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
-												(v1.1.0.9009) lose dependencies

											
										
										
											2020-05-18 10:30:53 +02:00
+								  subset(x, first_isolate(x = x,
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								                          col_date = col_date,
 								                          col_patient_id = col_patient_id,
 								                          col_mo = col_mo,
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								                          episode_days = episode_days,
-												(v1.6.0.9010) big first_isolate() update

											
										
										
											2021-04-26 23:57:37 +02:00
+								                          method = method,
-												added mdr_tb()

											
										
										
											2019-05-23 16:58:59 +02:00
+								                          ...))
-												dplyr 0.8.0 support, fixes #7

											
										
										
											2018-12-22 22:39:34 +01:00
+								}
-												(v1.6.0.9013) website update

											
										
										
											2021-04-29 17:16:30 +02:00
+								coerce_method <- function(method) {
 								  if (is.null(method)) {
 								    return(method)
 								  }
 								  method <- tolower(as.character(method[1L]))
 								  method[method %like% "^(p$|pheno)"] <- "phenotype-based"
 								  method[method %like% "^(e$|episode)"] <- "episode-based"
 								  method[method %like% "^patient"] <- "patient-based"
 								  method[method %like% "^(i$|iso)"] <- "isolate-based"
 								  method
-												(v1.4.0.9024) is_new_episode()

											
										
										
											2020-11-17 16:57:41 +01:00
+								}