Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use is_new_episode() that also supports grouping with the dplyr package, see Examples.

first_isolate(
  x,
  col_date = NULL,
  col_patient_id = NULL,
  col_mo = NULL,
  col_testcode = NULL,
  col_specimen = NULL,
  col_icu = NULL,
  col_keyantibiotics = NULL,
  episode_days = 365,
  testcodes_exclude = NULL,
  icu_exclude = FALSE,
  specimen_group = NULL,
  type = "keyantibiotics",
  ignore_I = TRUE,
  points_threshold = 2,
  info = interactive(),
  include_unknown = FALSE,
  ...
)

filter_first_isolate(
  x,
  col_date = NULL,
  col_patient_id = NULL,
  col_mo = NULL,
  ...
)

filter_first_weighted_isolate(
  x,
  col_date = NULL,
  col_patient_id = NULL,
  col_mo = NULL,
  col_keyantibiotics = NULL,
  ...
)

is_new_episode(
  .data,
  episode_days = 365,
  col_date = NULL,
  col_patient_id = NULL
)

Arguments

x, .data

a data.frame containing isolates.

col_date

column name of the result date (or date that is was received on the lab), defaults to the first column with a date class

col_patient_id

column name of the unique IDs of the patients, defaults to the first column that starts with 'patient' or 'patid' (case insensitive)

col_mo

column name of the IDs of the microorganisms (see as.mo()), defaults to the first column of class mo. Values will be coerced using as.mo().

col_testcode

column name of the test codes. Use col_testcode = NULL to not exclude certain test codes (like test codes for screening). In that case testcodes_exclude will be ignored.

col_specimen

column name of the specimen type or group

col_icu

column name of the logicals (TRUE/FALSE) whether a ward or department is an Intensive Care Unit (ICU)

col_keyantibiotics

column name of the key antibiotics to determine first weighted isolates, see key_antibiotics(). Defaults to the first column that starts with 'key' followed by 'ab' or 'antibiotics' (case insensitive). Use col_keyantibiotics = FALSE to prevent this.

episode_days

episode in days after which a genus/species combination will be determined as 'first isolate' again. The default of 365 days is based on the guideline by CLSI, see Source.

testcodes_exclude

character vector with test codes that should be excluded (case-insensitive)

icu_exclude

logical whether ICU isolates should be excluded (rows with value TRUE in column col_icu)

specimen_group

value in column col_specimen to filter on

type

type to determine weighed isolates; can be "keyantibiotics" or "points", see Details

ignore_I

logical to determine whether antibiotic interpretations with "I" will be ignored when type = "keyantibiotics", see Details

points_threshold

points until the comparison of key antibiotics will lead to inclusion of an isolate when type = "points", see Details

info

print progress

include_unknown

logical to determine whether 'unknown' microorganisms should be included too, i.e. microbial code "UNKNOWN", which defaults to FALSE. For WHONET users, this means that all records with organism code "con" (contamination) will be excluded at default. Isolates with a microbial ID of NA will always be excluded as first isolate.

...

parameters passed on to the first_isolate() function

Source

Methodology of this function is strictly based on:

M39 Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data, 4th Edition, 2014, Clinical and Laboratory Standards Institute (CLSI). https://clsi.org/standards/products/microbiology/documents/m39/.

Value

A logical vector

Details

The is_new_episode() function is a wrapper around the first_isolate() function and can be used for data sets without isolates to just determine patient episodes based on any combination of grouping variables (using dplyr), please see Examples. Since it runs first_isolate() for every group, it is quite slow.

All isolates with a microbial ID of NA will be excluded as first isolate.

Why this is so important

To conduct an analysis of antimicrobial resistance, you should only include the first isolate of every patient per episode (ref). If you would not do this, you could easily get an overestimate or underestimate of the resistance of an antibiotic. Imagine that a patient was admitted with an MRSA and that it was found in 5 different blood cultures the following week. The resistance percentage of oxacillin of all S. aureus isolates would be overestimated, because you included this MRSA more than once. It would be selection bias.

filter_*() shortcuts

The functions filter_first_isolate() and filter_first_weighted_isolate() are helper functions to quickly filter on first isolates.

The function filter_first_isolate() is essentially equal to either:

  x[first_isolate(x, ...), ]
  x %>% filter(first_isolate(x, ...))

The function filter_first_weighted_isolate() is essentially equal to:

  x %>%
    mutate(keyab = key_antibiotics(.)) %>%
    mutate(only_weighted_firsts = first_isolate(x,
                                                col_keyantibiotics = "keyab", ...)) %>%
    filter(only_weighted_firsts == TRUE) %>%
    select(-only_weighted_firsts, -keyab)

Key antibiotics

There are two ways to determine whether isolates can be included as first weighted isolates which will give generally the same results:

  1. Using type = "keyantibiotics" and parameter ignore_I

    Any difference from S to R (or vice versa) will (re)select an isolate as a first weighted isolate. With ignore_I = FALSE, also differences from I to S|R (or vice versa) will lead to this. This is a reliable method and 30-35 times faster than method 2. Read more about this in the key_antibiotics() function.

  2. Using type = "points" and parameter points_threshold

    A difference from I to S|R (or vice versa) means 0.5 points, a difference from S to R (or vice versa) means 1 point. When the sum of points exceeds points_threshold, which default to 2, an isolate will be (re)selected as a first weighted isolate.

Stable lifecycle


The lifecycle of this function is stable. In a stable function, major changes are unlikely. This means that the unlying code will generally evolve by adding new arguments; removing arguments or changing the meaning of existing arguments will be avoided.

If the unlying code needs breaking changes, they will occur gradually. For example, a parameter will be deprecated and first continue to work, but will emit an message informing you of the change. Next, typically after at least one newly released version on CRAN, the message will be transformed to an error.

Read more on our website!

On our website https://msberends.github.io/AMR/ you can find a comprehensive tutorial about how to conduct AMR analysis, the complete documentation of all functions and an example analysis using WHONET data. As we would like to better understand the backgrounds and needs of our users, please participate in our survey!

See also

Examples

# `example_isolates` is a dataset available in the AMR package.
# See ?example_isolates.

# basic filtering on first isolates
example_isolates[first_isolate(example_isolates), ]

# filtering based on isolates ----------------------------------------------
# \donttest{
if (require("dplyr")) {
  # filter on first isolates:
  example_isolates %>%
    mutate(first_isolate = first_isolate(.)) %>%
    filter(first_isolate == TRUE)
 
  # short-hand versions:
  example_isolates %>%
    filter_first_isolate()
    
  example_isolates %>%
    filter_first_weighted_isolate()
  
  # now let's see if first isolates matter:
  A <- example_isolates %>%
    group_by(hospital_id) %>%
    summarise(count = n_rsi(GEN),            # gentamicin availability
              resistance = resistance(GEN))  # gentamicin resistance
 
  B <- example_isolates %>%
    filter_first_weighted_isolate() %>%      # the 1st isolate filter
    group_by(hospital_id) %>%
    summarise(count = n_rsi(GEN),            # gentamicin availability
              resistance = resistance(GEN))  # gentamicin resistance
 
  # Have a look at A and B.
  # B is more reliable because every isolate is counted only once.
  # Gentamicin resistance in hospital D appears to be 3.7% higher than
  # when you (erroneously) would have used all isolates for analysis.
}

# filtering based on any other condition -----------------------------------

if (require("dplyr")) {
  # is_new_episode() can be used in dplyr verbs to determine patient
  # episodes based on any (combination of) grouping variables:
  example_isolates %>%
    mutate(condition = sample(x = c("A", "B", "C"), 
                              size = 2000,
                              replace = TRUE)) %>% 
    group_by(condition) %>%
    mutate(new_episode = is_new_episode())
  
  example_isolates %>%
    group_by(hospital_id) %>% 
    summarise(patients = n_distinct(patient_id),
              n_episodes_365 = sum(is_new_episode(episode_days = 365)),
              n_episodes_60  = sum(is_new_episode(episode_days = 60)),
              n_episodes_30  = sum(is_new_episode(episode_days = 30)))
    
    
  # grouping on microorganisms leads to the same results as first_isolate():
  x <- example_isolates %>%
    filter_first_isolate(include_unknown = TRUE)
    
  y <- example_isolates %>%
    group_by(mo) %>%
    filter(is_new_episode())

  identical(x$patient_id, y$patient_id)
  
  # but now you can group on isolates and many more:
  example_isolates %>%
    group_by(mo, hospital_id, ward_icu) %>%
    mutate(flag_episode = is_new_episode())
}
# }