This family of functions allows using AMR-specific data types such as <mic> and <sir> inside tidymodels pipelines.
Usage
all_mic()
all_mic_predictors()
all_sir()
all_sir_predictors()
step_mic_log2(recipe, ..., role = NA, trained = FALSE, columns = NULL,
  skip = FALSE, id = recipes::rand_id("mic_log2"))
step_sir_numeric(recipe, ..., role = NA, trained = FALSE, columns = NULL,
  skip = FALSE, id = recipes::rand_id("sir_numeric"))Arguments
- recipe
- A recipe object. The step will be added to the sequence of operations for this recipe. 
- ...
- One or more selector functions to choose variables for this step. See - selections()for more details.
- role
- Not used by this step since no new variables are created. 
- trained
- A logical to indicate if the quantities for preprocessing have been estimated. 
- skip
- A logical. Should the step be skipped when the recipe is baked by - bake()? While all operations are baked when- prep()is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using- skip = TRUEas it may affect the computations for subsequent operations.
- id
- A character string that is unique to this step to identify it. 
Details
You can read more in our online AMR with tidymodels introduction.
Tidyselect helpers include:
- all_mic()and- all_mic_predictors()to select- <mic>columns
- all_sir()and- all_sir_predictors()to select- <sir>columns
Pre-processing pipeline steps include:
- step_mic_log2()to convert MIC columns to numeric (via- as.numeric()) and apply a log2 transform, to be used with- all_mic_predictors()
- step_sir_numeric()to convert SIR columns to numeric (via- as.numeric()), to be used with- all_sir_predictors():- "S"= 1,- "I"/- "SDD"= 2,- "R"= 3. All other values are rendered- NA. Keep this in mind for further processing, especially if the model does not allow for- NAvalues.
These steps integrate with recipes::recipe() and work like standard preprocessing steps. They are useful for preparing data for modelling, especially with classification models.
Examples
if (require("tidymodels")) {
  # The below approach formed the basis for this paper: DOI 10.3389/fmicb.2025.1582703
  # Presence of ESBL genes was predicted based on raw MIC values.
  # example data set in the AMR package
  esbl_isolates
  # Prepare a binary outcome and convert to ordered factor
  data <- esbl_isolates %>%
    mutate(esbl = factor(esbl, levels = c(FALSE, TRUE), ordered = TRUE))
  # Split into training and testing sets
  split <- initial_split(data)
  training_data <- training(split)
  testing_data <- testing(split)
  # Create and prep a recipe with MIC log2 transformation
  mic_recipe <- recipe(esbl ~ ., data = training_data) %>%
    # Optionally remove non-predictive variables
    remove_role(genus, old_role = "predictor") %>%
    # Apply the log2 transformation to all MIC predictors
    step_mic_log2(all_mic_predictors()) %>%
    # And apply the preparation steps
    prep()
  # View prepped recipe
  mic_recipe
  # Apply the recipe to training and testing data
  out_training <- bake(mic_recipe, new_data = NULL)
  out_testing <- bake(mic_recipe, new_data = testing_data)
  # Fit a logistic regression model
  fitted <- logistic_reg(mode = "classification") %>%
    set_engine("glm") %>%
    fit(esbl ~ ., data = out_training)
  # Generate predictions on the test set
  predictions <- predict(fitted, out_testing) %>%
    bind_cols(out_testing)
  # Evaluate predictions using standard classification metrics
  our_metrics <- metric_set(accuracy, kap, ppv, npv)
  metrics <- our_metrics(predictions, truth = esbl, estimate = .pred_class)
  # Show performance
  metrics
}
#> Loading required package: tidymodels
#> ── Attaching packages ────────────────────────────────────── tidymodels 1.3.0 ──
#> ✔ broom        1.0.8     ✔ rsample      1.3.0
#> ✔ dials        1.4.0     ✔ tibble       3.3.0
#> ✔ infer        1.0.9     ✔ tidyr        1.3.1
#> ✔ modeldata    1.4.0     ✔ tune         1.3.0
#> ✔ parsnip      1.3.2     ✔ workflows    1.2.0
#> ✔ purrr        1.1.0     ✔ workflowsets 1.1.1
#> ✔ recipes      1.3.1     ✔ yardstick    1.3.2
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
#> ✖ purrr::discard() masks scales::discard()
#> ✖ dplyr::filter()  masks stats::filter()
#> ✖ dplyr::lag()     masks stats::lag()
#> ✖ recipes::step()  masks stats::step()
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> # A tibble: 4 × 3
#>   .metric  .estimator .estimate
#>   <chr>    <chr>          <dbl>
#> 1 accuracy binary         0.936
#> 2 kap      binary         0.872
#> 3 ppv      binary         0.925
#> 4 npv      binary         0.948