diff --git a/404.html b/404.html index c39b32f8a..17e95b509 100644 --- a/404.html +++ b/404.html @@ -31,7 +31,7 @@ AMR (for R) - 3.0.0.9002 + 3.0.0.9004 + + + + + +
+
+
+ +
+

This family of functions allows using AMR-specific data types such as <mic> and <sir> inside tidymodels pipelines.

+
+ +
+

Usage

+
all_mic()
+
+all_mic_predictors()
+
+all_sir()
+
+all_sir_predictors()
+
+step_mic_log2(recipe, ..., role = NA, trained = FALSE, columns = NULL,
+  skip = FALSE, id = recipes::rand_id("mic_log2"))
+
+step_sir_numeric(recipe, ..., role = NA, trained = FALSE, columns = NULL,
+  skip = FALSE, id = recipes::rand_id("sir_numeric"))
+
+ +
+

Arguments

+ + +
recipe
+

A recipe object. The step will be added to the sequence of +operations for this recipe.

+ + +
...
+

One or more selector functions to choose variables for this step. +See selections() for more details.

+ + +
role
+

Not used by this step since no new variables are created.

+ + +
trained
+

A logical to indicate if the quantities for preprocessing have +been estimated.

+ + +
skip
+

A logical. Should the step be skipped when the recipe is baked by +bake()? While all operations are baked when prep() is run, some +operations may not be able to be conducted on new data (e.g. processing the +outcome variable(s)). Care should be taken when using skip = TRUE as it +may affect the computations for subsequent operations.

+ + +
id
+

A character string that is unique to this step to identify it.

+ +
+
+

Details

+

You can read more in our online AMR with tidymodels introduction.

+

Tidyselect helpers include:

  • all_mic() and all_mic_predictors() to select <mic> columns

  • +
  • all_sir() and all_sir_predictors() to select <sir> columns

  • +

Pre-processing pipeline steps include:

  • step_mic_log2() to convert MIC columns to numeric (via as.numeric()) and apply a log2 transform, to be used with all_mic_predictors()

  • +
  • step_sir_numeric() to convert SIR columns to numeric (via as.numeric()), to be used with all_sir_predictors(): "S" = 1, "I"/"SDD" = 2, "R" = 3. All other values are rendered NA. Keep this in mind for further processing, especially if the model does not allow for NA values.

  • +

These steps integrate with recipes::recipe() and work like standard preprocessing steps. They are useful for preparing data for modelling, especially with classification models.

+
+ + +
+

Examples

+
library(tidymodels)
+#> ── Attaching packages ────────────────────────────────────── tidymodels 1.3.0 ──
+#>  broom        1.0.8      rsample      1.3.0
+#>  dials        1.4.0      tibble       3.3.0
+#>  infer        1.0.8      tidyr        1.3.1
+#>  modeldata    1.4.0      tune         1.3.0
+#>  parsnip      1.3.2      workflows    1.2.0
+#>  purrr        1.0.4      workflowsets 1.1.1
+#>  recipes      1.3.1      yardstick    1.3.2
+#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
+#>  purrr::discard() masks scales::discard()
+#>  dplyr::filter()  masks stats::filter()
+#>  dplyr::lag()     masks stats::lag()
+#>  recipes::step()  masks stats::step()
+
+# The below approach formed the basis for this paper: DOI 10.3389/fmicb.2025.1582703
+# Presence of ESBL genes was predicted based on raw MIC values.
+
+
+# example data set in the AMR package
+esbl_isolates
+#> # A tibble: 500 × 19
+#>    esbl  genus   AMC   AMP   TZP   CXM   FOX   CTX   CAZ   GEN   TOB   TMP   SXT
+#>    <lgl> <chr> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic>
+#>  1 FALSE Esch…    32    32     4    64    64  8.00  8.00     1     1  16.0    20
+#>  2 FALSE Esch…    32    32     4    64    64  4.00  8.00     1     1  16.0   320
+#>  3 FALSE Esch…     4     2    64     8     4  8.00  0.12    16    16   0.5    20
+#>  4 FALSE Kleb…    32    32    16    64    64  8.00  8.00     1     1   0.5    20
+#>  5 FALSE Esch…    32    32     4     4     4  0.25  2.00     1     1  16.0   320
+#>  6 FALSE Citr…    32    32    16    64    64 64.00 32.00     1     1   0.5    20
+#>  7 FALSE Morg…    32    32     4    64    64 16.00  2.00     1     1   0.5    20
+#>  8 FALSE Prot…    16    32     4     1     4  8.00  0.12     1     1  16.0   320
+#>  9 FALSE Ente…    32    32     8    64    64 32.00  4.00     1     1   0.5    20
+#> 10 FALSE Citr…    32    32    32    64    64  8.00 64.00     1     1  16.0   320
+#> # ℹ 490 more rows
+#> # ℹ 6 more variables: NIT <mic>, FOS <mic>, CIP <mic>, IPM <mic>, MEM <mic>,
+#> #   COL <mic>
+
+# Prepare a binary outcome and convert to ordered factor
+data <- esbl_isolates %>%
+  mutate(esbl = factor(esbl, levels = c(FALSE, TRUE), ordered = TRUE))
+
+# Split into training and testing sets
+split <- initial_split(data)
+training_data <- training(split)
+testing_data <- testing(split)
+
+# Create and prep a recipe with MIC log2 transformation
+mic_recipe <- recipe(esbl ~ ., data = training_data) %>%
+  # Optionally remove non-predictive variables
+  remove_role(genus, old_role = "predictor") %>%
+  # Apply the log2 transformation to all MIC predictors
+  step_mic_log2(all_mic_predictors()) %>%
+  prep()
+
+# View prepped recipe
+mic_recipe
+#> 
+#> ── Recipe ──────────────────────────────────────────────────────────────────────
+#> 
+#> ── Inputs 
+#> Number of variables by role
+#> outcome:          1
+#> predictor:       17
+#> undeclared role:  1
+#> 
+#> ── Training information 
+#> Training data contained 375 data points and no incomplete rows.
+#> 
+#> ── Operations 
+#>  Log2 transformation of MIC columns: AMC, AMP, TZP, CXM, FOX, ... | Trained
+
+# Apply the recipe to training and testing data
+out_training <- bake(mic_recipe, new_data = NULL)
+out_testing <- bake(mic_recipe, new_data = testing_data)
+
+# Fit a logistic regression model
+fitted <- logistic_reg(mode = "classification") %>%
+  set_engine("glm") %>%
+  fit(esbl ~ ., data = out_training)
+#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
+
+# Generate predictions on the test set
+predictions <- predict(fitted, out_testing) %>%
+  bind_cols(out_testing)
+
+# Evaluate predictions using standard classification metrics
+our_metrics <- metric_set(accuracy, kap, ppv, npv)
+metrics <- our_metrics(predictions, truth = esbl, estimate = .pred_class)
+
+# Show performance:
+# - negative predictive value (NPV) of ~98%
+# - positive predictive value (PPV) of ~94%
+metrics
+#> # A tibble: 4 × 3
+#>   .metric  .estimator .estimate
+#>   <chr>    <chr>          <dbl>
+#> 1 accuracy binary         0.912
+#> 2 kap      binary         0.824
+#> 3 ppv      binary         0.917
+#> 4 npv      binary         0.908
+
+
+
+ + +
+ + + + + + + diff --git a/reference/antibiogram.html b/reference/antibiogram.html index fd2221a78..af118809e 100644 --- a/reference/antibiogram.html +++ b/reference/antibiogram.html @@ -9,7 +9,7 @@ Adhering to previously described approaches (see Source) and especially the Baye AMR (for R) - 3.0.0.9002 + 3.0.0.9004 + + + + + +
+
+
+ +
+

A data set containing 500 microbial isolates with MIC values of common antibiotics and a binary esbl column for extended-spectrum beta-lactamase (ESBL) production. This data set contains randomised fictitious data but reflects reality and can be used to practise AMR-related machine learning, e.g., classification modelling with tidymodels.

+
+ +
+

Usage

+
esbl_isolates
+
+ +
+

Format

+

A tibble with 500 observations and 19 variables:

  • esbl
    Logical indicator if the isolate is ESBL-producing

  • +
  • genus
    Genus of the microorganism

  • +
  • AMC:COL
    MIC values for 17 antimicrobial agents, transformed to class mic (see as.mic())

  • +
+
+

Details

+

See our tidymodels integration for an example using this data set.

+
+ +
+

Examples

+
esbl_isolates
+#> # A tibble: 500 × 19
+#>    esbl  genus   AMC   AMP   TZP   CXM   FOX   CTX   CAZ   GEN   TOB   TMP   SXT
+#>    <lgl> <chr> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic> <mic>
+#>  1 FALSE Esch…    32    32     4    64    64  8.00  8.00     1     1  16.0    20
+#>  2 FALSE Esch…    32    32     4    64    64  4.00  8.00     1     1  16.0   320
+#>  3 FALSE Esch…     4     2    64     8     4  8.00  0.12    16    16   0.5    20
+#>  4 FALSE Kleb…    32    32    16    64    64  8.00  8.00     1     1   0.5    20
+#>  5 FALSE Esch…    32    32     4     4     4  0.25  2.00     1     1  16.0   320
+#>  6 FALSE Citr…    32    32    16    64    64 64.00 32.00     1     1   0.5    20
+#>  7 FALSE Morg…    32    32     4    64    64 16.00  2.00     1     1   0.5    20
+#>  8 FALSE Prot…    16    32     4     1     4  8.00  0.12     1     1  16.0   320
+#>  9 FALSE Ente…    32    32     8    64    64 32.00  4.00     1     1   0.5    20
+#> 10 FALSE Citr…    32    32    32    64    64  8.00 64.00     1     1  16.0   320
+#> # ℹ 490 more rows
+#> # ℹ 6 more variables: NIT <mic>, FOS <mic>, CIP <mic>, IPM <mic>, MEM <mic>,
+#> #   COL <mic>
+
+
+
+ + +
+ + + + + + + diff --git a/reference/eucast_rules.html b/reference/eucast_rules.html index 0ce40645d..169c77a5e 100644 --- a/reference/eucast_rules.html +++ b/reference/eucast_rules.html @@ -9,7 +9,7 @@ To improve the interpretation of the antibiogram before EUCAST rules are applied AMR (for R) - 3.0.0.9002 + 3.0.0.9004