diff --git a/404.html b/404.html index e9c8f51bc..e713ced71 100644 --- a/404.html +++ b/404.html @@ -32,7 +32,7 @@ AMR (for R) - 2.1.1.9229 + 2.1.1.9230 -

Notice how in fit(), the antibiotic selector functions -are internally called again. For training, these functions are called -since they are stored in the recipe.

+

Notice how in fit(), the antimicrobial selector +functions are internally called again. For training, these functions are +called since they are stored in the recipe.

Next, we evaluate the model on the testing data.

 # Make predictions on the testing set
@@ -363,7 +389,22 @@ since they are stored in the recipe.

#> .metric .estimator .estimate #> <chr> <chr> <dbl> #> 1 accuracy binary 0.995 -#> 2 kap binary 0.989
+#> 2 kap binary 0.989 + + +# To assess some other model properties, you can make our own `metrics()` function +our_metrics <- metric_set(accuracy, kap, ppv, npv) # add Positive Predictive Value and Negative Predictive Value +metrics2 <- predictions %>% + our_metrics(truth = mo, estimate = .pred_class) # run again on our `our_metrics()` function + +metrics2 +#> # A tibble: 4 × 3 +#> .metric .estimator .estimate +#> <chr> <chr> <dbl> +#> 1 accuracy binary 0.995 +#> 2 kap binary 0.989 +#> 3 ppv binary 0.987 +#> 4 npv binary 1

Explanation:

-

It appears we can predict the Gram based on AMR results with a 99.5% -accuracy based on AMR results of aminoglycosides and beta-lactam -antibiotics. The ROC curve looks like this:

+

It appears we can predict the Gram stain with a 99.5% accuracy based +on AMR results of only aminoglycosides and beta-lactam antibiotics. The +ROC curve looks like this:

 predictions %>%
   roc_curve(mo, `.pred_Gram-negative`) %>%
@@ -392,9 +433,241 @@ pipeline with the tidymodels framework and the
 aminoglycosides() and betalactams() with
 tidymodels, we efficiently prepared data, trained a model,
 and evaluated its performance.

-

This workflow is extensible to other antibiotic classes and +

This workflow is extensible to other antimicrobial classes and resistance patterns, empowering users to analyse AMR data systematically and reproducibly.

+
+
+ +
+

Example 2: Predicting AMR Over Time +

+

In this second example, we aim to predict antimicrobial resistance +(AMR) trends over time using tidymodels. We will model +resistance to three antibiotics (amoxicillin AMX, +amoxicillin-clavulanic acid AMC, and ciprofloxacin +CIP), based on historical data grouped by year and hospital +ward.

+
+

+Objective +

+

Our goal is to:

+
    +
  1. Prepare the dataset by aggregating resistance data over time.
  2. +
  3. Define a regression model to predict AMR trends.
  4. +
  5. Use tidymodels to preprocess, train, and evaluate the +model.
  6. +
+
+
+

+Data Preparation +

+

We start by transforming the example_isolates dataset +into a structured time-series format.

+
+# Load required libraries
+library(AMR)
+library(tidymodels)
+
+# Transform dataset
+data_time <- example_isolates %>%
+  top_n_microorganisms(n = 10) %>% # Filter on the top #10 species
+  mutate(year = as.integer(format(date, "%Y")),  # Extract year from date
+         gramstain = mo_gramstain(mo)) %>% # Get taxonomic names
+  group_by(year, gramstain) %>%
+  summarise(across(c(AMX, AMC, CIP), 
+                   function(x) resistance(x, minimum = 0),
+                   .names = "res_{.col}"), 
+            .groups = "drop") %>% 
+  filter(!is.na(res_AMX) & !is.na(res_AMC) & !is.na(res_CIP)) # Drop missing values
+#> ℹ Using column 'mo' as input for col_mo.
+
+data_time
+#> # A tibble: 32 × 5
+#>     year gramstain     res_AMX res_AMC res_CIP
+#>    <int> <chr>           <dbl>   <dbl>   <dbl>
+#>  1  2002 Gram-negative   1      0.105   0.0606
+#>  2  2002 Gram-positive   0.838  0.182   0.162 
+#>  3  2003 Gram-negative   1      0.0714  0     
+#>  4  2003 Gram-positive   0.714  0.244   0.154 
+#>  5  2004 Gram-negative   0.464  0.0938  0     
+#>  6  2004 Gram-positive   0.849  0.299   0.244 
+#>  7  2005 Gram-negative   0.412  0.132   0.0588
+#>  8  2005 Gram-positive   0.882  0.382   0.154 
+#>  9  2006 Gram-negative   0.379  0       0.1   
+#> 10  2006 Gram-positive   0.778  0.333   0.353 
+#> # ℹ 22 more rows
+

Explanation: - mo_name(mo): Converts +microbial codes into proper species names. - resistance(): +Converts AMR results into numeric values (proportion of resistant +isolates). - group_by(year, ward, species): Aggregates +resistance rates by year and ward.

+
+
+

+Defining the Workflow +

+

We now define the modeling workflow, which consists of a +preprocessing step, a model specification, and the fitting process.

+
+

1. Preprocessing with a Recipe +

+
+# Define the recipe
+resistance_recipe_time <- recipe(res_AMX ~ year + gramstain, data = data_time) %>%
+  step_dummy(gramstain, one_hot = TRUE) %>%  # Convert categorical to numerical
+  step_normalize(year) %>%  # Normalise year for better model performance
+  step_nzv(all_predictors())  # Remove near-zero variance predictors
+
+resistance_recipe_time
+#> 
+#> ── Recipe ──────────────────────────────────────────────────────────────────────
+#> 
+#> ── Inputs
+#> Number of variables by role
+#> outcome:   1
+#> predictor: 2
+#> 
+#> ── Operations
+#>  Dummy variables from: gramstain
+#>  Centering and scaling for: year
+#>  Sparse, unbalanced variable filter on: all_predictors()
+

Explanation: - step_dummy(): Encodes +categorical variables (ward, species) as +numerical indicators. - step_normalize(): Normalises the +year variable. - step_nzv(): Removes near-zero +variance predictors.

+
+
+

2. Specifying the Model +

+

We use a linear regression model to predict resistance trends.

+
+# Define the linear regression model
+lm_model <- linear_reg() %>%
+  set_engine("lm") # Use linear regression
+
+lm_model
+#> Linear Regression Model Specification (regression)
+#> 
+#> Computational engine: lm
+

Explanation: - linear_reg(): Defines a +linear regression model. - set_engine("lm"): Uses R’s +built-in linear regression engine.

+
+
+

3. Building the Workflow +

+

We combine the preprocessing recipe and model into a workflow.

+
+# Create workflow
+resistance_workflow_time <- workflow() %>%
+  add_recipe(resistance_recipe_time) %>%
+  add_model(lm_model)
+
+resistance_workflow_time
+#> ══ Workflow ════════════════════════════════════════════════════════════════════
+#> Preprocessor: Recipe
+#> Model: linear_reg()
+#> 
+#> ── Preprocessor ────────────────────────────────────────────────────────────────
+#> 3 Recipe Steps
+#> 
+#> • step_dummy()
+#> • step_normalize()
+#> • step_nzv()
+#> 
+#> ── Model ───────────────────────────────────────────────────────────────────────
+#> Linear Regression Model Specification (regression)
+#> 
+#> Computational engine: lm
+
+
+
+

+Training and Evaluating the Model +

+

We split the data into training and testing sets, fit the model, and +evaluate performance.

+
+# Split the data
+set.seed(123)
+data_split_time <- initial_split(data_time, prop = 0.8)
+train_time <- training(data_split_time)
+test_time <- testing(data_split_time)
+
+# Train the model
+fitted_workflow_time <- resistance_workflow_time %>%
+  fit(train_time)
+
+# Make predictions
+predictions_time <- fitted_workflow_time %>%
+  predict(test_time) %>%
+  bind_cols(test_time) 
+
+# Evaluate model
+metrics_time <- predictions_time %>%
+  metrics(truth = res_AMX, estimate = .pred)
+
+metrics_time
+#> # A tibble: 3 × 3
+#>   .metric .estimator .estimate
+#>   <chr>   <chr>          <dbl>
+#> 1 rmse    standard      0.0774
+#> 2 rsq     standard      0.711 
+#> 3 mae     standard      0.0704
+

Explanation: - initial_split(): Splits +data into training and testing sets. - fit(): Trains the +workflow. - predict(): Generates resistance predictions. - +metrics(): Evaluates model performance.

+
+
+

+Visualizing Predictions +

+

We plot resistance trends over time for Amoxicillin.

+
+library(ggplot2)
+
+# Plot actual vs predicted resistance over time
+ggplot(predictions_time, aes(x = year)) +
+  geom_point(aes(y = res_AMX, color = "Actual")) +
+  geom_line(aes(y = .pred, color = "Predicted")) +
+  labs(title = "Predicted vs Actual AMX Resistance Over Time",
+       x = "Year",
+       y = "Resistance Proportion") +
+  theme_minimal()
+

+

Additionally, we can visualise resistance trends in +ggplot2 and directly adding linear models there:

+
+ggplot(data_time, aes(x = year, y = res_AMX, color = gramstain)) +
+  geom_line() +
+  labs(title = "AMX Resistance Trends",
+       x = "Year",
+       y = "Resistance Proportion") +
+  # add a linear model directly in ggplot2:
+  geom_smooth(method = "lm",
+              formula = y ~ x,
+              alpha = 0.25) +
+  theme_minimal()
+

+
+
+

+Conclusion +

+

In this example, we demonstrated how to analyze AMR trends over time +using tidymodels. By aggregating resistance rates by year +and hospital ward, we built a predictive model to track changes in +resistance to amoxicillin (AMX), amoxicillin-clavulanic +acid (AMC), and ciprofloxacin (CIP).

+

This method can be extended to other antibiotics and resistance +patterns, providing valuable insights into AMR dynamics in healthcare +settings.

+
diff --git a/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png b/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png new file mode 100644 index 000000000..0116fcb2a Binary files /dev/null and b/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png differ diff --git a/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png b/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png new file mode 100644 index 000000000..6f59d476e Binary files /dev/null and b/articles/AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/EUCAST.html b/articles/EUCAST.html index de470001a..b2a32c431 100644 --- a/articles/EUCAST.html +++ b/articles/EUCAST.html @@ -31,7 +31,7 @@ AMR (for R) - 2.1.1.9229 + 2.1.1.9230