1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-10 12:21:53 +02:00

(v2.1.1.9182) fix AMR selectors for tidymodels, add unit tests

This commit is contained in:
2025-03-03 12:59:27 +01:00
parent b85890449d
commit 9a9468fa84
16 changed files with 84 additions and 33 deletions

View File

@ -42,8 +42,8 @@ We begin by loading the required libraries and preparing the `example_isolates`
```{r}
# Load required libraries
library(tidymodels) # For machine learning workflows, and data manipulation (dplyr, tidyr, ...)
library(AMR) # For AMR data analysis
library(tidymodels) # For machine learning workflows, and data manipulation (dplyr, tidyr, ...)
# Select relevant columns for prediction
data <- example_isolates %>%
@ -81,12 +81,18 @@ resistance_recipe <- recipe(mo ~ ., data = data) %>%
resistance_recipe
```
For a recipe that includes at least one preprocessing operation, like we have with `step_corr()`, the necessary parameters can be estimated from a training set using `prep()`:
```{r}
prep(resistance_recipe)
```
**Explanation:**
- `recipe(mo ~ ., data = data)` will take the `mo` column as outcome and all other columns as predictors.
- `step_corr()` removes predictors (i.e., antibiotic columns) that have a higher correlation than 90%.
Notice how the recipe contains just the antibiotic selector functions - no need to define the columns specifically.
Notice how the recipe contains just the antibiotic selector functions - no need to define the columns specifically. In the preparation (retrieved with `prep()`) we can see that the columns or variables `r paste0("'", suppressMessages(prep(resistance_recipe))$steps[[1]]$removals, "'", collapse = " and ")` were removed as they correlate too much with existing, other variables.
#### 2. Specifying the Model
@ -113,6 +119,7 @@ We bundle the recipe and model together into a `workflow`, which organizes the e
resistance_workflow <- workflow() %>%
add_recipe(resistance_recipe) %>% # Add the preprocessing recipe
add_model(logistic_model) # Add the logistic regression model
resistance_workflow
```
### **Training and Evaluating the Model**