2024-12-19 20:25:10 +01:00
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --> < html lang = "en" >
< head >
< meta http-equiv = "Content-Type" content = "text/html; charset=UTF-8" >
< meta charset = "utf-8" >
< meta http-equiv = "X-UA-Compatible" content = "IE=edge" >
< meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
2024-12-20 11:03:24 +01:00
< title > AMR with tidymodels • AMR (for R)< / title >
2024-12-19 20:25:10 +01:00
<!-- favicons --> < link rel = "icon" type = "image/png" sizes = "16x16" href = "../favicon-16x16.png" >
< link rel = "icon" type = "image/png" sizes = "32x32" href = "../favicon-32x32.png" >
< link rel = "apple-touch-icon" type = "image/png" sizes = "180x180" href = "../apple-touch-icon.png" >
< link rel = "apple-touch-icon" type = "image/png" sizes = "120x120" href = "../apple-touch-icon-120x120.png" >
< link rel = "apple-touch-icon" type = "image/png" sizes = "76x76" href = "../apple-touch-icon-76x76.png" >
< link rel = "apple-touch-icon" type = "image/png" sizes = "60x60" href = "../apple-touch-icon-60x60.png" >
< script src = "../deps/jquery-3.6.0/jquery-3.6.0.min.js" > < / script > < meta name = "viewport" content = "width=device-width, initial-scale=1, shrink-to-fit=no" >
< link href = "../deps/bootstrap-5.3.1/bootstrap.min.css" rel = "stylesheet" >
< script src = "../deps/bootstrap-5.3.1/bootstrap.bundle.min.js" > < / script > < link href = "../deps/Lato-0.4.9/font.css" rel = "stylesheet" >
< link href = "../deps/Fira_Code-0.4.9/font.css" rel = "stylesheet" >
< link href = "../deps/font-awesome-6.5.2/css/all.min.css" rel = "stylesheet" >
< link href = "../deps/font-awesome-6.5.2/css/v4-shims.min.css" rel = "stylesheet" >
< script src = "../deps/headroom-0.11.0/headroom.min.js" > < / script > < script src = "../deps/headroom-0.11.0/jQuery.headroom.min.js" > < / script > < script src = "../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js" > < / script > < script src = "../deps/clipboard.js-2.0.11/clipboard.min.js" > < / script > < script src = "../deps/search-1.0.0/autocomplete.jquery.min.js" > < / script > < script src = "../deps/search-1.0.0/fuse.min.js" > < / script > < script src = "../deps/search-1.0.0/mark.min.js" > < / script > <!-- pkgdown --> < script src = "../pkgdown.js" > < / script > < link href = "../extra.css" rel = "stylesheet" >
2024-12-20 11:03:24 +01:00
< script src = "../extra.js" > < / script > < meta property = "og:title" content = "AMR with tidymodels" >
2024-12-19 20:25:10 +01:00
< / head >
< body >
< a href = "#main" class = "visually-hidden-focusable" > Skip to contents< / a >
< nav class = "navbar navbar-expand-lg fixed-top bg-primary" data-bs-theme = "dark" aria-label = "Site navigation" > < div class = "container" >
< a class = "navbar-brand me-2" href = "../index.html" > AMR (for R)< / a >
2025-01-15 16:25:08 +01:00
< small class = "nav-text text-muted me-auto" data-bs-toggle = "tooltip" data-bs-placement = "bottom" title = "" > 2.1.1.9123< / small >
2024-12-19 20:25:10 +01:00
< button class = "navbar-toggler" type = "button" data-bs-toggle = "collapse" data-bs-target = "#navbar" aria-controls = "navbar" aria-expanded = "false" aria-label = "Toggle navigation" >
< span class = "navbar-toggler-icon" > < / span >
< / button >
< div id = "navbar" class = "collapse navbar-collapse ms-3" >
< ul class = "navbar-nav me-auto" >
< li class = "active nav-item dropdown" >
< button class = "nav-link dropdown-toggle" type = "button" id = "dropdown-how-to" data-bs-toggle = "dropdown" aria-expanded = "false" aria-haspopup = "true" > < span class = "fa fa-question-circle" > < / span > How to< / button >
< ul class = "dropdown-menu" aria-labelledby = "dropdown-how-to" >
< li > < a class = "dropdown-item" href = "../articles/AMR.html" > < span class = "fa fa-directions" > < / span > Conduct AMR Analysis< / a > < / li >
< li > < a class = "dropdown-item" href = "../reference/antibiogram.html" > < span class = "fa fa-file-prescription" > < / span > Generate Antibiogram (Trad./Syndromic/WISCA)< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/resistance_predict.html" > < span class = "fa fa-dice" > < / span > Predict Antimicrobial Resistance< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/datasets.html" > < span class = "fa fa-database" > < / span > Download Data Sets for Own Use< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/AMR_with_tidymodels.html" > < span class = "fa fa-square-root-variable" > < / span > Use AMR for Predictive Modelling (tidymodels)< / a > < / li >
< li > < a class = "dropdown-item" href = "../reference/AMR-options.html" > < span class = "fa fa-gear" > < / span > Set User- Or Team-specific Package Settings< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/PCA.html" > < span class = "fa fa-compress" > < / span > Conduct Principal Component Analysis for AMR< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/MDR.html" > < span class = "fa fa-skull-crossbones" > < / span > Determine Multi-Drug Resistance (MDR)< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/WHONET.html" > < span class = "fa fa-globe-americas" > < / span > Work with WHONET Data< / a > < / li >
< li > < a class = "dropdown-item" href = "../articles/EUCAST.html" > < span class = "fa fa-exchange-alt" > < / span > Apply Eucast Rules< / a > < / li >
< li > < a class = "dropdown-item" href = "../reference/mo_property.html" > < span class = "fa fa-bug" > < / span > Get Taxonomy of a Microorganism< / a > < / li >
< li > < a class = "dropdown-item" href = "../reference/ab_property.html" > < span class = "fa fa-capsules" > < / span > Get Properties of an Antibiotic Drug< / a > < / li >
< li > < a class = "dropdown-item" href = "../reference/av_property.html" > < span class = "fa fa-capsules" > < / span > Get Properties of an Antiviral Drug< / a > < / li >
< / ul >
< / li >
< li class = "nav-item" > < a class = "nav-link" href = "../articles/AMR_for_Python.html" > < span class = "fa fab fa-python" > < / span > AMR for Python< / a > < / li >
< li class = "nav-item" > < a class = "nav-link" href = "../reference/index.html" > < span class = "fa fa-book-open" > < / span > Manual< / a > < / li >
< li class = "nav-item" > < a class = "nav-link" href = "../authors.html" > < span class = "fa fa-users" > < / span > Authors< / a > < / li >
< / ul >
< ul class = "navbar-nav" >
< li class = "nav-item" > < a class = "nav-link" href = "../news/index.html" > < span class = "fa far fa-newspaper" > < / span > Changelog< / a > < / li >
< li class = "nav-item" > < a class = "external-link nav-link" href = "https://github.com/msberends/AMR" > < span class = "fa fab fa-github" > < / span > Source Code< / a > < / li >
< / ul >
< / div >
< / div >
< / nav > < div class = "container template-article" >
< div class = "row" >
< main id = "main" class = "col-md-9" > < div class = "page-header" >
2024-12-20 11:03:24 +01:00
< img src = "../logo.svg" class = "logo" alt = "" > < h1 > AMR with tidymodels< / h1 >
2024-12-19 20:25:10 +01:00
< small class = "dont-index" > Source: < a href = "https://github.com/msberends/AMR/blob/main/vignettes/AMR_with_tidymodels.Rmd" class = "external-link" > < code > vignettes/AMR_with_tidymodels.Rmd< / code > < / a > < / small >
< div class = "d-none name" > < code > AMR_with_tidymodels.Rmd< / code > < / div >
< / div >
2024-12-20 11:03:24 +01:00
< blockquote >
< p > This page was entirely written by our < a href = "https://chatgpt.com/g/g-M4UNLwFi5-amr-for-r-assistant" class = "external-link" > AMR for R
Assistant< / a > , a ChatGPT manually-trained model able to answer any
question about the AMR package.< / p >
< / blockquote >
2024-12-19 20:25:10 +01:00
< p > Antimicrobial resistance (AMR) is a global health crisis, and
understanding resistance patterns is crucial for managing effective
treatments. The < code > AMR< / code > R package provides robust tools for
analysing AMR data, including convenient antibiotic selector functions
like < code > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides()< / a > < / code > and < code > < a href = "../reference/antibiotic_class_selectors.html" > betalactams()< / a > < / code > . In
this post, we will explore how to use the < code > tidymodels< / code >
framework to predict resistance patterns in the
< code > example_isolates< / code > dataset.< / p >
< p > By leveraging the power of < code > tidymodels< / code > and the
< code > AMR< / code > package, we’ ll build a reproducible machine learning
2024-12-20 11:03:24 +01:00
workflow to predict the Gramstain of the microorganism to two important
antibiotic classes: aminoglycosides and beta-lactams.< / p >
2024-12-19 20:25:10 +01:00
< div class = "section level3" >
< h3 id = "objective" >
< strong > Objective< / strong > < a class = "anchor" aria-label = "anchor" href = "#objective" > < / a >
< / h3 >
< p > Our goal is to build a predictive model using the
2024-12-20 11:03:24 +01:00
< code > tidymodels< / code > framework to determine the Gramstain of the
microorganism based on microbial data. We will:< / p >
2024-12-19 20:25:10 +01:00
< ol style = "list-style-type: decimal" >
< li > Preprocess data using the selector functions
< code > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides()< / a > < / code > and < code > < a href = "../reference/antibiotic_class_selectors.html" > betalactams()< / a > < / code > .< / li >
< li > Define a logistic regression model for prediction.< / li >
< li > Use a structured < code > tidymodels< / code > workflow to preprocess,
train, and evaluate the model.< / li >
< / ol >
< / div >
< div class = "section level3" >
< h3 id = "data-preparation" >
< strong > Data Preparation< / strong > < a class = "anchor" aria-label = "anchor" href = "#data-preparation" > < / a >
< / h3 >
< p > We begin by loading the required libraries and preparing the
< code > example_isolates< / code > dataset from the < code > AMR< / code >
package.< / p >
< div class = "sourceCode" id = "cb1" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Load required libraries< / span > < / span >
< span > < span class = "kw" > < a href = "https://rdrr.io/r/base/library.html" class = "external-link" > library< / a > < / span > < span class = "op" > (< / span > < span class = "va" > < a href = "https://tidymodels.tidymodels.org" class = "external-link" > tidymodels< / a > < / span > < span class = "op" > )< / span > < span class = "co" > # For machine learning workflows, and data manipulation (dplyr, tidyr, ...)< / span > < / span >
< span > < span class = "co" > #> ── < span style = "font-weight: bold;" > Attaching packages< / span > ────────────────────────────────────── tidymodels 1.2.0 ──< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > broom < / span > 1.0.7 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > recipes < / span > 1.1.0< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > dials < / span > 1.3.0 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > rsample < / span > 1.2.1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > dplyr < / span > 1.1.4 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > tibble < / span > 3.2.1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > ggplot2 < / span > 3.5.1 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > tidyr < / span > 1.3.1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > infer < / span > 1.0.7 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > tune < / span > 1.2.1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > modeldata < / span > 1.4.0 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > workflows < / span > 1.1.4< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > parsnip < / span > 1.2.1 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > workflowsets< / span > 1.1.0< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > purrr < / span > 1.0.2 < span style = "color: #00BB00;" > ✔< / span > < span style = "color: #0000BB;" > yardstick < / span > 1.3.1< / span > < / span >
< span > < span class = "co" > #> ── < span style = "font-weight: bold;" > Conflicts< / span > ───────────────────────────────────────── tidymodels_conflicts() ──< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BB0000;" > ✖< / span > < span style = "color: #0000BB;" > purrr< / span > ::< span style = "color: #00BB00;" > discard()< / span > masks < span style = "color: #0000BB;" > scales< / span > ::discard()< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BB0000;" > ✖< / span > < span style = "color: #0000BB;" > dplyr< / span > ::< span style = "color: #00BB00;" > filter()< / span > masks < span style = "color: #0000BB;" > stats< / span > ::filter()< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BB0000;" > ✖< / span > < span style = "color: #0000BB;" > dplyr< / span > ::< span style = "color: #00BB00;" > lag()< / span > masks < span style = "color: #0000BB;" > stats< / span > ::lag()< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BB0000;" > ✖< / span > < span style = "color: #0000BB;" > recipes< / span > ::< span style = "color: #00BB00;" > step()< / span > masks < span style = "color: #0000BB;" > stats< / span > ::step()< / span > < / span >
2025-01-15 16:25:08 +01:00
< span > < span class = "co" > #> < span style = "color: #0000BB;" > •< / span > Dig deeper into tidy modeling with R at < span style = "color: #00BB00;" > https://www.tmwr.org< / span > < / span > < / span >
2024-12-19 20:25:10 +01:00
< span > < span class = "kw" > < a href = "https://rdrr.io/r/base/library.html" class = "external-link" > library< / a > < / span > < span class = "op" > (< / span > < span class = "va" > < a href = "https://msberends.github.io/AMR/" > AMR< / a > < / span > < span class = "op" > )< / span > < span class = "co" > # For AMR data analysis< / span > < / span >
< span > < / span >
< span > < span class = "co" > # Load the example_isolates dataset< / span > < / span >
< span > < span class = "fu" > < a href = "https://rdrr.io/r/utils/data.html" class = "external-link" > data< / a > < / span > < span class = "op" > (< / span > < span class = "st" > "example_isolates"< / span > < span class = "op" > )< / span > < span class = "co" > # Preloaded dataset with AMR results< / span > < / span >
< span > < / span >
< span > < span class = "co" > # Select relevant columns for prediction< / span > < / span >
< span > < span class = "va" > data< / span > < span class = "op" > < -< / span > < span class = "va" > example_isolates< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "co" > # select AB results dynamically< / span > < / span >
< span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/select.html" class = "external-link" > select< / a > < / span > < span class = "op" > (< / span > < span class = "va" > mo< / span > , < span class = "fu" > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides< / a > < / span > < span class = "op" > (< / span > < span class = "op" > )< / span > , < span class = "fu" > < a href = "../reference/antibiotic_class_selectors.html" > betalactams< / a > < / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "co" > # replace NAs with NI (not-interpretable)< / span > < / span >
< span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/mutate.html" class = "external-link" > mutate< / a > < / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/across.html" class = "external-link" > across< / a > < / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "https://tidyselect.r-lib.org/reference/where.html" class = "external-link" > where< / a > < / span > < span class = "op" > (< / span > < span class = "va" > is.sir< / span > < span class = "op" > )< / span > ,< / span >
< span > < span class = "op" > ~< / span > < span class = "fu" > replace_na< / span > < span class = "op" > (< / span > < span class = "va" > .x< / span > , < span class = "st" > "NI"< / span > < span class = "op" > )< / span > < span class = "op" > )< / span > ,< / span >
< span > < span class = "co" > # make factors of SIR columns< / span > < / span >
< span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/across.html" class = "external-link" > across< / a > < / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "https://tidyselect.r-lib.org/reference/where.html" class = "external-link" > where< / a > < / span > < span class = "op" > (< / span > < span class = "va" > is.sir< / span > < span class = "op" > )< / span > ,< / span >
< span > < span class = "va" > as.integer< / span > < span class = "op" > )< / span > ,< / span >
< span > < span class = "co" > # get Gramstain of microorganisms< / span > < / span >
< span > mo < span class = "op" > =< / span > < span class = "fu" > < a href = "https://rdrr.io/r/base/factor.html" class = "external-link" > as.factor< / a > < / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "../reference/mo_property.html" > mo_gramstain< / a > < / span > < span class = "op" > (< / span > < span class = "va" > mo< / span > < span class = "op" > )< / span > < span class = "op" > )< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "co" > # drop NAs - the ones without a Gramstain (fungi, etc.)< / span > < / span >
2024-12-20 11:03:24 +01:00
< span > < span class = "fu" > drop_na< / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < / span >
2024-12-19 20:25:10 +01:00
< span > < span class = "co" > #> ℹ For aminoglycosides() using columns 'GEN' (gentamicin), 'TOB'< / span > < / span >
< span > < span class = "co" > #> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)< / span > < / span >
< span > < span class = "co" > #> ℹ For betalactams() using columns 'PEN' (benzylpenicillin), 'OXA'< / span > < / span >
< span > < span class = "co" > #> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'< / span > < / span >
< span > < span class = "co" > #> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'< / span > < / span >
< span > < span class = "co" > #> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'< / span > < / span >
< span > < span class = "co" > #> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),< / span > < / span >
2024-12-20 11:03:24 +01:00
< span > < span class = "co" > #> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)< / span > < / span > < / code > < / pre > < / div >
< p > < strong > Explanation:< / strong > < / p >
< ul >
< li >
< code > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides()< / a > < / code > and < code > < a href = "../reference/antibiotic_class_selectors.html" > betalactams()< / a > < / code >
dynamically select columns for antibiotics in these classes.< / li >
< li >
< code > drop_na()< / code > ensures the model receives complete cases for
training.< / li >
< / ul >
2024-12-19 20:25:10 +01:00
< / div >
< div class = "section level3" >
< h3 id = "defining-the-workflow" >
< strong > Defining the Workflow< / strong > < a class = "anchor" aria-label = "anchor" href = "#defining-the-workflow" > < / a >
< / h3 >
< p > We now define the < code > tidymodels< / code > workflow, which consists of
three steps: preprocessing, model specification, and fitting.< / p >
< div class = "section level4" >
< h4 id = "preprocessing-with-a-recipe" > 1. Preprocessing with a Recipe< a class = "anchor" aria-label = "anchor" href = "#preprocessing-with-a-recipe" > < / a >
< / h4 >
2024-12-20 11:03:24 +01:00
< p > We create a recipe to preprocess the data for modelling.< / p >
2024-12-19 20:25:10 +01:00
< div class = "sourceCode" id = "cb2" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Define the recipe for data preprocessing< / span > < / span >
< span > < span class = "va" > resistance_recipe< / span > < span class = "op" > < -< / span > < span class = "fu" > recipe< / span > < span class = "op" > (< / span > < span class = "va" > mo< / span > < span class = "op" > ~< / span > < span class = "va" > .< / span > , data < span class = "op" > =< / span > < span class = "va" > data< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > step_corr< / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "https://rdrr.io/r/base/c.html" class = "external-link" > c< / a > < / span > < span class = "op" > (< / span > < span class = "fu" > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides< / a > < / span > < span class = "op" > (< / span > < span class = "op" > )< / span > , < span class = "fu" > < a href = "../reference/antibiotic_class_selectors.html" > betalactams< / a > < / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < span class = "op" > )< / span > , threshold < span class = "op" > =< / span > < span class = "fl" > 0.9< / span > < span class = "op" > )< / span > < / span >
< span > < span class = "va" > resistance_recipe< / span > < / span >
< span > < span class = "co" > #> < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BBBB;" > ──< / span > < span style = "font-weight: bold;" > Recipe< / span > < span style = "color: #00BBBB;" > ──────────────────────────────────────────────────────────────────────< / span > < / span > < / span >
< span > < span class = "co" > #> < / span > < / span >
< span > < span class = "co" > #> ── Inputs< / span > < / span >
< span > < span class = "co" > #> Number of variables by role< / span > < / span >
< span > < span class = "co" > #> outcome: 1< / span > < / span >
< span > < span class = "co" > #> predictor: 20< / span > < / span >
< span > < span class = "co" > #> < / span > < / span >
< span > < span class = "co" > #> ── Operations< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #00BBBB;" > •< / span > Correlation filter on: < span style = "color: #0000BB;" > c(aminoglycosides(), betalactams())< / span > < / span > < / span > < / code > < / pre > < / div >
2024-12-20 11:03:24 +01:00
< p > < strong > Explanation:< / strong > < / p >
< ul >
< li >
< code > recipe(mo ~ ., data = data)< / code > will take the
< code > mo< / code > column as outcome and all other columns as
predictors.< / li >
< li >
< code > step_corr()< / code > removes predictors (i.e., antibiotic
columns) that have a higher correlation than 90%.< / li >
< / ul >
< p > Notice how the recipe contains just the antibiotic selector functions
- no need to define the columns specifically.< / p >
2024-12-19 20:25:10 +01:00
< / div >
< div class = "section level4" >
< h4 id = "specifying-the-model" > 2. Specifying the Model< a class = "anchor" aria-label = "anchor" href = "#specifying-the-model" > < / a >
< / h4 >
< p > We define a logistic regression model since resistance prediction is
a binary classification task.< / p >
< div class = "sourceCode" id = "cb3" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Specify a logistic regression model< / span > < / span >
< span > < span class = "va" > logistic_model< / span > < span class = "op" > < -< / span > < span class = "fu" > logistic_reg< / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > set_engine< / span > < span class = "op" > (< / span > < span class = "st" > "glm"< / span > < span class = "op" > )< / span > < span class = "co" > # Use the Generalized Linear Model engine< / span > < / span >
< span > < span class = "va" > logistic_model< / span > < / span >
< span > < span class = "co" > #> Logistic Regression Model Specification (classification)< / span > < / span >
< span > < span class = "co" > #> < / span > < / span >
< span > < span class = "co" > #> Computational engine: glm< / span > < / span > < / code > < / pre > < / div >
2024-12-20 11:03:24 +01:00
< p > < strong > Explanation:< / strong > < / p >
< ul >
< li >
< code > logistic_reg()< / code > sets up a logistic regression
model.< / li >
< li >
< code > set_engine("glm")< / code > specifies the use of R’ s built-in GLM
engine.< / li >
< / ul >
2024-12-19 20:25:10 +01:00
< / div >
< div class = "section level4" >
< h4 id = "building-the-workflow" > 3. Building the Workflow< a class = "anchor" aria-label = "anchor" href = "#building-the-workflow" > < / a >
< / h4 >
< p > We bundle the recipe and model together into a < code > workflow< / code > ,
which organizes the entire modeling process.< / p >
< div class = "sourceCode" id = "cb4" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Combine the recipe and model into a workflow< / span > < / span >
< span > < span class = "va" > resistance_workflow< / span > < span class = "op" > < -< / span > < span class = "fu" > workflow< / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > add_recipe< / span > < span class = "op" > (< / span > < span class = "va" > resistance_recipe< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < span class = "co" > # Add the preprocessing recipe< / span > < / span >
2024-12-20 11:03:24 +01:00
< span > < span class = "fu" > add_model< / span > < span class = "op" > (< / span > < span class = "va" > logistic_model< / span > < span class = "op" > )< / span > < span class = "co" > # Add the logistic regression model< / span > < / span > < / code > < / pre > < / div >
2024-12-19 20:25:10 +01:00
< / div >
< / div >
< div class = "section level3" >
< h3 id = "training-and-evaluating-the-model" >
< strong > Training and Evaluating the Model< / strong > < a class = "anchor" aria-label = "anchor" href = "#training-and-evaluating-the-model" > < / a >
< / h3 >
< p > To train the model, we split the data into training and testing sets.
Then, we fit the workflow on the training set and evaluate its
performance.< / p >
< div class = "sourceCode" id = "cb5" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Split data into training and testing sets< / span > < / span >
< span > < span class = "fu" > < a href = "https://rdrr.io/r/base/Random.html" class = "external-link" > set.seed< / a > < / span > < span class = "op" > (< / span > < span class = "fl" > 123< / span > < span class = "op" > )< / span > < span class = "co" > # For reproducibility< / span > < / span >
< span > < span class = "va" > data_split< / span > < span class = "op" > < -< / span > < span class = "fu" > initial_split< / span > < span class = "op" > (< / span > < span class = "va" > data< / span > , prop < span class = "op" > =< / span > < span class = "fl" > 0.8< / span > < span class = "op" > )< / span > < span class = "co" > # 80% training, 20% testing< / span > < / span >
< span > < span class = "va" > training_data< / span > < span class = "op" > < -< / span > < span class = "fu" > training< / span > < span class = "op" > (< / span > < span class = "va" > data_split< / span > < span class = "op" > )< / span > < span class = "co" > # Training set< / span > < / span >
< span > < span class = "va" > testing_data< / span > < span class = "op" > < -< / span > < span class = "fu" > testing< / span > < span class = "op" > (< / span > < span class = "va" > data_split< / span > < span class = "op" > )< / span > < span class = "co" > # Testing set< / span > < / span >
< span > < / span >
< span > < span class = "co" > # Fit the workflow to the training data< / span > < / span >
< span > < span class = "va" > fitted_workflow< / span > < span class = "op" > < -< / span > < span class = "va" > resistance_workflow< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > fit< / span > < span class = "op" > (< / span > < span class = "va" > training_data< / span > < span class = "op" > )< / span > < span class = "co" > # Train the model< / span > < / span >
< span > < span class = "co" > #> ℹ For aminoglycosides() using columns 'GEN' (gentamicin), 'TOB'< / span > < / span >
< span > < span class = "co" > #> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)< / span > < / span >
< span > < span class = "co" > #> ℹ For betalactams() using columns 'PEN' (benzylpenicillin), 'OXA'< / span > < / span >
< span > < span class = "co" > #> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'< / span > < / span >
< span > < span class = "co" > #> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'< / span > < / span >
< span > < span class = "co" > #> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'< / span > < / span >
< span > < span class = "co" > #> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),< / span > < / span >
2024-12-20 11:03:24 +01:00
< span > < span class = "co" > #> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)< / span > < / span > < / code > < / pre > < / div >
< p > < strong > Explanation:< / strong > < / p >
< ul >
< li >
< code > initial_split()< / code > splits the data into training and
testing sets.< / li >
< li >
< code > fit()< / code > trains the workflow on the training set.< / li >
< / ul >
< p > Notice how in < code > fit()< / code > , the antibiotic selector functions
are internally called again. For training, these functions are called
since they are stored in the recipe.< / p >
2024-12-19 20:25:10 +01:00
< p > Next, we evaluate the model on the testing data.< / p >
< div class = "sourceCode" id = "cb6" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "co" > # Make predictions on the testing set< / span > < / span >
< span > < span class = "va" > predictions< / span > < span class = "op" > < -< / span > < span class = "va" > fitted_workflow< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > < a href = "https://rdrr.io/r/stats/predict.html" class = "external-link" > predict< / a > < / span > < span class = "op" > (< / span > < span class = "va" > testing_data< / span > < span class = "op" > )< / span > < span class = "co" > # Generate predictions< / span > < / span >
< span > < span class = "va" > probabilities< / span > < span class = "op" > < -< / span > < span class = "va" > fitted_workflow< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > < a href = "https://rdrr.io/r/stats/predict.html" class = "external-link" > predict< / a > < / span > < span class = "op" > (< / span > < span class = "va" > testing_data< / span > , type < span class = "op" > =< / span > < span class = "st" > "prob"< / span > < span class = "op" > )< / span > < span class = "co" > # Generate probabilities< / span > < / span >
< span > < / span >
< span > < span class = "va" > predictions< / span > < span class = "op" > < -< / span > < span class = "va" > predictions< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/bind_cols.html" class = "external-link" > bind_cols< / a > < / span > < span class = "op" > (< / span > < span class = "va" > probabilities< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > < a href = "https://dplyr.tidyverse.org/reference/bind_cols.html" class = "external-link" > bind_cols< / a > < / span > < span class = "op" > (< / span > < span class = "va" > testing_data< / span > < span class = "op" > )< / span > < span class = "co" > # Combine with true labels< / span > < / span >
< span > < / span >
< span > < span class = "va" > predictions< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # A tibble: 394 × 24< / span > < / span > < / span >
< span > < span class = "co" > #> .pred_class `.pred_Gram-negative` `.pred_Gram-positive` mo GEN TOB< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494; font-style: italic;" > < fct> < / span > < span style = "color: #949494; font-style: italic;" > < dbl> < / span > < span style = "color: #949494; font-style: italic;" > < dbl> < / span > < span style = "color: #949494; font-style: italic;" > < fct> < / span > < span style = "color: #949494; font-style: italic;" > < int> < / span > < span style = "color: #949494; font-style: italic;" > < int> < / span > < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 1< / span > Gram-positive 1.07< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > 8.93< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > Gram-p… 5 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 2< / span > Gram-positive 3.17< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 8< / span > 1.00< span style = "color: #949494;" > e< / span > + 0 Gram-p… 5 1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 3< / span > Gram-negative 9.99< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > 1.42< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 3< / span > Gram-n… 5 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 4< / span > Gram-positive 2.22< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > -16< / span > 1 < span style = "color: #949494;" > e< / span > + 0 Gram-p… 5 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 5< / span > Gram-negative 9.46< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > 5.42< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 2< / span > Gram-n… 5 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 6< / span > Gram-positive 1.07< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > 8.93< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > - 1< / span > Gram-p… 5 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 7< / span > Gram-positive 2.22< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > -16< / span > 1 < span style = "color: #949494;" > e< / span > + 0 Gram-p… 1 5< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 8< / span > Gram-positive 2.22< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > -16< / span > 1 < span style = "color: #949494;" > e< / span > + 0 Gram-p… 4 4< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 9< / span > Gram-negative 1 < span style = "color: #949494;" > e< / span > + 0 2.22< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > -16< / span > Gram-n… 1 1< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 10< / span > Gram-positive 6.05< span style = "color: #949494;" > e< / span > < span style = "color: #BB0000;" > -11< / span > 1.00< span style = "color: #949494;" > e< / span > + 0 Gram-p… 4 4< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # ℹ 384 more rows< / span > < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # ℹ 18 more variables: AMK < int> , KAN < int> , PEN < int> , OXA < int> , FLC < int> ,< / span > < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # AMX < int> , AMC < int> , AMP < int> , TZP < int> , CZO < int> , FEP < int> ,< / span > < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # CXM < int> , FOX < int> , CTX < int> , CAZ < int> , CRO < int> , IPM < int> , MEM < int> < / span > < / span > < / span >
< span > < / span >
< span > < span class = "co" > # Evaluate model performance< / span > < / span >
< span > < span class = "va" > metrics< / span > < span class = "op" > < -< / span > < span class = "va" > predictions< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > metrics< / span > < span class = "op" > (< / span > truth < span class = "op" > =< / span > < span class = "va" > mo< / span > , estimate < span class = "op" > =< / span > < span class = "va" > .pred_class< / span > < span class = "op" > )< / span > < span class = "co" > # Calculate performance metrics< / span > < / span >
< span > < / span >
< span > < span class = "va" > metrics< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494;" > # A tibble: 2 × 3< / span > < / span > < / span >
< span > < span class = "co" > #> .metric .estimator .estimate< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #949494; font-style: italic;" > < chr> < / span > < span style = "color: #949494; font-style: italic;" > < chr> < / span > < span style = "color: #949494; font-style: italic;" > < dbl> < / span > < / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 1< / span > accuracy binary 0.995< / span > < / span >
< span > < span class = "co" > #> < span style = "color: #BCBCBC;" > 2< / span > kap binary 0.989< / span > < / span > < / code > < / pre > < / div >
2024-12-20 11:03:24 +01:00
< p > < strong > Explanation:< / strong > < / p >
< ul >
< li >
< code > < a href = "https://rdrr.io/r/stats/predict.html" class = "external-link" > predict()< / a > < / code > generates predictions on the testing
set.< / li >
< li >
< code > metrics()< / code > computes evaluation metrics like accuracy and
kappa.< / li >
< / ul >
2024-12-19 20:25:10 +01:00
< p > It appears we can predict the Gram based on AMR results with a 0.995
2024-12-20 11:03:24 +01:00
accuracy based on AMR results of aminoglycosides and beta-lactam
antibiotics. The ROC curve looks like this:< / p >
2024-12-19 20:25:10 +01:00
< div class = "sourceCode" id = "cb7" > < pre class = "downlit sourceCode r" >
< code class = "sourceCode R" > < span > < span class = "va" > predictions< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > roc_curve< / span > < span class = "op" > (< / span > < span class = "va" > mo< / span > , < span class = "va" > `.pred_Gram-negative`< / span > < span class = "op" > )< / span > < span class = "op" > < a href = "https://magrittr.tidyverse.org/reference/pipe.html" class = "external-link" > %> %< / a > < / span > < / span >
< span > < span class = "fu" > < a href = "https://ggplot2.tidyverse.org/reference/autoplot.html" class = "external-link" > autoplot< / a > < / span > < span class = "op" > (< / span > < span class = "op" > )< / span > < / span > < / code > < / pre > < / div >
< p > < img src = "AMR_with_tidymodels_files/figure-html/unnamed-chunk-7-1.png" width = "720" > < / p >
< / div >
< div class = "section level3" >
< h3 id = "conclusion" >
< strong > Conclusion< / strong > < a class = "anchor" aria-label = "anchor" href = "#conclusion" > < / a >
< / h3 >
< p > In this post, we demonstrated how to build a machine learning
pipeline with the < code > tidymodels< / code > framework and the
< code > AMR< / code > package. By combining selector functions like
< code > < a href = "../reference/antibiotic_class_selectors.html" > aminoglycosides()< / a > < / code > and < code > < a href = "../reference/antibiotic_class_selectors.html" > betalactams()< / a > < / code > with
< code > tidymodels< / code > , we efficiently prepared data, trained a model,
and evaluated its performance.< / p >
< p > This workflow is extensible to other antibiotic classes and
resistance patterns, empowering users to analyse AMR data systematically
and reproducibly.< / p >
< / div >
< / main > < aside class = "col-md-3" > < nav id = "toc" aria-label = "Table of contents" > < h2 > On this page< / h2 >
< / nav > < / aside >
< / div >
< footer > < div class = "pkgdown-footer-left" >
< p > < code > AMR< / code > (for R). Free and open-source, licenced under the < a target = "_blank" href = "https://github.com/msberends/AMR/blob/main/LICENSE" class = "external-link" > GNU General Public License version 2.0 (GPL-2)< / a > .< br > Developed at the < a target = "_blank" href = "https://www.rug.nl" class = "external-link" > University of Groningen< / a > and < a target = "_blank" href = "https://www.umcg.nl" class = "external-link" > University Medical Center Groningen< / a > in The Netherlands.< / p >
< / div >
< div class = "pkgdown-footer-right" >
< p > < a target = "_blank" href = "https://www.rug.nl" class = "external-link" > < img src = "https://github.com/msberends/AMR/raw/main/pkgdown/assets/logo_rug.svg" style = "max-width: 150px;" > < / a > < a target = "_blank" href = "https://www.umcg.nl" class = "external-link" > < img src = "https://github.com/msberends/AMR/raw/main/pkgdown/assets/logo_umcg.svg" style = "max-width: 150px;" > < / a > < / p >
< / div >
< / footer >
< / div >
< / body >
< / html >