mirror of
https://github.com/msberends/AMR.git
synced 2025-01-24 11:44:35 +01:00
386 lines
37 KiB
HTML
386 lines
37 KiB
HTML
<!DOCTYPE html>
|
||
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<meta charset="utf-8">
|
||
<meta http-equiv="X-UA-Compatible" content="IE=edge">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
||
<title>AMR with tidymodels • AMR (for R)</title>
|
||
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png">
|
||
<link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png">
|
||
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png">
|
||
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png">
|
||
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png">
|
||
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png">
|
||
<script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
|
||
<link href="../deps/bootstrap-5.3.1/bootstrap.min.css" rel="stylesheet">
|
||
<script src="../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><link href="../deps/Lato-0.4.9/font.css" rel="stylesheet">
|
||
<link href="../deps/Fira_Code-0.4.9/font.css" rel="stylesheet">
|
||
<link href="../deps/font-awesome-6.5.2/css/all.min.css" rel="stylesheet">
|
||
<link href="../deps/font-awesome-6.5.2/css/v4-shims.min.css" rel="stylesheet">
|
||
<script src="../deps/headroom-0.11.0/headroom.min.js"></script><script src="../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><script src="../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><script src="../deps/clipboard.js-2.0.11/clipboard.min.js"></script><script src="../deps/search-1.0.0/autocomplete.jquery.min.js"></script><script src="../deps/search-1.0.0/fuse.min.js"></script><script src="../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><script src="../pkgdown.js"></script><link href="../extra.css" rel="stylesheet">
|
||
<script src="../extra.js"></script><meta property="og:title" content="AMR with tidymodels">
|
||
</head>
|
||
<body>
|
||
<a href="#main" class="visually-hidden-focusable">Skip to contents</a>
|
||
|
||
|
||
<nav class="navbar navbar-expand-lg fixed-top bg-primary" data-bs-theme="dark" aria-label="Site navigation"><div class="container">
|
||
|
||
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
|
||
|
||
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">2.1.1.9123</small>
|
||
|
||
|
||
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
|
||
<span class="navbar-toggler-icon"></span>
|
||
</button>
|
||
|
||
<div id="navbar" class="collapse navbar-collapse ms-3">
|
||
<ul class="navbar-nav me-auto">
|
||
<li class="active nav-item dropdown">
|
||
<button class="nav-link dropdown-toggle" type="button" id="dropdown-how-to" data-bs-toggle="dropdown" aria-expanded="false" aria-haspopup="true"><span class="fa fa-question-circle"></span> How to</button>
|
||
<ul class="dropdown-menu" aria-labelledby="dropdown-how-to">
|
||
<li><a class="dropdown-item" href="../articles/AMR.html"><span class="fa fa-directions"></span> Conduct AMR Analysis</a></li>
|
||
<li><a class="dropdown-item" href="../reference/antibiogram.html"><span class="fa fa-file-prescription"></span> Generate Antibiogram (Trad./Syndromic/WISCA)</a></li>
|
||
<li><a class="dropdown-item" href="../articles/resistance_predict.html"><span class="fa fa-dice"></span> Predict Antimicrobial Resistance</a></li>
|
||
<li><a class="dropdown-item" href="../articles/datasets.html"><span class="fa fa-database"></span> Download Data Sets for Own Use</a></li>
|
||
<li><a class="dropdown-item" href="../articles/AMR_with_tidymodels.html"><span class="fa fa-square-root-variable"></span> Use AMR for Predictive Modelling (tidymodels)</a></li>
|
||
<li><a class="dropdown-item" href="../reference/AMR-options.html"><span class="fa fa-gear"></span> Set User- Or Team-specific Package Settings</a></li>
|
||
<li><a class="dropdown-item" href="../articles/PCA.html"><span class="fa fa-compress"></span> Conduct Principal Component Analysis for AMR</a></li>
|
||
<li><a class="dropdown-item" href="../articles/MDR.html"><span class="fa fa-skull-crossbones"></span> Determine Multi-Drug Resistance (MDR)</a></li>
|
||
<li><a class="dropdown-item" href="../articles/WHONET.html"><span class="fa fa-globe-americas"></span> Work with WHONET Data</a></li>
|
||
<li><a class="dropdown-item" href="../articles/EUCAST.html"><span class="fa fa-exchange-alt"></span> Apply Eucast Rules</a></li>
|
||
<li><a class="dropdown-item" href="../reference/mo_property.html"><span class="fa fa-bug"></span> Get Taxonomy of a Microorganism</a></li>
|
||
<li><a class="dropdown-item" href="../reference/ab_property.html"><span class="fa fa-capsules"></span> Get Properties of an Antibiotic Drug</a></li>
|
||
<li><a class="dropdown-item" href="../reference/av_property.html"><span class="fa fa-capsules"></span> Get Properties of an Antiviral Drug</a></li>
|
||
</ul>
|
||
</li>
|
||
<li class="nav-item"><a class="nav-link" href="../articles/AMR_for_Python.html"><span class="fa fab fa-python"></span> AMR for Python</a></li>
|
||
<li class="nav-item"><a class="nav-link" href="../reference/index.html"><span class="fa fa-book-open"></span> Manual</a></li>
|
||
<li class="nav-item"><a class="nav-link" href="../authors.html"><span class="fa fa-users"></span> Authors</a></li>
|
||
</ul>
|
||
<ul class="navbar-nav">
|
||
<li class="nav-item"><a class="nav-link" href="../news/index.html"><span class="fa far fa-newspaper"></span> Changelog</a></li>
|
||
<li class="nav-item"><a class="external-link nav-link" href="https://github.com/msberends/AMR"><span class="fa fab fa-github"></span> Source Code</a></li>
|
||
</ul>
|
||
</div>
|
||
|
||
|
||
</div>
|
||
</nav><div class="container template-article">
|
||
|
||
|
||
|
||
|
||
<div class="row">
|
||
<main id="main" class="col-md-9"><div class="page-header">
|
||
<img src="../logo.svg" class="logo" alt=""><h1>AMR with tidymodels</h1>
|
||
|
||
|
||
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/main/vignettes/AMR_with_tidymodels.Rmd" class="external-link"><code>vignettes/AMR_with_tidymodels.Rmd</code></a></small>
|
||
<div class="d-none name"><code>AMR_with_tidymodels.Rmd</code></div>
|
||
</div>
|
||
|
||
|
||
|
||
<blockquote>
|
||
<p>This page was entirely written by our <a href="https://chatgpt.com/g/g-M4UNLwFi5-amr-for-r-assistant" class="external-link">AMR for R
|
||
Assistant</a>, a ChatGPT manually-trained model able to answer any
|
||
question about the AMR package.</p>
|
||
</blockquote>
|
||
<p>Antimicrobial resistance (AMR) is a global health crisis, and
|
||
understanding resistance patterns is crucial for managing effective
|
||
treatments. The <code>AMR</code> R package provides robust tools for
|
||
analysing AMR data, including convenient antibiotic selector functions
|
||
like <code><a href="../reference/antibiotic_class_selectors.html">aminoglycosides()</a></code> and <code><a href="../reference/antibiotic_class_selectors.html">betalactams()</a></code>. In
|
||
this post, we will explore how to use the <code>tidymodels</code>
|
||
framework to predict resistance patterns in the
|
||
<code>example_isolates</code> dataset.</p>
|
||
<p>By leveraging the power of <code>tidymodels</code> and the
|
||
<code>AMR</code> package, we’ll build a reproducible machine learning
|
||
workflow to predict the Gramstain of the microorganism to two important
|
||
antibiotic classes: aminoglycosides and beta-lactams.</p>
|
||
<div class="section level3">
|
||
<h3 id="objective">
|
||
<strong>Objective</strong><a class="anchor" aria-label="anchor" href="#objective"></a>
|
||
</h3>
|
||
<p>Our goal is to build a predictive model using the
|
||
<code>tidymodels</code> framework to determine the Gramstain of the
|
||
microorganism based on microbial data. We will:</p>
|
||
<ol style="list-style-type: decimal">
|
||
<li>Preprocess data using the selector functions
|
||
<code><a href="../reference/antibiotic_class_selectors.html">aminoglycosides()</a></code> and <code><a href="../reference/antibiotic_class_selectors.html">betalactams()</a></code>.</li>
|
||
<li>Define a logistic regression model for prediction.</li>
|
||
<li>Use a structured <code>tidymodels</code> workflow to preprocess,
|
||
train, and evaluate the model.</li>
|
||
</ol>
|
||
</div>
|
||
<div class="section level3">
|
||
<h3 id="data-preparation">
|
||
<strong>Data Preparation</strong><a class="anchor" aria-label="anchor" href="#data-preparation"></a>
|
||
</h3>
|
||
<p>We begin by loading the required libraries and preparing the
|
||
<code>example_isolates</code> dataset from the <code>AMR</code>
|
||
package.</p>
|
||
<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Load required libraries</span></span>
|
||
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://tidymodels.tidymodels.org" class="external-link">tidymodels</a></span><span class="op">)</span> <span class="co"># For machine learning workflows, and data manipulation (dplyr, tidyr, ...)</span></span>
|
||
<span><span class="co">#> ── <span style="font-weight: bold;">Attaching packages</span> ────────────────────────────────────── tidymodels 1.2.0 ──</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">broom </span> 1.0.7 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">recipes </span> 1.1.0</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">dials </span> 1.3.0 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">rsample </span> 1.2.1</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">dplyr </span> 1.1.4 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">tibble </span> 3.2.1</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">ggplot2 </span> 3.5.1 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">tidyr </span> 1.3.1</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">infer </span> 1.0.7 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">tune </span> 1.2.1</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">modeldata </span> 1.4.0 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">workflows </span> 1.1.4</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">parsnip </span> 1.2.1 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">workflowsets</span> 1.1.0</span></span>
|
||
<span><span class="co">#> <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">purrr </span> 1.0.2 <span style="color: #00BB00;">✔</span> <span style="color: #0000BB;">yardstick </span> 1.3.1</span></span>
|
||
<span><span class="co">#> ── <span style="font-weight: bold;">Conflicts</span> ───────────────────────────────────────── tidymodels_conflicts() ──</span></span>
|
||
<span><span class="co">#> <span style="color: #BB0000;">✖</span> <span style="color: #0000BB;">purrr</span>::<span style="color: #00BB00;">discard()</span> masks <span style="color: #0000BB;">scales</span>::discard()</span></span>
|
||
<span><span class="co">#> <span style="color: #BB0000;">✖</span> <span style="color: #0000BB;">dplyr</span>::<span style="color: #00BB00;">filter()</span> masks <span style="color: #0000BB;">stats</span>::filter()</span></span>
|
||
<span><span class="co">#> <span style="color: #BB0000;">✖</span> <span style="color: #0000BB;">dplyr</span>::<span style="color: #00BB00;">lag()</span> masks <span style="color: #0000BB;">stats</span>::lag()</span></span>
|
||
<span><span class="co">#> <span style="color: #BB0000;">✖</span> <span style="color: #0000BB;">recipes</span>::<span style="color: #00BB00;">step()</span> masks <span style="color: #0000BB;">stats</span>::step()</span></span>
|
||
<span><span class="co">#> <span style="color: #0000BB;">•</span> Learn how to get started at <span style="color: #00BB00;">https://www.tidymodels.org/start/</span></span></span>
|
||
<span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://msberends.github.io/AMR/">AMR</a></span><span class="op">)</span> <span class="co"># For AMR data analysis</span></span>
|
||
<span></span>
|
||
<span><span class="co"># Load the example_isolates dataset</span></span>
|
||
<span><span class="fu"><a href="https://rdrr.io/r/utils/data.html" class="external-link">data</a></span><span class="op">(</span><span class="st">"example_isolates"</span><span class="op">)</span> <span class="co"># Preloaded dataset with AMR results</span></span>
|
||
<span></span>
|
||
<span><span class="co"># Select relevant columns for prediction</span></span>
|
||
<span><span class="va">data</span> <span class="op"><-</span> <span class="va">example_isolates</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="co"># select AB results dynamically</span></span>
|
||
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html" class="external-link">select</a></span><span class="op">(</span><span class="va">mo</span>, <span class="fu"><a href="../reference/antibiotic_class_selectors.html">aminoglycosides</a></span><span class="op">(</span><span class="op">)</span>, <span class="fu"><a href="../reference/antibiotic_class_selectors.html">betalactams</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="co"># replace NAs with NI (not-interpretable)</span></span>
|
||
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html" class="external-link">mutate</a></span><span class="op">(</span><span class="fu"><a href="https://dplyr.tidyverse.org/reference/across.html" class="external-link">across</a></span><span class="op">(</span><span class="fu"><a href="https://tidyselect.r-lib.org/reference/where.html" class="external-link">where</a></span><span class="op">(</span><span class="va">is.sir</span><span class="op">)</span>,</span>
|
||
<span> <span class="op">~</span><span class="fu">replace_na</span><span class="op">(</span><span class="va">.x</span>, <span class="st">"NI"</span><span class="op">)</span><span class="op">)</span>,</span>
|
||
<span> <span class="co"># make factors of SIR columns</span></span>
|
||
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/across.html" class="external-link">across</a></span><span class="op">(</span><span class="fu"><a href="https://tidyselect.r-lib.org/reference/where.html" class="external-link">where</a></span><span class="op">(</span><span class="va">is.sir</span><span class="op">)</span>,</span>
|
||
<span> <span class="va">as.integer</span><span class="op">)</span>,</span>
|
||
<span> <span class="co"># get Gramstain of microorganisms</span></span>
|
||
<span> mo <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/factor.html" class="external-link">as.factor</a></span><span class="op">(</span><span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span><span class="op">(</span><span class="va">mo</span><span class="op">)</span><span class="op">)</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="co"># drop NAs - the ones without a Gramstain (fungi, etc.)</span></span>
|
||
<span> <span class="fu">drop_na</span><span class="op">(</span><span class="op">)</span></span>
|
||
<span><span class="co">#> ℹ For aminoglycosides() using columns 'GEN' (gentamicin), 'TOB'</span></span>
|
||
<span><span class="co">#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)</span></span>
|
||
<span><span class="co">#> ℹ For betalactams() using columns 'PEN' (benzylpenicillin), 'OXA'</span></span>
|
||
<span><span class="co">#> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'</span></span>
|
||
<span><span class="co">#> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'</span></span>
|
||
<span><span class="co">#> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'</span></span>
|
||
<span><span class="co">#> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),</span></span>
|
||
<span><span class="co">#> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)</span></span></code></pre></div>
|
||
<p><strong>Explanation:</strong></p>
|
||
<ul>
|
||
<li>
|
||
<code><a href="../reference/antibiotic_class_selectors.html">aminoglycosides()</a></code> and <code><a href="../reference/antibiotic_class_selectors.html">betalactams()</a></code>
|
||
dynamically select columns for antibiotics in these classes.</li>
|
||
<li>
|
||
<code>drop_na()</code> ensures the model receives complete cases for
|
||
training.</li>
|
||
</ul>
|
||
</div>
|
||
<div class="section level3">
|
||
<h3 id="defining-the-workflow">
|
||
<strong>Defining the Workflow</strong><a class="anchor" aria-label="anchor" href="#defining-the-workflow"></a>
|
||
</h3>
|
||
<p>We now define the <code>tidymodels</code> workflow, which consists of
|
||
three steps: preprocessing, model specification, and fitting.</p>
|
||
<div class="section level4">
|
||
<h4 id="preprocessing-with-a-recipe">1. Preprocessing with a Recipe<a class="anchor" aria-label="anchor" href="#preprocessing-with-a-recipe"></a>
|
||
</h4>
|
||
<p>We create a recipe to preprocess the data for modelling.</p>
|
||
<div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Define the recipe for data preprocessing</span></span>
|
||
<span><span class="va">resistance_recipe</span> <span class="op"><-</span> <span class="fu">recipe</span><span class="op">(</span><span class="va">mo</span> <span class="op">~</span> <span class="va">.</span>, data <span class="op">=</span> <span class="va">data</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">step_corr</span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="fu"><a href="../reference/antibiotic_class_selectors.html">aminoglycosides</a></span><span class="op">(</span><span class="op">)</span>, <span class="fu"><a href="../reference/antibiotic_class_selectors.html">betalactams</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span>, threshold <span class="op">=</span> <span class="fl">0.9</span><span class="op">)</span></span>
|
||
<span><span class="va">resistance_recipe</span></span>
|
||
<span><span class="co">#> </span></span>
|
||
<span><span class="co">#> <span style="color: #00BBBB;">──</span> <span style="font-weight: bold;">Recipe</span> <span style="color: #00BBBB;">──────────────────────────────────────────────────────────────────────</span></span></span>
|
||
<span><span class="co">#> </span></span>
|
||
<span><span class="co">#> ── Inputs</span></span>
|
||
<span><span class="co">#> Number of variables by role</span></span>
|
||
<span><span class="co">#> outcome: 1</span></span>
|
||
<span><span class="co">#> predictor: 20</span></span>
|
||
<span><span class="co">#> </span></span>
|
||
<span><span class="co">#> ── Operations</span></span>
|
||
<span><span class="co">#> <span style="color: #00BBBB;">•</span> Correlation filter on: <span style="color: #0000BB;">c(aminoglycosides(), betalactams())</span></span></span></code></pre></div>
|
||
<p><strong>Explanation:</strong></p>
|
||
<ul>
|
||
<li>
|
||
<code>recipe(mo ~ ., data = data)</code> will take the
|
||
<code>mo</code> column as outcome and all other columns as
|
||
predictors.</li>
|
||
<li>
|
||
<code>step_corr()</code> removes predictors (i.e., antibiotic
|
||
columns) that have a higher correlation than 90%.</li>
|
||
</ul>
|
||
<p>Notice how the recipe contains just the antibiotic selector functions
|
||
- no need to define the columns specifically.</p>
|
||
</div>
|
||
<div class="section level4">
|
||
<h4 id="specifying-the-model">2. Specifying the Model<a class="anchor" aria-label="anchor" href="#specifying-the-model"></a>
|
||
</h4>
|
||
<p>We define a logistic regression model since resistance prediction is
|
||
a binary classification task.</p>
|
||
<div class="sourceCode" id="cb3"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Specify a logistic regression model</span></span>
|
||
<span><span class="va">logistic_model</span> <span class="op"><-</span> <span class="fu">logistic_reg</span><span class="op">(</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">set_engine</span><span class="op">(</span><span class="st">"glm"</span><span class="op">)</span> <span class="co"># Use the Generalized Linear Model engine</span></span>
|
||
<span><span class="va">logistic_model</span></span>
|
||
<span><span class="co">#> Logistic Regression Model Specification (classification)</span></span>
|
||
<span><span class="co">#> </span></span>
|
||
<span><span class="co">#> Computational engine: glm</span></span></code></pre></div>
|
||
<p><strong>Explanation:</strong></p>
|
||
<ul>
|
||
<li>
|
||
<code>logistic_reg()</code> sets up a logistic regression
|
||
model.</li>
|
||
<li>
|
||
<code>set_engine("glm")</code> specifies the use of R’s built-in GLM
|
||
engine.</li>
|
||
</ul>
|
||
</div>
|
||
<div class="section level4">
|
||
<h4 id="building-the-workflow">3. Building the Workflow<a class="anchor" aria-label="anchor" href="#building-the-workflow"></a>
|
||
</h4>
|
||
<p>We bundle the recipe and model together into a <code>workflow</code>,
|
||
which organizes the entire modeling process.</p>
|
||
<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Combine the recipe and model into a workflow</span></span>
|
||
<span><span class="va">resistance_workflow</span> <span class="op"><-</span> <span class="fu">workflow</span><span class="op">(</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">add_recipe</span><span class="op">(</span><span class="va">resistance_recipe</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span> <span class="co"># Add the preprocessing recipe</span></span>
|
||
<span> <span class="fu">add_model</span><span class="op">(</span><span class="va">logistic_model</span><span class="op">)</span> <span class="co"># Add the logistic regression model</span></span></code></pre></div>
|
||
</div>
|
||
</div>
|
||
<div class="section level3">
|
||
<h3 id="training-and-evaluating-the-model">
|
||
<strong>Training and Evaluating the Model</strong><a class="anchor" aria-label="anchor" href="#training-and-evaluating-the-model"></a>
|
||
</h3>
|
||
<p>To train the model, we split the data into training and testing sets.
|
||
Then, we fit the workflow on the training set and evaluate its
|
||
performance.</p>
|
||
<div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Split data into training and testing sets</span></span>
|
||
<span><span class="fu"><a href="https://rdrr.io/r/base/Random.html" class="external-link">set.seed</a></span><span class="op">(</span><span class="fl">123</span><span class="op">)</span> <span class="co"># For reproducibility</span></span>
|
||
<span><span class="va">data_split</span> <span class="op"><-</span> <span class="fu">initial_split</span><span class="op">(</span><span class="va">data</span>, prop <span class="op">=</span> <span class="fl">0.8</span><span class="op">)</span> <span class="co"># 80% training, 20% testing</span></span>
|
||
<span><span class="va">training_data</span> <span class="op"><-</span> <span class="fu">training</span><span class="op">(</span><span class="va">data_split</span><span class="op">)</span> <span class="co"># Training set</span></span>
|
||
<span><span class="va">testing_data</span> <span class="op"><-</span> <span class="fu">testing</span><span class="op">(</span><span class="va">data_split</span><span class="op">)</span> <span class="co"># Testing set</span></span>
|
||
<span></span>
|
||
<span><span class="co"># Fit the workflow to the training data</span></span>
|
||
<span><span class="va">fitted_workflow</span> <span class="op"><-</span> <span class="va">resistance_workflow</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">fit</span><span class="op">(</span><span class="va">training_data</span><span class="op">)</span> <span class="co"># Train the model</span></span>
|
||
<span><span class="co">#> ℹ For aminoglycosides() using columns 'GEN' (gentamicin), 'TOB'</span></span>
|
||
<span><span class="co">#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)</span></span>
|
||
<span><span class="co">#> ℹ For betalactams() using columns 'PEN' (benzylpenicillin), 'OXA'</span></span>
|
||
<span><span class="co">#> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'</span></span>
|
||
<span><span class="co">#> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'</span></span>
|
||
<span><span class="co">#> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'</span></span>
|
||
<span><span class="co">#> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),</span></span>
|
||
<span><span class="co">#> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)</span></span></code></pre></div>
|
||
<p><strong>Explanation:</strong></p>
|
||
<ul>
|
||
<li>
|
||
<code>initial_split()</code> splits the data into training and
|
||
testing sets.</li>
|
||
<li>
|
||
<code>fit()</code> trains the workflow on the training set.</li>
|
||
</ul>
|
||
<p>Notice how in <code>fit()</code>, the antibiotic selector functions
|
||
are internally called again. For training, these functions are called
|
||
since they are stored in the recipe.</p>
|
||
<p>Next, we evaluate the model on the testing data.</p>
|
||
<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="co"># Make predictions on the testing set</span></span>
|
||
<span><span class="va">predictions</span> <span class="op"><-</span> <span class="va">fitted_workflow</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu"><a href="https://rdrr.io/r/stats/predict.html" class="external-link">predict</a></span><span class="op">(</span><span class="va">testing_data</span><span class="op">)</span> <span class="co"># Generate predictions</span></span>
|
||
<span><span class="va">probabilities</span> <span class="op"><-</span> <span class="va">fitted_workflow</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu"><a href="https://rdrr.io/r/stats/predict.html" class="external-link">predict</a></span><span class="op">(</span><span class="va">testing_data</span>, type <span class="op">=</span> <span class="st">"prob"</span><span class="op">)</span> <span class="co"># Generate probabilities</span></span>
|
||
<span></span>
|
||
<span><span class="va">predictions</span> <span class="op"><-</span> <span class="va">predictions</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/bind_cols.html" class="external-link">bind_cols</a></span><span class="op">(</span><span class="va">probabilities</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/bind_cols.html" class="external-link">bind_cols</a></span><span class="op">(</span><span class="va">testing_data</span><span class="op">)</span> <span class="co"># Combine with true labels</span></span>
|
||
<span></span>
|
||
<span><span class="va">predictions</span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># A tibble: 394 × 24</span></span></span>
|
||
<span><span class="co">#> .pred_class `.pred_Gram-negative` `.pred_Gram-positive` mo GEN TOB</span></span>
|
||
<span><span class="co">#> <span style="color: #949494; font-style: italic;"><fct></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><dbl></span> <span style="color: #949494; font-style: italic;"><fct></span> <span style="color: #949494; font-style: italic;"><int></span> <span style="color: #949494; font-style: italic;"><int></span></span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 1</span> Gram-positive 1.07<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> 8.93<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> Gram-p… 5 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 2</span> Gram-positive 3.17<span style="color: #949494;">e</span><span style="color: #BB0000;">- 8</span> 1.00<span style="color: #949494;">e</span>+ 0 Gram-p… 5 1</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 3</span> Gram-negative 9.99<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> 1.42<span style="color: #949494;">e</span><span style="color: #BB0000;">- 3</span> Gram-n… 5 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 4</span> Gram-positive 2.22<span style="color: #949494;">e</span><span style="color: #BB0000;">-16</span> 1 <span style="color: #949494;">e</span>+ 0 Gram-p… 5 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 5</span> Gram-negative 9.46<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> 5.42<span style="color: #949494;">e</span><span style="color: #BB0000;">- 2</span> Gram-n… 5 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 6</span> Gram-positive 1.07<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> 8.93<span style="color: #949494;">e</span><span style="color: #BB0000;">- 1</span> Gram-p… 5 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 7</span> Gram-positive 2.22<span style="color: #949494;">e</span><span style="color: #BB0000;">-16</span> 1 <span style="color: #949494;">e</span>+ 0 Gram-p… 1 5</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 8</span> Gram-positive 2.22<span style="color: #949494;">e</span><span style="color: #BB0000;">-16</span> 1 <span style="color: #949494;">e</span>+ 0 Gram-p… 4 4</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;"> 9</span> Gram-negative 1 <span style="color: #949494;">e</span>+ 0 2.22<span style="color: #949494;">e</span><span style="color: #BB0000;">-16</span> Gram-n… 1 1</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;">10</span> Gram-positive 6.05<span style="color: #949494;">e</span><span style="color: #BB0000;">-11</span> 1.00<span style="color: #949494;">e</span>+ 0 Gram-p… 4 4</span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># ℹ 384 more rows</span></span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># ℹ 18 more variables: AMK <int>, KAN <int>, PEN <int>, OXA <int>, FLC <int>,</span></span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># AMX <int>, AMC <int>, AMP <int>, TZP <int>, CZO <int>, FEP <int>,</span></span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># CXM <int>, FOX <int>, CTX <int>, CAZ <int>, CRO <int>, IPM <int>, MEM <int></span></span></span>
|
||
<span></span>
|
||
<span><span class="co"># Evaluate model performance</span></span>
|
||
<span><span class="va">metrics</span> <span class="op"><-</span> <span class="va">predictions</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">metrics</span><span class="op">(</span>truth <span class="op">=</span> <span class="va">mo</span>, estimate <span class="op">=</span> <span class="va">.pred_class</span><span class="op">)</span> <span class="co"># Calculate performance metrics</span></span>
|
||
<span></span>
|
||
<span><span class="va">metrics</span></span>
|
||
<span><span class="co">#> <span style="color: #949494;"># A tibble: 2 × 3</span></span></span>
|
||
<span><span class="co">#> .metric .estimator .estimate</span></span>
|
||
<span><span class="co">#> <span style="color: #949494; font-style: italic;"><chr></span> <span style="color: #949494; font-style: italic;"><chr></span> <span style="color: #949494; font-style: italic;"><dbl></span></span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;">1</span> accuracy binary 0.995</span></span>
|
||
<span><span class="co">#> <span style="color: #BCBCBC;">2</span> kap binary 0.989</span></span></code></pre></div>
|
||
<p><strong>Explanation:</strong></p>
|
||
<ul>
|
||
<li>
|
||
<code><a href="https://rdrr.io/r/stats/predict.html" class="external-link">predict()</a></code> generates predictions on the testing
|
||
set.</li>
|
||
<li>
|
||
<code>metrics()</code> computes evaluation metrics like accuracy and
|
||
kappa.</li>
|
||
</ul>
|
||
<p>It appears we can predict the Gram based on AMR results with a 0.995
|
||
accuracy based on AMR results of aminoglycosides and beta-lactam
|
||
antibiotics. The ROC curve looks like this:</p>
|
||
<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
|
||
<code class="sourceCode R"><span><span class="va">predictions</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu">roc_curve</span><span class="op">(</span><span class="va">mo</span>, <span class="va">`.pred_Gram-negative`</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%>%</a></span></span>
|
||
<span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/autoplot.html" class="external-link">autoplot</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
|
||
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-7-1.png" width="720"></p>
|
||
</div>
|
||
<div class="section level3">
|
||
<h3 id="conclusion">
|
||
<strong>Conclusion</strong><a class="anchor" aria-label="anchor" href="#conclusion"></a>
|
||
</h3>
|
||
<p>In this post, we demonstrated how to build a machine learning
|
||
pipeline with the <code>tidymodels</code> framework and the
|
||
<code>AMR</code> package. By combining selector functions like
|
||
<code><a href="../reference/antibiotic_class_selectors.html">aminoglycosides()</a></code> and <code><a href="../reference/antibiotic_class_selectors.html">betalactams()</a></code> with
|
||
<code>tidymodels</code>, we efficiently prepared data, trained a model,
|
||
and evaluated its performance.</p>
|
||
<p>This workflow is extensible to other antibiotic classes and
|
||
resistance patterns, empowering users to analyse AMR data systematically
|
||
and reproducibly.</p>
|
||
</div>
|
||
</main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
|
||
</nav></aside>
|
||
</div>
|
||
|
||
|
||
|
||
<footer><div class="pkgdown-footer-left">
|
||
<p><code>AMR</code> (for R). Free and open-source, licenced under the <a target="_blank" href="https://github.com/msberends/AMR/blob/main/LICENSE" class="external-link">GNU General Public License version 2.0 (GPL-2)</a>.<br>Developed at the <a target="_blank" href="https://www.rug.nl" class="external-link">University of Groningen</a> and <a target="_blank" href="https://www.umcg.nl" class="external-link">University Medical Center Groningen</a> in The Netherlands.</p>
|
||
</div>
|
||
|
||
<div class="pkgdown-footer-right">
|
||
<p><a target="_blank" href="https://www.rug.nl" class="external-link"><img src="https://github.com/msberends/AMR/raw/main/pkgdown/assets/logo_rug.svg" style="max-width: 150px;"></a><a target="_blank" href="https://www.umcg.nl" class="external-link"><img src="https://github.com/msberends/AMR/raw/main/pkgdown/assets/logo_umcg.svg" style="max-width: 150px;"></a></p>
|
||
</div>
|
||
|
||
</footer>
|
||
</div>
|
||
|
||
|
||
|
||
|
||
|
||
</body>
|
||
</html>
|