1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-25 04:15:41 +02:00

(v1.3.0.9001) website update

This commit is contained in:
2020-08-10 12:46:03 +02:00
parent 0d9602a6a9
commit 7d16bec21f
79 changed files with 2221 additions and 1974 deletions

File diff suppressed because it is too large Load Diff

Binary file not shown.

Before

Width:  |  Height:  |  Size: 64 KiB

After

Width:  |  Height:  |  Size: 64 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 83 KiB

After

Width:  |  Height:  |  Size: 83 KiB

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to apply EUCAST rules</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/EUCAST.Rmd"><code>vignettes/EUCAST.Rmd</code></a></small>
<div class="hidden name"><code>EUCAST.Rmd</code></div>
@ -209,33 +209,39 @@
<a href="#examples" class="anchor"></a>Examples</h2>
<p>These rules can be used to discard impossible bug-drug combinations in your data. For example, <em>Klebsiella</em> produces beta-lactamase that prevents ampicillin (or amoxicillin) from working against it. In other words, practically every strain of <em>Klebsiella</em> is resistant to ampicillin.</p>
<p>Sometimes, laboratory data can still contain such strains with ampicillin being susceptible to ampicillin. This could be because an antibiogram is available before an identification is available, and the antibiogram is then not re-interpreted based on the identification (namely, <em>Klebsiella</em>). EUCAST expert rules solve this, that can be applied using <code><a href="../reference/eucast_rules.html">eucast_rules()</a></code>:</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="no">oops</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="kw">mo</span> <span class="kw">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="st">"Klebsiella"</span>,
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="kw">oops</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(mo = <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="st">"Klebsiella"</span>,
<span class="st">"Escherichia"</span>),
<span class="kw">ampicillin</span> <span class="kw">=</span> <span class="st">"S"</span>)
<span class="no">oops</span>
ampicillin = <span class="st">"S"</span>)
<span class="kw">oops</span>
<span class="co"># mo ampicillin</span>
<span class="co"># 1 Klebsiella S</span>
<span class="co"># 2 Escherichia S</span>
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="no">oops</span>, <span class="kw">info</span> <span class="kw">=</span> <span class="fl">FALSE</span>)
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="kw">oops</span>, info = <span class="fl">FALSE</span>)
<span class="co"># mo ampicillin</span>
<span class="co"># 1 Klebsiella R</span>
<span class="co"># 2 Escherichia S</span></pre></body></html></div>
<span class="co"># 2 Escherichia S</span>
</pre></div>
<p>EUCAST rules can not only be used for correction, they can also be used for filling in known resistance and susceptibility based on results of other antimicrobials drugs. This process is called <em>interpretive reading</em> and is part of the <code><a href="../reference/eucast_rules.html">eucast_rules()</a></code> function as well:</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="no">data</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="kw">mo</span> <span class="kw">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="st">"Staphylococcus aureus"</span>,
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="kw">data</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(mo = <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="st">"Staphylococcus aureus"</span>,
<span class="st">"Enterococcus faecalis"</span>,
<span class="st">"Escherichia coli"</span>,
<span class="st">"Klebsiella pneumoniae"</span>,
<span class="st">"Pseudomonas aeruginosa"</span>),
<span class="kw">VAN</span> <span class="kw">=</span> <span class="st">"-"</span>, <span class="co"># Vancomycin</span>
<span class="kw">AMX</span> <span class="kw">=</span> <span class="st">"-"</span>, <span class="co"># Amoxicillin</span>
<span class="kw">COL</span> <span class="kw">=</span> <span class="st">"-"</span>, <span class="co"># Colistin</span>
<span class="kw">CAZ</span> <span class="kw">=</span> <span class="st">"-"</span>, <span class="co"># Ceftazidime</span>
<span class="kw">CXM</span> <span class="kw">=</span> <span class="st">"-"</span>, <span class="co"># Cefuroxime</span>
<span class="kw">PEN</span> <span class="kw">=</span> <span class="st">"S"</span>, <span class="co"># Penicillin G</span>
<span class="kw">FOX</span> <span class="kw">=</span> <span class="st">"S"</span>, <span class="co"># Cefoxitin</span>
<span class="kw">stringsAsFactors</span> <span class="kw">=</span> <span class="fl">FALSE</span>)</pre></body></html></div>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="no">data</span></pre></body></html></div>
VAN = <span class="st">"-"</span>, <span class="co"># Vancomycin</span>
AMX = <span class="st">"-"</span>, <span class="co"># Amoxicillin</span>
COL = <span class="st">"-"</span>, <span class="co"># Colistin</span>
CAZ = <span class="st">"-"</span>, <span class="co"># Ceftazidime</span>
CXM = <span class="st">"-"</span>, <span class="co"># Cefuroxime</span>
PEN = <span class="st">"S"</span>, <span class="co"># Penicillin G</span>
FOX = <span class="st">"S"</span>, <span class="co"># Cefoxitin</span>
stringsAsFactors = <span class="fl">FALSE</span>)
</pre></div>
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="kw">data</span>
</pre></div>
<table class="table">
<thead><tr class="header">
<th align="left">mo</th>
@ -300,7 +306,9 @@
</tr>
</tbody>
</table>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="no">data</span>)</pre></body></html></div>
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="kw">data</span>)
</pre></div>
<pre><code># Warning: Not all columns with antimicrobial results are of class &lt;rsi&gt;.
# Transform eligible columns to class &lt;rsi&gt; on beforehand: your_data %&gt;% mutate_if(is.rsi.eligible, as.rsi)</code></pre>
<table class="table">
@ -385,7 +393,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to determine multi-drug resistance (MDR)</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/MDR.Rmd"><code>vignettes/MDR.Rmd</code></a></small>
<div class="hidden name"><code>MDR.Rmd</code></div>
@ -234,16 +234,20 @@
<a href="#examples" class="anchor"></a>Examples</h4>
<p>The <code><a href="../reference/mdro.html">mdro()</a></code> function always returns an ordered <code>factor</code>. For example, the output of the default guideline by Magiorakos <em>et al.</em> returns a <code>factor</code> with levels Negative, MDR, XDR or PDR in that order.</p>
<p>The next example uses the <code>example_isolates</code> data set. This is a data set included with this package and contains 2,000 microbial isolates with their full antibiograms. It reflects reality and can be used to practice AMR analysis. If we test the MDR/XDR/PDR guideline on this data set, we get:</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">dplyr</span>) <span class="co"># to support pipes: %&gt;%</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">cleaner</span>) <span class="co"># to create frequency tables</span></pre></body></html></div>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="no">example_isolates</span> <span class="kw">%&gt;%</span>
<span class="fu"><a href="../reference/mdro.html">mdro</a></span>() <span class="kw">%&gt;%</span>
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>) <span class="co"># to support pipes: %&gt;%</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://github.com/msberends/cleaner">cleaner</a></span>) <span class="co"># to create frequency tables</span>
</pre></div>
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="kw">example_isolates</span> <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/mdro.html">mdro</a></span>() <span class="op">%&gt;%</span>
<span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>() <span class="co"># show frequency table of the result</span>
<span class="co"># NOTE: Using column `mo` as input for `col_mo`.</span>
<span class="co"># NOTE: Auto-guessing columns suitable for analysis...OK.</span>
<span class="co"># NOTE: Reliability would be improved if these antimicrobial results would be available too: ceftaroline (CPT), fusidic acid (FUS), telavancin (TLV), daptomycin (DAP), quinupristin/dalfopristin (QDA), minocycline (MNO), gentamicin-high (GEH), streptomycin-high (STH), doripenem (DOR), levofloxacin (LVX), netilmicin (NET), ticarcillin/clavulanic acid (TCC), ertapenem (ETP), cefotetan (CTT), aztreonam (ATM), ampicillin/sulbactam (SAM), polymyxin B (PLB)</span>
<span class="co"># Warning in mdro(.): NA introduced for isolates where the available percentage of</span>
<span class="co"># antimicrobial classes was below 50% (set with `pct_required_classes`)</span></pre></body></html></div>
<span class="co"># antimicrobial classes was below 50% (set with `pct_required_classes`)</span>
</pre></div>
<p><strong>Frequency table</strong></p>
<p>Class: factor &gt; ordered (numeric)<br>
Length: 2,000<br>
@ -279,55 +283,67 @@ Unique: 2</p>
</tbody>
</table>
<p>For another example, I will create a data set to determine multi-drug resistant TB:</p>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="co"># a helper function to get a random vector with values S, I and R</span>
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="co"># a helper function to get a random vector with values S, I and R</span>
<span class="co"># with the probabilities 50% - 10% - 40%</span>
<span class="no">sample_rsi</span> <span class="kw">&lt;-</span> <span class="kw">function</span>() {
<span class="kw">sample_rsi</span> <span class="op">&lt;-</span> <span class="fu">function</span>() {
<span class="fu"><a href="https://rdrr.io/r/base/sample.html">sample</a></span>(<span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="st">"S"</span>, <span class="st">"I"</span>, <span class="st">"R"</span>),
<span class="kw">size</span> <span class="kw">=</span> <span class="fl">5000</span>,
<span class="kw">prob</span> <span class="kw">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="fl">0.5</span>, <span class="fl">0.1</span>, <span class="fl">0.4</span>),
<span class="kw">replace</span> <span class="kw">=</span> <span class="fl">TRUE</span>)
size = <span class="fl">5000</span>,
prob = <span class="fu"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="fl">0.5</span>, <span class="fl">0.1</span>, <span class="fl">0.4</span>),
replace = <span class="fl">TRUE</span>)
}
<span class="no">my_TB_data</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="kw">rifampicin</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">isoniazid</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">gatifloxacin</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">ethambutol</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">pyrazinamide</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">moxifloxacin</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">kanamycin</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>())</pre></body></html></div>
<span class="kw">my_TB_data</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(rifampicin = <span class="fu">sample_rsi</span>(),
isoniazid = <span class="fu">sample_rsi</span>(),
gatifloxacin = <span class="fu">sample_rsi</span>(),
ethambutol = <span class="fu">sample_rsi</span>(),
pyrazinamide = <span class="fu">sample_rsi</span>(),
moxifloxacin = <span class="fu">sample_rsi</span>(),
kanamycin = <span class="fu">sample_rsi</span>())
</pre></div>
<p>Because all column names are automatically verified for valid drug names or codes, this would have worked exactly the same:</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="no">my_TB_data</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="kw">RIF</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">INH</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">GAT</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">ETH</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">PZA</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">MFX</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>(),
<span class="kw">KAN</span> <span class="kw">=</span> <span class="fu">sample_rsi</span>())</pre></body></html></div>
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="kw">my_TB_data</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(RIF = <span class="fu">sample_rsi</span>(),
INH = <span class="fu">sample_rsi</span>(),
GAT = <span class="fu">sample_rsi</span>(),
ETH = <span class="fu">sample_rsi</span>(),
PZA = <span class="fu">sample_rsi</span>(),
MFX = <span class="fu">sample_rsi</span>(),
KAN = <span class="fu">sample_rsi</span>())
</pre></div>
<p>The data set now looks like this:</p>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span>(<span class="no">my_TB_data</span>)
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span>(<span class="kw">my_TB_data</span>)
<span class="co"># rifampicin isoniazid gatifloxacin ethambutol pyrazinamide moxifloxacin</span>
<span class="co"># 1 S S R R S I</span>
<span class="co"># 2 R S R R R S</span>
<span class="co"># 3 R S S S I S</span>
<span class="co"># 4 S S I S R S</span>
<span class="co"># 5 R I R S R S</span>
<span class="co"># 6 S S S S R R</span>
<span class="co"># 1 R R R R I I</span>
<span class="co"># 2 S S S R S R</span>
<span class="co"># 3 S I S S S S</span>
<span class="co"># 4 S I S R R R</span>
<span class="co"># 5 S S R S S R</span>
<span class="co"># 6 S R S R R R</span>
<span class="co"># kanamycin</span>
<span class="co"># 1 I</span>
<span class="co"># 2 I</span>
<span class="co"># 1 S</span>
<span class="co"># 2 R</span>
<span class="co"># 3 S</span>
<span class="co"># 4 R</span>
<span class="co"># 5 R</span>
<span class="co"># 6 S</span></pre></body></html></div>
<span class="co"># 4 S</span>
<span class="co"># 5 S</span>
<span class="co"># 6 R</span>
</pre></div>
<p>We can now add the interpretation of MDR-TB to our data set. You can use:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="fu"><a href="../reference/mdro.html">mdro</a></span>(<span class="no">my_TB_data</span>, <span class="kw">guideline</span> <span class="kw">=</span> <span class="st">"TB"</span>)</pre></body></html></div>
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="fu"><a href="../reference/mdro.html">mdro</a></span>(<span class="kw">my_TB_data</span>, guideline = <span class="st">"TB"</span>)
</pre></div>
<p>or its shortcut <code><a href="../reference/mdro.html">mdr_tb()</a></code>:</p>
<div class="sourceCode" id="cb7"><html><body><pre class="r"><span class="no">my_TB_data</span>$<span class="no">mdr</span> <span class="kw">&lt;-</span> <span class="fu"><a href="../reference/mdro.html">mdr_tb</a></span>(<span class="no">my_TB_data</span>)
<div class="sourceCode" id="cb7"><pre class="downlit">
<span class="kw">my_TB_data</span><span class="op">$</span><span class="kw">mdr</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/mdro.html">mdr_tb</a></span>(<span class="kw">my_TB_data</span>)
<span class="co"># NOTE: No column found as input for `col_mo`, assuming all records contain Mycobacterium tuberculosis.</span>
<span class="co"># NOTE: Auto-guessing columns suitable for analysis...OK.</span>
<span class="co"># NOTE: Reliability would be improved if these antimicrobial results would be available too: capreomycin (CAP), rifabutin (RIB), rifapentine (RFP)</span></pre></body></html></div>
<span class="co"># NOTE: Reliability would be improved if these antimicrobial results would be available too: capreomycin (CAP), rifabutin (RIB), rifapentine (RFP)</span>
</pre></div>
<p>Create a frequency table of the results:</p>
<div class="sourceCode" id="cb8"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="no">my_TB_data</span>$<span class="no">mdr</span>)</pre></body></html></div>
<div class="sourceCode" id="cb8"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="kw">my_TB_data</span><span class="op">$</span><span class="kw">mdr</span>)
</pre></div>
<p><strong>Frequency table</strong></p>
<p>Class: factor &gt; ordered (numeric)<br>
Length: 5,000<br>
@ -347,40 +363,40 @@ Unique: 5</p>
<tr class="odd">
<td align="left">1</td>
<td align="left">Mono-resistant</td>
<td align="right">3215</td>
<td align="right">64.30%</td>
<td align="right">3215</td>
<td align="right">64.30%</td>
<td align="right">3229</td>
<td align="right">64.58%</td>
<td align="right">3229</td>
<td align="right">64.58%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">Multi-drug-resistant</td>
<td align="right">643</td>
<td align="right">12.86%</td>
<td align="right">3858</td>
<td align="right">77.16%</td>
<td align="left">Negative</td>
<td align="right">674</td>
<td align="right">13.48%</td>
<td align="right">3903</td>
<td align="right">78.06%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">Negative</td>
<td align="right">637</td>
<td align="right">12.74%</td>
<td align="right">4495</td>
<td align="right">89.90%</td>
<td align="left">Multi-drug-resistant</td>
<td align="right">616</td>
<td align="right">12.32%</td>
<td align="right">4519</td>
<td align="right">90.38%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">Poly-resistant</td>
<td align="right">292</td>
<td align="right">5.84%</td>
<td align="right">4787</td>
<td align="right">95.74%</td>
<td align="right">285</td>
<td align="right">5.70%</td>
<td align="right">4804</td>
<td align="right">96.08%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="left">Extensively drug-resistant</td>
<td align="right">213</td>
<td align="right">4.26%</td>
<td align="right">196</td>
<td align="right">3.92%</td>
<td align="right">5000</td>
<td align="right">100.00%</td>
</tr>
@ -402,7 +418,7 @@ Unique: 5</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to conduct principal component analysis (PCA) for AMR</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/PCA.Rmd"><code>vignettes/PCA.Rmd</code></a></small>
<div class="hidden name"><code>PCA.Rmd</code></div>
@ -204,9 +204,10 @@
<h1 class="hasAnchor">
<a href="#transforming" class="anchor"></a>Transforming</h1>
<p>For PCA, we need to transform our AMR data first. This is what the <code>example_isolates</code> data set in this package looks like:</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">AMR</span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">dplyr</span>)
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/reexports.html">glimpse</a></span>(<span class="no">example_isolates</span>)
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://msberends.github.io/AMR">AMR</a></span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>)
<span class="fu"><a href="https://tibble.tidyverse.org/reference/glimpse.html">glimpse</a></span>(<span class="kw">example_isolates</span>)
<span class="co"># Rows: 2,000</span>
<span class="co"># Columns: 49</span>
<span class="co"># $ date &lt;date&gt; 2002-01-02, 2002-01-03, 2002-01-07, 2002-01-07, 2002…</span>
@ -257,16 +258,18 @@
<span class="co"># $ CHL &lt;ord&gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…</span>
<span class="co"># $ COL &lt;ord&gt; NA, NA, R, R, R, R, R, R, R, R, R, R, NA, NA, NA, R, …</span>
<span class="co"># $ MUP &lt;ord&gt; NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…</span>
<span class="co"># $ RIF &lt;ord&gt; R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R…</span></pre></body></html></div>
<span class="co"># $ RIF &lt;ord&gt; R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R…</span>
</pre></div>
<p>Now to transform this to a data set with only resistance percentages per taxonomic order and genus:</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="no">resistance_data</span> <span class="kw">&lt;-</span> <span class="no">example_isolates</span> <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(<span class="kw">order</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span>(<span class="no">mo</span>), <span class="co"># group on anything, like order</span>
<span class="kw">genus</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="no">mo</span>)) <span class="kw">%&gt;%</span> <span class="co"># and genus as we do here</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span>(<span class="no">is.rsi</span>, <span class="no">resistance</span>) <span class="kw">%&gt;%</span> <span class="co"># then get resistance of all drugs</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="no">order</span>, <span class="no">genus</span>, <span class="no">AMC</span>, <span class="no">CXM</span>, <span class="no">CTX</span>,
<span class="no">CAZ</span>, <span class="no">GEN</span>, <span class="no">TOB</span>, <span class="no">TMP</span>, <span class="no">SXT</span>) <span class="co"># and select only relevant columns</span>
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="kw">resistance_data</span> <span class="op">&lt;-</span> <span class="kw">example_isolates</span> <span class="op">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(order = <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span>(<span class="kw">mo</span>), <span class="co"># group on anything, like order</span>
genus = <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="kw">mo</span>)) <span class="op">%&gt;%</span> <span class="co"># and genus as we do here</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/summarise_all.html">summarise_if</a></span>(<span class="kw">is.rsi</span>, <span class="kw">resistance</span>) <span class="op">%&gt;%</span> <span class="co"># then get resistance of all drugs</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="kw">order</span>, <span class="kw">genus</span>, <span class="kw">AMC</span>, <span class="kw">CXM</span>, <span class="kw">CTX</span>,
<span class="kw">CAZ</span>, <span class="kw">GEN</span>, <span class="kw">TOB</span>, <span class="kw">TMP</span>, <span class="kw">SXT</span>) <span class="co"># and select only relevant columns</span>
<span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span>(<span class="no">resistance_data</span>)
<span class="fu"><a href="https://rdrr.io/r/utils/head.html">head</a></span>(<span class="kw">resistance_data</span>)
<span class="co"># # A tibble: 6 x 10</span>
<span class="co"># # Groups: order [2]</span>
<span class="co"># order genus AMC CXM CTX CAZ GEN TOB TMP SXT</span>
@ -276,35 +279,46 @@
<span class="co"># 3 Actinomycetales Cutibacterium NA NA NA NA NA NA NA NA</span>
<span class="co"># 4 Actinomycetales Dermabacter NA NA NA NA NA NA NA NA</span>
<span class="co"># 5 Actinomycetales Micrococcus NA NA NA NA NA NA NA NA</span>
<span class="co"># 6 Actinomycetales Rothia NA NA NA NA NA NA NA NA</span></pre></body></html></div>
<span class="co"># 6 Actinomycetales Rothia NA NA NA NA NA NA NA NA</span>
</pre></div>
</div>
<div id="perform-principal-component-analysis" class="section level1">
<h1 class="hasAnchor">
<a href="#perform-principal-component-analysis" class="anchor"></a>Perform principal component analysis</h1>
<p>The new <code><a href="../reference/pca.html">pca()</a></code> function will automatically filter on rows that contain numeric values in all selected variables, so we now only need to do:</p>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="no">pca_result</span> <span class="kw">&lt;-</span> <span class="fu"><a href="../reference/pca.html">pca</a></span>(<span class="no">resistance_data</span>)
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="kw">pca_result</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/pca.html">pca</a></span>(<span class="kw">resistance_data</span>)
<span class="co"># NOTE: Columns selected for PCA: AMC CXM CTX CAZ GEN TOB TMP SXT.</span>
<span class="co"># Total observations available: 7.</span></pre></body></html></div>
<span class="co"># Total observations available: 7.</span>
</pre></div>
<p>The result can be reviewed with the good old <code><a href="https://rdrr.io/r/base/summary.html">summary()</a></code> function:</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="no">pca_result</span>)
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="kw">pca_result</span>)
<span class="co"># Importance of components:</span>
<span class="co"># PC1 PC2 PC3 PC4 PC5 PC6 PC7</span>
<span class="co"># Standard deviation 2.154 1.6809 0.61305 0.33882 0.20755 0.03137 1.602e-16</span>
<span class="co"># Proportion of Variance 0.580 0.3532 0.04698 0.01435 0.00538 0.00012 0.000e+00</span>
<span class="co"># Cumulative Proportion 0.580 0.9332 0.98014 0.99449 0.99988 1.00000 1.000e+00</span></pre></body></html></div>
<span class="co"># Cumulative Proportion 0.580 0.9332 0.98014 0.99449 0.99988 1.00000 1.000e+00</span>
</pre></div>
<p>Good news. The first two components explain a total of 93.3% of the variance (see the PC1 and PC2 values of the <em>Proportion of Variance</em>. We can create a so-called biplot with the base R <code><a href="https://rdrr.io/r/stats/biplot.html">biplot()</a></code> function, to see which antimicrobial resistance per drug explain the difference per microorganism.</p>
</div>
<div id="plotting-the-results" class="section level1">
<h1 class="hasAnchor">
<a href="#plotting-the-results" class="anchor"></a>Plotting the results</h1>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/stats/biplot.html">biplot</a></span>(<span class="no">pca_result</span>)</pre></body></html></div>
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/stats/biplot.html">biplot</a></span>(<span class="kw">pca_result</span>)
</pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-5-1.png" width="750"></p>
<p>But we cant see the explanation of the points. Perhaps this works better with our new <code><a href="../reference/ggplot_pca.html">ggplot_pca()</a></code> function, that automatically adds the right labels and even groups:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="no">pca_result</span>)</pre></body></html></div>
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="kw">pca_result</span>)
</pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-6-1.png" width="750"></p>
<p>You can also print an ellipse per group, and edit the appearance:</p>
<div class="sourceCode" id="cb7"><html><body><pre class="r"><span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="no">pca_result</span>, <span class="kw">ellipse</span> <span class="kw">=</span> <span class="fl">TRUE</span>) +
<span class="kw pkg">ggplot2</span><span class="kw ns">::</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html">labs</a></span>(<span class="kw">title</span> <span class="kw">=</span> <span class="st">"An AMR/PCA biplot!"</span>)</pre></body></html></div>
<div class="sourceCode" id="cb7"><pre class="downlit">
<span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span>(<span class="kw">pca_result</span>, ellipse = <span class="fl">TRUE</span>) <span class="op">+</span>
<span class="kw">ggplot2</span>::<span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html">labs</a></span>(title = <span class="st">"An AMR/PCA biplot!"</span>)
</pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-7-1.png" width="750"></p>
</div>
</div>
@ -324,7 +338,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to import data from SPSS / SAS / Stata</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/SPSS.Rmd"><code>vignettes/SPSS.Rmd</code></a></small>
<div class="hidden name"><code>SPSS.Rmd</code></div>
@ -240,7 +240,8 @@
</li>
</ul>
<p>To demonstrate the first point:</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="co"># not all values are valid MIC values:</span>
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="co"># not all values are valid MIC values:</span>
<span class="fu"><a href="../reference/as.mic.html">as.mic</a></span>(<span class="fl">0.125</span>)
<span class="co"># Class &lt;mic&gt;</span>
<span class="co"># [1] 0.125</span>
@ -253,13 +254,13 @@
<span class="co"># [1] "Gram-negative"</span>
<span class="co"># Klebsiella is intrinsic resistant to amoxicllin, according to EUCAST:</span>
<span class="no">klebsiella_test</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="kw">mo</span> <span class="kw">=</span> <span class="st">"klebsiella"</span>,
<span class="kw">amox</span> <span class="kw">=</span> <span class="st">"S"</span>,
<span class="kw">stringsAsFactors</span> <span class="kw">=</span> <span class="fl">FALSE</span>)
<span class="no">klebsiella_test</span> <span class="co"># (our original data)</span>
<span class="kw">klebsiella_test</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(mo = <span class="st">"klebsiella"</span>,
amox = <span class="st">"S"</span>,
stringsAsFactors = <span class="fl">FALSE</span>)
<span class="kw">klebsiella_test</span> <span class="co"># (our original data)</span>
<span class="co"># mo amox</span>
<span class="co"># 1 klebsiella S</span>
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="no">klebsiella_test</span>, <span class="kw">info</span> <span class="kw">=</span> <span class="fl">FALSE</span>) <span class="co"># (the edited data by EUCAST rules)</span>
<span class="fu"><a href="../reference/eucast_rules.html">eucast_rules</a></span>(<span class="kw">klebsiella_test</span>, info = <span class="fl">FALSE</span>) <span class="co"># (the edited data by EUCAST rules)</span>
<span class="co"># mo amox</span>
<span class="co"># 1 klebsiella R</span>
@ -271,7 +272,8 @@
<span class="co"># [4] "fluclox" "flucloxacilina" "flucloxacillin" </span>
<span class="co"># [7] "flucloxacilline" "flucloxacillinum" "fluorochloroxacillin"</span>
<span class="fu"><a href="../reference/ab_property.html">ab_atc</a></span>(<span class="st">"floxapen"</span>)
<span class="co"># [1] "J01CF05"</span></pre></body></html></div>
<span class="co"># [1] "J01CF05"</span>
</pre></div>
</div>
<div id="import-data-from-spsssasstata" class="section level2">
<h2 class="hasAnchor">
@ -287,7 +289,8 @@
<p><img src="https://github.com/msberends/AMR/raw/master/docs/import2.png"></p>
<p>If you want named variables to be imported as factors so it resembles SPSS more, use <code><a href="https://haven.tidyverse.org/reference/as_factor.html">as_factor()</a></code>.</p>
<p>The difference is this:</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="no">SPSS_data</span>
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="kw">SPSS_data</span>
<span class="co"># # A tibble: 4,203 x 4</span>
<span class="co"># v001 sex status statusage</span>
<span class="co"># &lt;dbl&gt; &lt;dbl+lbl&gt; &lt;dbl+lbl&gt; &lt;dbl&gt;</span>
@ -303,7 +306,7 @@
<span class="co"># 10 10018 0 1 66.6</span>
<span class="co"># # … with 4,193 more rows</span>
<span class="fu">as_factor</span>(<span class="no">SPSS_data</span>)
<span class="fu">as_factor</span>(<span class="kw">SPSS_data</span>)
<span class="co"># # A tibble: 4,203 x 4</span>
<span class="co"># v001 sex status statusage</span>
<span class="co"># &lt;dbl&gt; &lt;fct&gt; &lt;fct&gt; &lt;dbl&gt;</span>
@ -317,67 +320,82 @@
<span class="co"># 8 10011 Male alive 73.1</span>
<span class="co"># 9 10017 Male alive 56.7</span>
<span class="co"># 10 10018 Female alive 66.6</span>
<span class="co"># # … with 4,193 more rows</span></pre></body></html></div>
<span class="co"># # … with 4,193 more rows</span>
</pre></div>
</div>
<div id="base-r" class="section level3">
<h3 class="hasAnchor">
<a href="#base-r" class="anchor"></a>Base R</h3>
<p>To import data from SPSS, SAS or Stata, you can use the <a href="https://haven.tidyverse.org/">great <code>haven</code> package</a> yourself:</p>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="co"># download and install the latest version:</span>
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="co"># download and install the latest version:</span>
<span class="fu"><a href="https://rdrr.io/r/utils/install.packages.html">install.packages</a></span>(<span class="st">"haven"</span>)
<span class="co"># load the package you just installed:</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">haven</span>)</pre></body></html></div>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="http://haven.tidyverse.org">haven</a></span>)
</pre></div>
<p>You can now import files as follows:</p>
<div id="spss" class="section level4">
<h4 class="hasAnchor">
<a href="#spss" class="anchor"></a>SPSS</h4>
<p>To read files from SPSS into R:</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="co"># read any SPSS file based on file extension (best way):</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_spss</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="co"># read any SPSS file based on file extension (best way):</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_spss</a></span>(file = <span class="st">"path/to/file"</span>)
<span class="co"># read .sav or .zsav file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_sav</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_sav</a></span>(file = <span class="st">"path/to/file"</span>)
<span class="co"># read .por file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_por</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">read_por</a></span>(file = <span class="st">"path/to/file"</span>)
</pre></div>
<p>Do not forget about <code><a href="https://haven.tidyverse.org/reference/as_factor.html">as_factor()</a></code>, as mentioned above.</p>
<p>To export your R objects to the SPSS file format:</p>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="co"># save as .sav file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">write_sav</a></span>(<span class="kw">data</span> <span class="kw">=</span> <span class="no">yourdata</span>, <span class="kw">path</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="co"># save as .sav file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">write_sav</a></span>(data = <span class="kw">yourdata</span>, path = <span class="st">"path/to/file"</span>)
<span class="co"># save as compressed .zsav file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">write_sav</a></span>(<span class="kw">data</span> <span class="kw">=</span> <span class="no">yourdata</span>, <span class="kw">path</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>, <span class="kw">compress</span> <span class="kw">=</span> <span class="fl">TRUE</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_spss.html">write_sav</a></span>(data = <span class="kw">yourdata</span>, path = <span class="st">"path/to/file"</span>, compress = <span class="fl">TRUE</span>)
</pre></div>
</div>
<div id="sas" class="section level4">
<h4 class="hasAnchor">
<a href="#sas" class="anchor"></a>SAS</h4>
<p>To read files from SAS into R:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="co"># read .sas7bdat + .sas7bcat files:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html">read_sas</a></span>(<span class="kw">data_file</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>, <span class="kw">catalog_file</span> <span class="kw">=</span> <span class="kw">NULL</span>)
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="co"># read .sas7bdat + .sas7bcat files:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html">read_sas</a></span>(data_file = <span class="st">"path/to/file"</span>, catalog_file = <span class="kw">NULL</span>)
<span class="co"># read SAS transport files (version 5 and version 8):</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html">read_xpt</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html">read_xpt</a></span>(file = <span class="st">"path/to/file"</span>)
</pre></div>
<p>To export your R objects to the SAS file format:</p>
<div class="sourceCode" id="cb7"><html><body><pre class="r"><span class="co"># save as regular SAS file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html">write_sas</a></span>(<span class="kw">data</span> <span class="kw">=</span> <span class="no">yourdata</span>, <span class="kw">path</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>)
<div class="sourceCode" id="cb7"><pre class="downlit">
<span class="co"># save as regular SAS file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_sas.html">write_sas</a></span>(data = <span class="kw">yourdata</span>, path = <span class="st">"path/to/file"</span>)
<span class="co"># the SAS transport format is an open format </span>
<span class="co"># (required for submission of the data to the FDA)</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html">write_xpt</a></span>(<span class="kw">data</span> <span class="kw">=</span> <span class="no">yourdata</span>, <span class="kw">path</span> <span class="kw">=</span> <span class="st">"path/to/file"</span>, <span class="kw">version</span> <span class="kw">=</span> <span class="fl">8</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_xpt.html">write_xpt</a></span>(data = <span class="kw">yourdata</span>, path = <span class="st">"path/to/file"</span>, version = <span class="fl">8</span>)
</pre></div>
</div>
<div id="stata" class="section level4">
<h4 class="hasAnchor">
<a href="#stata" class="anchor"></a>Stata</h4>
<p>To read files from Stata into R:</p>
<div class="sourceCode" id="cb8"><html><body><pre class="r"><span class="co"># read .dta file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">read_stata</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"/path/to/file"</span>)
<div class="sourceCode" id="cb8"><pre class="downlit">
<span class="co"># read .dta file:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">read_stata</a></span>(file = <span class="st">"/path/to/file"</span>)
<span class="co"># works exactly the same:</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">read_dta</a></span>(<span class="kw">file</span> <span class="kw">=</span> <span class="st">"/path/to/file"</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">read_dta</a></span>(file = <span class="st">"/path/to/file"</span>)
</pre></div>
<p>To export your R objects to the Stata file format:</p>
<div class="sourceCode" id="cb9"><html><body><pre class="r"><span class="co"># save as .dta file, Stata version 14:</span>
<div class="sourceCode" id="cb9"><pre class="downlit">
<span class="co"># save as .dta file, Stata version 14:</span>
<span class="co"># (supports Stata v8 until v15 at the time of writing)</span>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">write_dta</a></span>(<span class="kw">data</span> <span class="kw">=</span> <span class="no">yourdata</span>, <span class="kw">path</span> <span class="kw">=</span> <span class="st">"/path/to/file"</span>, <span class="kw">version</span> <span class="kw">=</span> <span class="fl">14</span>)</pre></body></html></div>
<span class="fu"><a href="https://haven.tidyverse.org/reference/read_dta.html">write_dta</a></span>(data = <span class="kw">yourdata</span>, path = <span class="st">"/path/to/file"</span>, version = <span class="fl">14</span>)
</pre></div>
</div>
</div>
</div>
@ -398,7 +416,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to work with WHONET data</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/WHONET.Rmd"><code>vignettes/WHONET.Rmd</code></a></small>
<div class="hidden name"><code>WHONET.Rmd</code></div>
@ -200,34 +200,42 @@
<a href="#import-of-data" class="anchor"></a>Import of data</h3>
<p>This tutorial assumes you already imported the WHONET data with e.g. the <a href="https://readxl.tidyverse.org/"><code>readxl</code> package</a>. In RStudio, this can be done using the menu button Import Dataset in the tab Environment. Choose the option From Excel and select your exported file. Make sure date fields are imported correctly.</p>
<p>An example syntax could look like this:</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">readxl</span>)
<span class="no">data</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://readxl.tidyverse.org/reference/read_excel.html">read_excel</a></span>(<span class="kw">path</span> <span class="kw">=</span> <span class="st">"path/to/your/file.xlsx"</span>)</pre></body></html></div>
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://readxl.tidyverse.org">readxl</a></span>)
<span class="kw">data</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://readxl.tidyverse.org/reference/read_excel.html">read_excel</a></span>(path = <span class="st">"path/to/your/file.xlsx"</span>)
</pre></div>
<p>This package comes with an <a href="https://msberends.github.io/AMR/reference/WHONET.html">example data set <code>WHONET</code></a>. We will use it for this analysis.</p>
</div>
<div id="preparation" class="section level3">
<h3 class="hasAnchor">
<a href="#preparation" class="anchor"></a>Preparation</h3>
<p>First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you dont know it yet, I suggest you read about it on their website: <a href="https://www.tidyverse.org/" class="uri">https://www.tidyverse.org/</a>.</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">dplyr</span>) <span class="co"># part of tidyverse</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">ggplot2</span>) <span class="co"># part of tidyverse</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">AMR</span>) <span class="co"># this package</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">cleaner</span>) <span class="co"># to create frequency tables</span></pre></body></html></div>
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>) <span class="co"># part of tidyverse</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="http://ggplot2.tidyverse.org">ggplot2</a></span>) <span class="co"># part of tidyverse</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://msberends.github.io/AMR">AMR</a></span>) <span class="co"># this package</span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://github.com/msberends/cleaner">cleaner</a></span>) <span class="co"># to create frequency tables</span>
</pre></div>
<p>We will have to transform some variables to simplify and automate the analysis:</p>
<ul>
<li>Microorganisms should be transformed to our own microorganism IDs (called an <code>mo</code>) using <a href="https://msberends.github.io/AMR/reference/catalogue_of_life">our Catalogue of Life reference data set</a>, which contains all ~70,000 microorganisms from the taxonomic kingdoms Bacteria, Fungi and Protozoa. We do the tranformation with <code><a href="../reference/as.mo.html">as.mo()</a></code>. This function also recognises almost all WHONET abbreviations of microorganisms.</li>
<li>Antimicrobial results or interpretations have to be clean and valid. In other words, they should only contain values <code>"S"</code>, <code>"I"</code> or <code>"R"</code>. That is exactly where the <code><a href="../reference/as.rsi.html">as.rsi()</a></code> function is for.</li>
</ul>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="co"># transform variables</span>
<span class="no">data</span> <span class="kw">&lt;-</span> <span class="no">WHONET</span> <span class="kw">%&gt;%</span>
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="co"># transform variables</span>
<span class="kw">data</span> <span class="op">&lt;-</span> <span class="kw">WHONET</span> <span class="op">%&gt;%</span>
<span class="co"># get microbial ID based on given organism</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="kw">mo</span> <span class="kw">=</span> <span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="no">Organism</span>)) <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(mo = <span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="kw">Organism</span>)) <span class="op">%&gt;%</span>
<span class="co"># transform everything from "AMP_ND10" to "CIP_EE" to the new `rsi` class</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate_all.html">mutate_at</a></span>(<span class="fu"><a href="https://dplyr.tidyverse.org/reference/vars.html">vars</a></span>(<span class="no">AMP_ND10</span>:<span class="no">CIP_EE</span>), <span class="no">as.rsi</span>)</pre></body></html></div>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/mutate_all.html">mutate_at</a></span>(<span class="fu"><a href="https://dplyr.tidyverse.org/reference/vars.html">vars</a></span>(<span class="kw">AMP_ND10</span><span class="op">:</span><span class="kw">CIP_EE</span>), <span class="kw">as.rsi</span>)
</pre></div>
<p>No errors or warnings, so all values are transformed succesfully.</p>
<p>We also created a package dedicated to data cleaning and checking, called the <code>cleaner</code> package. Its <code><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq()</a></code> function can be used to create frequency tables.</p>
<p>So lets check our data, with a couple of frequency tables:</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="co"># our newly created `mo` variable, put in the mo_name() function</span>
<span class="no">data</span> <span class="kw">%&gt;%</span> <span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="no">mo</span>), <span class="kw">nmax</span> <span class="kw">=</span> <span class="fl">10</span>)</pre></body></html></div>
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="co"># our newly created `mo` variable, put in the mo_name() function</span>
<span class="kw">data</span> <span class="op">%&gt;%</span> <span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="kw">mo</span>), nmax = <span class="fl">10</span>)
</pre></div>
<p><strong>Frequency table</strong></p>
<p>Class: character<br>
Length: 500<br>
@ -328,9 +336,11 @@ Longest: 40</p>
</tbody>
</table>
<p>(omitted 27 entries, n = 56 [11.20%])</p>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="co"># our transformed antibiotic columns</span>
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="co"># our transformed antibiotic columns</span>
<span class="co"># amoxicillin/clavulanic acid (J01CR02) as an example</span>
<span class="no">data</span> <span class="kw">%&gt;%</span> <span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="no">AMC_ND2</span>)</pre></body></html></div>
<span class="kw">data</span> <span class="op">%&gt;%</span> <span class="fu"><a href="https://rdrr.io/pkg/cleaner/man/freq.html">freq</a></span>(<span class="kw">AMC_ND2</span>)
</pre></div>
<p><strong>Frequency table</strong></p>
<p>Class: factor &gt; ordered &gt; rsi (numeric)<br>
Length: 500<br>
@ -378,10 +388,12 @@ Unique: 3</p>
<h3 class="hasAnchor">
<a href="#a-first-glimpse-at-results" class="anchor"></a>A first glimpse at results</h3>
<p>An easy <code>ggplot</code> will already give a lot of information, using the included <code><a href="../reference/ggplot_rsi.html">ggplot_rsi()</a></code> function:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="no">data</span> <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(<span class="no">Country</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="no">Country</span>, <span class="no">AMP_ND2</span>, <span class="no">AMC_ED20</span>, <span class="no">CAZ_ED10</span>, <span class="no">CIP_ED5</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="../reference/ggplot_rsi.html">ggplot_rsi</a></span>(<span class="kw">translate_ab</span> <span class="kw">=</span> <span class="st">'ab'</span>, <span class="kw">facet</span> <span class="kw">=</span> <span class="st">"Country"</span>, <span class="kw">datalabels</span> <span class="kw">=</span> <span class="fl">FALSE</span>)</pre></body></html></div>
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="kw">data</span> <span class="op">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(<span class="kw">Country</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="kw">Country</span>, <span class="kw">AMP_ND2</span>, <span class="kw">AMC_ED20</span>, <span class="kw">CAZ_ED10</span>, <span class="kw">CIP_ED5</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/ggplot_rsi.html">ggplot_rsi</a></span>(translate_ab = <span class="st">'ab'</span>, facet = <span class="st">"Country"</span>, datalabels = <span class="fl">FALSE</span>)
</pre></div>
<p><img src="WHONET_files/figure-html/unnamed-chunk-7-1.png" width="720"></p>
</div>
</div>
@ -399,7 +411,7 @@ Unique: 3</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>Benchmarks</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/benchmarks.Rmd"><code>vignettes/benchmarks.Rmd</code></a></small>
<div class="hidden name"><code>benchmarks.Rmd</code></div>
@ -197,13 +197,16 @@
<p>One of the most important features of this package is the complete microbial taxonomic database, supplied by the <a href="http://catalogueoflife.org">Catalogue of Life</a>. We created a function <code><a href="../reference/as.mo.html">as.mo()</a></code> that transforms any user input value to a valid microbial ID by using intelligent rules combined with the taxonomic tree of Catalogue of Life.</p>
<p>Using the <code>microbenchmark</code> package, we can review the calculation performance of this function. Its function <code>microbenchmark()</code> runs different input expressions independently of each other and measures their time-to-result.</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="no">microbenchmark</span> <span class="kw">&lt;-</span> <span class="kw pkg">microbenchmark</span><span class="kw ns">::</span><span class="no"><a href="https://rdrr.io/pkg/microbenchmark/man/microbenchmark.html">microbenchmark</a></span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">AMR</span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">dplyr</span>)</pre></body></html></div>
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="kw">microbenchmark</span> <span class="op">&lt;-</span> <span class="kw">microbenchmark</span>::<span class="kw"><a href="https://rdrr.io/pkg/microbenchmark/man/microbenchmark.html">microbenchmark</a></span>
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://msberends.github.io/AMR">AMR</a></span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>)
</pre></div>
<p>In the next test, we try to coerce different input values into the microbial code of <em>Staphylococcus aureus</em>. Coercion is a computational process of forcing output based on an input. For microorganism names, coercing user input to taxonomically valid microorganism names is crucial to ensure correct interpretation and to enable grouping based on taxonomic properties.</p>
<p>The actual result is the same every time: it returns its microorganism code <code>B_STPHY_AURS</code> (<em>B</em> stands for <em>Bacteria</em>, the taxonomic kingdom).</p>
<p>But the calculation time differs a lot:</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"><span class="no">S.aureus</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(
<div class="sourceCode" id="cb2"><pre class="downlit">
<span class="kw">S.aureus</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"sau"</span>), <span class="co"># WHONET code</span>
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"stau"</span>),
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"STAU"</span>),
@ -218,47 +221,50 @@
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"VISA"</span>), <span class="co"># Vancomycin Intermediate S. aureus</span>
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"VRSA"</span>), <span class="co"># Vancomycin Resistant S. aureus</span>
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="fl">22242419</span>), <span class="co"># Catalogue of Life ID</span>
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">S.aureus</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">2</span>)
times = <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">S.aureus</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">2</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># as.mo("sau") 11 12 17 13 15 51 10</span>
<span class="co"># as.mo("stau") 150 160 170 170 190 200 10</span>
<span class="co"># as.mo("STAU") 160 160 180 190 190 210 10</span>
<span class="co"># as.mo("staaur") 12 13 23 15 20 68 10</span>
<span class="co"># as.mo("STAAUR") 11 12 20 16 18 44 10</span>
<span class="co"># as.mo("S. aureus") 11 13 29 17 44 84 10</span>
<span class="co"># as.mo("S aureus") 11 15 21 16 18 46 10</span>
<span class="co"># as.mo("Staphylococcus aureus") 11 13 16 13 15 41 10</span>
<span class="co"># as.mo("Staphylococcus aureus (MRSA)") 870 890 920 900 950 1100 10</span>
<span class="co"># as.mo("Sthafilokkockus aaureuz") 400 410 430 440 450 490 10</span>
<span class="co"># as.mo("MRSA") 13 13 17 14 16 40 10</span>
<span class="co"># as.mo("VISA") 14 17 25 19 36 46 10</span>
<span class="co"># as.mo("VRSA") 13 15 21 17 21 50 10</span>
<span class="co"># as.mo(22242419) 130 140 150 150 150 180 10</span></pre></body></html></div>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># as.mo("sau") 11.0 14 21 15 16 51 10</span>
<span class="co"># as.mo("stau") 170.0 170 190 190 210 240 10</span>
<span class="co"># as.mo("STAU") 160.0 170 180 180 200 210 10</span>
<span class="co"># as.mo("staaur") 11.0 13 19 14 18 48 10</span>
<span class="co"># as.mo("STAAUR") 11.0 13 22 17 37 40 10</span>
<span class="co"># as.mo("S. aureus") 15.0 15 24 17 26 56 10</span>
<span class="co"># as.mo("S aureus") 12.0 13 21 16 23 49 10</span>
<span class="co"># as.mo("Staphylococcus aureus") 9.8 13 21 14 15 65 10</span>
<span class="co"># as.mo("Staphylococcus aureus (MRSA)") 960.0 960 1100 980 1100 1400 10</span>
<span class="co"># as.mo("Sthafilokkockus aaureuz") 440.0 450 480 470 480 570 10</span>
<span class="co"># as.mo("MRSA") 12.0 14 22 15 17 86 10</span>
<span class="co"># as.mo("VISA") 15.0 18 25 19 40 42 10</span>
<span class="co"># as.mo("VRSA") 14.0 15 30 22 44 69 10</span>
<span class="co"># as.mo(22242419) 130.0 150 160 170 180 190 10</span>
</pre></div>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-4-1.png" width="562.5"></p>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 5 milliseconds means it can determine 200 input values per second. It case of 100 milliseconds, this is only 10 input values per second.</p>
<p>To achieve this speed, the <code>as.mo</code> function also takes into account the prevalence of human pathogenic microorganisms. The downside of this is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of <em>Methanosarcina semesiae</em> (<code>B_MTHNSR_SEMS</code>), a bug probably never found before in humans:</p>
<div class="sourceCode" id="cb3"><html><body><pre class="r"><span class="no">M.semesiae</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"metsem"</span>),
<div class="sourceCode" id="cb3"><pre class="downlit">
<span class="kw">M.semesiae</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"metsem"</span>),
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"METSEM"</span>),
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"M. semesiae"</span>),
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"M. semesiae"</span>),
<span class="fu"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"Methanosarcina semesiae"</span>),
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">M.semesiae</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">4</span>)
times = <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">M.semesiae</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">4</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max</span>
<span class="co"># as.mo("metsem") 176.800 179.200 189.20 185.90 194.00 212.60</span>
<span class="co"># as.mo("METSEM") 164.400 170.800 193.00 188.20 211.10 243.00</span>
<span class="co"># as.mo("M. semesiae") 10.950 11.310 19.92 15.41 18.79 50.84</span>
<span class="co"># as.mo("M. semesiae") 11.560 11.860 17.66 14.15 16.96 50.76</span>
<span class="co"># as.mo("Methanosarcina semesiae") 9.408 9.669 18.03 14.12 15.24 42.57</span>
<span class="co"># expr min lq mean median uq max</span>
<span class="co"># as.mo("metsem") 186.900 192.90 204.70 199.10 207.70 251.20</span>
<span class="co"># as.mo("METSEM") 175.500 199.70 215.20 218.20 232.00 240.40</span>
<span class="co"># as.mo("M. semesiae") 11.500 13.29 16.47 13.85 16.84 36.90</span>
<span class="co"># as.mo("M. semesiae") 11.690 11.94 16.81 14.40 15.75 42.76</span>
<span class="co"># as.mo("Methanosarcina semesiae") 9.688 10.28 14.55 11.99 13.72 39.41</span>
<span class="co"># neval</span>
<span class="co"># 10</span>
<span class="co"># 10</span>
<span class="co"># 10</span>
<span class="co"># 10</span>
<span class="co"># 10</span></pre></body></html></div>
<span class="co"># 10</span>
</pre></div>
<p>Looking up arbitrary codes of less prevalent microorganisms costs the most time. Full names (like <em>Methanosarcina semesiae</em>) are always very fast and only take some thousands of seconds to coerce - they are the most probable input from most data sets.</p>
<p>In the figure below, we compare <em>Escherichia coli</em> (which is very common) with <em>Prevotella brevis</em> (which is moderately common) and with <em>Methanosarcina semesiae</em> (which is uncommon):</p>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-6-1.png" width="900"></p>
@ -267,102 +273,110 @@
<h3 class="hasAnchor">
<a href="#repetitive-results" class="anchor"></a>Repetitive results</h3>
<p>Repetitive results are unique values that are present more than once. Unique values will only be calculated once by <code><a href="../reference/as.mo.html">as.mo()</a></code>. We will use <code><a href="../reference/mo_property.html">mo_name()</a></code> for this test - a helper function that returns the full microbial name (genus, species and possibly subspecies) which uses <code><a href="../reference/as.mo.html">as.mo()</a></code> internally.</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="co"># take all MO codes from the example_isolates data set</span>
<span class="no">x</span> <span class="kw">&lt;-</span> <span class="no">example_isolates</span>$<span class="no">mo</span> <span class="kw">%&gt;%</span>
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="co"># take all MO codes from the example_isolates data set</span>
<span class="kw">x</span> <span class="op">&lt;-</span> <span class="kw">example_isolates</span><span class="op">$</span><span class="kw">mo</span> <span class="op">%&gt;%</span>
<span class="co"># keep only the unique ones</span>
<span class="fu"><a href="https://rdrr.io/r/base/unique.html">unique</a></span>() <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://rdrr.io/r/base/unique.html">unique</a></span>() <span class="op">%&gt;%</span>
<span class="co"># pick 50 of them at random</span>
<span class="fu"><a href="https://rdrr.io/r/base/sample.html">sample</a></span>(<span class="fl">50</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://rdrr.io/r/base/sample.html">sample</a></span>(<span class="fl">50</span>) <span class="op">%&gt;%</span>
<span class="co"># paste that 10,000 times</span>
<span class="fu"><a href="https://rdrr.io/r/base/rep.html">rep</a></span>(<span class="fl">10000</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://rdrr.io/r/base/rep.html">rep</a></span>(<span class="fl">10000</span>) <span class="op">%&gt;%</span>
<span class="co"># scramble it</span>
<span class="fu"><a href="https://rdrr.io/r/base/sample.html">sample</a></span>()
<span class="co"># got indeed 50 times 10,000 = half a million?</span>
<span class="fu"><a href="https://rdrr.io/r/base/length.html">length</a></span>(<span class="no">x</span>)
<span class="fu"><a href="https://rdrr.io/r/base/length.html">length</a></span>(<span class="kw">x</span>)
<span class="co"># [1] 500000</span>
<span class="co"># and how many unique values do we have?</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/n_distinct.html">n_distinct</a></span>(<span class="no">x</span>)
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/n_distinct.html">n_distinct</a></span>(<span class="kw">x</span>)
<span class="co"># [1] 50</span>
<span class="co"># now let's see:</span>
<span class="no">run_it</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="no">x</span>),
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">run_it</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">3</span>)
<span class="kw">run_it</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="kw">x</span>),
times = <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">run_it</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">3</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># mo_name(x) 1720 1760 1820 1800 1830 1990 10</span></pre></body></html></div>
<p>So transforming 500,000 values (!!) of 50 unique values only takes 1.8 seconds. You only lose time on your unique input values.</p>
<span class="co"># mo_name(x) 1840 1870 1950 1940 1980 2140 10</span>
</pre></div>
<p>So transforming 500,000 values (!!) of 50 unique values only takes 1.94 seconds. You only lose time on your unique input values.</p>
</div>
<div id="precalculated-results" class="section level3">
<h3 class="hasAnchor">
<a href="#precalculated-results" class="anchor"></a>Precalculated results</h3>
<p>What about precalculated results? If the input is an already precalculated result of a helper function like <code><a href="../reference/mo_property.html">mo_name()</a></code>, it almost doesnt take any time at all (see C below):</p>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="no">run_it</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="kw">A</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"B_STPHY_AURS"</span>),
<span class="kw">B</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"S. aureus"</span>),
<span class="kw">C</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"Staphylococcus aureus"</span>),
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">run_it</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">3</span>)
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="kw">run_it</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(A = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"B_STPHY_AURS"</span>),
B = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"S. aureus"</span>),
C = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"Staphylococcus aureus"</span>),
times = <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">run_it</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">3</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># A 8.16 8.35 9.06 8.97 9.75 10.20 10</span>
<span class="co"># B 10.50 10.60 15.50 12.20 12.80 49.90 10</span>
<span class="co"># C 1.04 1.15 1.21 1.19 1.27 1.53 10</span></pre></body></html></div>
<p>So going from <code><a href="../reference/mo_property.html">mo_name("Staphylococcus aureus")</a></code> to <code>"Staphylococcus aureus"</code> takes 0.0012 seconds - it doesnt even start calculating <em>if the result would be the same as the expected resulting value</em>. That goes for all helper functions:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="no">run_it</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="kw">A</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_species</a></span>(<span class="st">"aureus"</span>),
<span class="kw">B</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="st">"Staphylococcus"</span>),
<span class="kw">C</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"Staphylococcus aureus"</span>),
<span class="kw">D</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_family</a></span>(<span class="st">"Staphylococcaceae"</span>),
<span class="kw">E</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span>(<span class="st">"Bacillales"</span>),
<span class="kw">F</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_class</a></span>(<span class="st">"Bacilli"</span>),
<span class="kw">G</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_phylum</a></span>(<span class="st">"Firmicutes"</span>),
<span class="kw">H</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_kingdom</a></span>(<span class="st">"Bacteria"</span>),
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">run_it</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">3</span>)
<span class="co"># A 8.17 8.49 9.32 9.32 9.90 10.90 10</span>
<span class="co"># B 10.90 11.80 16.30 13.20 14.70 45.60 10</span>
<span class="co"># C 1.06 1.22 1.32 1.28 1.44 1.57 10</span>
</pre></div>
<p>So going from <code><a href="../reference/mo_property.html">mo_name("Staphylococcus aureus")</a></code> to <code>"Staphylococcus aureus"</code> takes 0.0013 seconds - it doesnt even start calculating <em>if the result would be the same as the expected resulting value</em>. That goes for all helper functions:</p>
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="kw">run_it</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(A = <span class="fu"><a href="../reference/mo_property.html">mo_species</a></span>(<span class="st">"aureus"</span>),
B = <span class="fu"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="st">"Staphylococcus"</span>),
C = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"Staphylococcus aureus"</span>),
D = <span class="fu"><a href="../reference/mo_property.html">mo_family</a></span>(<span class="st">"Staphylococcaceae"</span>),
E = <span class="fu"><a href="../reference/mo_property.html">mo_order</a></span>(<span class="st">"Bacillales"</span>),
F = <span class="fu"><a href="../reference/mo_property.html">mo_class</a></span>(<span class="st">"Bacilli"</span>),
G = <span class="fu"><a href="../reference/mo_property.html">mo_phylum</a></span>(<span class="st">"Firmicutes"</span>),
H = <span class="fu"><a href="../reference/mo_property.html">mo_kingdom</a></span>(<span class="st">"Bacteria"</span>),
times = <span class="fl">10</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">run_it</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">3</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># A 0.948 0.971 1.14 1.020 1.39 1.52 10</span>
<span class="co"># B 0.968 1.040 1.22 1.190 1.41 1.56 10</span>
<span class="co"># C 0.979 1.020 1.31 1.260 1.58 1.66 10</span>
<span class="co"># D 0.964 1.010 1.24 1.190 1.45 1.83 10</span>
<span class="co"># E 0.977 0.995 1.15 1.030 1.40 1.45 10</span>
<span class="co"># F 0.878 0.982 1.11 1.010 1.37 1.43 10</span>
<span class="co"># G 0.929 0.961 1.18 1.000 1.43 1.58 10</span>
<span class="co"># H 0.901 0.967 1.09 0.998 1.35 1.40 10</span></pre></body></html></div>
<span class="co"># A 1.020 1.030 1.11 1.060 1.22 1.33 10</span>
<span class="co"># B 0.982 1.010 1.10 1.040 1.21 1.38 10</span>
<span class="co"># C 0.992 1.020 1.13 1.040 1.24 1.58 10</span>
<span class="co"># D 0.987 1.000 1.07 1.030 1.08 1.29 10</span>
<span class="co"># E 0.978 0.982 1.02 0.999 1.03 1.15 10</span>
<span class="co"># F 0.975 0.992 1.05 1.000 1.03 1.26 10</span>
<span class="co"># G 0.976 0.983 1.02 0.994 1.03 1.22 10</span>
<span class="co"># H 0.977 1.010 1.11 1.090 1.21 1.28 10</span>
</pre></div>
<p>Of course, when running <code><a href="../reference/mo_property.html">mo_phylum("Firmicutes")</a></code> the function has zero knowledge about the actual microorganism, namely <em>S. aureus</em>. But since the result would be <code>"Firmicutes"</code> anyway, there is no point in calculating the result. And because this package knows all phyla of all known bacteria (according to the Catalogue of Life), it can just return the initial value immediately.</p>
</div>
<div id="results-in-other-languages" class="section level3">
<h3 class="hasAnchor">
<a href="#results-in-other-languages" class="anchor"></a>Results in other languages</h3>
<p>When the system language is non-English and supported by this <code>AMR</code> package, some functions will have a translated result. This almost doest take extra time:</p>
<div class="sourceCode" id="cb7"><html><body><pre class="r"><span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"en"</span>) <span class="co"># or just mo_name("CoNS") on an English system</span>
<div class="sourceCode" id="cb7"><pre class="downlit">
<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"en"</span>) <span class="co"># or just mo_name("CoNS") on an English system</span>
<span class="co"># [1] "Coagulase-negative Staphylococcus (CoNS)"</span>
<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"es"</span>) <span class="co"># or just mo_name("CoNS") on a Spanish system</span>
<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"es"</span>) <span class="co"># or just mo_name("CoNS") on a Spanish system</span>
<span class="co"># [1] "Staphylococcus coagulasa negativo (SCN)"</span>
<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"nl"</span>) <span class="co"># or just mo_name("CoNS") on a Dutch system</span>
<span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"nl"</span>) <span class="co"># or just mo_name("CoNS") on a Dutch system</span>
<span class="co"># [1] "Coagulase-negatieve Staphylococcus (CNS)"</span>
<span class="no">run_it</span> <span class="kw">&lt;-</span> <span class="fu">microbenchmark</span>(<span class="kw">en</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"en"</span>),
<span class="kw">de</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"de"</span>),
<span class="kw">nl</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"nl"</span>),
<span class="kw">es</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"es"</span>),
<span class="kw">it</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"it"</span>),
<span class="kw">fr</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"fr"</span>),
<span class="kw">pt</span> <span class="kw">=</span> <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="st">"pt"</span>),
<span class="kw">times</span> <span class="kw">=</span> <span class="fl">100</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="no">run_it</span>, <span class="kw">unit</span> <span class="kw">=</span> <span class="st">"ms"</span>, <span class="kw">signif</span> <span class="kw">=</span> <span class="fl">4</span>)
<span class="kw">run_it</span> <span class="op">&lt;-</span> <span class="fu">microbenchmark</span>(en = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"en"</span>),
de = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"de"</span>),
nl = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"nl"</span>),
es = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"es"</span>),
it = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"it"</span>),
fr = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"fr"</span>),
pt = <span class="fu"><a href="../reference/mo_property.html">mo_name</a></span>(<span class="st">"CoNS"</span>, language = <span class="st">"pt"</span>),
times = <span class="fl">100</span>)
<span class="fu"><a href="https://rdrr.io/r/base/print.html">print</a></span>(<span class="kw">run_it</span>, unit = <span class="st">"ms"</span>, signif = <span class="fl">4</span>)
<span class="co"># Unit: milliseconds</span>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># en 12.09 12.46 15.90 13.86 14.55 57.62 100</span>
<span class="co"># de 12.92 13.26 19.73 14.63 16.01 61.55 100</span>
<span class="co"># nl 16.53 17.00 20.26 17.64 19.93 57.54 100</span>
<span class="co"># es 12.98 13.28 18.27 14.76 15.64 179.30 100</span>
<span class="co"># it 12.92 13.15 19.20 14.08 16.08 64.07 100</span>
<span class="co"># fr 12.99 13.21 17.81 13.59 15.71 67.97 100</span>
<span class="co"># pt 13.00 13.23 17.30 14.35 15.65 69.85 100</span></pre></body></html></div>
<span class="co"># expr min lq mean median uq max neval</span>
<span class="co"># en 12.40 14.34 17.88 14.89 15.48 55.22 100</span>
<span class="co"># de 13.17 14.30 17.90 15.84 16.66 56.60 100</span>
<span class="co"># nl 17.14 19.86 24.99 20.78 21.70 64.66 100</span>
<span class="co"># es 13.43 15.29 17.65 15.93 16.59 54.38 100</span>
<span class="co"># it 13.33 14.83 18.35 15.68 16.36 57.61 100</span>
<span class="co"># fr 13.40 15.43 18.66 16.01 16.59 54.35 100</span>
<span class="co"># pt 13.47 15.33 18.93 16.15 16.84 57.28 100</span>
</pre></div>
<p>Currently supported are German, Dutch, Spanish, Italian, French and Portuguese.</p>
</div>
</div>
@ -380,7 +394,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 91 KiB

After

Width:  |  Height:  |  Size: 91 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 60 KiB

After

Width:  |  Height:  |  Size: 67 KiB

View File

@ -81,7 +81,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9000</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -263,7 +263,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -186,7 +186,7 @@
<h1 data-toc-skip>How to predict antimicrobial resistance</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">30 July 2020</h4>
<h4 class="date">10 August 2020</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/master/vignettes/resistance_predict.Rmd"><code>vignettes/resistance_predict.Rmd</code></a></small>
<div class="hidden name"><code>resistance_predict.Rmd</code></div>
@ -200,35 +200,38 @@
<a href="#needed-r-packages" class="anchor"></a>Needed R packages</h2>
<p>As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the <a href="https://www.tidyverse.org">tidyverse packages</a> <a href="https://dplyr.tidyverse.org/"><code>dplyr</code></a> and <a href="https://ggplot2.tidyverse.org"><code>ggplot2</code></a> by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.</p>
<p>Our <code>AMR</code> package depends on these packages and even extends their use and functions.</p>
<div class="sourceCode" id="cb1"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">dplyr</span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">ggplot2</span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="no">AMR</span>)
<div class="sourceCode" id="cb1"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org">dplyr</a></span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="http://ggplot2.tidyverse.org">ggplot2</a></span>)
<span class="fu"><a href="https://rdrr.io/r/base/library.html">library</a></span>(<span class="kw"><a href="https://msberends.github.io/AMR">AMR</a></span>)
<span class="co"># (if not yet installed, install with:)</span>
<span class="co"># install.packages(c("tidyverse", "AMR"))</span></pre></body></html></div>
<span class="co"># install.packages(c("tidyverse", "AMR"))</span>
</pre></div>
</div>
<div id="prediction-analysis" class="section level2">
<h2 class="hasAnchor">
<a href="#prediction-analysis" class="anchor"></a>Prediction analysis</h2>
<p>Our package contains a function <code><a href="../reference/resistance_predict.html">resistance_predict()</a></code>, which takes the same input as functions for <a href="./AMR.html">other AMR analysis</a>. Based on a date column, it calculates cases per year and uses a regression model to predict antimicrobial resistance.</p>
<p>It is basically as easy as:</p>
<div class="sourceCode" id="cb2"><html><body><pre class="r"># resistance prediction of piperacillin/tazobactam (TZP):
resistance_predict(tbl = example_isolates, col_date = "date", col_ab = "TZP", model = "binomial")
# or:
example_isolates %&gt;%
resistance_predict(col_ab = "TZP",
model "binomial")
# to bind it to object 'predict_TZP' for example:
predict_TZP &lt;- example_isolates %&gt;%
resistance_predict(col_ab = "TZP",
model = "binomial")</pre></body></html></div>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1"></a><span class="co"># resistance prediction of piperacillin/tazobactam (TZP):</span></span>
<span id="cb2-2"><a href="#cb2-2"></a><span class="kw">resistance_predict</span>(<span class="dt">tbl =</span> example_isolates, <span class="dt">col_date =</span> <span class="st">"date"</span>, <span class="dt">col_ab =</span> <span class="st">"TZP"</span>, <span class="dt">model =</span> <span class="st">"binomial"</span>)</span>
<span id="cb2-3"><a href="#cb2-3"></a></span>
<span id="cb2-4"><a href="#cb2-4"></a><span class="co"># or:</span></span>
<span id="cb2-5"><a href="#cb2-5"></a>example_isolates <span class="op">%&gt;%</span><span class="st"> </span></span>
<span id="cb2-6"><a href="#cb2-6"></a><span class="st"> </span><span class="kw">resistance_predict</span>(<span class="dt">col_ab =</span> <span class="st">"TZP"</span>,</span>
<span id="cb2-7"><a href="#cb2-7"></a> model <span class="st">"binomial"</span>)</span>
<span id="cb2-8"><a href="#cb2-8"></a></span>
<span id="cb2-9"><a href="#cb2-9"></a><span class="co"># to bind it to object 'predict_TZP' for example:</span></span>
<span id="cb2-10"><a href="#cb2-10"></a>predict_TZP &lt;-<span class="st"> </span>example_isolates <span class="op">%&gt;%</span><span class="st"> </span></span>
<span id="cb2-11"><a href="#cb2-11"></a><span class="st"> </span><span class="kw">resistance_predict</span>(<span class="dt">col_ab =</span> <span class="st">"TZP"</span>,</span>
<span id="cb2-12"><a href="#cb2-12"></a> <span class="dt">model =</span> <span class="st">"binomial"</span>)</span></code></pre></div>
<p>The function will look for a date column itself if <code>col_date</code> is not set.</p>
<p>When running any of these commands, a summary of the regression model will be printed unless using <code><a href="../reference/resistance_predict.html">resistance_predict(..., info = FALSE)</a></code>.</p>
<pre><code># NOTE: Using column `date` as input for `col_date`.</code></pre>
<p>This text is only a printed summary - the actual result (output) of the function is a <code>data.frame</code> containing for each year: the number of observations, the actual observed resistance, the estimated resistance and the standard error below and above the estimation:</p>
<div class="sourceCode" id="cb4"><html><body><pre class="r"><span class="no">predict_TZP</span>
<div class="sourceCode" id="cb4"><pre class="downlit">
<span class="kw">predict_TZP</span>
<span class="co"># year value se_min se_max observations observed estimated</span>
<span class="co"># 1 2002 0.20000000 NA NA 15 0.20000000 0.05616378</span>
<span class="co"># 2 2003 0.06250000 NA NA 32 0.06250000 0.06163839</span>
@ -258,27 +261,36 @@ predict_TZP &lt;- example_isolates %&gt;%
<span class="co"># 26 2027 0.41315710 0.3244399 0.5018743 NA NA 0.41315710</span>
<span class="co"># 27 2028 0.43730688 0.3418075 0.5328063 NA NA 0.43730688</span>
<span class="co"># 28 2029 0.46175755 0.3597639 0.5637512 NA NA 0.46175755</span>
<span class="co"># 29 2030 0.48639359 0.3782932 0.5944939 NA NA 0.48639359</span></pre></body></html></div>
<span class="co"># 29 2030 0.48639359 0.3782932 0.5944939 NA NA 0.48639359</span>
</pre></div>
<p>The function <code>plot</code> is available in base R, and can be extended by other packages to depend the output based on the type of input. We extended its function to cope with resistance predictions:</p>
<div class="sourceCode" id="cb5"><html><body><pre class="r"><span class="fu"><a href="https://rdrr.io/r/base/plot.html">plot</a></span>(<span class="no">predict_TZP</span>)</pre></body></html></div>
<div class="sourceCode" id="cb5"><pre class="downlit">
<span class="fu"><a href="https://rdrr.io/r/graphics/plot.default.html">plot</a></span>(<span class="kw">predict_TZP</span>)
</pre></div>
<p><img src="resistance_predict_files/figure-html/unnamed-chunk-4-1.png" width="720"></p>
<p>This is the fastest way to plot the result. It automatically adds the right axes, error bars, titles, number of available observations and type of model.</p>
<p>We also support the <code>ggplot2</code> package with our custom function <code><a href="../reference/resistance_predict.html">ggplot_rsi_predict()</a></code> to create more appealing plots:</p>
<div class="sourceCode" id="cb6"><html><body><pre class="r"><span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>(<span class="no">predict_TZP</span>)</pre></body></html></div>
<div class="sourceCode" id="cb6"><pre class="downlit">
<span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>(<span class="kw">predict_TZP</span>)
</pre></div>
<p><img src="resistance_predict_files/figure-html/unnamed-chunk-5-1.png" width="720"></p>
<div class="sourceCode" id="cb7"><html><body><pre class="r">
<div class="sourceCode" id="cb7"><pre class="downlit">
<span class="co"># choose for error bars instead of a ribbon</span>
<span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>(<span class="no">predict_TZP</span>, <span class="kw">ribbon</span> <span class="kw">=</span> <span class="fl">FALSE</span>)</pre></body></html></div>
<span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>(<span class="kw">predict_TZP</span>, ribbon = <span class="fl">FALSE</span>)
</pre></div>
<p><img src="resistance_predict_files/figure-html/unnamed-chunk-5-2.png" width="720"></p>
<div id="choosing-the-right-model" class="section level3">
<h3 class="hasAnchor">
<a href="#choosing-the-right-model" class="anchor"></a>Choosing the right model</h3>
<p>Resistance is not easily predicted; if we look at vancomycin resistance in Gram-positive bacteria, the spread (i.e. standard error) is enormous:</p>
<div class="sourceCode" id="cb8"><html><body><pre class="r"><span class="no">example_isolates</span> <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span>(<span class="no">mo</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="kw">NULL</span>) <span class="kw">==</span> <span class="st">"Gram-positive"</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">resistance_predict</a></span>(<span class="kw">col_ab</span> <span class="kw">=</span> <span class="st">"VAN"</span>, <span class="kw">year_min</span> <span class="kw">=</span> <span class="fl">2010</span>, <span class="kw">info</span> <span class="kw">=</span> <span class="fl">FALSE</span>, <span class="kw">model</span> <span class="kw">=</span> <span class="st">"binomial"</span>) <span class="kw">%&gt;%</span>
<div class="sourceCode" id="cb8"><pre class="downlit">
<span class="kw">example_isolates</span> <span class="op">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span>(<span class="kw">mo</span>, language = <span class="kw">NULL</span>) <span class="op">==</span> <span class="st">"Gram-positive"</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">resistance_predict</a></span>(col_ab = <span class="st">"VAN"</span>, year_min = <span class="fl">2010</span>, info = <span class="fl">FALSE</span>, model = <span class="st">"binomial"</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>()
<span class="co"># NOTE: Using column `date` as input for `col_date`.</span></pre></body></html></div>
<span class="co"># NOTE: Using column `date` as input for `col_date`.</span>
</pre></div>
<p><img src="resistance_predict_files/figure-html/unnamed-chunk-6-1.png" width="720"></p>
<p>Vancomycin resistance could be 100% in ten years, but might also stay around 0%.</p>
<p>You can define the model with the <code>model</code> parameter. The model chosen above is a generalised linear regression model using a binomial distribution, assuming that a period of zero resistance was followed by a period of increasing resistance leading slowly to more and more resistance.</p>
@ -319,25 +331,29 @@ predict_TZP &lt;- example_isolates %&gt;%
</tbody>
</table>
<p>For the vancomycin resistance in Gram-positive bacteria, a linear model might be more appropriate since no binomial distribution is to be expected based on the observed years:</p>
<div class="sourceCode" id="cb9"><html><body><pre class="r"><span class="no">example_isolates</span> <span class="kw">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span>(<span class="no">mo</span>, <span class="kw">language</span> <span class="kw">=</span> <span class="kw">NULL</span>) <span class="kw">==</span> <span class="st">"Gram-positive"</span>) <span class="kw">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">resistance_predict</a></span>(<span class="kw">col_ab</span> <span class="kw">=</span> <span class="st">"VAN"</span>, <span class="kw">year_min</span> <span class="kw">=</span> <span class="fl">2010</span>, <span class="kw">info</span> <span class="kw">=</span> <span class="fl">FALSE</span>, <span class="kw">model</span> <span class="kw">=</span> <span class="st">"linear"</span>) <span class="kw">%&gt;%</span>
<div class="sourceCode" id="cb9"><pre class="downlit">
<span class="kw">example_isolates</span> <span class="op">%&gt;%</span>
<span class="fu"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="fu"><a href="../reference/mo_property.html">mo_gramstain</a></span>(<span class="kw">mo</span>, language = <span class="kw">NULL</span>) <span class="op">==</span> <span class="st">"Gram-positive"</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">resistance_predict</a></span>(col_ab = <span class="st">"VAN"</span>, year_min = <span class="fl">2010</span>, info = <span class="fl">FALSE</span>, model = <span class="st">"linear"</span>) <span class="op">%&gt;%</span>
<span class="fu"><a href="../reference/resistance_predict.html">ggplot_rsi_predict</a></span>()
<span class="co"># NOTE: Using column `date` as input for `col_date`.</span></pre></body></html></div>
<span class="co"># NOTE: Using column `date` as input for `col_date`.</span>
</pre></div>
<p><img src="resistance_predict_files/figure-html/unnamed-chunk-7-1.png" width="720"></p>
<p>This seems more likely, doesnt it?</p>
<p>The model itself is also available from the object, as an <code>attribute</code>:</p>
<div class="sourceCode" id="cb10"><html><body><pre class="r"><span class="no">model</span> <span class="kw">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/attributes.html">attributes</a></span>(<span class="no">predict_TZP</span>)$<span class="no">model</span>
<div class="sourceCode" id="cb10"><pre class="downlit">
<span class="kw">model</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/attributes.html">attributes</a></span>(<span class="kw">predict_TZP</span>)<span class="op">$</span><span class="kw">model</span>
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="no">model</span>)$<span class="no">family</span>
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="kw">model</span>)<span class="op">$</span><span class="kw">family</span>
<span class="co"># </span>
<span class="co"># Family: binomial </span>
<span class="co"># Link function: logit</span>
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="no">model</span>)$<span class="no">coefficients</span>
<span class="fu"><a href="https://rdrr.io/r/base/summary.html">summary</a></span>(<span class="kw">model</span>)<span class="op">$</span><span class="kw">coefficients</span>
<span class="co"># Estimate Std. Error z value Pr(&gt;|z|)</span>
<span class="co"># (Intercept) -200.67944891 46.17315349 -4.346237 1.384932e-05</span>
<span class="co"># year 0.09883005 0.02295317 4.305725 1.664395e-05</span></pre></body></html></div>
<span class="co"># year 0.09883005 0.02295317 4.305725 1.664395e-05</span>
</pre></div>
</div>
</div>
</div>
@ -357,7 +373,7 @@ predict_TZP &lt;- example_isolates %&gt;%
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>

View File

@ -39,7 +39,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9000</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9001</span>
</span>
</div>
@ -195,6 +195,7 @@
<p><strong>READ ALL VIGNETTES <a href="https://msberends.github.io/AMR/articles/">ON OUR WEBSITE</a></strong></p>
<div id="welcome-to-the-amr-package" class="section level1">
<h1 class="hasAnchor">
<a href="#welcome-to-the-amr-package" class="anchor"></a>Welcome to the AMR package</h1>
@ -243,7 +244,7 @@
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.</p>
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.5.1.9000.</p>
</div>
</footer>