1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-08 10:31:53 +02:00

algorithm update

This commit is contained in:
2019-02-21 18:55:52 +01:00
parent c6e57ca456
commit 68a9a35ed6
112 changed files with 586 additions and 886 deletions

View File

@ -192,7 +192,7 @@
<h1>How to conduct AMR analysis</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>AMR.Rmd</code></div>
@ -201,7 +201,7 @@
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 20 February 2019.</p>
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 21 February 2019.</p>
<div id="introduction" class="section level1">
<h1 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h1>
@ -217,21 +217,21 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2019-02-20</td>
<td align="center">2019-02-21</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
</tr>
<tr class="even">
<td align="center">2019-02-20</td>
<td align="center">2019-02-21</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
</tr>
<tr class="odd">
<td align="center">2019-02-20</td>
<td align="center">2019-02-21</td>
<td align="center">efgh</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
@ -327,67 +327,67 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2010-03-15</td>
<td align="center">U2</td>
<td align="center">2012-08-08</td>
<td align="center">P2</td>
<td align="center">Hospital B</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2010-08-09</td>
<td align="center">Q7</td>
<td align="center">Hospital B</td>
<td align="center">2011-03-05</td>
<td align="center">D8</td>
<td align="center">Hospital A</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2012-04-03</td>
<td align="center">D4</td>
<td align="center">Hospital C</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2012-03-04</td>
<td align="center">P2</td>
<tr class="even">
<td align="center">2012-10-25</td>
<td align="center">I1</td>
<td align="center">Hospital B</td>
<td align="center">Klebsiella pneumoniae</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2013-10-19</td>
<td align="center">K5</td>
<td align="center">Hospital C</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2017-10-05</td>
<td align="center">I7</td>
<td align="center">Hospital D</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">2017-04-18</td>
<td align="center">X3</td>
<td align="center">Hospital B</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2017-09-27</td>
<td align="center">H7</td>
<td align="center">Hospital C</td>
<td align="center">Staphylococcus aureus</td>
<td align="center">S</td>
<td align="center">2013-03-18</td>
<td align="center">C4</td>
<td align="center">Hospital A</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
@ -411,8 +411,8 @@
#&gt;
#&gt; Item Count Percent Cum. Count Cum. Percent
#&gt; --- ----- ------- -------- ----------- -------------
#&gt; 1 M 10,384 51.9% 10,384 51.9%
#&gt; 2 F 9,616 48.1% 20,000 100.0%</code></pre>
#&gt; 1 M 10,458 52.3% 10,458 52.3%
#&gt; 2 F 9,542 47.7% 20,000 100.0%</code></pre>
<p>So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values <code>M</code> and <code>F</code>. From a researcher perspective: there are slightly more men. Nothing we didnt already know.</p>
<p>The data is already quite clean, but we still need to transform some variables. The <code>bacteria</code> column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> function of the <code>dplyr</code> package makes this really easy:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1">data &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span></a>
@ -443,10 +443,10 @@
<a class="sourceLine" id="cb14-19" title="19"><span class="co">#&gt; Kingella kingae (no changes)</span></a>
<a class="sourceLine" id="cb14-20" title="20"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-21" title="21"><span class="co">#&gt; EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1261 changes)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1342 changes)</span></a>
<a class="sourceLine" id="cb14-23" title="23"><span class="co">#&gt; Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-24" title="24"><span class="co">#&gt; Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2655 changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2761 changes)</span></a>
<a class="sourceLine" id="cb14-26" title="26"><span class="co">#&gt; Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<a class="sourceLine" id="cb14-27" title="27"><span class="co">#&gt; Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<a class="sourceLine" id="cb14-28" title="28"><span class="co">#&gt; Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)</span></a>
@ -462,9 +462,9 @@
<a class="sourceLine" id="cb14-38" title="38"><span class="co">#&gt; Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<a class="sourceLine" id="cb14-39" title="39"><span class="co">#&gt; Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<a class="sourceLine" id="cb14-40" title="40"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,230 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,471 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-42" title="42"><span class="co">#&gt; -&gt; added 0 test results</span></a>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 3,916 test results (0 to S; 0 to I; 3,916 to R)</span></a></code></pre></div>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 4,103 test results (0 to S; 0 to I; 4,103 to R)</span></a></code></pre></div>
</div>
<div id="adding-new-variables" class="section level1">
<h1 class="hasAnchor">
@ -489,7 +489,7 @@
<a class="sourceLine" id="cb16-3" title="3"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<a class="sourceLine" id="cb16-4" title="4"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `date` as input for `col_date`.</span></a>
<a class="sourceLine" id="cb16-5" title="5"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,689 first isolates (28.4% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,674 first isolates (28.4% of total)</span></a></code></pre></div>
<p>So only 28.4% is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb17-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb17-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(first <span class="op">==</span><span class="st"> </span><span class="ot">TRUE</span>)</a></code></pre></div>
@ -516,8 +516,8 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-09</td>
<td align="center">V4</td>
<td align="center">2010-01-23</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -527,8 +527,8 @@
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-04-24</td>
<td align="center">V4</td>
<td align="center">2010-03-13</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -538,32 +538,32 @@
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-05-30</td>
<td align="center">V4</td>
<td align="center">2010-04-19</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-06-10</td>
<td align="center">V4</td>
<td align="center">2010-06-11</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-06-17</td>
<td align="center">V4</td>
<td align="center">2010-07-04</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -571,8 +571,8 @@
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-07-30</td>
<td align="center">V4</td>
<td align="center">2010-07-05</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -582,47 +582,47 @@
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-09-20</td>
<td align="center">V4</td>
<td align="center">2011-03-21</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-10-21</td>
<td align="center">V4</td>
<td align="center">2011-04-02</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-11-06</td>
<td align="center">V4</td>
<td align="center">2011-04-05</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-05-21</td>
<td align="center">V4</td>
<td align="center">2011-04-13</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
</tbody>
</table>
@ -637,7 +637,7 @@
<a class="sourceLine" id="cb19-7" title="7"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb19-8" title="8"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<a class="sourceLine" id="cb19-9" title="9"><span class="co">#&gt; [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,871 first weighted isolates (79.4% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,801 first weighted isolates (79.0% of total)</span></a></code></pre></div>
<table class="table">
<thead><tr class="header">
<th align="center">isolate</th>
@ -654,8 +654,8 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-09</td>
<td align="center">V4</td>
<td align="center">2010-01-23</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -666,8 +666,8 @@
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-04-24</td>
<td align="center">V4</td>
<td align="center">2010-03-13</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -678,34 +678,34 @@
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-05-30</td>
<td align="center">V4</td>
<td align="center">2010-04-19</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-06-10</td>
<td align="center">V4</td>
<td align="center">2010-06-11</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-06-17</td>
<td align="center">V4</td>
<td align="center">2010-07-04</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -714,71 +714,71 @@
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-07-30</td>
<td align="center">V4</td>
<td align="center">2010-07-05</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-09-20</td>
<td align="center">V4</td>
<td align="center">2011-03-21</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-10-21</td>
<td align="center">V4</td>
<td align="center">2011-04-02</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-11-06</td>
<td align="center">V4</td>
<td align="center">2011-04-05</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-05-21</td>
<td align="center">V4</td>
<td align="center">2011-04-13</td>
<td align="center">A4</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">TRUE</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
</tbody>
</table>
<p>Instead of 2, now 6 isolates are flagged. In total, 79.4% of all isolates are marked first weighted - 50.9% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>Instead of 2, now 8 isolates are flagged. In total, 79% of all isolates are marked first weighted - 50.6% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>As with <code><a href="../reference/first_isolate.html">filter_first_isolate()</a></code>, theres a shortcut for this new algorithm too:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb20-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/first_isolate.html">filter_first_weighted_isolate</a></span>()</a></code></pre></div>
<p>So we end up with 15,871 isolates for analysis.</p>
<p>So we end up with 15,801 isolates for analysis.</p>
<p>We can remove unneeded columns:</p>
<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb21-1" title="1">data_1st &lt;-<span class="st"> </span>data_1st <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb21-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="op">-</span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(first, keyab))</a></code></pre></div>
@ -786,7 +786,6 @@
<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb22-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/utils/topics/head">head</a></span>(data_1st)</a></code></pre></div>
<table class="table">
<thead><tr class="header">
<th></th>
<th align="center">date</th>
<th align="center">patient_id</th>
<th align="center">hospital</th>
@ -803,14 +802,13 @@
</tr></thead>
<tbody>
<tr class="odd">
<td>1</td>
<td align="center">2010-03-15</td>
<td align="center">U2</td>
<td align="center">2012-08-08</td>
<td align="center">P2</td>
<td align="center">Hospital B</td>
<td align="center">B_STRPT_PNE</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">F</td>
<td align="center">Gram positive</td>
@ -819,74 +817,9 @@
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>2</td>
<td align="center">2010-08-09</td>
<td align="center">Q7</td>
<td align="center">Hospital B</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>4</td>
<td align="center">2013-10-19</td>
<td align="center">K5</td>
<td align="center">Hospital C</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>5</td>
<td align="center">2017-10-05</td>
<td align="center">I7</td>
<td align="center">Hospital D</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>6</td>
<td align="center">2017-09-27</td>
<td align="center">H7</td>
<td align="center">Hospital C</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>7</td>
<td align="center">2016-09-06</td>
<td align="center">L9</td>
<td align="center">Hospital B</td>
<td align="center">2011-03-05</td>
<td align="center">D8</td>
<td align="center">Hospital A</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -898,6 +831,66 @@
<td align="center">coli</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">2012-04-03</td>
<td align="center">D4</td>
<td align="center">Hospital C</td>
<td align="center">B_STPHY_AUR</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram positive</td>
<td align="center">Staphylococcus</td>
<td align="center">aureus</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2012-10-25</td>
<td align="center">I1</td>
<td align="center">Hospital B</td>
<td align="center">B_KLBSL_PNE</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Klebsiella</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">2017-04-18</td>
<td align="center">X3</td>
<td align="center">Hospital B</td>
<td align="center">B_STRPT_PNE</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">F</td>
<td align="center">Gram positive</td>
<td align="center">Streptococcus</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2013-03-18</td>
<td align="center">C4</td>
<td align="center">Hospital A</td>
<td align="center">B_STRPT_PNE</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">M</td>
<td align="center">Gram positive</td>
<td align="center">Streptococcus</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
</tbody>
</table>
<p>Time for the analysis!</p>
@ -915,9 +908,9 @@
<div class="sourceCode" id="cb23"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb23-1" title="1"><span class="kw"><a href="../reference/freq.html">freq</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/paste">paste</a></span>(data_1st<span class="op">$</span>genus, data_1st<span class="op">$</span>species))</a></code></pre></div>
<p>Or can be used like the <code>dplyr</code> way, which is easier readable:</p>
<div class="sourceCode" id="cb24"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb24-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(genus, species)</a></code></pre></div>
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from a <code>data.frame</code> (15,871 x 13)</strong></p>
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from a <code>data.frame</code> (15,801 x 13)</strong></p>
<p>Columns: 2<br>
Length: 15,871 (of which NA: 0 = 0.00%)<br>
Length: 15,801 (of which NA: 0 = 0.00%)<br>
Unique: 4</p>
<p>Shortest: 16<br>
Longest: 24</p>
@ -934,33 +927,33 @@ Longest: 24</p>
<tr class="odd">
<td align="left">1</td>
<td align="left">Escherichia coli</td>
<td align="right">7,903</td>
<td align="right">49.8%</td>
<td align="right">7,903</td>
<td align="right">49.8%</td>
<td align="right">7,850</td>
<td align="right">49.7%</td>
<td align="right">7,850</td>
<td align="right">49.7%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">Staphylococcus aureus</td>
<td align="right">3,987</td>
<td align="right">25.1%</td>
<td align="right">11,890</td>
<td align="right">74.9%</td>
<td align="right">3,918</td>
<td align="right">24.8%</td>
<td align="right">11,768</td>
<td align="right">74.5%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">Streptococcus pneumoniae</td>
<td align="right">2,426</td>
<td align="right">15.3%</td>
<td align="right">14,316</td>
<td align="right">90.2%</td>
<td align="right">2,446</td>
<td align="right">15.5%</td>
<td align="right">14,214</td>
<td align="right">90.0%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">Klebsiella pneumoniae</td>
<td align="right">1,555</td>
<td align="right">9.8%</td>
<td align="right">15,871</td>
<td align="right">1,587</td>
<td align="right">10.0%</td>
<td align="right">15,801</td>
<td align="right">100.0%</td>
</tr>
</tbody>
@ -971,7 +964,7 @@ Longest: 24</p>
<a href="#resistance-percentages" class="anchor"></a>Resistance percentages</h2>
<p>The functions <code>portion_R</code>, <code>portion_RI</code>, <code>portion_I</code>, <code>portion_IS</code> and <code>portion_S</code> can be used to determine the portion of a specific antimicrobial outcome. They can be used on their own:</p>
<div class="sourceCode" id="cb25"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb25-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/portion.html">portion_IR</a></span>(amox)</a>
<a class="sourceLine" id="cb25-2" title="2"><span class="co">#&gt; [1] 0.4737572</span></a></code></pre></div>
<a class="sourceLine" id="cb25-2" title="2"><span class="co">#&gt; [1] 0.4747801</span></a></code></pre></div>
<p>Or can be used in conjuction with <code><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by()</a></code> and <code><a href="https://dplyr.tidyverse.org/reference/summarise.html">summarise()</a></code>, both from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb26-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb26-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(hospital) <span class="op">%&gt;%</span><span class="st"> </span></a>
@ -984,19 +977,19 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Hospital A</td>
<td align="center">0.4684758</td>
<td align="center">0.4696939</td>
</tr>
<tr class="even">
<td align="center">Hospital B</td>
<td align="center">0.4675514</td>
<td align="center">0.4782930</td>
</tr>
<tr class="odd">
<td align="center">Hospital C</td>
<td align="center">0.4904459</td>
<td align="center">0.4683438</td>
</tr>
<tr class="even">
<td align="center">Hospital D</td>
<td align="center">0.4804110</td>
<td align="center">0.4815051</td>
</tr>
</tbody>
</table>
@ -1014,23 +1007,23 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Hospital A</td>
<td align="center">0.4684758</td>
<td align="center">4901</td>
<td align="center">0.4696939</td>
<td align="center">4867</td>
</tr>
<tr class="even">
<td align="center">Hospital B</td>
<td align="center">0.4675514</td>
<td align="center">5501</td>
<td align="center">0.4782930</td>
<td align="center">5413</td>
</tr>
<tr class="odd">
<td align="center">Hospital C</td>
<td align="center">0.4904459</td>
<td align="center">2355</td>
<td align="center">0.4683438</td>
<td align="center">2385</td>
</tr>
<tr class="even">
<td align="center">Hospital D</td>
<td align="center">0.4804110</td>
<td align="center">3114</td>
<td align="center">0.4815051</td>
<td align="center">3136</td>
</tr>
</tbody>
</table>
@ -1050,27 +1043,27 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Escherichia</td>
<td align="center">0.7295964</td>
<td align="center">0.8977603</td>
<td align="center">0.9743136</td>
<td align="center">0.7278981</td>
<td align="center">0.8996178</td>
<td align="center">0.9742675</td>
</tr>
<tr class="even">
<td align="center">Klebsiella</td>
<td align="center">0.7299035</td>
<td align="center">0.8958199</td>
<td align="center">0.9774920</td>
<td align="center">0.7303088</td>
<td align="center">0.9004411</td>
<td align="center">0.9716446</td>
</tr>
<tr class="odd">
<td align="center">Staphylococcus</td>
<td align="center">0.7281164</td>
<td align="center">0.9260095</td>
<td align="center">0.9806872</td>
<td align="center">0.7304747</td>
<td align="center">0.9157734</td>
<td align="center">0.9757529</td>
</tr>
<tr class="even">
<td align="center">Streptococcus</td>
<td align="center">0.7353669</td>
<td align="center">0.7485691</td>
<td align="center">0.0000000</td>
<td align="center">0.7353669</td>
<td align="center">0.7485691</td>
</tr>
</tbody>
</table>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 68 KiB

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 50 KiB

View File

@ -192,7 +192,7 @@
<h1>How to apply EUCAST rules</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>EUCAST.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to use the <em>G</em>-test</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>G_test.Rmd</code></div>

View File

@ -204,7 +204,7 @@
<div id="spss-sas-stata" class="section level2">
<h2 class="hasAnchor">
<a href="#spss-sas-stata" class="anchor"></a>SPSS / SAS / Stata</h2>
<p>SPSS (Statistical Package for the Social Sciences) is probably the most well-known software package for statistical analysis. SPSS is easier to learn than R, because in SPSS you only have to click a menu to run parts of your analysis. Because of its user-friendlyness, it is taught at universities and particularly useful for students who are new to statistics. From my experience, I would guess that pretty much all (bio)medical students know it at the time they graduate. SAS and Stata are statistical packages popular in big industries.</p>
<p>SPSS (Statistical Package for the Social Sciences) is probably the most well-known software package for statistical analysis. SPSS is easier to learn than R, because in SPSS you only have to click a menu to run parts of your analysis. Because of its user-friendliness, it is taught at universities and particularly useful for students who are new to statistics. From my experience, I would guess that pretty much all (bio)medical students know it at the time they graduate. SAS and Stata are comparable statistical packages popular in big industries.</p>
</div>
<div id="compared-to-r" class="section level2">
<h2 class="hasAnchor">

View File

@ -192,7 +192,7 @@
<h1>How to work with WHONET data</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>WHONET.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to get properties of an antibiotic</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>atc_property.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>Benchmarks</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>benchmarks.Rmd</code></div>
@ -201,7 +201,7 @@
<p>One of the most important features of this package is the complete microbial taxonomic database, supplied by the Catalogue of Life (<a href="http://catalogueoflife.org" class="uri">http://catalogueoflife.org</a>). We created a function <code><a href="../reference/as.mo.html">as.mo()</a></code> that transforms any user input value to a valid microbial ID by using AI (Artificial Intelligence) combined with the taxonomic tree of Catalogue of Life.</p>
<p>One of the most important features of this package is the complete microbial taxonomic database, supplied by the <a href="http://catalogueoflife.org">Catalogue of Life</a>. We created a function <code><a href="../reference/as.mo.html">as.mo()</a></code> that transforms any user input value to a valid microbial ID by using AI (Artificial Intelligence) combined with the taxonomic tree of Catalogue of Life.</p>
<p>Using the <code>microbenchmark</code> package, we can review the calculation performance of this function. Its function <code><a href="https://www.rdocumentation.org/packages/microbenchmark/topics/microbenchmark">microbenchmark()</a></code> runs different input expressions independently of each other and measures their time-to-result.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(microbenchmark)</a>
<a class="sourceLine" id="cb1-2" title="2"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(AMR)</a></code></pre></div>
@ -216,27 +216,18 @@
<a class="sourceLine" id="cb2-7" title="7"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"Staphylococcus aureus"</span>),</a>
<a class="sourceLine" id="cb2-8" title="8"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"B_STPHY_AUR"</span>),</a>
<a class="sourceLine" id="cb2-9" title="9"> <span class="dt">times =</span> <span class="dv">10</span>)</a>
<a class="sourceLine" id="cb2-10" title="10"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(S.aureus, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb2-10" title="10"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(S.aureus, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">2</span>)</a>
<a class="sourceLine" id="cb2-11" title="11"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb2-12" title="12"><span class="co">#&gt; expr min lq mean median uq max</span></a>
<a class="sourceLine" id="cb2-13" title="13"><span class="co">#&gt; as.mo("sau") 42.300 42.500 47.00 43.100 43.200 82.000</span></a>
<a class="sourceLine" id="cb2-14" title="14"><span class="co">#&gt; as.mo("stau") 75.900 76.100 82.70 76.700 77.900 125.000</span></a>
<a class="sourceLine" id="cb2-15" title="15"><span class="co">#&gt; as.mo("staaur") 42.400 43.300 53.60 44.600 49.000 98.200</span></a>
<a class="sourceLine" id="cb2-16" title="16"><span class="co">#&gt; as.mo("S. aureus") 18.400 18.600 20.60 18.700 19.200 34.100</span></a>
<a class="sourceLine" id="cb2-17" title="17"><span class="co">#&gt; as.mo("S. aureus") 18.400 18.500 18.80 18.600 19.200 19.600</span></a>
<a class="sourceLine" id="cb2-18" title="18"><span class="co">#&gt; as.mo("STAAUR") 42.300 42.700 43.30 43.000 43.800 45.700</span></a>
<a class="sourceLine" id="cb2-19" title="19"><span class="co">#&gt; as.mo("Staphylococcus aureus") 11.400 11.500 11.80 11.600 11.800 13.400</span></a>
<a class="sourceLine" id="cb2-20" title="20"><span class="co">#&gt; as.mo("B_STPHY_AUR") 0.261 0.418 0.44 0.434 0.493 0.542</span></a>
<a class="sourceLine" id="cb2-21" title="21"><span class="co">#&gt; neval</span></a>
<a class="sourceLine" id="cb2-22" title="22"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-23" title="23"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-24" title="24"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-25" title="25"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-26" title="26"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-27" title="27"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-28" title="28"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb2-29" title="29"><span class="co">#&gt; 10</span></a></code></pre></div>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 10 milliseconds means it can determine 100 input values per second. It case of 50 milliseconds, this is only 20 input values per second. The more an input value resembles a full name, the faster the result will be found. In case of <code><a href="../reference/as.mo.html">as.mo("B_STPHY_AUR")</a></code>, the input is already a valid MO code, so it only almost takes no time at all (261 millionths of seconds).</p>
<a class="sourceLine" id="cb2-12" title="12"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb2-13" title="13"><span class="co">#&gt; as.mo("sau") 42.00 43.00 47.00 43.00 44.0 81.00 10</span></a>
<a class="sourceLine" id="cb2-14" title="14"><span class="co">#&gt; as.mo("stau") 86.00 87.00 93.00 88.00 89.0 130.00 10</span></a>
<a class="sourceLine" id="cb2-15" title="15"><span class="co">#&gt; as.mo("staaur") 43.00 43.00 45.00 43.00 43.0 64.00 10</span></a>
<a class="sourceLine" id="cb2-16" title="16"><span class="co">#&gt; as.mo("S. aureus") 23.00 23.00 27.00 23.00 24.0 60.00 10</span></a>
<a class="sourceLine" id="cb2-17" title="17"><span class="co">#&gt; as.mo("S. aureus") 23.00 23.00 29.00 24.00 24.0 73.00 10</span></a>
<a class="sourceLine" id="cb2-18" title="18"><span class="co">#&gt; as.mo("STAAUR") 43.00 43.00 43.00 43.00 44.0 46.00 10</span></a>
<a class="sourceLine" id="cb2-19" title="19"><span class="co">#&gt; as.mo("Staphylococcus aureus") 14.00 15.00 19.00 15.00 16.0 53.00 10</span></a>
<a class="sourceLine" id="cb2-20" title="20"><span class="co">#&gt; as.mo("B_STPHY_AUR") 0.34 0.42 0.47 0.49 0.5 0.58 10</span></a></code></pre></div>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 10 milliseconds means it can determine 100 input values per second. It case of 50 milliseconds, this is only 20 input values per second. The more an input value resembles a full name, the faster the result will be found. In case of <code><a href="../reference/as.mo.html">as.mo("B_STPHY_AUR")</a></code>, the input is already a valid MO code, so it only almost takes no time at all (494 millionths of a second).</p>
<p>To achieve this speed, the <code>as.mo</code> function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of <em>Mycoplasma leonicaptivi</em> (<code>B_MYCPL_LEO</code>), a bug probably never found before in humans:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1">M.leonicaptivi &lt;-<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/microbenchmark/topics/microbenchmark">microbenchmark</a></span>(<span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"myle"</span>),</a>
<a class="sourceLine" id="cb3-2" title="2"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"mycleo"</span>),</a>
@ -246,25 +237,25 @@
<a class="sourceLine" id="cb3-6" title="6"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"Mycoplasma leonicaptivi"</span>),</a>
<a class="sourceLine" id="cb3-7" title="7"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"B_MYCPL_LEO"</span>),</a>
<a class="sourceLine" id="cb3-8" title="8"> <span class="dt">times =</span> <span class="dv">10</span>)</a>
<a class="sourceLine" id="cb3-9" title="9"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(M.leonicaptivi, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">4</span>)</a>
<a class="sourceLine" id="cb3-9" title="9"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(M.leonicaptivi, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">2</span>)</a>
<a class="sourceLine" id="cb3-10" title="10"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb3-11" title="11"><span class="co">#&gt; expr min lq mean median</span></a>
<a class="sourceLine" id="cb3-12" title="12"><span class="co">#&gt; as.mo("myle") 111.9000 112.1000 121.9000 112.4000</span></a>
<a class="sourceLine" id="cb3-13" title="13"><span class="co">#&gt; as.mo("mycleo") 381.6000 381.9000 397.9000 384.7000</span></a>
<a class="sourceLine" id="cb3-14" title="14"><span class="co">#&gt; as.mo("M. leonicaptivi") 202.9000 203.8000 205.5000 204.1000</span></a>
<a class="sourceLine" id="cb3-15" title="15"><span class="co">#&gt; as.mo("M. leonicaptivi") 203.1000 203.3000 208.7000 203.8000</span></a>
<a class="sourceLine" id="cb3-16" title="16"><span class="co">#&gt; as.mo("MYCLEO") 381.5000 381.7000 388.1000 381.9000</span></a>
<a class="sourceLine" id="cb3-17" title="17"><span class="co">#&gt; as.mo("Mycoplasma leonicaptivi") 103.0000 103.1000 103.6000 103.3000</span></a>
<a class="sourceLine" id="cb3-18" title="18"><span class="co">#&gt; as.mo("B_MYCPL_LEO") 0.3021 0.5631 0.5459 0.5664</span></a>
<a class="sourceLine" id="cb3-19" title="19"><span class="co">#&gt; uq max neval</span></a>
<a class="sourceLine" id="cb3-20" title="20"><span class="co">#&gt; 113.5000 169.7000 10</span></a>
<a class="sourceLine" id="cb3-21" title="21"><span class="co">#&gt; 420.5000 420.7000 10</span></a>
<a class="sourceLine" id="cb3-22" title="22"><span class="co">#&gt; 206.1000 215.4000 10</span></a>
<a class="sourceLine" id="cb3-23" title="23"><span class="co">#&gt; 204.6000 249.4000 10</span></a>
<a class="sourceLine" id="cb3-24" title="24"><span class="co">#&gt; 386.0000 433.7000 10</span></a>
<a class="sourceLine" id="cb3-25" title="25"><span class="co">#&gt; 103.8000 105.4000 10</span></a>
<a class="sourceLine" id="cb3-26" title="26"><span class="co">#&gt; 0.5712 0.6199 10</span></a></code></pre></div>
<p>That takes 5.9 times as much time on average! A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance:</p>
<a class="sourceLine" id="cb3-11" title="11"><span class="co">#&gt; expr min lq mean median uq max</span></a>
<a class="sourceLine" id="cb3-12" title="12"><span class="co">#&gt; as.mo("myle") 140.00 140.00 150.0 140.00 140.00 180.00</span></a>
<a class="sourceLine" id="cb3-13" title="13"><span class="co">#&gt; as.mo("mycleo") 470.00 480.00 500.0 510.00 520.00 560.00</span></a>
<a class="sourceLine" id="cb3-14" title="14"><span class="co">#&gt; as.mo("M. leonicaptivi") 240.00 240.00 250.0 240.00 280.00 290.00</span></a>
<a class="sourceLine" id="cb3-15" title="15"><span class="co">#&gt; as.mo("M. leonicaptivi") 240.00 240.00 250.0 240.00 280.00 280.00</span></a>
<a class="sourceLine" id="cb3-16" title="16"><span class="co">#&gt; as.mo("MYCLEO") 470.00 510.00 510.0 520.00 520.00 540.00</span></a>
<a class="sourceLine" id="cb3-17" title="17"><span class="co">#&gt; as.mo("Mycoplasma leonicaptivi") 150.00 150.00 170.0 180.00 190.00 200.00</span></a>
<a class="sourceLine" id="cb3-18" title="18"><span class="co">#&gt; as.mo("B_MYCPL_LEO") 0.32 0.58 0.6 0.59 0.61 0.97</span></a>
<a class="sourceLine" id="cb3-19" title="19"><span class="co">#&gt; neval</span></a>
<a class="sourceLine" id="cb3-20" title="20"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-21" title="21"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-22" title="22"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-23" title="23"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-24" title="24"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-25" title="25"><span class="co">#&gt; 10</span></a>
<a class="sourceLine" id="cb3-26" title="26"><span class="co">#&gt; 10</span></a></code></pre></div>
<p>That takes 6.9 times as much time on average! A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/graphics/topics/par">par</a></span>(<span class="dt">mar =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="dv">5</span>, <span class="dv">16</span>, <span class="dv">4</span>, <span class="dv">2</span>)) <span class="co"># set more space for left margin text (16)</span></a>
<a class="sourceLine" id="cb4-2" title="2"></a>
<a class="sourceLine" id="cb4-3" title="3"><span class="co"># highest value on y axis</span></a>
@ -272,10 +263,10 @@
<a class="sourceLine" id="cb4-5" title="5"></a>
<a class="sourceLine" id="cb4-6" title="6"><span class="kw"><a href="https://www.rdocumentation.org/packages/graphics/topics/boxplot">boxplot</a></span>(S.aureus, <span class="dt">horizontal =</span> <span class="ot">TRUE</span>, <span class="dt">las =</span> <span class="dv">1</span>, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">log =</span> <span class="ot">FALSE</span>, <span class="dt">xlab =</span> <span class="st">""</span>, <span class="dt">ylim =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="dv">0</span>, max_y_axis),</a>
<a class="sourceLine" id="cb4-7" title="7"> <span class="dt">main =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/expression">expression</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/paste">paste</a></span>(<span class="st">"Benchmark of "</span>, <span class="kw"><a href="https://www.rdocumentation.org/packages/grDevices/topics/plotmath">italic</a></span>(<span class="st">"Staphylococcus aureus"</span>))))</a></code></pre></div>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-4-1.png" width="720"></p>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-5-1.png" width="720"></p>
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/graphics/topics/boxplot">boxplot</a></span>(M.leonicaptivi, <span class="dt">horizontal =</span> <span class="ot">TRUE</span>, <span class="dt">las =</span> <span class="dv">1</span>, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">log =</span> <span class="ot">FALSE</span>, <span class="dt">xlab =</span> <span class="st">""</span>, <span class="dt">ylim =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="dv">0</span>, max_y_axis),</a>
<a class="sourceLine" id="cb5-2" title="2"> <span class="dt">main =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/expression">expression</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/paste">paste</a></span>(<span class="st">"Benchmark of "</span>, <span class="kw"><a href="https://www.rdocumentation.org/packages/grDevices/topics/plotmath">italic</a></span>(<span class="st">"Mycoplasma leonicaptivi"</span>))))</a></code></pre></div>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-4-2.png" width="720"></p>
<p><img src="benchmarks_files/figure-html/unnamed-chunk-5-2.png" width="720"></p>
<p>To relieve this pitfall and further improve performance, two important calculations take almost no time at all: <strong>repetitive results</strong> and <strong>already precalculated results</strong>.</p>
<div id="repetitive-results" class="section level3">
<h3 class="hasAnchor">
@ -301,8 +292,8 @@
<a class="sourceLine" id="cb6-18" title="18"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb6-19" title="19"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb6-20" title="20"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb6-21" title="21"><span class="co">#&gt; mo_fullname(x) 438 448 467 470 476 500 10</span></a></code></pre></div>
<p>So transforming 500,000 values (!) of 95 unique values only takes 0.47 seconds (469 ms). You only lose time on your unique input values.</p>
<a class="sourceLine" id="cb6-21" title="21"><span class="co">#&gt; mo_fullname(x) 445 466 497 491 536 543 10</span></a></code></pre></div>
<p>So transforming 500,000 values (!) of 95 unique values only takes 0.49 seconds (490 ms). You only lose time on your unique input values.</p>
</div>
<div id="precalculated-results" class="section level3">
<h3 class="hasAnchor">
@ -314,10 +305,10 @@
<a class="sourceLine" id="cb7-4" title="4"> <span class="dt">times =</span> <span class="dv">10</span>)</a>
<a class="sourceLine" id="cb7-5" title="5"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb7-6" title="6"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb7-7" title="7"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb7-8" title="8"><span class="co">#&gt; A 38.500 38.600 38.700 38.700 38.900 39.100 10</span></a>
<a class="sourceLine" id="cb7-9" title="9"><span class="co">#&gt; B 19.400 19.500 20.900 19.800 20.100 31.200 10</span></a>
<a class="sourceLine" id="cb7-10" title="10"><span class="co">#&gt; C 0.256 0.293 0.389 0.395 0.473 0.507 10</span></a></code></pre></div>
<a class="sourceLine" id="cb7-7" title="7"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb7-8" title="8"><span class="co">#&gt; A 38.70 39.100 40.200 40.000 40.100 45.300 10</span></a>
<a class="sourceLine" id="cb7-9" title="9"><span class="co">#&gt; B 24.50 24.600 24.800 24.700 24.700 25.500 10</span></a>
<a class="sourceLine" id="cb7-10" title="10"><span class="co">#&gt; C 0.26 0.392 0.434 0.447 0.516 0.561 10</span></a></code></pre></div>
<p>So going from <code><a href="../reference/mo_property.html">mo_fullname("Staphylococcus aureus")</a></code> to <code>"Staphylococcus aureus"</code> takes 0.0004 seconds - it doesnt even start calculating <em>if the result would be the same as the expected resulting value</em>. That goes for all helper functions:</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1">run_it &lt;-<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/microbenchmark/topics/microbenchmark">microbenchmark</a></span>(<span class="dt">A =</span> <span class="kw"><a href="../reference/mo_property.html">mo_species</a></span>(<span class="st">"aureus"</span>),</a>
<a class="sourceLine" id="cb8-2" title="2"> <span class="dt">B =</span> <span class="kw"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="st">"Staphylococcus"</span>),</a>
@ -331,14 +322,14 @@
<a class="sourceLine" id="cb8-10" title="10"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb8-11" title="11"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb8-12" title="12"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb8-13" title="13"><span class="co">#&gt; A 0.277 0.328 0.410 0.450 0.467 0.483 10</span></a>
<a class="sourceLine" id="cb8-14" title="14"><span class="co">#&gt; B 0.291 0.307 0.363 0.374 0.390 0.467 10</span></a>
<a class="sourceLine" id="cb8-15" title="15"><span class="co">#&gt; C 0.299 0.336 0.400 0.400 0.485 0.498 10</span></a>
<a class="sourceLine" id="cb8-16" title="16"><span class="co">#&gt; D 0.271 0.288 0.319 0.328 0.346 0.371 10</span></a>
<a class="sourceLine" id="cb8-17" title="17"><span class="co">#&gt; E 0.202 0.263 0.288 0.270 0.304 0.405 10</span></a>
<a class="sourceLine" id="cb8-18" title="18"><span class="co">#&gt; F 0.241 0.255 0.296 0.283 0.350 0.362 10</span></a>
<a class="sourceLine" id="cb8-19" title="19"><span class="co">#&gt; G 0.260 0.264 0.303 0.281 0.312 0.425 10</span></a>
<a class="sourceLine" id="cb8-20" title="20"><span class="co">#&gt; H 0.240 0.256 0.310 0.327 0.346 0.378 10</span></a></code></pre></div>
<a class="sourceLine" id="cb8-13" title="13"><span class="co">#&gt; A 0.297 0.329 0.400 0.416 0.453 0.459 10</span></a>
<a class="sourceLine" id="cb8-14" title="14"><span class="co">#&gt; B 0.277 0.304 0.349 0.363 0.382 0.407 10</span></a>
<a class="sourceLine" id="cb8-15" title="15"><span class="co">#&gt; C 0.281 0.430 0.436 0.440 0.471 0.493 10</span></a>
<a class="sourceLine" id="cb8-16" title="16"><span class="co">#&gt; D 0.249 0.277 0.310 0.316 0.337 0.347 10</span></a>
<a class="sourceLine" id="cb8-17" title="17"><span class="co">#&gt; E 0.214 0.252 0.300 0.306 0.338 0.403 10</span></a>
<a class="sourceLine" id="cb8-18" title="18"><span class="co">#&gt; F 0.237 0.270 0.300 0.311 0.326 0.335 10</span></a>
<a class="sourceLine" id="cb8-19" title="19"><span class="co">#&gt; G 0.245 0.282 0.297 0.298 0.314 0.348 10</span></a>
<a class="sourceLine" id="cb8-20" title="20"><span class="co">#&gt; H 0.241 0.282 0.308 0.312 0.328 0.373 10</span></a></code></pre></div>
<p>Of course, when running <code><a href="../reference/mo_property.html">mo_phylum("Firmicutes")</a></code> the function has zero knowledge about the actual microorganism, namely <em>S. aureus</em>. But since the result would be <code>"Firmicutes"</code> too, there is no point in calculating the result. And because this package knows all phyla of all known bacteria (according to the Catalogue of Life), it can just return the initial value immediately.</p>
</div>
<div id="results-in-other-languages" class="section level3">
@ -365,13 +356,13 @@
<a class="sourceLine" id="cb9-18" title="18"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">4</span>)</a>
<a class="sourceLine" id="cb9-19" title="19"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb9-20" title="20"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb9-21" title="21"><span class="co">#&gt; en 11.01 11.04 11.05 11.06 11.07 11.08 10</span></a>
<a class="sourceLine" id="cb9-22" title="22"><span class="co">#&gt; de 19.31 19.51 19.79 19.61 19.91 21.00 10</span></a>
<a class="sourceLine" id="cb9-23" title="23"><span class="co">#&gt; nl 19.13 19.37 26.23 19.59 21.11 52.30 10</span></a>
<a class="sourceLine" id="cb9-24" title="24"><span class="co">#&gt; es 19.13 19.42 19.51 19.53 19.58 20.00 10</span></a>
<a class="sourceLine" id="cb9-25" title="25"><span class="co">#&gt; it 19.16 19.34 29.12 19.55 51.61 52.06 10</span></a>
<a class="sourceLine" id="cb9-26" title="26"><span class="co">#&gt; fr 19.01 19.54 19.84 19.69 20.41 20.46 10</span></a>
<a class="sourceLine" id="cb9-27" title="27"><span class="co">#&gt; pt 19.00 19.33 19.44 19.49 19.59 19.67 10</span></a></code></pre></div>
<a class="sourceLine" id="cb9-21" title="21"><span class="co">#&gt; en 10.85 10.89 11.10 11.03 11.23 11.83 10</span></a>
<a class="sourceLine" id="cb9-22" title="22"><span class="co">#&gt; de 19.43 19.50 19.86 19.58 20.35 20.99 10</span></a>
<a class="sourceLine" id="cb9-23" title="23"><span class="co">#&gt; nl 19.08 19.17 19.40 19.48 19.56 19.63 10</span></a>
<a class="sourceLine" id="cb9-24" title="24"><span class="co">#&gt; es 19.35 19.44 26.07 19.48 20.06 52.36 10</span></a>
<a class="sourceLine" id="cb9-25" title="25"><span class="co">#&gt; it 19.23 19.40 22.91 19.49 19.91 52.92 10</span></a>
<a class="sourceLine" id="cb9-26" title="26"><span class="co">#&gt; fr 19.10 19.22 19.40 19.45 19.54 19.68 10</span></a>
<a class="sourceLine" id="cb9-27" title="27"><span class="co">#&gt; pt 19.01 19.46 29.32 19.55 52.32 52.50 10</span></a></code></pre></div>
<p>Currently supported are German, Dutch, Spanish, Italian, French and Portuguese.</p>
</div>
</div>

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

View File

@ -192,7 +192,7 @@
<h1>How to create frequency tables</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>freq.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to get properties of a microorganism</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>mo_property.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to predict antimicrobial resistance</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">20 February 2019</h4>
<h4 class="date">21 February 2019</h4>
<div class="hidden name"><code>resistance_predict.Rmd</code></div>