1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-11 21:01:54 +02:00

(v1.3.0.9022) mo_matching_score(), poorman update, as.rsi() fix

This commit is contained in:
2020-09-18 16:05:53 +02:00
parent 89401ede9f
commit 4e40e42011
138 changed files with 2923 additions and 1472 deletions

View File

@ -50,7 +50,7 @@
<meta property="og:title" content="Transform input to a microorganism ID — as.mo" />
<meta property="og:description" content="Use this function to determine a valid microorganism ID (mo). Determination is done using intelligent rules and the complete taxonomic kingdoms Bacteria, Chromista, Protozoa, Archaea and most microbial species from the kingdom Fungi (see Source). The input can be almost anything: a full name (like &quot;Staphylococcus aureus&quot;), an abbreviated name (like &quot;S. aureus&quot;), an abbreviation known in the field (like &quot;MRSA&quot;), or just a genus. Please see Examples." />
<meta property="og:image" content="https://msberends.github.io/AMR/logo.svg" />
<meta property="og:image" content="https://msberends.github.io/AMR/logo.png" />
@ -82,7 +82,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">1.3.0.9022</span>
</span>
</div>
@ -266,7 +266,7 @@
<colgroup><col class="name" /><col class="desc" /></colgroup>
<tr>
<th>x</th>
<td><p>a character vector or a <code><a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a></code> with one or two columns</p></td>
<td><p>a character vector or a <a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a> with one or two columns</p></td>
</tr>
<tr>
<th>Becker</th>
@ -284,7 +284,7 @@
</tr>
<tr>
<th>reference_df</th>
<td><p>a <code><a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a></code> to be used for extra reference when translating <code>x</code> to a valid <code>mo</code>. See <code><a href='mo_source.html'>set_mo_source()</a></code> and <code><a href='mo_source.html'>get_mo_source()</a></code> to automate the usage of your own codes (e.g. used in your analysis or organisation).</p></td>
<td><p>a <a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a> to be used for extra reference when translating <code>x</code> to a valid <code>mo</code>. See <code><a href='mo_source.html'>set_mo_source()</a></code> and <code><a href='mo_source.html'>get_mo_source()</a></code> to automate the usage of your own codes (e.g. used in your analysis or organisation).</p></td>
</tr>
<tr>
<th>ignore_pattern</th>
@ -302,7 +302,7 @@
<h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>
<p>A <code><a href='https://rdrr.io/r/base/character.html'>character</a></code> <code><a href='https://rdrr.io/r/base/vector.html'>vector</a></code> with additional class <code>mo</code></p>
<p>A <a href='https://rdrr.io/r/base/character.html'>character</a> <a href='https://rdrr.io/r/base/vector.html'>vector</a> with additional class <code>mo</code></p>
<h2 class="hasAnchor" id="details"><a class="anchor" href="#details"></a>Details</h2>
@ -352,9 +352,9 @@
</ul>
<p>There are three helper functions that can be run after using the <code>as.mo()</code> function:</p><ul>
<li><p>Use <code>mo_uncertainties()</code> to get a <code><a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a></code> that prints in a pretty format with all taxonomic names that were guessed. The output contains a score that is based on the human pathogenic prevalence and the <a href='https://en.wikipedia.org/wiki/Levenshtein_distance'>Levenshtein distance</a> between the user input and the full taxonomic name.</p></li>
<li><p>Use <code>mo_failures()</code> to get a <code><a href='https://rdrr.io/r/base/character.html'>character</a></code> <code><a href='https://rdrr.io/r/base/vector.html'>vector</a></code> with all values that could not be coerced to a valid value.</p></li>
<li><p>Use <code>mo_renamed()</code> to get a <code><a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a></code> with all values that could be coerced based on old, previously accepted taxonomic names.</p></li>
<li><p>Use <code>mo_uncertainties()</code> to get a <a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a> that prints in a pretty format with all taxonomic names that were guessed. The output contains the matching score for all matches (see <em>Background on matching score</em>).</p></li>
<li><p>Use <code>mo_failures()</code> to get a <a href='https://rdrr.io/r/base/character.html'>character</a> <a href='https://rdrr.io/r/base/vector.html'>vector</a> with all values that could not be coerced to a valid value.</p></li>
<li><p>Use <code>mo_renamed()</code> to get a <a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a> with all values that could be coerced based on old, previously accepted taxonomic names.</p></li>
</ul>
@ -366,6 +366,21 @@
<p>Group 2 consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is <em>Aspergillus</em>, <em>Bacteroides</em>, <em>Candida</em>, <em>Capnocytophaga</em>, <em>Chryseobacterium</em>, <em>Cryptococcus</em>, <em>Elisabethkingia</em>, <em>Flavobacterium</em>, <em>Fusobacterium</em>, <em>Giardia</em>, <em>Leptotrichia</em>, <em>Mycoplasma</em>, <em>Prevotella</em>, <em>Rhodotorula</em>, <em>Treponema</em>, <em>Trichophyton</em> or <em>Ureaplasma</em>. This group consequently contains all less common and rare human pathogens.</p>
<p>Group 3 (least prevalent microorganisms) consists of all other microorganisms. This group contains microorganisms most probably not found in humans.</p>
<h3>Background on matching scores</h3>
<p>With ambiguous user input, the returned results are chosen based on their matching score using <code><a href='mo_matching_score.html'>mo_matching_score()</a></code>. This matching score is based on four parameters:</p><ol>
<li><p>The prevalence \(P\) is categorised into group 1, 2 and 3 as stated above;</p></li>
<li><p>A kingdom index \(K\) is set as follows: Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, and all others = 5;</p></li>
<li><p>The level of uncertainty \(U\) needed to get to the result, as stated above (1 to 3);</p></li>
<li><p>The <a href='https://en.wikipedia.org/wiki/Levenshtein_distance'>Levenshtein distance</a> \(L\) is the distance between the user input and all taxonomic full names, with the text length of the user input being the maximum distance. A modified version of the Levenshtein distance \(L'\) based on the text length of the full name \(F\) is calculated as:</p></li>
</ol>
<p>$$L' = F - \frac{0.5 \times L}{F}$$</p>
<p>The final matching score \(M\) is calculated as:
$$M = L' \times \frac{1}{P \times K} * \frac{1}{U}$$</p>
<p>All matches are sorted descending on their matching score and for all user input values, the top match will be returned.</p>
<h2 class="hasAnchor" id="source"><a class="anchor" href="#source"></a>Source</h2>
@ -403,7 +418,7 @@ This package contains the complete taxonomic tree of almost all microorganisms (
<p>On our website <a href='https://msberends.github.io/AMR'>https://msberends.github.io/AMR</a> you can find <a href='https://msberends.github.io/AMR/articles/AMR.html'>a comprehensive tutorial</a> about how to conduct AMR analysis, the <a href='https://msberends.github.io/AMR/reference'>complete documentation of all functions</a> (which reads a lot easier than here in R) and <a href='https://msberends.github.io/AMR/articles/WHONET.html'>an example analysis using WHONET data</a>. As we would like to better understand the backgrounds and needs of our users, please <a href='https://msberends.github.io/AMR/survey.html'>participate in our survey</a>!</p>
<h2 class="hasAnchor" id="see-also"><a class="anchor" href="#see-also"></a>See also</h2>
<div class='dont-index'><p><a href='microorganisms.html'>microorganisms</a> for the <code><a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a></code> that is being used to determine ID's.</p>
<div class='dont-index'><p><a href='microorganisms.html'>microorganisms</a> for the <a href='https://rdrr.io/r/base/data.frame.html'>data.frame</a> that is being used to determine ID's.</p>
<p>The <code><a href='mo_property.html'>mo_property()</a></code> functions (like <code><a href='mo_property.html'>mo_genus()</a></code>, <code><a href='mo_property.html'>mo_gramstain()</a></code>) to get properties based on the returned code.</p></div>
<h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>