<!-- Generated by pkgdown: do not edit by hand --><htmllang="en"><head><metahttp-equiv="Content-Type"content="text/html; charset=UTF-8"><metacharset="utf-8"><metahttp-equiv="X-UA-Compatible"content="IE=edge"><metaname="viewport"content="width=device-width, initial-scale=1, shrink-to-fit=no"><title>Calculate the Matching Score for Microorganisms — mo_matching_score • AMR (for R)</title><!-- favicons --><linkrel="icon"type="image/png"sizes="16x16"href="../favicon-16x16.png"><linkrel="icon"type="image/png"sizes="32x32"href="../favicon-32x32.png"><linkrel="apple-touch-icon"type="image/png"sizes="180x180"href="../apple-touch-icon.png"><linkrel="apple-touch-icon"type="image/png"sizes="120x120"href="../apple-touch-icon-120x120.png"><linkrel="apple-touch-icon"type="image/png"sizes="76x76"href="../apple-touch-icon-76x76.png"><linkrel="apple-touch-icon"type="image/png"sizes="60x60"href="../apple-touch-icon-60x60.png"><scriptsrc="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><metaname="viewport"content="width=device-width, initial-scale=1, shrink-to-fit=no"><linkhref="../deps/bootstrap-5.3.1/bootstrap.min.css"rel="stylesheet"><scriptsrc="../deps/bootstrap-5.3.1/bootstrap.bundle.min.js"></script><linkhref="../deps/Lato-0.4.9/font.css"rel="stylesheet"><linkhref="../deps/Fira_Code-0.4.9/font.css"rel="stylesheet"><linkhref="../deps/font-awesome-6.4.2/css/all.min.css"rel="stylesheet"><linkhref="../deps/font-awesome-6.4.2/css/v4-shims.min.css"rel="stylesheet"><scriptsrc="../deps/headroom-0.11.0/headroom.min.js"></script><scriptsrc="../deps/headroom-0.11.0/jQuery.headroom.min.js"></script><scriptsrc="../deps/bootstrap-toc-1.0.1/bootstrap-toc.min.js"></script><scriptsrc="../deps/clipboard.js-2.0.11/clipboard.min.js"></script><scriptsrc="../deps/search-1.0.0/autocomplete.jquery.min.js"></script><scriptsrc="../deps/search-1.0.0/fuse.min.js"></script><scriptsrc="../deps/search-1.0.0/mark.min.js"></script><!-- pkgdown --><scriptsrc="../pkgdown.js"></script><linkhref="../extra.css"rel="stylesheet"><scriptsrc="../extra.js"></script><metaproperty="og:title"content="Calculate the Matching Score for Microorganisms — mo_matching_score"><metaname="description"content="This algorithm is used by as.mo() and all the mo_* functions to determine the most probable match of taxonomic records based on user input."><metaproperty="og:description"content="This algorithm is used by as.mo() and all the mo_* functions to determine the most probable match of taxonomic records based on user input."><metaproperty="og:image"content="https://msberends.github.io/AMR/logo.svg"></head><body>
<buttonclass="nav-link dropdown-toggle"type="button"id="dropdown-how-to"data-bs-toggle="dropdown"aria-expanded="false"aria-haspopup="true"><spanclass="fa fa-question-circle"></span> How to</button>
<ulclass="dropdown-menu"aria-labelledby="dropdown-how-to"><li><aclass="dropdown-item"href="../articles/AMR.html"><spanclass="fa fa-directions"></span> Conduct AMR Analysis</a></li>
<p>This algorithm is used by <code><ahref="as.mo.html">as.mo()</a></code> and all the <code><ahref="mo_property.html">mo_*</a></code> functions to determine the most probable match of taxonomic records based on user input.</p>
<p>This algorithm was originally developed in 2018 and subsequently described in: Berends MS <em>et al.</em> (2022). <strong>AMR: An R Package for Working with Antimicrobial Resistance Data</strong>. <em>Journal of Statistical Software</em>, 104(3), 1-31; <ahref="https://doi.org/10.18637/jss.v104.i03"class="external-link">doi:10.18637/jss.v104.i03</a>
<p>Later, the work of Bartlett A <em>et al.</em> about bacterial pathogens infecting humans (2022, <ahref="https://doi.org/10.1099/mic.0.001269"class="external-link">doi:10.1099/mic.0.001269</a>
<h2id="matching-score-for-microorganisms">Matching Score for Microorganisms<aclass="anchor"aria-label="anchor"href="#matching-score-for-microorganisms"></a></h2>
<p>With ambiguous user input in <code><ahref="as.mo.html">as.mo()</a></code> and all the <code><ahref="mo_property.html">mo_*</a></code> functions, the returned results are chosen based on their matching score using <code>mo_matching_score()</code>. This matching score \(m\), is calculated as:</p>
<li><p>\(n\) is a taxonomic name (genus, species, and subspecies);</p></li>
<li><p>\(l_n\) is the length of \(n\);</p></li>
<li><p>\(lev\) is the <ahref="https://en.wikipedia.org/wiki/Levenshtein_distance"class="external-link">Levenshtein distance function</a> (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \(x\) into \(n\);</p></li>
</ul><p>The grouping into human pathogenic prevalence \(p\) is based on recent work from Bartlett <em>et al.</em> (2022, <ahref="https://doi.org/10.1099/mic.0.001269"class="external-link">doi:10.1099/mic.0.001269</a>
) who extensively studied medical-scientific literature to categorise all bacterial species into these groups:</p><ul><li><p><strong>Established</strong>, if a taxonomic species has infected at least three persons in three or more references. These records have <code>prevalence = 1.15</code> in the <ahref="microorganisms.html">microorganisms</a> data set;</p></li>
<li><p><strong>Putative</strong>, if a taxonomic species has fewer than three known cases. These records have <code>prevalence = 1.25</code> in the <ahref="microorganisms.html">microorganisms</a> data set.</p></li>
</ul><p>Furthermore,</p><ul><li><p>Genera from the World Health Organization's (WHO) Priority Pathogen List have <code>prevalence = 1.0</code> in the <ahref="microorganisms.html">microorganisms</a> data set;</p></li>
<li><p>Any genus present in the <strong>established</strong> list also has <code>prevalence = 1.15</code> in the <ahref="microorganisms.html">microorganisms</a> data set;</p></li>
<li><p>Any other genus present in the <strong>putative</strong> list has <code>prevalence = 1.25</code> in the <ahref="microorganisms.html">microorganisms</a> data set;</p></li>
<li><p>Any other species or subspecies of which the genus is present in the two aforementioned groups, has <code>prevalence = 1.5</code> in the <ahref="microorganisms.html">microorganisms</a> data set;</p></li>
</ul><p>When calculating the matching score, all characters in \(x\) and \(n\) are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses.</p>
<p>All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., <code>"E. coli"</code> will return the microbial ID of <em>Escherichia coli</em> (\(m = 0.688\), a highly prevalent microorganism found in humans) and not <em>Entamoeba coli</em> (\(m = 0.381\), a less prevalent microorganism in humans), although the latter would alphabetically come first.</p>
<h2id="reference-data-publicly-available">Reference Data Publicly Available<aclass="anchor"aria-label="anchor"href="#reference-data-publicly-available"></a></h2>
<p>All data sets in this <code>AMR</code> package (about microorganisms, antibiotics, SIR interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit <ahref="https://msberends.github.io/AMR/articles/datasets.html">our website for the download links</a>. The actual files are of course available on <ahref="https://github.com/msberends/AMR/tree/main/data-raw"class="external-link">our GitHub repository</a>.</p>
<p><code>AMR</code> (for R). Free and open-source, licenced under the <atarget="_blank"href="https://github.com/msberends/AMR/blob/main/LICENSE"class="external-link">GNU General Public License version 2.0 (GPL-2)</a>.<br>Developed at the <atarget="_blank"href="https://www.rug.nl"class="external-link">University of Groningen</a> and <atarget="_blank"href="https://www.umcg.nl"class="external-link">University Medical Center Groningen</a> in The Netherlands.</p>