1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-09 02:03:04 +02:00

rlang dependency, new fungi

This commit is contained in:
2019-02-28 13:56:28 +01:00
parent cf3bdb54c7
commit 2565b60024
86 changed files with 762 additions and 705 deletions

View File

@ -80,7 +80,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9020</span>
</span>
</div>
@ -296,11 +296,12 @@
| | | ----&gt; subspecies, a 3-4 letter acronym
| | ----&gt; species, a 3-4 letter acronym
| ----&gt; genus, a 5-7 letter acronym, mostly without vowels
----&gt; taxonomic kingdom: A (Archaea), B (Bacteria), C (Chromista),
F (Fungi), P (Protozoa) or V (Viruses)
----&gt; taxonomic kingdom: A (Archaea), AN (Animalia), B (Bacteria), C (Chromista),
F (Fungi), P (Protozoa), PL (Plantae) or V (Viruses)
</pre>
<p>Use the <code><a href='mo_property.html'>mo_property</a></code> functions to get properties based on the returned code, see Examples.</p>
<p>This function uses Artificial Intelligence (AI) to help getting fast and logical results. It tries to find matches in this order:</p><ul>
<p><strong>Artificial Intelligence</strong> <br />
This function uses Artificial Intelligence (AI) to help getting fast and logical results. It tries to find matches in this order:</p><ul>
<li><p>Taxonomic kingdom: it first searches in Bacteria, then Fungi, then Protozoa</p></li>
<li><p>Human pathogenic prevalence: it first searches in more prevalent microorganisms, then less prevalent ones (see section <em>Microbial prevalence of pathogens in humans</em>)</p></li>
<li><p>Valid MO codes and full names: it first searches in already valid MO code and known genus/species combinations</p></li>
@ -310,8 +311,8 @@
<li><p><code>"E. coli"</code> will return the ID of <em>Escherichia coli</em> and not <em>Entamoeba coli</em>, although the latter would alphabetically come first</p></li>
<li><p><code>"H. influenzae"</code> will return the ID of <em>Haemophilus influenzae</em> and not <em>Haematobacter influenzae</em> for the same reason</p></li>
<li><p>Something like <code>"stau"</code> or <code>"S aur"</code> will return the ID of <em>Staphylococcus aureus</em> and not <em>Staphylococcus auricularis</em></p></li>
</ul><p>This means that looking up human pathogenic microorganisms takes less time than looking up human <strong>non</strong>-pathogenic microorganisms.</p>
<p><strong>UNCERTAIN RESULTS</strong> <br />
</ul><p>This means that looking up human pathogenic microorganisms takes less time than looking up human non-pathogenic microorganisms.</p>
<p><strong>Uncertain results</strong> <br />
When using <code>allow_uncertain = TRUE</code> (which is the default setting), it will use additional rules if all previous AI rules failed to get valid results. These are:</p><ul>
<li><p>It tries to look for previously accepted (but now invalid) taxonomic names</p></li>
<li><p>It strips off values between brackets and the brackets itself, and re-evaluates the input with all previous rules</p></li>
@ -325,19 +326,16 @@ When using <code>allow_uncertain = TRUE</code> (which is the default setting), i
<li><p><code>"Fluoroquinolone-resistant Neisseria gonorrhoeae"</code>. The first word will be stripped, after which the function will try to find a match. A warning will be thrown that the result <em>Neisseria gonorrhoeae</em> (<code>B_NESSR_GON</code>) needs review.</p></li>
</ul>
<p>Use <code>mo_failures()</code> to get a vector with all values that could not be coerced to a valid value.</p>
<p>Use <code>mo_uncertainties()</code> to get info about all values that were coerced to a valid value, but with uncertainty.</p>
<p>Use <code>mo_uncertainties()</code> to get a data.frame with all values that were coerced to a valid value, but with uncertainty.</p>
<p>Use <code>mo_renamed()</code> to get a vector with all values that could be coerced based on an old, previously accepted taxonomic name.</p>
<h2 class="hasAnchor" id="microbial-prevalence-of-pathogens-in-humans"><a class="anchor" href="#microbial-prevalence-of-pathogens-in-humans"></a>Microbial prevalence of pathogens in humans</h2>
<p>The artificial intelligence takes into account microbial prevalence of pathogens in humans. It uses three groups and every (sub)species is in the group it matches first. These groups are:</p><ul>
<p><strong>Microbial prevalence of pathogens in humans</strong> <br />
The artificial intelligence takes into account microbial prevalence of pathogens in humans. It uses three groups and every (sub)species is in the group it matches first. These groups are:</p><ul>
<li><p>1 (most prevalent): class is Gammaproteobacteria <strong>or</strong> genus is one of: <em>Enterococcus</em>, <em>Staphylococcus</em>, <em>Streptococcus</em>.</p></li>
<li><p>2: phylum is one of: Proteobacteria, Firmicutes, Actinobacteria, Sarcomastigophora <strong>or</strong> genus is one of: <em>Aspergillus</em>, <em>Bacteroides</em>, <em>Candida</em>, <em>Capnocytophaga</em>, <em>Chryseobacterium</em>, <em>Cryptococcus</em>, <em>Elisabethkingia</em>, <em>Flavobacterium</em>, <em>Fusobacterium</em>, <em>Giardia</em>, <em>Leptotrichia</em>, <em>Mycoplasma</em>, <em>Prevotella</em>, <em>Rhodotorula</em>, <em>Treponema</em>, <em>Trichophyton</em>, <em>Ureaplasma</em>.</p></li>
<li><p>3 (least prevalent): all others.</p></li>
</ul>
<p>Group 1 contains all common Gram negatives, like all Enterobacteriaceae and e.g. <em>Pseudomonas</em> and <em>Legionella</em>.</p>
<p>Group 2 probably contains all microbial pathogens ever found in humans.</p>
<p>Group 2 probably contains all other microbial pathogens ever found in humans.</p>
<h2 class="hasAnchor" id="source"><a class="anchor" href="#source"></a>Source</h2>
@ -349,17 +347,9 @@ When using <code>allow_uncertain = TRUE</code> (which is the default setting), i
<h2 class="hasAnchor" id="catalogue-of-life"><a class="anchor" href="#catalogue-of-life"></a>Catalogue of Life</h2>
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>
</ul>
<p>The Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.6 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.</p>
<p>The syntax used to transform the original data to a cleansed R format, can be found here: <a href='https://gitlab.com/msberends/AMR/blob/master/reproduction_of_microorganisms.R'>https://gitlab.com/msberends/AMR/blob/master/reproduction_of_microorganisms.R</a>.</p>
<p><img src='figures/logo_col.png' height=40px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms (~60,000 species) from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). The Catalogue of Life is the most comprehensive and authoritative global index of species currently available.</p>
<p><a href='catalogue_of_life.html'>Click here</a> for more information about the included taxa. The Catalogue of Life releases updates annually; check which version was included in this package with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<h2 class="hasAnchor" id="read-more-on-our-website-"><a class="anchor" href="#read-more-on-our-website-"></a>Read more on our website!</h2>
@ -431,8 +421,6 @@ The <code><a href='mo_property.html'>mo_property</a></code> functions (like <cod
<li><a href="#details">Details</a></li>
<li><a href="#microbial-prevalence-of-pathogens-in-humans">Microbial prevalence of pathogens in humans</a></li>
<li><a href="#source">Source</a></li>
<li><a href="#catalogue-of-life">Catalogue of Life</a></li>