<!-- Generated by pkgdown: do not edit by hand --><htmllang="en"><head><metahttp-equiv="Content-Type"content="text/html; charset=UTF-8"><metacharset="utf-8"><metahttp-equiv="X-UA-Compatible"content="IE=edge"><metaname="viewport"content="width=device-width, initial-scale=1, shrink-to-fit=no"><metaname="description"content="A data set containing the full microbial taxonomy (last updated: 11 December, 2022) of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF). This data set is the backbone of this AMR package. MO codes can be looked up using as.mo()."><title>Data Set with 52,144 Microorganisms — microorganisms • AMR (for R)</title><!-- favicons --><linkrel="icon"type="image/png"sizes="16x16"href="../favicon-16x16.png"><linkrel="icon"type="image/png"sizes="32x32"href="../favicon-32x32.png"><linkrel="apple-touch-icon"type="image/png"sizes="180x180"href="../apple-touch-icon.png"><linkrel="apple-touch-icon"type="image/png"sizes="120x120"href="../apple-touch-icon-120x120.png"><linkrel="apple-touch-icon"type="image/png"sizes="76x76"href="../apple-touch-icon-76x76.png"><linkrel="apple-touch-icon"type="image/png"sizes="60x60"href="../apple-touch-icon-60x60.png"><scriptsrc="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><metaname="viewport"content="width=device-width, initial-scale=1, shrink-to-fit=no"><linkhref="../deps/bootstrap-5.1.3/bootstrap.min.css"rel="stylesheet"><scriptsrc="../deps/bootstrap-5.1.3/bootstrap.bundle.min.js"></script><linkhref="../deps/Fira_Code-0.4.4/font.css"rel="stylesheet"><!-- Font Awesome icons --><linkrel="stylesheet"href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css"integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk="crossorigin="anonymous"><linkrel="stylesheet"href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css"integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw="crossorigin="anonymous"><!-- bootstrap-toc --><scriptsrc="https://cdn.jsdelivr.net/gh/afeld/bootstrap-toc@v1.0.1/dist/bootstrap-toc.min.js"integrity="sha256-4veVQbu7//Lk5TSmc7YV48MxtMy98e26cf5MrgZYnwo="crossorigin="anonymous"></script><!-- headroom.js --><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js"integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0="crossorigin="anonymous"></script><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js"integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4="crossorigin="anonymous"></script><!-- clipboard.js --><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js"integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI="crossorigin="anonymous"></script><!-- search --><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js"integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A=="crossorigin="anonymous"></script><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js"integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg=="crossorigin="anonymous"></script><scriptsrc="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js"integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww=="crossorigin="anonymous"></script><!-- pkgdown --><scriptsrc="../pkgdown.js"></script><linkhref="../extra.css"rel="stylesheet"><scriptsrc="../extra.js"></script><metaproperty="og:title"content="Data Set with 52,144 Microorganisms — microorganisms"><metaproperty="og:description"content="Adatasetcontainingthefullmicrobialtaxonomy(lastupdated:11December,2022)offivekingdomsfromtheListofProkaryoticnameswithStandinginNomenclature(LPSN)andtheGlobalBiodiversityInformationFacility(GBIF).Thisdatasetisthe
<p>A data set containing the full microbial taxonomy (<strong>last updated: 11 December, 2022</strong>) of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF). This data set is the backbone of this <code>AMR</code> package. MO codes can be looked up using <code><ahref="as.mo.html">as.mo()</a></code>.</p>
<p>A <ahref="https://tibble.tidyverse.org/reference/tibble.html"class="external-link">tibble</a> with 52,144 observations and 22 variables:</p><ul><li><p><code>mo</code><br> ID of microorganism as used by this package</p></li>
<li><p><code>fullname</code><br> Full name, like <code>"Escherichia coli"</code>. For the taxonomic ranks genus, species and subspecies, this is the 'pasted' text of genus, species, and subspecies. For all taxonomic ranks higher than genus, this is the name of the taxon.</p></li>
<li><p><code>status</code><br> Status of the taxon, either "accepted" or "synonym"</p></li>
<li><p><code>rank</code><br> Text of the taxonomic rank of the microorganism, such as <code>"species"</code> or <code>"genus"</code></p></li>
<li><p><code>ref</code><br> Author(s) and year of related scientific publication. This contains only the <em>first surname</em> and year of the <em>latest</em> authors, e.g. "Wallis <em>et al.</em> 2006 <em>emend.</em> Smith and Jones 2018" becomes "Smith <em>et al.</em>, 2018". This field is directly retrieved from the source specified in the column <code>source</code>. Moreover, accents were removed to comply with CRAN that only allows ASCII characters, e.g. "Váňová" becomes "Vanova".</p></li>
<li><p><code>lpsn</code><br> Identifier ('Record number') of the List of Prokaryotic names with Standing in Nomenclature (LPSN). This will be the first/highest LPSN identifier to keep one identifier per row. For example, <em>Acetobacter ascendens</em> has LPSN Record number 7864 and 11011. Only the first is available in the <code>microorganisms</code> data set.</p></li>
<li><p><code>lpsn_parent</code><br> LPSN identifier of the parent taxon</p></li>
<li><p><code>lpsn_renamed_to</code><br> LPSN identifier of the currently valid taxon</p></li>
<li><p><code>gbif</code><br> Identifier ('taxonID') of the Global Biodiversity Information Facility (GBIF)</p></li>
<li><p><code>gbif_parent</code><br> GBIF identifier of the parent taxon</p></li>
<li><p><code>gbif_renamed_to</code><br> GBIF identifier of the currently valid taxon</p></li>
<li><p><code>source</code><br> Either "GBIF", "LPSN" or "manually added" (see <em>Source</em>)</p></li>
<li><p><code>snomed</code><br> Systematized Nomenclature of Medicine (SNOMED) code of the microorganism, version of 1 July, 2021 (see <em>Source</em>). Use <code><ahref="mo_property.html">mo_snomed()</a></code> to retrieve it quickly, see <code><ahref="mo_property.html">mo_property()</a></code>.</p></li>
<ul><li><p>Parte, AC <em>et al.</em> (2020). <strong>List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.</strong> International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; <ahref="https://doi.org/10.1099/ijsem.0.004332"class="external-link">doi:10.1099/ijsem.0.004332</a>
<li><p>Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microoganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: <ahref="https://phinvads.cdc.gov"class="external-link">https://phinvads.cdc.gov</a></p></li>
<li><p>Grimont <em>et al.</em>. Antigenic Formulae of the Salmonella Serovars, 2007, 9th Edition. WHO Collaborating Centre for Reference and Research on <em>Salmonella</em> (WHOCC-SALM).</p></li>
<p>Please note that entries are only based on the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF) (see below). Since these sources incorporate entries based on (recent) publications in the International Journal of Systematic and Evolutionary Microbiology (IJSEM), it can happen that the year of publication is sometimes later than one might expect.</p>
<p>For example, <em>Staphylococcus pettenkoferi</em> was described for the first time in Diagnostic Microbiology and Infectious Disease in 2002 (<ahref="https://doi.org/10.1016/s0732-8893%2802%2900399-1"class="external-link">doi:10.1016/s0732-8893(02)00399-1</a>
), but it was not before 2007 that a publication in IJSEM followed (<ahref="https://doi.org/10.1099/ijs.0.64381-0"class="external-link">doi:10.1099/ijs.0.64381-0</a>
<p>Included taxonomic data are:</p><ul><li><p>All ~36,000 (sub)species from the kingdoms of Archaea and Bacteria</p></li>
<li><p>~7,900 (sub)species from the kingdom of Fungi. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package. Only relevant fungi are covered (such as all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histoplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>~5,100 (sub)species from the kingdom of Protozoa</p></li>
<li><p>~1,400 (sub)species from ~40 other relevant genera from the kingdom of Animalia (such as <em>Strongyloides</em> and <em>Taenia</em>)</p></li>
<li><p>All ~9,800 previously accepted names of all included (sub)species (these were taxonomically renamed)</p></li>
<li><p>2 entries of <em>Staphylococcus</em> (coagulase-negative (CoNS) and coagulase-positive (CoPS))</p></li>
<li><p>1 entry of <em>Blastocystis</em> (<em>B. hominis</em>), although it officially does not exist (Noel <em>et al.</em> 2005, PMID 15634993)</p></li>
<li><p>1 entry of <em>Moraxella</em> (<em>M. catarrhalis</em>), which was formally named <em>Branhamella catarrhalis</em> (Catlin, 1970) though this change was never accepted within the field of clinical microbiology</p></li>
</ul><p>The syntax used to transform the original data to a cleansed <spanstyle="R">R</span> format, can be found here: <ahref="https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R"class="external-link">https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R</a>.</p>
<p>Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. Please visit <ahref="https://msberends.github.io/AMR/articles/datasets.html">our website for the download links</a>. The actual files are of course available on <ahref="https://github.com/msberends/AMR/tree/main/data-raw"class="external-link">our GitHub repository</a>.</p>
<h2id="about-the-records-from-lpsn-see-source-">About the Records from LPSN (see <em>Source</em>)<aclass="anchor"aria-label="anchor"href="#about-the-records-from-lpsn-see-source-"></a></h2>
<p>The List of Prokaryotic names with Standing in Nomenclature (LPSN) provides comprehensive information on the nomenclature of prokaryotes. LPSN is a free to use service founded by Jean P. Euzeby in 1997 and later on maintained by Aidan C. Parte.</p>
<spanclass="r-out co"><spanclass="r-pr">#></span><spanstyle="color: #949494;"># … with 52,134 more rows, and 13 more variables: species <chr>,</span></span>
<p></p><p><code>AMR</code> (for R). Free and open-source, licenced under the <atarget="_blank"href="https://github.com/msberends/AMR/blob/main/LICENSE"class="external-link">GNU General Public License version 2.0 (GPL-2)</a>.<br>Developed at the <atarget="_blank"href="https://www.rug.nl"class="external-link">University of Groningen</a> and <atarget="_blank"href="https://www.umcg.nl"class="external-link">University Medical Center Groningen</a> in The Netherlands.</p>