This package contains the complete taxonomic tree (last updated: 5 October 2021) of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (CoL), supplemented with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN).
Catalogue of Life
This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, http://www.catalogueoflife.org). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, lpsn.dsmz.de). This supplementation is needed until the CoL+ project is finished, which we await.
Click here for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with catalogue_of_life_version()
.
Included Taxa
Included are:
All ~58,000 (sub)species from the kingdoms of Archaea, Bacteria, Chromista and Protozoa
All ~5,000 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Microascales, Mucorales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales, as well as ~4,600 other fungal (sub)species. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (such as all species of Aspergillus, Candida, Cryptococcus, Histplasma, Pneumocystis, Saccharomyces and Trichophyton).
All ~2,200 (sub)species from ~50 other relevant genera from the kingdom of Animalia (such as Strongyloides and Taenia)
All ~14,000 previously accepted names of all included (sub)species (these were taxonomically renamed)
The complete taxonomic tree of all included (sub)species: from kingdom to subspecies
The responsible author(s) and year of scientific publication
The Catalogue of Life (http://www.catalogueoflife.org) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.9 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.
The syntax used to transform the original data to a cleansed R format, can be found here: https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R.
See also
Data set microorganisms for the actual data.
Function as.mo()
to use the data for intelligent determination of microorganisms.
Examples
# Get version info of included data set
catalogue_of_life_version()
#> Included in this AMR package (v1.8.1.9020) are:
#>
#> Catalogue of Life: 2019 Annual Checklist
#> Available at: http://www.catalogueoflife.org
#> Number of included microbial species: 49,029
#> List of Prokaryotic names with Standing in Nomenclature (5 October 2021)
#> Available at: https://lpsn.dsmz.de
#> Number of included bacterial species: 21,701
#>
#> => Total number of species included: 70,764
#> => Total number of synonyms included: 14,338
#>
#> See for more info `?microorganisms` and `?catalogue_of_life`.
# Get a note when a species was renamed
mo_shortname("Chlamydophila psittaci")
#> ℹ Chlamydophila psittaci (Everett et al. , 1999) was renamed back to
#> Chlamydia psittaci (Page, 1968) [B_CHLMY_PSTT]
#> [1] "C. psittaci"
# Get any property from the entire taxonomic tree for all included species
mo_class("E. coli")
#> ℹ Function `as.mo()` is uncertain about "E. coli" (assuming Escherichia
#> coli). Run `mo_uncertainties()` to review this.
#> [1] "Gammaproteobacteria"
mo_family("E. coli")
#> ℹ Function `as.mo()` is uncertain about "E. coli" (assuming Escherichia
#> coli). Run `mo_uncertainties()` to review this.
#> [1] "Enterobacteriaceae"
mo_gramstain("E. coli") # based on kingdom and phylum, see ?mo_gramstain
#> ℹ Function `as.mo()` is uncertain about "E. coli" (assuming Escherichia
#> coli). Run `mo_uncertainties()` to review this.
#> [1] "Gram-negative"
mo_ref("E. coli")
#> ℹ Function `as.mo()` is uncertain about "E. coli" (assuming Escherichia
#> coli). Run `mo_uncertainties()` to review this.
#> [1] "Castellani et al., 1919"
# Do not get mistaken - this package is about microorganisms
mo_kingdom("C. elegans")
#> ℹ Function `as.mo()` is uncertain about "C. elegans" (assuming Cladosporium
#> elegans). Run `mo_uncertainties()` to review this.
#> [1] "Fungi"
mo_name("C. elegans")
#> ℹ Function `as.mo()` is uncertain about "C. elegans" (assuming Cladosporium
#> elegans). Run `mo_uncertainties()` to review this.
#> [1] "Cladosporium elegans"