This package contains the complete taxonomic tree (last updated: 5 October 2021) of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (CoL), supplemented with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN).

Catalogue of Life


This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, http://www.catalogueoflife.org). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, lpsn.dsmz.de). This supplementation is needed until the CoL+ project is finished, which we await.

Click here for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with catalogue_of_life_version().

Included Taxa

Included are:

  • All ~58,000 (sub)species from the kingdoms of Archaea, Bacteria, Chromista and Protozoa

  • All ~5,000 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Microascales, Mucorales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales, as well as ~4,600 other fungal (sub)species. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (such as all species of Aspergillus, Candida, Cryptococcus, Histplasma, Pneumocystis, Saccharomyces and Trichophyton).

  • All ~2,200 (sub)species from ~50 other relevant genera from the kingdom of Animalia (such as Strongyloides and Taenia)

  • All ~14,000 previously accepted names of all included (sub)species (these were taxonomically renamed)

  • The complete taxonomic tree of all included (sub)species: from kingdom to subspecies

  • The responsible author(s) and year of scientific publication

The Catalogue of Life (http://www.catalogueoflife.org) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.9 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.

The syntax used to transform the original data to a cleansed R format, can be found here: https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R.

Read more on Our Website!

On our website https://msberends.github.io/AMR/ you can find a comprehensive tutorial about how to conduct AMR data analysis, the complete documentation of all functions and an example analysis using WHONET data.

See also

Data set microorganisms for the actual data.
Function as.mo() to use the data for intelligent determination of microorganisms.

Examples

# Get version info of included data set
catalogue_of_life_version()


# Get a note when a species was renamed
mo_shortname("Chlamydophila psittaci")
# Note: 'Chlamydophila psittaci' (Everett et al., 1999) was renamed back to
#       'Chlamydia psittaci' (Page, 1968)
#> [1] "C. psittaci"

# Get any property from the entire taxonomic tree for all included species
mo_class("E. coli")
#> [1] "Gammaproteobacteria"

mo_family("E. coli")
#> [1] "Enterobacteriaceae"

mo_gramstain("E. coli") # based on kingdom and phylum, see ?mo_gramstain
#> [1] "Gram-negative"

mo_ref("E. coli")
#> [1] "Castellani et al., 1919"

# Do not get mistaken - this package is about microorganisms
mo_kingdom("C. elegans")
#> [1] "Fungi"                 # Fungi?!
mo_name("C. elegans")
#> [1] "Cladosporium elegans"  # Because a microorganism was found