New mo algorithm, prepare for 2.0

2026-07-13 22:30:54 +02:00 · 2022-10-05 09:12:22 +02:00
parent 63fe160322
commit cd2acc4a29
182 changed files with 4054 additions and 90905 deletions
--- a/man/AMR.Rd
+++ b/man/AMR.Rd
@@ -1,21 +1,44 @@
 % Generated by roxygen2: do not edit by hand
 % Please edit documentation in R/amr.R
+\docType{package}
 \name{AMR}
 \alias{AMR}
+\alias{AMR-package}
 \title{The \code{AMR} Package}
+\source{
+To cite AMR in publications use:
+
+Berends MS, Luz CF, Friedrich AW, Sinha BNM, Albers CJ, Glasner C (2022). "AMR: An R Package for Working with Antimicrobial Resistance Data." \emph{Journal of Statistical Software}, \emph{104}(3), 1-31. \doi{10.18637/jss.v104.i03}.
+
+A BibTeX entry for LaTeX users is:
+
+\preformatted{
+@Article{,
+  title = {{AMR}: An {R} Package for Working with Antimicrobial Resistance Data},
+  author = {Matthijs S. Berends and Christian F. Luz and Alexander W. Friedrich and Bhanu N. M. Sinha and Casper J. Albers and Corinna Glasner},
+  journal = {Journal of Statistical Software},
+  year = {2022},
+  volume = {104},
+  number = {3},
+  pages = {1--31},
+  doi = {10.18637/jss.v104.i03},
+}
+}
+}
 \description{
 Welcome to the \code{AMR} package.
-}
-\details{
+
 \code{AMR} is a free, open-source and independent \R package to simplify the analysis and prediction of Antimicrobial Resistance (AMR) and to work with microbial and antimicrobial data and properties, by using evidence-based methods. Our aim is to provide a standard for clean and reproducible antimicrobial resistance data analysis, that can therefore empower epidemiological analyses to continuously enable surveillance and treatment evaluation in any setting.

-After installing this package, \R knows ~71,000 distinct microbial species and all ~570 antibiotic, antimycotic and antiviral drugs by name and code (including ATC, EARS-NET, LOINC and SNOMED CT), and knows all about valid R/SI and MIC values. It supports any data format, including WHONET/EARS-Net data.
+This work was published in the Journal of Statistical Software (Volume 104(3); \doi{10.18637/jss.v104.i03}) and formed the basis of two PhD theses (\doi{10.33612/diss.177417131} and \doi{10.33612/diss.192486375}).
+
+After installing this package, \R knows ~49,000 distinct microbial species and all ~570 antibiotic, antimycotic and antiviral drugs by name and code (including ATC, EARS-NET, LOINC and SNOMED CT), and knows all about valid R/SI and MIC values. It supports any data format, including WHONET/EARS-Net data.

 This package is fully independent of any other \R package and works on Windows, macOS and Linux with all versions of \R since R-3.0.0 (April 2013). It was designed to work in any setting, including those with very limited resources. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the University of Groningen, in collaboration with non-profit organisations Certe Medical Diagnostics and Advice and University Medical Center Groningen. This \R package is actively maintained and free software; you can freely use and distribute it for both personal and commercial (but not patent) purposes under the terms of the GNU General Public License version 2.0 (GPL-2), as published by the Free Software Foundation.

 This package can be used for:
 \itemize{
-\item Reference for the taxonomy of microorganisms, since the package contains all microbial (sub)species from the Catalogue of Life and List of Prokaryotic names with Standing in Nomenclature
+\item Reference for the taxonomy of microorganisms, since the package contains all microbial (sub)species from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF)
 \item Interpreting raw MIC and disk diffusion values, based on the latest CLSI or EUCAST guidelines
 \item Retrieving antimicrobial drug names, doses and forms of administration from clinical health care records
 \item Determining first isolates to be used for AMR data analysis
@@ -38,21 +61,43 @@ This package can be used for:
 All data sets in this \code{AMR} package (about microorganisms, antibiotics, R/SI interpretation, EUCAST rules, etc.) are publicly and freely available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. We also provide tab-separated plain text files that are machine-readable and suitable for input in any software program, such as laboratory information systems. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}.
 }

-\section{Contact Us}{
-
-For suggestions, comments or questions, please contact us via:
-
-Dr. Matthijs S. Berends \cr
-m.s.berends [at] umcg [dot] nl \cr
-University of Groningen
-Department of Medical Microbiology and Infection Prevention \cr
-University Medical Center Groningen \cr
-Post Office Box 30001 \cr
-9700 RB Groningen \cr
-The Netherlands
-\url{https://msberends.github.io/AMR/}
-
-If you have found a bug, please file a new issue at: \cr
-\url{https://github.com/msberends/AMR/issues}
+\seealso{
+Useful links:
+\itemize{
+  \item \url{https://msberends.github.io/AMR/}
+  \item \url{https://github.com/msberends/AMR}
+  \item Report bugs at \url{https://github.com/msberends/AMR/issues}
 }

+}
+\author{
+\strong{Maintainer}: Matthijs S. Berends \email{m.berends@certe.nl} (\href{https://orcid.org/0000-0001-7620-1800}{ORCID})
+
+Authors:
+\itemize{
+  \item Christian F. Luz (\href{https://orcid.org/0000-0001-5809-5995}{ORCID}) [contributor]
+  \item Dennis Souverein (\href{https://orcid.org/0000-0003-0455-0336}{ORCID}) [contributor]
+  \item Erwin E. A. Hassing [contributor]
+}
+
+Other contributors:
+\itemize{
+  \item Casper J. Albers (\href{https://orcid.org/0000-0002-9213-6743}{ORCID}) [thesis advisor]
+  \item Peter Dutey-Magni (\href{https://orcid.org/0000-0002-8942-9836}{ORCID}) [contributor]
+  \item Judith M. Fonville [contributor]
+  \item Alex W. Friedrich (\href{https://orcid.org/0000-0003-4881-038X}{ORCID}) [thesis advisor]
+  \item Corinna Glasner (\href{https://orcid.org/0000-0003-1241-1328}{ORCID}) [thesis advisor]
+  \item Eric H. L. C. M. Hazenberg [contributor]
+  \item Gwen Knight (\href{https://orcid.org/0000-0002-7263-9896}{ORCID}) [contributor]
+  \item Annick Lenglet (\href{https://orcid.org/0000-0003-2013-8405}{ORCID}) [contributor]
+  \item Bart C. Meijer [contributor]
+  \item Dmytro Mykhailenko [contributor]
+  \item Anton Mymrikov [contributor]
+  \item Sofia Ny (\href{https://orcid.org/0000-0002-2017-1363}{ORCID}) [contributor]
+  \item Rogier P. Schade [contributor]
+  \item Bhanu N. M. Sinha (\href{https://orcid.org/0000-0003-1634-0010}{ORCID}) [thesis advisor]
+  \item Anthony Underwood (\href{https://orcid.org/0000-0002-8547-4277}{ORCID}) [contributor]
+}
+
+}
+\keyword{internal}
--- a/man/as.mo.Rd
+++ b/man/as.mo.Rd
@@ -4,18 +4,22 @@
 \alias{as.mo}
 \alias{mo}
 \alias{is.mo}
-\alias{mo_failures}
 \alias{mo_uncertainties}
 \alias{mo_renamed}
+\alias{mo_failures}
+\alias{mo_reset_session}
+\alias{mo_cleaning_regex}
 \title{Transform Input to a Microorganism Code}
 \usage{
 as.mo(
  x,
  Becker = FALSE,
  Lancefield = FALSE,
-  allow_uncertain = TRUE,
+  minimum_matching_score = NULL,
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
  reference_df = get_mo_source(),
-  ignore_pattern = getOption("AMR_ignore_pattern"),
+  ignore_pattern = getOption("AMR_ignore_pattern", NULL),
+  remove_from_input = mo_cleaning_regex(),
  language = get_AMR_locale(),
  info = interactive(),
  ...
@@ -23,28 +27,36 @@ as.mo(

 is.mo(x)

-mo_failures()
-
 mo_uncertainties()

 mo_renamed()
+
+mo_failures()
+
+mo_reset_session()
+
+mo_cleaning_regex()
 }
 \arguments{
 \item{x}{a \link{character} vector or a \link{data.frame} with one or two columns}

-\item{Becker}{a \link{logical} to indicate whether staphylococci should be categorised into coagulase-negative staphylococci ("CoNS") and coagulase-positive staphylococci ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} (1,2,3).
+\item{Becker}{a \link{logical} to indicate whether staphylococci should be categorised into coagulase-negative staphylococci ("CoNS") and coagulase-positive staphylococci ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} (see Source).

 This excludes \emph{Staphylococcus aureus} at default, use \code{Becker = "all"} to also categorise \emph{S. aureus} as "CoPS".}

-\item{Lancefield}{a \link{logical} to indicate whether a beta-haemolytic \emph{Streptococcus} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield (4). These streptococci will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.
+\item{Lancefield}{a \link{logical} to indicate whether a beta-haemolytic \emph{Streptococcus} should be categorised into Lancefield groups instead of their own species, according to Rebecca C. Lancefield (see Source). These streptococci will be categorised in their first group, e.g. \emph{Streptococcus dysgalactiae} will be group C, although officially it was also categorised into groups G and L.

 This excludes enterococci at default (who are in group D), use \code{Lancefield = "all"} to also categorise all enterococci as group D.}

-\item{allow_uncertain}{a number between \code{0} (or \code{"none"}) and \code{3} (or \code{"all"}), or \code{TRUE} (= \code{2}) or \code{FALSE} (= \code{0}) to indicate whether the input should be checked for less probable results, see \emph{Details}}
+\item{minimum_matching_score}{a numeric value to set as the lower limit for the \link[=mo_matching_score]{MO matching score}. When left blank, this will be determined automatically based on the character length of \code{x}, its \link[=microorganisms]{taxonomic kingdom} and \link[=mo_matching_score]{human pathogenicity}.}
+
+\item{keep_synonyms}{a \link{logical} to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is \code{FALSE}, which will return a note if old taxonomic names were processed. The default can be set with \code{options(AMR_keep_synonyms = TRUE)} or \code{options(AMR_keep_synonyms = FALSE)}.}

 \item{reference_df}{a \link{data.frame} to be used for extra reference when translating \code{x} to a valid \code{\link{mo}}. See \code{\link[=set_mo_source]{set_mo_source()}} and \code{\link[=get_mo_source]{get_mo_source()}} to automate the usage of your own codes (e.g. used in your analysis or organisation).}

-\item{ignore_pattern}{a regular expression (case-insensitive) of which all matches in \code{x} must return \code{NA}. This can be convenient to exclude known non-relevant input and can also be set with the option \code{AMR_ignore_pattern}, e.g. \code{options(AMR_ignore_pattern = "(not reported|contaminated flora)")}.}
+\item{ignore_pattern}{a \link[base:regex]{regular expression} (case-insensitive) of which all matches in \code{x} must return \code{NA}. This can be convenient to exclude known non-relevant input and can also be set with the option \code{AMR_ignore_pattern}, e.g. \code{options(AMR_ignore_pattern = "(not reported|contaminated flora)")}.}
+
+\item{remove_from_input}{a \link[base:regex]{regular expression} (case-insensitive) to clean the input of \code{x}. Everything matched in \code{x} will be removed. At default, this is the outcome of \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}, which removes texts between brackets and texts such as "species" and "serovar".}

 \item{language}{language to translate text like "no growth", which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})}

@@ -56,11 +68,9 @@ This excludes enterococci at default (who are in group D), use \code{Lancefield
 A \link{character} \link{vector} with additional class \code{\link{mo}}
 }
 \description{
-Use this function to determine a valid microorganism code (\code{\link{mo}}). Determination is done using intelligent rules and the complete taxonomic kingdoms Bacteria, Chromista, Protozoa, Archaea and most microbial species from the kingdom Fungi (see \emph{Source}). The input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (such as \code{"S. aureus"}), an abbreviation known in the field (such as \code{"MRSA"}), or just a genus. See \emph{Examples}.
+Use this function to determine a valid microorganism code (\code{\link{mo}}). Determination is done using intelligent rules and the complete taxonomic kingdoms Animalia, Archaea, Bacteria and Protozoa, and most microbial species from the kingdom Fungi (see \emph{Source}). The input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (such as \code{"S. aureus"}), an abbreviation known in the field (such as \code{"MRSA"}), or just a genus. See \emph{Examples}.
 }
 \details{
-\subsection{General Info}{
-
 A microorganism (MO) code from this package (class: \code{\link{mo}}) is human readable and typically looks like these examples:

 \if{html}{\out{<div class="sourceCode">}}\preformatted{  Code               Full name
@@ -70,47 +80,23 @@ A microorganism (MO) code from this package (class: \code{\link{mo}}) is human r
  B_KLBSL_PNMN_RHNS  Klebsiella pneumoniae rhinoscleromatis
  |   |    |    |
  |   |    |    |
-  |   |    |    \\---> subspecies, a 4-5 letter acronym
-  |   |    \\----> species, a 4-5 letter acronym
-  |   \\----> genus, a 5-7 letter acronym
+  |   |    |    \\---> subspecies, a 3-5 letter acronym
+  |   |    \\----> species, a 3-6 letter acronym
+  |   \\----> genus, a 4-8 letter acronym
  \\----> taxonomic kingdom: A (Archaea), AN (Animalia), B (Bacteria),
-                            C (Chromista), F (Fungi), P (Protozoa)
+                            F (Fungi), PL (Plantae), P (Protozoa)
 }\if{html}{\out{</div>}}

-Values that cannot be coerced will be considered 'unknown' and will get the MO code \code{UNKNOWN}.
+Values that cannot be coerced will be considered 'unknown' and will be returned as the MO code \code{UNKNOWN} with a warning.

 Use the \code{\link[=mo_property]{mo_*}} functions to get properties based on the returned code, see \emph{Examples}.

-The algorithm uses data from the Catalogue of Life (see below) and from one other source (see \link{microorganisms}).
-
-The \code{\link[=as.mo]{as.mo()}} function uses several coercion rules for fast and logical results. It assesses the input matching criteria in the following order:
-\enumerate{
-\item Human pathogenic prevalence: the function  starts with more prevalent microorganisms, followed by less prevalent ones;
-\item Taxonomic kingdom: the function starts with determining Bacteria, then Fungi, then Protozoa, then others;
-\item Breakdown of input values to identify possible matches.
-}
-
-This will lead to the effect that e.g. \code{"E. coli"} (a microorganism highly prevalent in humans) will return the microbial ID of \emph{Escherichia coli} and not \emph{Entamoeba coli} (a microorganism less prevalent in humans), although the latter would alphabetically come first.
-}
-
+The \code{\link[=as.mo]{as.mo()}} function uses a novel \link[=mo_matching_score]{matching score algorithm} (see \emph{Matching Score for Microorganisms} below) to match input against the \link[=microorganisms]{available microbial taxonomy} in this package. This will lead to the effect that e.g. \code{"E. coli"} (a microorganism highly prevalent in humans) will return the microbial ID of \emph{Escherichia coli} and not \emph{Entamoeba coli} (a microorganism less prevalent in humans), although the latter would alphabetically come first. The algorithm uses data from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF) (see \link{microorganisms}).
 \subsection{Coping with Uncertain Results}{

-In addition, the \code{\link[=as.mo]{as.mo()}} function can differentiate four levels of uncertainty to guess valid results:
-\itemize{
-\item Uncertainty level 0: no additional rules are applied;
-\item Uncertainty level 1: allow previously accepted (but now invalid) taxonomic names and minor spelling errors;
-\item Uncertainty level 2: allow all of level 1, strip values between brackets, inverse the words of the input, strip off text elements from the end keeping at least two elements;
-\item Uncertainty level 3: allow all of level 1 and 2, strip off text elements from the end, allow any part of a taxonomic name.
-}
+Results of non-exact taxonomic input are based on their \link[=mo_matching_score]{matching score}. The lowest allowed score can be set with the \code{minimum_matching_score} argument. At default this will be determined based on the character length of the input, and the \link[=microorganisms]{taxonomic kingdom} and \link[=mo_matching_score]{human pathogenicity} of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to evaluate the results with \code{\link[=mo_uncertainties]{mo_uncertainties()}}, which returns a \link{data.frame} with all specifications.

-The level of uncertainty can be set using the argument \code{allow_uncertain}. The default is \code{allow_uncertain = TRUE}, which is equal to uncertainty level 2. Using \code{allow_uncertain = FALSE} is equal to uncertainty level 0 and will skip all rules. You can also use e.g. \code{as.mo(..., allow_uncertain = 1)} to only allow up to level 1 uncertainty.
-
-With the default setting (\code{allow_uncertain = TRUE}, level 2), below examples will lead to valid results:
-\itemize{
-\item \code{"Streptococcus group B (known as S. agalactiae)"}. The text between brackets will be removed and a warning will be thrown that the result \emph{Streptococcus group B} (\code{B_STRPT_GRPB}) needs review.
-\item \code{"S. aureus - please mind: MRSA"}. The last word will be stripped, after which the function will try to find a match. If it does not, the second last word will be stripped, etc. Again, a warning will be thrown that the result \emph{Staphylococcus aureus} (\code{B_STPHY_AURS}) needs review.
-\item \code{"Fluoroquinolone-resistant Neisseria gonorrhoeae"}. The first word will be stripped, after which the function will try to find a match. A warning will be thrown that the result \emph{Neisseria gonorrhoeae} (\code{B_NESSR_GNRR}) needs review.
-}
+To increase the quality of matching, the \code{remove_from_input} argument can be used to clean the input (i.e., \code{x}). This must be a \link[base:regex]{regular expression} that matches parts of the input that should be removed before the input is matched against the \link[=microorganisms]{available microbial taxonomy}. It will be matched Perl-compatible and case-insensitive. The default value of \code{remove_from_input} is the outcome of the helper function \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}.

 There are three helper functions that can be run after using the \code{\link[=as.mo]{as.mo()}} function:
 \itemize{
@@ -122,19 +108,21 @@ There are three helper functions that can be run after using the \code{\link[=as

 \subsection{Microbial Prevalence of Pathogens in Humans}{

-The intelligent rules consider the prevalence of microorganisms in humans grouped into three groups, which is available as the \code{prevalence} columns in the \link{microorganisms} and \link{microorganisms.old} data sets. The grouping into human pathogenic prevalence is explained in the section \emph{Matching Score for Microorganisms} below.
+The coercion rules consider the prevalence of microorganisms in humans grouped into three groups, which is available as the \code{prevalence} columns in the \link{microorganisms} data set. The grouping into human pathogenic prevalence is explained in the section \emph{Matching Score for Microorganisms} below.
 }
 }
 \section{Source}{

 \enumerate{
-\item Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870-926; \doi{10.1128/CMR.00109-13}
-\item Becker K \emph{et al.} \strong{Implications of identifying the recently defined members of the \emph{S. aureus} complex, \emph{S. argenteus} and \emph{S. schweitzeri}: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).} 2019. Clin Microbiol Infect; \doi{10.1016/j.cmi.2019.02.028}
-\item Becker K \emph{et al.} \strong{Emergence of coagulase-negative staphylococci} 2020. Expert Rev Anti Infect Ther. 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
-\item Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571-95; \doi{10.1084/jem.57.4.571}
-\item Catalogue of Life: 2019 Annual Checklist, \url{http://www.catalogueoflife.org}
-\item List of Prokaryotic names with Standing in Nomenclature (5 October 2021), \doi{10.1099/ijsem.0.004332}
-\item US Edition of SNOMED CT from 1 September 2020, retrieved from the Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS), OID 2.16.840.1.114222.4.11.1009, version 12; url: \url{https://phinvads.cdc.gov/vads/ViewValueSet.action?oid=2.16.840.1.114222.4.11.1009}
+\item Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03}
+\item Becker K \emph{et al.} (2014). \strong{Coagulase-Negative Staphylococci.} \emph{Clin Microbiol Rev.} 27(4): 870-926; \doi{10.1128/CMR.00109-13}
+\item Becker K \emph{et al.} (2019). \strong{Implications of identifying the recently defined members of the \emph{S. aureus} complex, \emph{S. argenteus} and \emph{S. schweitzeri}: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).} \emph{Clin Microbiol Infect}; \doi{10.1016/j.cmi.2019.02.028}
+\item Becker K \emph{et al.} (2020). \strong{Emergence of coagulase-negative staphylococci} \emph{Expert Rev Anti Infect Ther.} 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
+\item Lancefield RC (1933). \strong{A serological differentiation of human and other groups of hemolytic streptococci}. \emph{J Exp Med.} 57(4): 571-95; \doi{10.1084/jem.57.4.571}
+\item Berends MS \emph{et al.} (2022). \strong{Trends in Occurrence and Phenotypic Resistance of Coagulase-Negative Staphylococci (CoNS) Found in Human Blood in the Northern Netherlands between 2013 and 2019} \emph{Microorganisms} 10(9), 1801; \doi{10.3390/microorganisms10091801}
+\item Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on 12 September, 2022.
+\item GBIF Secretariat (November 26, 2021). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on 12 September, 2022.
+\item Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microoganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: \url{https://phinvads.cdc.gov}
 }
 }

@@ -149,26 +137,22 @@ where:
 \item \ifelse{html}{\out{<i>x</i> is the user input;}}{\eqn{x} is the user input;}
 \item \ifelse{html}{\out{<i>n</i> is a taxonomic name (genus, species, and subspecies);}}{\eqn{n} is a taxonomic name (genus, species, and subspecies);}
 \item \ifelse{html}{\out{<i>l<sub>n</sub></i> is the length of <i>n</i>;}}{l_n is the length of \eqn{n};}
-\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a>, which counts any insertion, deletion and substitution as 1 that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function, which counts any insertion, deletion and substitution as 1 that is needed to change \eqn{x} into \eqn{n};}
+\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a> (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n};}
 \item \ifelse{html}{\out{<i>p<sub>n</sub></i> is the human pathogenic prevalence group of <i>n</i>, as described below;}}{p_n is the human pathogenic prevalence group of \eqn{n}, as described below;}
 \item \ifelse{html}{\out{<i>k<sub>n</sub></i> is the taxonomic kingdom of <i>n</i>, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}}{l_n is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}
 }

-The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence. \strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales. \strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Alternaria}, \emph{Anaerosalibacter}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobacterium}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Blastocystis}, \emph{Branhamella}, \emph{Calymmatobacterium}, \emph{Candida}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Chaetomium}, \emph{Chryseobacterium}, \emph{Chryseomonas}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Hendersonula}, \emph{Hypomyces}, \emph{Koserella}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Nectria}, \emph{Ochroconis}, \emph{Oidiodendron}, \emph{Phoma}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Stomatococcus}, \emph{Treponema}, \emph{Trichoderma}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Tritirachium} or \emph{Ureaplasma}. \strong{Group 3} consists of all other microorganisms.
+The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence:
+
+\strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales.
+
+\strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acanthamoeba}, \emph{Acholeplasma}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Aedes}, \emph{Alistipes}, \emph{Alloprevotella}, \emph{Alternaria}, \emph{Amoeba}, \emph{Anaerosalibacter}, \emph{Ancylostoma}, \emph{Angiostrongylus}, \emph{Anisakis}, \emph{Anopheles}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Bergeyella}, \emph{Blastocystis}, \emph{Blastomyces}, \emph{Borrelia}, \emph{Brachyspira}, \emph{Branhamella}, \emph{Butyricimonas}, \emph{Candida}, \emph{Capillaria}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Cetobacterium}, \emph{Chaetomium}, \emph{Chlamydia}, \emph{Chlamydophila}, \emph{Chryseobacterium}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Contracaecum}, \emph{Cordylobia}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Deinococcus}, \emph{Demodex}, \emph{Dermatobia}, \emph{Dientamoeba}, \emph{Diphyllobothrium}, \emph{Dirofilaria}, \emph{Dysgonomonas}, \emph{Echinostoma}, \emph{Elizabethkingia}, \emph{Empedobacter}, \emph{Entamoeba}, \emph{Enterobius}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Fasciola}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Giardia}, \emph{Haloarcula}, \emph{Halobacterium}, \emph{Halococcus}, \emph{Hendersonula}, \emph{Heterophyes}, \emph{Histomonas}, \emph{Histoplasma}, \emph{Hymenolepis}, \emph{Hypomyces}, \emph{Hysterothylacium}, \emph{Leishmania}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Lucilia}, \emph{Lumbricus}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Metagonimus}, \emph{Meyerozyma}, \emph{Microsporidium}, \emph{Microsporum}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Myroides}, \emph{Necator}, \emph{Nectria}, \emph{Ochroconis}, \emph{Odoribacter}, \emph{Oesophagostomum}, \emph{Oidiodendron}, \emph{Opisthorchis}, \emph{Ornithobacterium}, \emph{Parabacteroides}, \emph{Pediculus}, \emph{Pedobacter}, \emph{Phlebotomus}, \emph{Phocaeicola}, \emph{Phocanema}, \emph{Phoma}, \emph{Pichia}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Pneumocystis}, \emph{Porphyromonas}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Pseudoterranova}, \emph{Pulex}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Riemerella}, \emph{Saccharomyces}, \emph{Sarcoptes}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sphingobacterium}, \emph{Spirometra}, \emph{Spiroplasma}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Streptobacillus}, \emph{Strongyloides}, \emph{Syngamus}, \emph{Taenia}, \emph{Tannerella}, \emph{Tenacibaculum}, \emph{Terrimonas}, \emph{Toxocara}, \emph{Treponema}, \emph{Trichinella}, \emph{Trichobilharzia}, \emph{Trichoderma}, \emph{Trichomonas}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Trichostrongylus}, \emph{Trichuris}, \emph{Tritirachium}, \emph{Trombicula}, \emph{Trypanosoma}, \emph{Tunga}, \emph{Ureaplasma}, \emph{Victivallis}, \emph{Wautersiella}, \emph{Weeksella} or \emph{Wuchereria}.
+
+\strong{Group 3} consists of all other microorganisms.

 All characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses.

-All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.079}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
-
-Since \code{AMR} version 1.8.1, common microorganism abbreviations are ignored in determining the matching score. These abbreviations are currently: AIEC, ATEC, BORSA, CRSM, DAEC, EAEC, EHEC, EIEC, EPEC, ETEC, GISA, MRPA, MRSA, MRSE, MSSA, MSSE, NMEC, PISP, PRSP, STEC, UPEC, VISA, VISP, VRE, VRSA and VRSP.
-}
-
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
+All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.119}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
 }

 \section{Reference Data Publicly Available}{
@@ -179,29 +163,31 @@ All data sets in this \code{AMR} package (about microorganisms, antibiotics, R/S
 \examples{
 \donttest{
 # These examples all return "B_STPHY_AURS", the ID of S. aureus:
-as.mo("sau") # WHONET code
-as.mo("stau")
-as.mo("STAU")
-as.mo("staaur")
-as.mo("S. aureus")
-as.mo("S aureus")
-as.mo("Staphylococcus aureus")
-as.mo("Staphylococcus aureus (MRSA)")
-as.mo("Zthafilokkoockus oureuz") # handles incorrect spelling
-as.mo("MRSA") # Methicillin Resistant S. aureus
-as.mo("VISA") # Vancomycin Intermediate S. aureus
-as.mo("VRSA") # Vancomycin Resistant S. aureus
-as.mo(115329001) # SNOMED CT code
+as.mo(c(
+  "sau", # WHONET code
+  "stau",
+  "STAU",
+  "staaur",
+  "S. aureus",
+  "S aureus",
+  "Staphylococcus aureus",
+  "Staphylococcus aureus (MRSA,",
+  "Zthafilokkoockus oureuz", # handles incorrect spelling
+  "MRSA", # Methicillin Resistant S. aureus
+  "VISA", # Vancomycin Intermediate S. aureus
+  "VRSA", # Vancomycin Resistant S. aureus
+  115329001 # SNOMED CT code
+))

 # Dyslexia is no problem - these all work:
-as.mo("Ureaplasma urealyticum")
-as.mo("Ureaplasma urealyticus")
-as.mo("Ureaplasmium urealytica")
-as.mo("Ureaplazma urealitycium")
+as.mo(c(
+  "Ureaplasma urealyticum",
+  "Ureaplasma urealyticus",
+  "Ureaplasmium urealytica",
+  "Ureaplazma urealitycium"
+))

 as.mo("Streptococcus group A")
-as.mo("GAS") # Group A Streptococci
-as.mo("GBS") # Group B Streptococci

 as.mo("S. epidermidis") # will remain species: B_STPHY_EPDR
 as.mo("S. epidermidis", Becker = TRUE) # will not remain species: B_STPHY_CONS
@@ -210,9 +196,9 @@ as.mo("S. pyogenes") # will remain species: B_STRPT_PYGN
 as.mo("S. pyogenes", Lancefield = TRUE) # will not remain species: B_STRPT_GRPA

 # All mo_* functions use as.mo() internally too (see ?mo_property):
-mo_genus("E. coli") # returns "Escherichia"
-mo_gramstain("E. coli") # returns "Gram negative"
-mo_is_intrinsic_resistant("E. coli", "vanco") # returns TRUE
+mo_genus("E. coli")
+mo_gramstain("ESCO")
+mo_is_intrinsic_resistant("ESCCOL", ab = "vanco")
 }
 }
 \seealso{
@@ -220,9 +206,3 @@ mo_is_intrinsic_resistant("E. coli", "vanco") # returns TRUE

 The \code{\link[=mo_property]{mo_*}} functions (such as \code{\link[=mo_genus]{mo_genus()}}, \code{\link[=mo_gramstain]{mo_gramstain()}}) to get properties based on the returned code.
 }
-\keyword{Becker}
-\keyword{Lancefield}
-\keyword{becker}
-\keyword{guess}
-\keyword{lancefield}
-\keyword{mo}
--- a/man/as.rsi.Rd
+++ b/man/as.rsi.Rd
@@ -229,7 +229,7 @@ if (require("dplyr")) {
    as.rsi() # automatically determines urine isolates

  df \%>\%
-    mutate_at(vars(AMP:NIT), as.rsi, mo = "E. coli", uti = TRUE)
+    mutate_at(vars(AMP:TOB), as.rsi, mo = "E. coli", uti = TRUE)
 }

 # For CLEANING existing R/SI values ------------------------------------
--- a/man/catalogue_of_life.Rd
+++ b/man/catalogue_of_life.Rd
@@ -1,58 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/catalogue_of_life.R
-\name{catalogue_of_life}
-\alias{catalogue_of_life}
-\title{The Catalogue of Life}
-\description{
-This package contains the complete taxonomic tree (last updated: 5 October 2021) of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (CoL), supplemented with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN).
-}
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
-}
-
-\section{Included Taxa}{
-
-Included are:
-\itemize{
-\item All ~58,000 (sub)species from the kingdoms of Archaea, Bacteria, Chromista and Protozoa
-\item All ~5,000 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Microascales, Mucorales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales, as well as ~4,600 other fungal (sub)species. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (such as all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).
-\item All ~2,200 (sub)species from ~50 other relevant genera from the kingdom of Animalia (such as \emph{Strongyloides} and \emph{Taenia})
-\item All ~14,000 previously accepted names of all included (sub)species (these were taxonomically renamed)
-\item The complete taxonomic tree of all included (sub)species: from kingdom to subspecies
-\item The responsible author(s) and year of scientific publication
-}
-
-The Catalogue of Life (\url{http://www.catalogueoflife.org}) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.9 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.
-
-The syntax used to transform the original data to a cleansed \R format, can be found here: \url{https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R}.
-}
-
-\examples{
-# Get version info of included data set
-catalogue_of_life_version()
-
-
-# Get a note when a species was renamed
-mo_shortname("Chlamydophila psittaci")
-
-# Get any property from the entire taxonomic tree for all included species
-mo_class("Escherichia coli")
-
-mo_family("Escherichia coli")
-
-mo_gramstain("Escherichia coli") # based on kingdom and phylum, see ?mo_gramstain
-
-mo_ref("Escherichia coli")
-
-# Do not get mistaken - this package is about microorganisms
-mo_kingdom("C. elegans")
-mo_name("C. elegans")
-}
-\seealso{
-Data set \link{microorganisms} for the actual data. \cr
-Function \code{\link[=as.mo]{as.mo()}} to use the data for intelligent determination of microorganisms.
-}
--- a/man/catalogue_of_life_version.Rd
+++ b/man/catalogue_of_life_version.Rd
@@ -1,28 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/catalogue_of_life.R
-\name{catalogue_of_life_version}
-\alias{catalogue_of_life_version}
-\title{Version info of included Catalogue of Life}
-\usage{
-catalogue_of_life_version()
-}
-\value{
-a \link{list}, which prints in pretty format
-}
-\description{
-This function returns information about the included data from the Catalogue of Life.
-}
-\details{
-For LPSN, see \link{microorganisms}.
-}
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
-}
-
-\seealso{
-\link{microorganisms}
-}
--- a/man/custom_eucast_rules.Rd
+++ b/man/custom_eucast_rules.Rd
@@ -24,43 +24,87 @@ Some organisations have their own adoption of EUCAST rules. This function can be

 If you are familiar with the \code{\link[dplyr:case_when]{case_when()}} function of the \code{dplyr} package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written \emph{before} the tilde (\code{~}) and the consequence of the rule is written \emph{after} the tilde:

-\if{html}{\out{<div class="sourceCode {r}">}}\preformatted{x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S",
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S",
                         TZP == "R" ~ aminopenicillins == "R")
 }\if{html}{\out{</div>}}

 These are two custom EUCAST rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work:

-\if{html}{\out{<div class="sourceCode {r}">}}\preformatted{x
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{x
+#> A set of custom EUCAST rules:
+#>
+#>   1. If TZP is "S" then set to  S :
+#>      amoxicillin (AMX), ampicillin (AMP)
+#>
+#>   2. If TZP is "R" then set to  R :
+#>      amoxicillin (AMX), ampicillin (AMP)
 }\if{html}{\out{</div>}}

 The rules (the part \emph{before} the tilde, in above example \code{TZP == "S"} and \code{TZP == "R"}) must be evaluable in your data set: it should be able to run as a filter in your data set without errors. This means for the above example that the column \code{TZP} must exist. We will create a sample data set and test the rules set:

-\if{html}{\out{<div class="sourceCode {r}">}}\preformatted{df <- data.frame(mo = c("Escherichia coli", "Klebsiella pneumoniae"),
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{df <- data.frame(mo = c("Escherichia coli", "Klebsiella pneumoniae"),
                 TZP = as.rsi("R"),
                 ampi = as.rsi("S"),
                 cipro = as.rsi("S"))
 df
+#>                      mo TZP ampi cipro
+#> 1      Escherichia coli   R    S     S
+#> 2 Klebsiella pneumoniae   R    S     S

 eucast_rules(df, rules = "custom", custom_rules = x, info = FALSE)
+#>                      mo TZP ampi cipro
+#> 1      Escherichia coli   R    R     S
+#> 2 Klebsiella pneumoniae   R    R     S
 }\if{html}{\out{</div>}}
 }

 \subsection{Using taxonomic properties in rules}{

-There is one exception in variables used for the rules: all column names of the \link{microorganisms} data set can also be used, but do not have to exist in the data set. These column names are: \verb{r vector_and(colnames(microorganisms), sort = FALSE)}. Thus, this next example will work as well, despite the fact that the \code{df} data set does not contain a column \code{genus}:
+There is one exception in variables used for the rules: all column names of the \link{microorganisms} data set can also be used, but do not have to exist in the data set. These column names are: "mo", "fullname", "status", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "source", "lpsn", "lpsn_parent", "lpsn_renamed_to", "gbif", "gbif_parent", "gbif_renamed_to", "prevalence" and "snomed". Thus, this next example will work as well, despite the fact that the \code{df} data set does not contain a column \code{genus}:

-\if{html}{\out{<div class="sourceCode {r}">}}\preformatted{y <- custom_eucast_rules(TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S",
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{y <- custom_eucast_rules(TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S",
                         TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R")

 eucast_rules(df, rules = "custom", custom_rules = y, info = FALSE)
+#>                      mo TZP ampi cipro
+#> 1      Escherichia coli   R    S     S
+#> 2 Klebsiella pneumoniae   R    R     S
 }\if{html}{\out{</div>}}
 }

 \subsection{Usage of antibiotic group names}{

 It is possible to define antibiotic groups instead of single antibiotics for the rule consequence, the part \emph{after} the tilde. In above examples, the antibiotic group \code{aminopenicillins} is used to include ampicillin and amoxicillin. The following groups are allowed (case-insensitive). Within parentheses are the agents that will be matched when running the rule.
-
-\verb{r paste0("  * ", sapply(DEFINED_AB_GROUPS, function(x) paste0("\\"", tolower(gsub("^AB_", "", x)), "\\"\\\\cr(", vector_and(ab_name(eval(parse(text = x), envir = asNamespace("AMR")), language = NULL, tolower = TRUE), quotes = FALSE), ")"), USE.NAMES = FALSE), "\\n", collapse = "")}
+\itemize{
+\item "aminoglycosides"\cr(amikacin, amikacin/fosfomycin, amphotericin B-high, apramycin, arbekacin, astromicin, bekanamycin, dibekacin, framycetin, gentamicin, gentamicin-high, habekacin, hygromycin, isepamicin, kanamycin, kanamycin-high, kanamycin/cephalexin, micronomicin, neomycin, netilmicin, pentisomicin, plazomicin, propikacin, ribostamycin, sisomicin, streptoduocin, streptomycin, streptomycin-high, tobramycin and tobramycin-high)
+\item "aminopenicillins"\cr(amoxicillin and ampicillin)
+\item "antifungals"\cr(5-fluorocytosine, amphotericin B, anidulafungin, butoconazole, caspofungin, ciclopirox, clotrimazole, econazole, fluconazole, fosfluconazole, griseofulvin, hachimycin, ibrexafungerp, isavuconazole, isoconazole, itraconazole, ketoconazole, manogepix, micafungin, miconazole, nystatin, pimaricin, posaconazole, rezafungin, ribociclib, sulconazole, terbinafine, terconazole and voriconazole)
+\item "antimycobacterials"\cr(4-aminosalicylic acid, calcium aminosalicylate, capreomycin, clofazimine, delamanid, enviomycin, ethambutol, ethambutol/isoniazid, ethionamide, isoniazid, morinamide, p-aminosalicylic acid, pretomanid, prothionamide, pyrazinamide, rifabutin, rifampicin, rifampicin/isoniazid, rifampicin/pyrazinamide/ethambutol/isoniazid, rifampicin/pyrazinamide/isoniazid, rifamycin, rifapentine, simvastatin/fenofibrate, sodium aminosalicylate, streptomycin/isoniazid, terizidone, thioacetazone/isoniazid, tiocarlide and viomycin)
+\item "betalactams"\cr(amoxicillin, amoxicillin/clavulanic acid, amoxicillin/sulbactam, ampicillin, ampicillin/sulbactam, apalcillin, aspoxicillin, avibactam, azidocillin, azlocillin, aztreonam, aztreonam/avibactam, aztreonam/nacubactam, bacampicillin, benzathine benzylpenicillin, benzathine phenoxymethylpenicillin, benzylpenicillin, biapenem, carbenicillin, carindacillin, cefacetrile, cefaclor, cefadroxil, cefaloridine, cefamandole, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/nacubactam, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/enzyme inhibitor, ceftolozane/tazobactam, ceftriaxone, cefuroxime, cefuroxime axetil, cephalexin, cephalothin, cephapirin, cephradine, ciclacillin, clometocillin, cloxacillin, dicloxacillin, doripenem, epicillin, ertapenem, flucloxacillin, hetacillin, imipenem, imipenem/EDTA, imipenem/relebactam, latamoxef, lenampicillin, loracarbef, mecillinam, meropenem, meropenem/nacubactam, meropenem/vaborbactam, metampicillin, methicillin, mezlocillin, mezlocillin/sulbactam, nacubactam, nafcillin, oxacillin, panipenem, penamecillin, penicillin/novobiocin, penicillin/sulbactam, phenethicillin, phenoxymethylpenicillin, piperacillin, piperacillin/sulbactam, piperacillin/tazobactam, piridicillin, pivampicillin, pivmecillinam, procaine benzylpenicillin, propicillin, razupenem, ritipenem, ritipenem acoxil, sarmoxicillin, sulbactam, sulbenicillin, sultamicillin, talampicillin, tazobactam, tebipenem, temocillin, ticarcillin and ticarcillin/clavulanic acid)
+\item "carbapenems"\cr(biapenem, doripenem, ertapenem, imipenem, imipenem/EDTA, imipenem/relebactam, meropenem, meropenem/nacubactam, meropenem/vaborbactam, panipenem, razupenem, ritipenem, ritipenem acoxil and tebipenem)
+\item "cephalosporins"\cr(cefacetrile, cefaclor, cefadroxil, cefaloridine, cefamandole, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/enzyme inhibitor, ceftolozane/tazobactam, ceftriaxone, cefuroxime, cefuroxime axetil, cephalexin, cephalothin, cephapirin, cephradine, latamoxef and loracarbef)
+\item "cephalosporins_1st"\cr(cefacetrile, cefadroxil, cefaloridine, cefatrizine, cefazedone, cefazolin, cefroxadine, ceftezole, cephalexin, cephalothin, cephapirin and cephradine)
+\item "cephalosporins_2nd"\cr(cefaclor, cefamandole, cefmetazole, cefonicid, ceforanide, cefotetan, cefotiam, cefoxitin, cefoxitin screening, cefprozil, cefuroxime, cefuroxime axetil and loracarbef)
+\item "cephalosporins_3rd"\cr(cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefetamet, cefetamet pivoxil, cefixime, cefmenoxime, cefodizime, cefoperazone, cefoperazone/sulbactam, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotiam hexetil, cefovecin, cefpimizole, cefpiramide, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefsulodin, ceftazidime, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftriaxone and latamoxef)
+\item "cephalosporins_4th"\cr(cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetecol, cefoselis, cefozopran, cefpirome and cefquinome)
+\item "cephalosporins_5th"\cr(ceftaroline, ceftaroline/avibactam, ceftobiprole, ceftobiprole medocaril, ceftolozane/enzyme inhibitor and ceftolozane/tazobactam)
+\item "cephalosporins_except_caz"\cr(cefacetrile, cefaclor, cefadroxil, cefaloridine, cefamandole, cefatrizine, cefazedone, cefazolin, cefcapene, cefcapene pivoxil, cefdinir, cefditoren, cefditoren pivoxil, cefepime, cefepime/clavulanic acid, cefepime/tazobactam, cefetamet, cefetamet pivoxil, cefetecol, cefetrizole, cefixime, cefmenoxime, cefmetazole, cefodizime, cefonicid, cefoperazone, cefoperazone/sulbactam, ceforanide, cefoselis, cefotaxime, cefotaxime/clavulanic acid, cefotaxime/sulbactam, cefotetan, cefotiam, cefotiam hexetil, cefovecin, cefoxitin, cefoxitin screening, cefozopran, cefpimizole, cefpiramide, cefpirome, cefpodoxime, cefpodoxime proxetil, cefpodoxime/clavulanic acid, cefprozil, cefquinome, cefroxadine, cefsulodin, cefsumide, ceftaroline, ceftaroline/avibactam, ceftazidime/avibactam, ceftazidime/clavulanic acid, cefteram, cefteram pivoxil, ceftezole, ceftibuten, ceftiofur, ceftizoxime, ceftizoxime alapivoxil, ceftobiprole, ceftobiprole medocaril, ceftolozane/enzyme inhibitor, ceftolozane/tazobactam, ceftriaxone, cefuroxime, cefuroxime axetil, cephalexin, cephalothin, cephapirin, cephradine, latamoxef and loracarbef)
+\item "fluoroquinolones"\cr(besifloxacin, ciprofloxacin, clinafloxacin, danofloxacin, delafloxacin, difloxacin, enoxacin, enrofloxacin, finafloxacin, fleroxacin, garenoxacin, gatifloxacin, gemifloxacin, grepafloxacin, levofloxacin, levonadifloxacin, lomefloxacin, marbofloxacin, metioxate, miloxacin, moxifloxacin, nadifloxacin, nifuroquine, norfloxacin, ofloxacin, orbifloxacin, pazufloxacin, pefloxacin, pradofloxacin, premafloxacin, prulifloxacin, rufloxacin, sarafloxacin, sitafloxacin, sparfloxacin, temafloxacin, tilbroquinol, tioxacin, tosufloxacin and trovafloxacin)
+\item "glycopeptides"\cr(avoparcin, dalbavancin, norvancomycin, oritavancin, ramoplanin, teicoplanin, teicoplanin-macromethod, telavancin, vancomycin and vancomycin-macromethod)
+\item "glycopeptides_except_lipo"\cr(avoparcin, norvancomycin, ramoplanin, teicoplanin, teicoplanin-macromethod, vancomycin and vancomycin-macromethod)
+\item "lincosamides"\cr(acetylmidecamycin, acetylspiramycin, clindamycin, gamithromycin, kitasamycin, lincomycin, meleumycin, nafithromycin, pirlimycin, primycin, solithromycin, tildipirosin, tilmicosin, tulathromycin, tylosin and tylvalosin)
+\item "lipoglycopeptides"\cr(dalbavancin, oritavancin and telavancin)
+\item "macrolides"\cr(acetylmidecamycin, acetylspiramycin, azithromycin, clarithromycin, dirithromycin, erythromycin, flurithromycin, gamithromycin, josamycin, kitasamycin, meleumycin, midecamycin, miocamycin, nafithromycin, oleandomycin, pirlimycin, primycin, rokitamycin, roxithromycin, solithromycin, spiramycin, telithromycin, tildipirosin, tilmicosin, troleandomycin, tulathromycin, tylosin and tylvalosin)
+\item "oxazolidinones"\cr(cadazolid, cycloserine, linezolid, tedizolid and thiacetazone)
+\item "penicillins"\cr(amoxicillin, amoxicillin/clavulanic acid, amoxicillin/sulbactam, ampicillin, ampicillin/sulbactam, apalcillin, aspoxicillin, avibactam, azidocillin, azlocillin, aztreonam, aztreonam/avibactam, aztreonam/nacubactam, bacampicillin, benzathine benzylpenicillin, benzathine phenoxymethylpenicillin, benzylpenicillin, carbenicillin, carindacillin, cefepime/nacubactam, ciclacillin, clometocillin, cloxacillin, dicloxacillin, epicillin, flucloxacillin, hetacillin, lenampicillin, mecillinam, metampicillin, methicillin, mezlocillin, mezlocillin/sulbactam, nacubactam, nafcillin, oxacillin, penamecillin, penicillin/novobiocin, penicillin/sulbactam, phenethicillin, phenoxymethylpenicillin, piperacillin, piperacillin/sulbactam, piperacillin/tazobactam, piridicillin, pivampicillin, pivmecillinam, procaine benzylpenicillin, propicillin, sarmoxicillin, sulbactam, sulbenicillin, sultamicillin, talampicillin, tazobactam, temocillin, ticarcillin and ticarcillin/clavulanic acid)
+\item "polymyxins"\cr(colistin, polymyxin B and polymyxin B/polysorbate 80)
+\item "quinolones"\cr(besifloxacin, cinoxacin, ciprofloxacin, clinafloxacin, danofloxacin, delafloxacin, difloxacin, enoxacin, enrofloxacin, finafloxacin, fleroxacin, flumequine, garenoxacin, gatifloxacin, gemifloxacin, grepafloxacin, levofloxacin, levonadifloxacin, lomefloxacin, marbofloxacin, metioxate, miloxacin, moxifloxacin, nadifloxacin, nalidixic acid, nifuroquine, nitroxoline, norfloxacin, ofloxacin, orbifloxacin, oxolinic acid, pazufloxacin, pefloxacin, pipemidic acid, piromidic acid, pradofloxacin, premafloxacin, prulifloxacin, rosoxacin, rufloxacin, sarafloxacin, sitafloxacin, sparfloxacin, temafloxacin, tilbroquinol, tioxacin, tosufloxacin and trovafloxacin)
+\item "streptogramins"\cr(pristinamycin and quinupristin/dalfopristin)
+\item "tetracyclines"\cr(cetocycline, chlortetracycline, clomocycline, demeclocycline, doxycycline, eravacycline, lymecycline, metacycline, minocycline, omadacycline, oxytetracycline, penimepicycline, rolitetracycline, tetracycline and tigecycline)
+\item "tetracyclines_except_tgc"\cr(cetocycline, chlortetracycline, clomocycline, demeclocycline, doxycycline, eravacycline, lymecycline, metacycline, minocycline, omadacycline, oxytetracycline, penimepicycline, rolitetracycline and tetracycline)
+\item "trimethoprims"\cr(brodimoprim, sulfadiazine, sulfadiazine/tetroxoprim, sulfadiazine/trimethoprim, sulfadimethoxine, sulfadimidine, sulfadimidine/trimethoprim, sulfafurazole, sulfaisodimidine, sulfalene, sulfamazone, sulfamerazine, sulfamerazine/trimethoprim, sulfamethizole, sulfamethoxazole, sulfamethoxypyridazine, sulfametomidine, sulfametoxydiazine, sulfametrole/trimethoprim, sulfamoxole, sulfamoxole/trimethoprim, sulfanilamide, sulfaperin, sulfaphenazole, sulfapyridine, sulfathiazole, sulfathiourea, trimethoprim and trimethoprim/sulfamethoxazole)
+\item "ureidopenicillins"\cr(azlocillin, mezlocillin, piperacillin and piperacillin/tazobactam)
+}
 }
 }

--- a/man/eucast_rules.Rd
+++ b/man/eucast_rules.Rd
@@ -66,7 +66,7 @@ eucast_dosage(ab, administration = "iv", version_breakpoints = 11)
 The input of \code{x}, possibly with edited values of antibiotics. Or, if \code{verbose = TRUE}, a \link{data.frame} with all original and new values of the affected bug-drug combinations.
 }
 \description{
-Apply rules for clinical breakpoints and intrinsic resistance as defined by the European Committee on Antimicrobial Susceptibility Testing (EUCAST, \url{https://eucast.org}), see \emph{Source}. Use \code{\link[=eucast_dosage]{eucast_dosage()}} to get a \link{data.frame} with advised dosages of a certain bug-drug combination, which is based on the \link{dosage} data set.
+Apply rules for clinical breakpoints and intrinsic resistance as defined by the European Committee on Antimicrobial Susceptibility Testing (EUCAST, \url{https://www.eucast.org}), see \emph{Source}. Use \code{\link[=eucast_dosage]{eucast_dosage()}} to get a \link{data.frame} with advised dosages of a certain bug-drug combination, which is based on the \link{dosage} data set.

 To improve the interpretation of the antibiogram before EUCAST rules are applied, some non-EUCAST rules can applied at default, see \emph{Details}.
 }
@@ -74,15 +74,15 @@ To improve the interpretation of the antibiogram before EUCAST rules are applied
 \strong{Note:} This function does not translate MIC values to RSI values. Use \code{\link[=as.rsi]{as.rsi()}} for that. \cr
 \strong{Note:} When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr

-The file containing all EUCAST rules is located here: \url{https://github.com/msberends/AMR/blob/main/data-raw/eucast_rules.tsv}.  \strong{Note:} Old taxonomic names are replaced with the current taxonomy where applicable. For example, \emph{Ochrobactrum anthropi} was renamed to \emph{Brucella anthropi} in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The file used as input for this \code{AMR} package contains the taxonomy updated until \code{\link[=catalogue_of_life]{r CATALOGUE_OF_LIFE$yearmonth_LPSN}}.
+The file containing all EUCAST rules is located here: \url{https://github.com/msberends/AMR/blob/main/data-raw/eucast_rules.tsv}.  \strong{Note:} Old taxonomic names are replaced with the current taxonomy where applicable. For example, \emph{Ochrobactrum anthropi} was renamed to \emph{Brucella anthropi} in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The \code{AMR} package contains the full microbial taxonomy updated until 12 September, 2022, see \link{microorganisms}.
 \subsection{Custom Rules}{

 Custom rules can be created using \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}, e.g.:

-\if{html}{\out{<div class="sourceCode {r}">}}\preformatted{x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
                         AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I")

-eucast_rules(example_isolates, rules = "custom", custom_rules = x, info = FALSE)
+eucast_rules(example_isolates, rules = "custom", custom_rules = x)
 }\if{html}{\out{</div>}}
 }

--- a/man/figures/logo_col.png
+++ b/man/figures/logo_col.png
--- a/man/intrinsic_resistant.Rd
+++ b/man/intrinsic_resistant.Rd
@@ -5,7 +5,7 @@
 \alias{intrinsic_resistant}
 \title{Data Set with Bacterial Intrinsic Resistance}
 \format{
-A \link[tibble:tibble]{tibble} with 134,956 observations and 2 variables:
+A \link[tibble:tibble]{tibble} with 134,659 observations and 2 variables:
 \itemize{
 \item \code{mo}\cr Microorganism ID
 \item \code{ab}\cr Antibiotic ID
--- a/man/microorganisms.Rd
+++ b/man/microorganisms.Rd
@@ -3,64 +3,70 @@
 \docType{data}
 \name{microorganisms}
 \alias{microorganisms}
-\title{Data Set with 70,764 Microorganisms}
+\title{Data Set with 48,787 Microorganisms}
 \format{
-A \link[tibble:tibble]{tibble} with 70,764 observations and 16 variables:
+A \link[tibble:tibble]{tibble} with 48,787 observations and 22 variables:
 \itemize{
 \item \code{mo}\cr ID of microorganism as used by this package
-\item \code{fullname}\cr Full name, like \code{"Escherichia coli"}
+\item \code{fullname}\cr Full name, like \code{"Escherichia coli"}. For the taxonomic ranks genus, species and subspecies, this is the 'pasted' text of genus, species, and subspecies. For all taxonomic ranks higher than genus, this is the name of the taxon.
+\item \code{status} \cr Status of the taxon, either "accepted" or "synonym"
 \item \code{kingdom}, \code{phylum}, \code{class}, \code{order}, \code{family}, \code{genus}, \code{species}, \code{subspecies}\cr Taxonomic rank of the microorganism
-\item \code{rank}\cr Text of the taxonomic rank of the microorganism, like \code{"species"} or \code{"genus"}
-\item \code{ref}\cr Author(s) and year of concerning scientific publication
-\item \code{species_id}\cr ID of the species as used by the Catalogue of Life
-\item \code{source}\cr Either "CoL", "LPSN" or "manually added" (see \emph{Source})
+\item \code{rank}\cr Text of the taxonomic rank of the microorganism, such as \code{"species"} or \code{"genus"}
+\item \code{ref}\cr Author(s) and year of related scientific publication. This contains only the \emph{first surname} and year of the \emph{latest} authors, e.g. "Wallis \emph{et al.} 2006 \emph{emend.} Smith and Jones 2018" becomes "Smith \emph{et al.}, 2018". This field is directly retrieved from the source specified in the column \code{source}. Moreover, accents were removed to comply with CRAN that only allows ASCII characters, e.g. "Váňová" becomes "Vanova".
+\item \code{lpsn}\cr Identifier ('Record number') of the List of Prokaryotic names with Standing in Nomenclature (LPSN). This will be the first/highest LPSN identifier to keep one identifier per row. For example, \emph{Acetobacter ascendens} has LPSN Record number 7864 and 11011. Only the first is available in the \code{microorganisms} data set.
+\item \code{lpsn_parent}\cr LPSN identifier of the parent taxon
+\item \code{lpsn_renamed_to}\cr LPSN identifier of the currently valid taxon
+\item \code{gbif}\cr Identifier ('taxonID') of the Global Biodiversity Information Facility (GBIF)
+\item \code{gbif_parent}\cr GBIF identifier of the parent taxon
+\item \code{gbif_renamed_to}\cr GBIF identifier of the currently valid taxon
+\item \code{source}\cr Either "GBIF", "LPSN" or "manually added" (see \emph{Source})
 \item \code{prevalence}\cr Prevalence of the microorganism, see \code{\link[=as.mo]{as.mo()}}
-\item \code{snomed}\cr Systematized Nomenclature of Medicine (SNOMED) code of the microorganism, according to the US Edition of SNOMED CT from 1 September 2020 (see \emph{Source}). Use \code{\link[=mo_snomed]{mo_snomed()}} to retrieve it quickly, see \code{\link[=mo_property]{mo_property()}}.
+\item \code{snomed}\cr Systematized Nomenclature of Medicine (SNOMED) code of the microorganism, version of 1 July, 2021 (see \emph{Source}). Use \code{\link[=mo_snomed]{mo_snomed()}} to retrieve it quickly, see \code{\link[=mo_property]{mo_property()}}.
 }
 }
 \source{
-Catalogue of Life: 2019 Annual Checklist as currently implemented in this \code{AMR} package:
 \itemize{
-\item Annual Checklist (public online taxonomic database), \url{http://www.catalogueoflife.org}
-}
-
-List of Prokaryotic names with Standing in Nomenclature (5 October 2021) as currently implemented in this \code{AMR} package:
-\itemize{
-\item Parte, A.C., Sarda Carbasse, J., Meier-Kolthoff, J.P., Reimer, L.C. and Goker, M. (2020). List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}
-\item Parte, A.C. (2018). LPSN - List of Prokaryotic names with Standing in Nomenclature (bacterio.net), 20 years on. International Journal of Systematic and Evolutionary Microbiology, 68, 1825-1829; \doi{10.1099/ijsem.0.002786}
-\item Parte, A.C. (2014). LPSN - List of Prokaryotic names with Standing in Nomenclature. Nucleic Acids Research, 42, Issue D1, D613-D616; \doi{10.1093/nar/gkt1111}
-\item Euzeby, J.P. (1997). List of Bacterial Names with Standing in Nomenclature: a Folder Available on the Internet. International Journal of Systematic Bacteriology, 47, 590-592; \doi{10.1099/00207713-47-2-590}
-}
-
-US Edition of SNOMED CT from 1 September 2020 as currently implemented in this \code{AMR} package:
-\itemize{
-\item Retrieved from the Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS), OID 2.16.840.1.114222.4.11.1009, version 12; url: \url{https://phinvads.cdc.gov/vads/ViewValueSet.action?oid=2.16.840.1.114222.4.11.1009}
+\item Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on 12 September, 2022.
+\item GBIF Secretariat (November 26, 2021). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on 12 September, 2022.
+\item Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microoganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: \url{https://phinvads.cdc.gov}
 }
 }
 \usage{
 microorganisms
 }
 \description{
-A data set containing the full microbial taxonomy (\strong{last updated: 5 October 2021}) of six kingdoms from the Catalogue of Life (CoL) and the List of Prokaryotic names with Standing in Nomenclature (LPSN). MO codes can be looked up using \code{\link[=as.mo]{as.mo()}}.
+A data set containing the full microbial taxonomy (\strong{last updated: 12 September, 2022}) of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF). This data set is the backbone of this \code{AMR} package. MO codes can be looked up using \code{\link[=as.mo]{as.mo()}}.
 }
 \details{
-Please note that entries are only based on the Catalogue of Life and the LPSN (see below). Since these sources incorporate entries based on (recent) publications in the International Journal of Systematic and Evolutionary Microbiology (IJSEM), it can happen that the year of publication is sometimes later than one might expect.
+Please note that entries are only based on the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF) (see below). Since these sources incorporate entries based on (recent) publications in the International Journal of Systematic and Evolutionary Microbiology (IJSEM), it can happen that the year of publication is sometimes later than one might expect.

 For example, \emph{Staphylococcus pettenkoferi} was described for the first time in Diagnostic Microbiology and Infectious Disease in 2002 (\doi{10.1016/s0732-8893(02)00399-1}), but it was not before 2007 that a publication in IJSEM followed (\doi{10.1099/ijs.0.64381-0}). Consequently, the \code{AMR} package returns 2007 for \code{mo_year("S. pettenkoferi")}.
+}
+\section{Included Taxa}{
+
+Included taxonomic data are:
+\itemize{
+\item All ~34,000 (sub)species from the kingdoms of Archaea and Bacteria
+\item ~7,400 (sub)species from the kingdom of Fungi. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package. Only relevant fungi are covered (such as all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histoplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).
+\item ~4,900 (sub)species from the kingdom of Protozoa
+\item ~1,500 (sub)species from ~50 other relevant genera from the kingdom of Animalia (such as \emph{Strongyloides} and \emph{Taenia})
+\item All ~9,400 previously accepted names of all included (sub)species (these were taxonomically renamed)
+\item The complete taxonomic tree of all included (sub)species: from kingdom to subspecies
+\item The identifier of the parent taxons
+\item The year and first author of the related scientific publication
+}
 \subsection{Manual additions}{

 For convenience, some entries were added manually:
 \itemize{
 \item 11 entries of \emph{Streptococcus} (beta-haemolytic: groups A, B, C, D, F, G, H, K and unspecified; other: viridans, milleri)
 \item 2 entries of \emph{Staphylococcus} (coagulase-negative (CoNS) and coagulase-positive (CoPS))
-\item 3 entries of \emph{Trichomonas} (\emph{T. vaginalis}, and its family and genus)
-\item 4 entries of \emph{Toxoplasma} (\emph{T. gondii}, and its order, family and genus)
-\item 1 entry of \emph{Candida} (\emph{C.  krusei}), that is not (yet) in the Catalogue of Life
 \item 1 entry of \emph{Blastocystis} (\emph{B.  hominis}), although it officially does not exist (Noel \emph{et al.} 2005, PMID 15634993)
 \item 1 entry of \emph{Moraxella} (\emph{M. catarrhalis}), which was formally named \emph{Branhamella catarrhalis} (Catlin, 1970) though this change was never accepted within the field of clinical microbiology
 \item 5 other 'undefined' entries (unknown, unknown Gram negatives, unknown Gram positives, unknown yeast and unknown fungus)
-\item 6 families under the Enterobacterales order, according to Adeolu \emph{et al.} (2016, PMID 27620848), that are not (yet) in the Catalogue of Life
 }
+
+The syntax used to transform the original data to a cleansed \R format, can be found here: \url{https://github.com/msberends/AMR/blob/main/data-raw/reproduction_of_microorganisms.R}.
 }

 \subsection{Direct download}{
@@ -68,19 +74,12 @@ For convenience, some entries were added manually:
 Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}.
 }
 }
+
 \section{About the Records from LPSN (see \emph{Source})}{

+LPSN is the main source for bacteriological taxonomy of this \code{AMR} package.
+
 The List of Prokaryotic names with Standing in Nomenclature (LPSN) provides comprehensive information on the nomenclature of prokaryotes. LPSN is a free to use service founded by Jean P. Euzeby in 1997 and later on maintained by Aidan C. Parte.
-
-As of February 2020, the regularly augmented LPSN database at DSMZ is the basis of the new LPSN service. The new database was implemented for the Type-Strain Genome Server and augmented in 2018 to store all kinds of nomenclatural information. Data from the previous version of LPSN and from the Prokaryotic Nomenclature Up-to-date (PNU) service were imported into the new system. PNU had been established in 1993 as a service of the Leibniz Institute DSMZ, and was curated by Norbert Weiss, Manfred Kracht and Dorothea Gleim.
-}
-
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
 }

 \examples{
--- a/man/microorganisms.codes.Rd
+++ b/man/microorganisms.codes.Rd
@@ -3,9 +3,9 @@
 \docType{data}
 \name{microorganisms.codes}
 \alias{microorganisms.codes}
-\title{Data Set with 5,604 Common Microorganism Codes}
+\title{Data Set with 5,411 Common Microorganism Codes}
 \format{
-A \link[tibble:tibble]{tibble} with 5,604 observations and 2 variables:
+A \link[tibble:tibble]{tibble} with 5,411 observations and 2 variables:
 \itemize{
 \item \code{code}\cr Commonly used code of a microorganism
 \item \code{mo}\cr ID of the microorganism in the \link{microorganisms} data set
@@ -20,14 +20,6 @@ A data set containing commonly used codes for microorganisms, from laboratory sy
 \details{
 Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}.
 }
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
-}
-
 \examples{
 microorganisms.codes
 }
--- a/man/microorganisms.old.Rd
+++ b/man/microorganisms.old.Rd
@@ -1,44 +0,0 @@
-% Generated by roxygen2: do not edit by hand
-% Please edit documentation in R/data.R
-\docType{data}
-\name{microorganisms.old}
-\alias{microorganisms.old}
-\title{Data Set with Previously Accepted Taxonomic Names}
-\format{
-A \link[tibble:tibble]{tibble} with 14,338 observations and 4 variables:
-\itemize{
-\item \code{fullname}\cr Old full taxonomic name of the microorganism
-\item \code{fullname_new}\cr New full taxonomic name of the microorganism
-\item \code{ref}\cr Author(s) and year of concerning scientific publication
-\item \code{prevalence}\cr Prevalence of the microorganism, see \code{\link[=as.mo]{as.mo()}}
-}
-}
-\source{
-Catalogue of Life: Annual Checklist (public online taxonomic database), \url{http://www.catalogueoflife.org} (check included annual version with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}).
-
-Parte, A.C. (2018). LPSN - List of Prokaryotic names with Standing in Nomenclature (bacterio.net), 20 years on. International Journal of Systematic and Evolutionary Microbiology, 68, 1825-1829; \doi{10.1099/ijsem.0.002786}
-}
-\usage{
-microorganisms.old
-}
-\description{
-A data set containing old (previously valid or accepted) taxonomic names according to the Catalogue of Life. This data set is used internally by \code{\link[=as.mo]{as.mo()}}.
-}
-\details{
-Like all data sets in this package, this data set is publicly available for download in the following formats: R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. Please visit \href{https://msberends.github.io/AMR/articles/datasets.html}{our website for the download links}. The actual files are of course available on \href{https://github.com/msberends/AMR/tree/main/data-raw}{our GitHub repository}.
-}
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
-}
-
-\examples{
-microorganisms.old
-}
-\seealso{
-\code{\link[=as.mo]{as.mo()}} \code{\link[=mo_property]{mo_property()}} \link{microorganisms}
-}
-\keyword{datasets}
--- a/man/mo_matching_score.Rd
+++ b/man/mo_matching_score.Rd
@@ -14,6 +14,9 @@ mo_matching_score(x, n)
 \description{
 This algorithm is used by \code{\link[=as.mo]{as.mo()}} and all the \code{\link[=mo_property]{mo_*}} functions to determine the most probable match of taxonomic records based on user input.
 }
+\note{
+This algorithm was described in: Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03}.
+}
 \section{Matching Score for Microorganisms}{

 With ambiguous user input in \code{\link[=as.mo]{as.mo()}} and all the \code{\link[=mo_property]{mo_*}} functions, the returned results are chosen based on their matching score using \code{\link[=mo_matching_score]{mo_matching_score()}}. This matching score \eqn{m}, is calculated as:
@@ -25,18 +28,22 @@ where:
 \item \ifelse{html}{\out{<i>x</i> is the user input;}}{\eqn{x} is the user input;}
 \item \ifelse{html}{\out{<i>n</i> is a taxonomic name (genus, species, and subspecies);}}{\eqn{n} is a taxonomic name (genus, species, and subspecies);}
 \item \ifelse{html}{\out{<i>l<sub>n</sub></i> is the length of <i>n</i>;}}{l_n is the length of \eqn{n};}
-\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a>, which counts any insertion, deletion and substitution as 1 that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function, which counts any insertion, deletion and substitution as 1 that is needed to change \eqn{x} into \eqn{n};}
+\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a> (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n};}
 \item \ifelse{html}{\out{<i>p<sub>n</sub></i> is the human pathogenic prevalence group of <i>n</i>, as described below;}}{p_n is the human pathogenic prevalence group of \eqn{n}, as described below;}
 \item \ifelse{html}{\out{<i>k<sub>n</sub></i> is the taxonomic kingdom of <i>n</i>, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}}{l_n is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}
 }

-The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence. \strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales. \strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Alternaria}, \emph{Anaerosalibacter}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobacterium}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Blastocystis}, \emph{Branhamella}, \emph{Calymmatobacterium}, \emph{Candida}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Chaetomium}, \emph{Chryseobacterium}, \emph{Chryseomonas}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Hendersonula}, \emph{Hypomyces}, \emph{Koserella}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Nectria}, \emph{Ochroconis}, \emph{Oidiodendron}, \emph{Phoma}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Stomatococcus}, \emph{Treponema}, \emph{Trichoderma}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Tritirachium} or \emph{Ureaplasma}. \strong{Group 3} consists of all other microorganisms.
+The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence:
+
+\strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales.
+
+\strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acanthamoeba}, \emph{Acholeplasma}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Aedes}, \emph{Alistipes}, \emph{Alloprevotella}, \emph{Alternaria}, \emph{Amoeba}, \emph{Anaerosalibacter}, \emph{Ancylostoma}, \emph{Angiostrongylus}, \emph{Anisakis}, \emph{Anopheles}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Bergeyella}, \emph{Blastocystis}, \emph{Blastomyces}, \emph{Borrelia}, \emph{Brachyspira}, \emph{Branhamella}, \emph{Butyricimonas}, \emph{Candida}, \emph{Capillaria}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Cetobacterium}, \emph{Chaetomium}, \emph{Chlamydia}, \emph{Chlamydophila}, \emph{Chryseobacterium}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Contracaecum}, \emph{Cordylobia}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Deinococcus}, \emph{Demodex}, \emph{Dermatobia}, \emph{Dientamoeba}, \emph{Diphyllobothrium}, \emph{Dirofilaria}, \emph{Dysgonomonas}, \emph{Echinostoma}, \emph{Elizabethkingia}, \emph{Empedobacter}, \emph{Entamoeba}, \emph{Enterobius}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Fasciola}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Giardia}, \emph{Haloarcula}, \emph{Halobacterium}, \emph{Halococcus}, \emph{Hendersonula}, \emph{Heterophyes}, \emph{Histomonas}, \emph{Histoplasma}, \emph{Hymenolepis}, \emph{Hypomyces}, \emph{Hysterothylacium}, \emph{Leishmania}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Lucilia}, \emph{Lumbricus}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Metagonimus}, \emph{Meyerozyma}, \emph{Microsporidium}, \emph{Microsporum}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Myroides}, \emph{Necator}, \emph{Nectria}, \emph{Ochroconis}, \emph{Odoribacter}, \emph{Oesophagostomum}, \emph{Oidiodendron}, \emph{Opisthorchis}, \emph{Ornithobacterium}, \emph{Parabacteroides}, \emph{Pediculus}, \emph{Pedobacter}, \emph{Phlebotomus}, \emph{Phocaeicola}, \emph{Phocanema}, \emph{Phoma}, \emph{Pichia}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Pneumocystis}, \emph{Porphyromonas}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Pseudoterranova}, \emph{Pulex}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Riemerella}, \emph{Saccharomyces}, \emph{Sarcoptes}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sphingobacterium}, \emph{Spirometra}, \emph{Spiroplasma}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Streptobacillus}, \emph{Strongyloides}, \emph{Syngamus}, \emph{Taenia}, \emph{Tannerella}, \emph{Tenacibaculum}, \emph{Terrimonas}, \emph{Toxocara}, \emph{Treponema}, \emph{Trichinella}, \emph{Trichobilharzia}, \emph{Trichoderma}, \emph{Trichomonas}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Trichostrongylus}, \emph{Trichuris}, \emph{Tritirachium}, \emph{Trombicula}, \emph{Trypanosoma}, \emph{Tunga}, \emph{Ureaplasma}, \emph{Victivallis}, \emph{Wautersiella}, \emph{Weeksella} or \emph{Wuchereria}.
+
+\strong{Group 3} consists of all other microorganisms.

 All characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses.

-All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.079}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
-
-Since \code{AMR} version 1.8.1, common microorganism abbreviations are ignored in determining the matching score. These abbreviations are currently: AIEC, ATEC, BORSA, CRSM, DAEC, EAEC, EHEC, EIEC, EPEC, ETEC, GISA, MRPA, MRSA, MRSE, MSSA, MSSE, NMEC, PISP, PRSP, STEC, UPEC, VISA, VISP, VRE, VRSA and VRSP.
+All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.119}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
 }

 \section{Reference Data Publicly Available}{
--- a/man/mo_property.Rd
+++ b/man/mo_property.Rd
@@ -15,6 +15,7 @@
 \alias{mo_kingdom}
 \alias{mo_domain}
 \alias{mo_type}
+\alias{mo_status}
 \alias{mo_gramstain}
 \alias{mo_is_gram_negative}
 \alias{mo_is_gram_positive}
@@ -25,6 +26,7 @@
 \alias{mo_authors}
 \alias{mo_year}
 \alias{mo_lpsn}
+\alias{mo_gbif}
 \alias{mo_rank}
 \alias{mo_taxonomy}
 \alias{mo_synonyms}
@@ -32,76 +34,240 @@
 \alias{mo_url}
 \title{Get Properties of a Microorganism}
 \usage{
-mo_name(x, language = get_AMR_locale(), ...)
+mo_name(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_fullname(x, language = get_AMR_locale(), ...)
+mo_fullname(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_shortname(x, language = get_AMR_locale(), ...)
+mo_shortname(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_subspecies(x, language = get_AMR_locale(), ...)
+mo_subspecies(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_species(x, language = get_AMR_locale(), ...)
+mo_species(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_genus(x, language = get_AMR_locale(), ...)
+mo_genus(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_family(x, language = get_AMR_locale(), ...)
+mo_family(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_order(x, language = get_AMR_locale(), ...)
+mo_order(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_class(x, language = get_AMR_locale(), ...)
+mo_class(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_phylum(x, language = get_AMR_locale(), ...)
+mo_phylum(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_kingdom(x, language = get_AMR_locale(), ...)
+mo_kingdom(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_domain(x, language = get_AMR_locale(), ...)
+mo_domain(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_type(x, language = get_AMR_locale(), ...)
+mo_type(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_gramstain(x, language = get_AMR_locale(), ...)
+mo_status(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_is_gram_negative(x, language = get_AMR_locale(), ...)
+mo_gramstain(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_is_gram_positive(x, language = get_AMR_locale(), ...)
+mo_is_gram_negative(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_is_yeast(x, language = get_AMR_locale(), ...)
+mo_is_gram_positive(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_is_intrinsic_resistant(x, ab, language = get_AMR_locale(), ...)
+mo_is_yeast(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_snomed(x, language = get_AMR_locale(), ...)
+mo_is_intrinsic_resistant(
+  x,
+  ab,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_ref(x, language = get_AMR_locale(), ...)
+mo_snomed(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_authors(x, language = get_AMR_locale(), ...)
+mo_ref(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_year(x, language = get_AMR_locale(), ...)
+mo_authors(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_lpsn(x, language = get_AMR_locale(), ...)
+mo_year(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_rank(x, language = get_AMR_locale(), ...)
+mo_lpsn(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_taxonomy(x, language = get_AMR_locale(), ...)
+mo_gbif(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_synonyms(x, language = get_AMR_locale(), ...)
+mo_rank(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_info(x, language = get_AMR_locale(), ...)
+mo_taxonomy(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_url(x, open = FALSE, language = get_AMR_locale(), ...)
+mo_synonyms(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)

-mo_property(x, property = "fullname", language = get_AMR_locale(), ...)
+mo_info(
+  x,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)
+
+mo_url(
+  x,
+  open = FALSE,
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)
+
+mo_property(
+  x,
+  property = "fullname",
+  language = get_AMR_locale(),
+  keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
+  ...
+)
 }
 \arguments{
 \item{x}{any \link{character} (vector) that can be coerced to a valid microorganism code with \code{\link[=as.mo]{as.mo()}}. Can be left blank for auto-guessing the column containing microorganism codes if used in a data set, see \emph{Examples}.}

-\item{language}{language of the returned text, defaults to system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}}) and can be overwritten by setting the option \code{AMR_locale}, e.g. \code{options(AMR_locale = "de")}, see \link{translate}. Also used to translate text like "no growth". Use \code{language = NULL} or \code{language = ""} to prevent translation.}
+\item{language}{language to translate text like "no growth", which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})}

-\item{...}{other arguments passed on to \code{\link[=as.mo]{as.mo()}}, such as 'allow_uncertain' and 'ignore_pattern'}
+\item{keep_synonyms}{a \link{logical} to indicate if old, previously valid taxonomic names must be preserved and not be corrected to currently accepted names. The default is \code{FALSE}, which will return a note if old taxonomic names were processed. The default can be set with \code{options(AMR_keep_synonyms = TRUE)} or \code{options(AMR_keep_synonyms = FALSE)}.}
+
+\item{...}{other arguments passed on to \code{\link[=as.mo]{as.mo()}}, such as 'minimum_matching_score', 'ignore_pattern', and 'remove_from_input'}

 \item{ab}{any (vector of) text that can be coerced to a valid antibiotic code with \code{\link[=as.ab]{as.ab()}}}

 \item{open}{browse the URL using \code{\link[utils:browseURL]{browseURL()}}}

-\item{property}{one of the column names of the \link{microorganisms} data set: "mo", "fullname", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "species_id", "source", "prevalence" or "snomed", or must be \code{"shortname"}}
+\item{property}{one of the column names of the \link{microorganisms} data set: "mo", "fullname", "status", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies", "rank", "ref", "source", "lpsn", "lpsn_parent", "lpsn_renamed_to", "gbif", "gbif_parent", "gbif_renamed_to", "prevalence" or "snomed", or must be \code{"shortname"}}
 }
 \value{
 \itemize{
@@ -116,11 +282,11 @@ mo_property(x, property = "fullname", language = get_AMR_locale(), ...)
 Use these functions to return a specific property of a microorganism based on the latest accepted taxonomy. All input values will be evaluated internally with \code{\link[=as.mo]{as.mo()}}, which makes it possible to use microbial abbreviations, codes and names as input. See \emph{Examples}.
 }
 \details{
-All functions will return the most recently known taxonomic property according to the Catalogue of Life, except for \code{\link[=mo_ref]{mo_ref()}}, \code{\link[=mo_authors]{mo_authors()}} and \code{\link[=mo_year]{mo_year()}}. Please refer to this example, knowing that \emph{Escherichia blattae} was renamed to \emph{Shimwellia blattae} in 2010:
+All functions will, at default, keep old taxonomic properties. Please refer to this example, knowing that \emph{Escherichia blattae} was renamed to \emph{Shimwellia blattae} in 2010:
 \itemize{
 \item \code{mo_name("Escherichia blattae")} will return \code{"Shimwellia blattae"} (with a message about the renaming)
-\item \code{mo_ref("Escherichia blattae")} will return \code{"Burgess et al., 1973"} (with a message about the renaming)
-\item \code{mo_ref("Shimwellia blattae")} will return \code{"Priest et al., 2010"} (without a message)
+\item \code{mo_ref("Escherichia blattae", keep_synonyms = TRUE)} will return \code{"Burgess et al., 1973"} (with a warning about the renaming)
+\item \code{mo_ref("Shimwellia blattae", keep_synonyms = FALSE)} will return \code{"Priest et al., 2010"} (without a message)
 }

 The short name - \code{\link[=mo_shortname]{mo_shortname()}} - almost always returns the first character of the genus and the full species, like \code{"E. coli"}. Exceptions are abbreviations of staphylococci (such as \emph{"CoNS"}, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as \emph{"GBS"}, Group B Streptococci). Please bear in mind that e.g. \emph{E. coli} could mean \emph{Escherichia coli} (kingdom of Bacteria) as well as \emph{Entamoeba coli} (kingdom of Protozoa). Returning to the full name will be done using \code{\link[=as.mo]{as.mo()}} internally, giving priority to bacteria and human pathogens, i.e. \code{"E. coli"} will be considered \emph{Escherichia coli}. In other words, \code{mo_fullname(mo_shortname("Entamoeba coli"))} returns \code{"Escherichia coli"}.
@@ -137,7 +303,7 @@ All output \link[=translate]{will be translated} where possible.

 The function \code{\link[=mo_url]{mo_url()}} will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species.

-SNOMED codes - \code{\link[=mo_snomed]{mo_snomed()}} - are from the US Edition of SNOMED CT from 1 September 2020. See \emph{Source} and the \link{microorganisms} data set for more info.
+SNOMED codes - \code{\link[=mo_snomed]{mo_snomed()}} - are from the version of 1 July, 2021. See \emph{Source} and the \link{microorganisms} data set for more info.
 }
 \section{Matching Score for Microorganisms}{

@@ -150,38 +316,36 @@ where:
 \item \ifelse{html}{\out{<i>x</i> is the user input;}}{\eqn{x} is the user input;}
 \item \ifelse{html}{\out{<i>n</i> is a taxonomic name (genus, species, and subspecies);}}{\eqn{n} is a taxonomic name (genus, species, and subspecies);}
 \item \ifelse{html}{\out{<i>l<sub>n</sub></i> is the length of <i>n</i>;}}{l_n is the length of \eqn{n};}
-\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a>, which counts any insertion, deletion and substitution as 1 that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function, which counts any insertion, deletion and substitution as 1 that is needed to change \eqn{x} into \eqn{n};}
+\item \ifelse{html}{\out{<i>lev</i> is the <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance function</a> (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change <i>x</i> into <i>n</i>;}}{lev is the Levenshtein distance function (counting any insertion as 1, and any deletion or substitution as 2) that is needed to change \eqn{x} into \eqn{n};}
 \item \ifelse{html}{\out{<i>p<sub>n</sub></i> is the human pathogenic prevalence group of <i>n</i>, as described below;}}{p_n is the human pathogenic prevalence group of \eqn{n}, as described below;}
 \item \ifelse{html}{\out{<i>k<sub>n</sub></i> is the taxonomic kingdom of <i>n</i>, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}}{l_n is the taxonomic kingdom of \eqn{n}, set as Bacteria = 1, Fungi = 2, Protozoa = 3, Archaea = 4, others = 5.}
 }

-The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence. \strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales. \strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Alternaria}, \emph{Anaerosalibacter}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobacterium}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Blastocystis}, \emph{Branhamella}, \emph{Calymmatobacterium}, \emph{Candida}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Chaetomium}, \emph{Chryseobacterium}, \emph{Chryseomonas}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Hendersonula}, \emph{Hypomyces}, \emph{Koserella}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Nectria}, \emph{Ochroconis}, \emph{Oidiodendron}, \emph{Phoma}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Stomatococcus}, \emph{Treponema}, \emph{Trichoderma}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Tritirachium} or \emph{Ureaplasma}. \strong{Group 3} consists of all other microorganisms.
+The grouping into human pathogenic prevalence (\eqn{p}) is based on experience from several microbiological laboratories in the Netherlands in conjunction with international reports on pathogen prevalence:
+
+\strong{Group 1} (most prevalent microorganisms) consists of all microorganisms where the taxonomic class is Gammaproteobacteria or where the taxonomic genus is \emph{Enterococcus}, \emph{Staphylococcus} or \emph{Streptococcus}. This group consequently contains all common Gram-negative bacteria, such as \emph{Pseudomonas} and \emph{Legionella} and all species within the order Enterobacterales.
+
+\strong{Group 2} consists of all microorganisms where the taxonomic phylum is Proteobacteria, Firmicutes, Actinobacteria or Sarcomastigophora, or where the taxonomic genus is \emph{Absidia}, \emph{Acanthamoeba}, \emph{Acholeplasma}, \emph{Acremonium}, \emph{Actinotignum}, \emph{Aedes}, \emph{Alistipes}, \emph{Alloprevotella}, \emph{Alternaria}, \emph{Amoeba}, \emph{Anaerosalibacter}, \emph{Ancylostoma}, \emph{Angiostrongylus}, \emph{Anisakis}, \emph{Anopheles}, \emph{Apophysomyces}, \emph{Arachnia}, \emph{Aspergillus}, \emph{Aureobasidium}, \emph{Bacteroides}, \emph{Basidiobolus}, \emph{Beauveria}, \emph{Bergeyella}, \emph{Blastocystis}, \emph{Blastomyces}, \emph{Borrelia}, \emph{Brachyspira}, \emph{Branhamella}, \emph{Butyricimonas}, \emph{Candida}, \emph{Capillaria}, \emph{Capnocytophaga}, \emph{Catabacter}, \emph{Cetobacterium}, \emph{Chaetomium}, \emph{Chlamydia}, \emph{Chlamydophila}, \emph{Chryseobacterium}, \emph{Chrysonilia}, \emph{Cladophialophora}, \emph{Cladosporium}, \emph{Conidiobolus}, \emph{Contracaecum}, \emph{Cordylobia}, \emph{Cryptococcus}, \emph{Curvularia}, \emph{Deinococcus}, \emph{Demodex}, \emph{Dermatobia}, \emph{Dientamoeba}, \emph{Diphyllobothrium}, \emph{Dirofilaria}, \emph{Dysgonomonas}, \emph{Echinostoma}, \emph{Elizabethkingia}, \emph{Empedobacter}, \emph{Entamoeba}, \emph{Enterobius}, \emph{Exophiala}, \emph{Exserohilum}, \emph{Fasciola}, \emph{Flavobacterium}, \emph{Fonsecaea}, \emph{Fusarium}, \emph{Fusobacterium}, \emph{Giardia}, \emph{Haloarcula}, \emph{Halobacterium}, \emph{Halococcus}, \emph{Hendersonula}, \emph{Heterophyes}, \emph{Histomonas}, \emph{Histoplasma}, \emph{Hymenolepis}, \emph{Hypomyces}, \emph{Hysterothylacium}, \emph{Leishmania}, \emph{Lelliottia}, \emph{Leptosphaeria}, \emph{Leptotrichia}, \emph{Lucilia}, \emph{Lumbricus}, \emph{Malassezia}, \emph{Malbranchea}, \emph{Metagonimus}, \emph{Meyerozyma}, \emph{Microsporidium}, \emph{Microsporum}, \emph{Mortierella}, \emph{Mucor}, \emph{Mycocentrospora}, \emph{Mycoplasma}, \emph{Myroides}, \emph{Necator}, \emph{Nectria}, \emph{Ochroconis}, \emph{Odoribacter}, \emph{Oesophagostomum}, \emph{Oidiodendron}, \emph{Opisthorchis}, \emph{Ornithobacterium}, \emph{Parabacteroides}, \emph{Pediculus}, \emph{Pedobacter}, \emph{Phlebotomus}, \emph{Phocaeicola}, \emph{Phocanema}, \emph{Phoma}, \emph{Pichia}, \emph{Piedraia}, \emph{Pithomyces}, \emph{Pityrosporum}, \emph{Pneumocystis}, \emph{Porphyromonas}, \emph{Prevotella}, \emph{Pseudallescheria}, \emph{Pseudoterranova}, \emph{Pulex}, \emph{Rhizomucor}, \emph{Rhizopus}, \emph{Rhodotorula}, \emph{Riemerella}, \emph{Saccharomyces}, \emph{Sarcoptes}, \emph{Scolecobasidium}, \emph{Scopulariopsis}, \emph{Scytalidium}, \emph{Sphingobacterium}, \emph{Spirometra}, \emph{Spiroplasma}, \emph{Sporobolomyces}, \emph{Stachybotrys}, \emph{Streptobacillus}, \emph{Strongyloides}, \emph{Syngamus}, \emph{Taenia}, \emph{Tannerella}, \emph{Tenacibaculum}, \emph{Terrimonas}, \emph{Toxocara}, \emph{Treponema}, \emph{Trichinella}, \emph{Trichobilharzia}, \emph{Trichoderma}, \emph{Trichomonas}, \emph{Trichophyton}, \emph{Trichosporon}, \emph{Trichostrongylus}, \emph{Trichuris}, \emph{Tritirachium}, \emph{Trombicula}, \emph{Trypanosoma}, \emph{Tunga}, \emph{Ureaplasma}, \emph{Victivallis}, \emph{Wautersiella}, \emph{Weeksella} or \emph{Wuchereria}.
+
+\strong{Group 3} consists of all other microorganisms.

 All characters in \eqn{x} and \eqn{n} are ignored that are other than A-Z, a-z, 0-9, spaces and parentheses.

-All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.079}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
-
-Since \code{AMR} version 1.8.1, common microorganism abbreviations are ignored in determining the matching score. These abbreviations are currently: AIEC, ATEC, BORSA, CRSM, DAEC, EAEC, EHEC, EIEC, EPEC, ETEC, GISA, MRPA, MRSA, MRSE, MSSA, MSSE, NMEC, PISP, PRSP, STEC, UPEC, VISA, VISP, VRE, VRSA and VRSP.
-}
-
-\section{Catalogue of Life}{
-
-\if{html}{\figure{logo_col.png}{options: height="40" style=margin-bottom:"5"} \cr}
-This package contains the complete taxonomic tree of almost all microorganisms (~71,000 species) from the authoritative and comprehensive Catalogue of Life (CoL, \url{http://www.catalogueoflife.org}). The CoL is the most comprehensive and authoritative global index of species currently available. Nonetheless, we supplemented the CoL data with data from the List of Prokaryotic names with Standing in Nomenclature (LPSN, \href{https://lpsn.dsmz.de}{lpsn.dsmz.de}). This supplementation is needed until the \href{https://github.com/CatalogueOfLife/general}{CoL+ project} is finished, which we await.
-
-\link[=catalogue_of_life]{Click here} for more information about the included taxa. Check which versions of the CoL and LPSN were included in this package with \code{\link[=catalogue_of_life_version]{catalogue_of_life_version()}}.
+All matches are sorted descending on their matching score and for all user input values, the top match will be returned. This will lead to the effect that e.g., \code{"E. coli"} will return the microbial ID of \emph{Escherichia coli} (\eqn{m = 0.688}, a highly prevalent microorganism found in humans) and not \emph{Entamoeba coli} (\eqn{m = 0.119}, a less prevalent microorganism in humans), although the latter would alphabetically come first.
 }

 \section{Source}{

 \enumerate{
-\item Becker K \emph{et al.} \strong{Coagulase-Negative Staphylococci}. 2014. Clin Microbiol Rev. 27(4): 870-926; \doi{10.1128/CMR.00109-13}
-\item Becker K \emph{et al.} \strong{Implications of identifying the recently defined members of the \emph{S. aureus} complex, \emph{S. argenteus} and \emph{S. schweitzeri}: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).} 2019. Clin Microbiol Infect; \doi{10.1016/j.cmi.2019.02.028}
-\item Becker K \emph{et al.} \strong{Emergence of coagulase-negative staphylococci} 2020. Expert Rev Anti Infect Ther. 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
-\item Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 571-95; \doi{10.1084/jem.57.4.571}
-\item Catalogue of Life: 2019 Annual Checklist, \url{http://www.catalogueoflife.org}
-\item List of Prokaryotic names with Standing in Nomenclature (5 October 2021), \doi{10.1099/ijsem.0.004332}
-\item US Edition of SNOMED CT from 1 September 2020, retrieved from the Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS), OID 2.16.840.1.114222.4.11.1009, version 12; url: \url{https://phinvads.cdc.gov/vads/ViewValueSet.action?oid=2.16.840.1.114222.4.11.1009}
+\item Berends MS \emph{et al.} (2022). \strong{AMR: An R Package for Working with Antimicrobial Resistance Data}. \emph{Journal of Statistical Software}, 104(3), 1-31; \doi{10.18637/jss.v104.i03}
+\item Becker K \emph{et al.} (2014). \strong{Coagulase-Negative Staphylococci.} \emph{Clin Microbiol Rev.} 27(4): 870-926; \doi{10.1128/CMR.00109-13}
+\item Becker K \emph{et al.} (2019). \strong{Implications of identifying the recently defined members of the \emph{S. aureus} complex, \emph{S. argenteus} and \emph{S. schweitzeri}: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).} \emph{Clin Microbiol Infect}; \doi{10.1016/j.cmi.2019.02.028}
+\item Becker K \emph{et al.} (2020). \strong{Emergence of coagulase-negative staphylococci} \emph{Expert Rev Anti Infect Ther.} 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
+\item Lancefield RC (1933). \strong{A serological differentiation of human and other groups of hemolytic streptococci}. \emph{J Exp Med.} 57(4): 571-95; \doi{10.1084/jem.57.4.571}
+\item Berends MS \emph{et al.} (2022). \strong{Trends in Occurrence and Phenotypic Resistance of Coagulase-Negative Staphylococci (CoNS) Found in Human Blood in the Northern Netherlands between 2013 and 2019} \emph{Microorganisms} 10(9), 1801; \doi{10.3390/microorganisms10091801}
+\item Parte, AC \emph{et al.} (2020). \strong{List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ.} International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; \doi{10.1099/ijsem.0.004332}. Accessed from \url{https://lpsn.dsmz.de} on 12 September, 2022.
+\item GBIF Secretariat (November 26, 2021). GBIF Backbone Taxonomy. Checklist dataset \doi{10.15468/39omei}. Accessed from \url{https://www.gbif.org} on 12 September, 2022.
+\item Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name 'Microoganism', OID 2.16.840.1.114222.4.11.1009 (v12). URL: \url{https://phinvads.cdc.gov}
 }
 }

--- a/man/translate.Rd
+++ b/man/translate.Rd
@@ -17,7 +17,7 @@ reset_AMR_locale()
 translate_AMR(x, language = get_AMR_locale())
 }
 \arguments{
-\item{language}{language to choose. Use one of these supported language names or ISO-639-1 codes: "English" ("en"), "Chinese" ("zh"), "Danish" ("da"), "Dutch" ("nl"), "French" ("fr"), "German" ("de"), "Greek" ("el"), "Italian" ("it"), "Japanese" ("ja"), "Polish" ("pl"), "Portuguese" ("pt"), "Russian" ("ru"), "Spanish" ("es"), "Swedish" ("sv"), "Turkish" ("tr"), "Ukrainian" ("uk").}
+\item{language}{language to choose. Use one of these supported language names or ISO-639-1 codes: English (en), Chinese (zh), Danish (da), Dutch (nl), French (fr), German (de), Greek (el), Italian (it), Japanese (ja), Polish (pl), Portuguese (pt), Russian (ru), Spanish (es), Swedish (sv), Turkish (tr) or Ukrainian (uk).}

 \item{x}{text to translate}
 }
@@ -25,16 +25,29 @@ translate_AMR(x, language = get_AMR_locale())
 For language-dependent output of AMR functions, like \code{\link[=mo_name]{mo_name()}}, \code{\link[=mo_gramstain]{mo_gramstain()}}, \code{\link[=mo_type]{mo_type()}} and \code{\link[=ab_name]{ab_name()}}.
 }
 \details{
-The currently 16 supported languages are English, Chinese, Danish, Dutch, French, German, Greek, Italian, Japanese, Polish, Portuguese, Russian, Spanish, Swedish, Turkish and Ukrainian. All these languages have translations available for all antimicrobial agents and colloquial microorganism names.
+The currently 16 supported languages are English (en), Chinese (zh), Danish (da), Dutch (nl), French (fr), German (de), Greek (el), Italian (it), Japanese (ja), Polish (pl), Portuguese (pt), Russian (ru), Spanish (es), Swedish (sv), Turkish (tr) or Ukrainian (uk). All these languages have translations available for all antimicrobial agents and colloquial microorganism names.
+
+\strong{To silence language notes when this package loads} on a non-English operating system, you can set the option \code{AMR_locale} in your \code{.Rprofile} file like this:
+
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{# Open .Rprofile file
+utils::file.edit("~/.Rprofile")
+
+# Add e.g. Italian support to that file using:
+options(AMR_locale = "Italian")
+# or using:
+AMR::set_AMR_locale("Italian")
+
+# And save the file!
+}\if{html}{\out{</div>}}

 Please read about adding or updating a language in \href{https://github.com/msberends/AMR/wiki/}{our Wiki}.
 \subsection{Changing the Default Language}{

 The system language will be used at default (as returned by \code{Sys.getenv("LANG")} or, if \code{LANG} is not set, \code{\link[=Sys.getlocale]{Sys.getlocale("LC_COLLATE")}}), if that language is supported. But the language to be used can be overwritten in two ways and will be checked in this order:
 \enumerate{
-\item Setting the R option \code{AMR_locale}, either by using \code{set_AMR_locale()} or by running e.g. \code{options(AMR_locale = "de")}.
+\item Setting the R option \code{AMR_locale}, either by using e.g. \code{set_AMR_locale("German")} or by running e.g. \code{options(AMR_locale = "German")}.

-Note that setting an \R option only works in the same session. Save the command \code{options(AMR_locale = "(your language)")} to your \code{.Rprofile} file to apply it for every session.
+Note that setting an \R option only works in the same session. Save the command \code{options(AMR_locale = "(your language)")} to your \code{.Rprofile} file to apply it for every session. Run \code{utils::file.edit("~/.Rprofile")} to edit your \code{.Rprofile} file.
 \item Setting the system variable \code{LANGUAGE} or \code{LANG}, e.g. by adding \code{LANGUAGE="de_DE.utf8"} to your \code{.Renviron} file in your home directory.
 }

@@ -44,16 +57,22 @@ Thus, if the R option \code{AMR_locale} is set, the system variables \code{LANGU
 \examples{
 # Current settings (based on system language)
 ab_name("Ciprofloxacin")
-mo_name("Coagulase-negative Staphylococcus")
+mo_name("Coagulase-negative Staphylococcus (CoNS)")

 # setting another language
-set_AMR_locale("Greek")
-ab_name("Ciprofloxacin")
-mo_name("Coagulase-negative Staphylococcus")
-
 set_AMR_locale("Spanish")
 ab_name("Ciprofloxacin")
-mo_name("Coagulase-negative Staphylococcus")
+mo_name("Coagulase-negative Staphylococcus (CoNS)")
+
+# setting yet another language
+set_AMR_locale("Greek")
+ab_name("Ciprofloxacin")
+mo_name("Coagulase-negative Staphylococcus (CoNS)")
+
+# setting yet another language
+set_AMR_locale("Ukrainian")
+ab_name("Ciprofloxacin")
+mo_name("Coagulase-negative Staphylococcus (CoNS)")

 # set_AMR_locale() understands endonyms, English exonyms, and ISO-639-1:
 set_AMR_locale("Deutsch")