(v2.1.1.9071) update veterinary SIR interpretation, add only_fungi

2025-07-23 19:43:13 +02:00 · 2024-09-19 11:44:56 +02:00
parent 573c0346ed
commit ddb23b6e73
52 changed files with 290 additions and 162 deletions
--- a/.github/prehooks/commit-msg
+++ b/.github/prehooks/commit-msg
@ -1,38 +0,0 @@
-#!/bin/bash
-
-# ==================================================================== #
-# TITLE:                                                               #
-# AMR: An R Package for Working with Antimicrobial Resistance Data     #
-#                                                                      #
-# SOURCE CODE:                                                         #
-# https://github.com/msberends/AMR                                     #
-#                                                                      #
-# PLEASE CITE THIS SOFTWARE AS:                                        #
-# Berends MS, Luz CF, Friedrich AW, et al. (2022).                     #
-# AMR: An R Package for Working with Antimicrobial Resistance Data.    #
-# Journal of Statistical Software, 104(3), 1-31.                       #
-# https://doi.org/10.18637/jss.v104.i03                                #
-#                                                                      #
-# Developed at the University of Groningen and the University Medical  #
-# Center Groningen in The Netherlands, in collaboration with many      #
-# colleagues from around the world, see our website.                   #
-#                                                                      #
-# This R package is free software; you can freely use and distribute   #
-# it for both personal and commercial purposes under the terms of the  #
-# GNU General Public License version 2.0 (GNU GPL-2), as published by  #
-# the Free Software Foundation.                                        #
-# We created this package for both routine data analysis and academic  #
-# research and it was publicly released in the hope that it will be    #
-# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY.              #
-#                                                                      #
-# Visit our website for the full manual and a complete tutorial about  #
-# how to conduct AMR data analysis: https://msberends.github.io/AMR/   #
-# ==================================================================== #
-
-# always add these:
-git add data-raw/*
-git add man/*
-git add R/sysdata.rda
-git add NAMESPACE
-git add DESCRIPTION
-git add NEWS.md
--- a/2
+++ b/2
@ -56,5 +56,5 @@ BugReports: https://github.com/msberends/AMR/issues
 License: GPL-2 | file LICENSE
 Encoding: UTF-8
 LazyData: true
-RoxygenNote: 7.3.1
+RoxygenNote: 7.3.2
 Roxygen: list(markdown = TRUE)
--- a/NEWS.md
+++ b/NEWS.md
@ -3,7 +3,7 @@
 *(this beta version will eventually become v3.0. We're happy to reach a new major milestone soon, which will be all about the new One Health support! Install this beta using [the instructions here](https://msberends.github.io/AMR/#latest-development-version).)*

 #### A New Milestone: AMR v3.0 with One Health Support (= Human + Veterinary + Environmental)
-This package now supports not only tools for AMR data analysis in clinical settings, but also for veterinary and environmental microbiology. This was made possible through a collaboration with the [University of Prince Edward Island](https://www.upei.ca/avc), Canada. To celebrate this great improvement of the package, we also updated the package logo to reflect this change.
+This package now supports not only tools for AMR data analysis in clinical settings, but also for veterinary and environmental microbiology. This was made possible through a collaboration with the [University of Prince Edward Island's Atlantic Veterinary College](https://www.upei.ca/avc), Canada. To celebrate this great improvement of the package, we also updated the package logo to reflect this change.

 ## Breaking
 * Removed all functions and references that used the deprecated `rsi` class, which were all replaced with their `sir` equivalents over a year ago
@ -26,6 +26,9 @@ This package now supports not only tools for AMR data analysis in clinical setti
    * The `microorganisms` data set now contains additional columns `mycobank`, `mycobank_parent`, and `mycobank_renamed_to`
    * New function `mo_mycobank()` to get the MycoBank record number, analogous to existing functions `mo_lpsn()` and `mo_gbif()`
  * We've welcomed over 2,000 records from 2023, over 900 from 2024, and many thousands of new fungi
+* Improved support for mycologists:
+  * The `as.mo()` function now includes a new argument, `only_fungi` (TRUE/FALSE), which limits the results to fungi only. Normally, bacteria are often prioritised by the algorithm, but setting `only_fungi = TRUE` ensures only fungi are returned. 
+  * You can also set this globally using the new R option `AMR_only_fungi`, e.g., `options(AMR_only_fungi = TRUE)`.
 * Other
  * New function `mo_group_members()` to retrieve the member microorganisms of a microorganism group. For example, `mo_group_members("Strep group C")` returns a vector of all microorganisms that are in that group.

@ -59,6 +62,7 @@ This package now supports not only tools for AMR data analysis in clinical setti
 * Fixed a bug for when `antibiogram()` returns an empty data set

 ## Other
+* Greatly updated and expanded documentation
 * Added Jordan Stull, Matthew Saab, and Javier Sanchez as contributors, to thank them for their valuable input


--- a/R/guess_ab_col.R
+++ b/R/guess_ab_col.R
@ -274,16 +274,18 @@ get_column_abx <- function(x,
      }
      if (names(out[i]) %in% names(duplicates)) {
        already_set_as <- out[unname(out) == unname(out[i])][1L]
-        warning_(
-          paste0(
-            "Column '", font_bold(out[i]), "' will not be used for ",
-            names(out)[i], " (", ab_name(names(out)[i], tolower = TRUE, language = NULL), ")",
-            ", as it is already set for ",
-            names(already_set_as), " (", ab_name(names(already_set_as), tolower = TRUE, language = NULL), ")"
-          ),
-          add_fn = font_red,
-          immediate = verbose
-        )
+        if (names(out)[i] != names(already_set_as)) {
+          warning_(
+            paste0(
+              "Column '", font_bold(out[i]), "' will not be used for ",
+              names(out)[i], " (", ab_name(names(out)[i], tolower = TRUE, language = NULL), ")",
+              ", as it is already set for ",
+              names(already_set_as), " (", ab_name(names(already_set_as), tolower = TRUE, language = NULL), ")"
+            ),
+            add_fn = font_red,
+            immediate = verbose
+          )
+        }
      }
    }
  }
--- a/R/mo.R
+++ b/R/mo.R
@ -42,13 +42,15 @@
 #' @param reference_df a [data.frame] to be used for extra reference when translating `x` to a valid [`mo`]. See [set_mo_source()] and [get_mo_source()] to automate the usage of your own codes (e.g. used in your analysis or organisation).
 #' @param ignore_pattern a Perl-compatible [regular expression][base::regex] (case-insensitive) of which all matches in `x` must return `NA`. This can be convenient to exclude known non-relevant input and can also be set with the [package option][AMR-options] [`AMR_ignore_pattern`][AMR-options], e.g. `options(AMR_ignore_pattern = "(not reported|contaminated flora)")`.
 #' @param cleaning_regex a Perl-compatible [regular expression][base::regex] (case-insensitive) to clean the input of `x`. Every matched part in `x` will be removed. At default, this is the outcome of [mo_cleaning_regex()], which removes texts between brackets and texts such as "species" and "serovar". The default can be set with the [package option][AMR-options] [`AMR_cleaning_regex`][AMR-options].
+#' @param only_fungi a [logical] to indicate if only fungi must be found, making sure that e.g. misspellings always return records from the kingdom of Fungi. This can be set globally for [all microorganism functions][mo_property()] with the [package option][AMR-options] [`AMR_only_fungi`][AMR-options], i.e. `options(AMR_only_fungi = TRUE)`.
 #' @param language language to translate text like "no growth", which defaults to the system language (see [get_AMR_locale()])
-#' @param info a [logical] to indicate if a progress bar should be printed if more than 25 items are to be coerced - the default is `TRUE` only in interactive mode
+#' @param info a [logical] to indicate that info must be printed, e.g. a progress bar when more than 25 items are to be coerced, or a list with old taxonomic names. The default is `TRUE` only in interactive mode.
 #' @param ... other arguments passed on to functions
 #' @rdname as.mo
 #' @aliases mo
 #' @details
-#' A microorganism (MO) code from this package (class: [`mo`]) is human readable and typically looks like these examples:
+#' A microorganism (MO) code from this package (class: [`mo`]) is human-readable and typically looks like these examples:
+#' 
 #' ```
 #'   Code               Full name
 #'   ---------------    --------------------------------------
@ -60,50 +62,74 @@
 #'   |   |    |    \---> subspecies, a 3-5 letter acronym
 #'   |   |    \----> species, a 3-6 letter acronym
 #'   |   \----> genus, a 4-8 letter acronym
-#'   \----> taxonomic kingdom: A (Archaea), AN (Animalia), B (Bacteria),
-#'                             F (Fungi), PL (Plantae), P (Protozoa)
+#'   \----> kingdom: A (Archaea), AN (Animalia), B (Bacteria),
+#'                   C (Chromista), F (Fungi), PL (Plantae),
+#'                   P (Protozoa)
 #' ```
 #'
-#' Values that cannot be coerced will be considered 'unknown' and will be returned as the MO code `UNKNOWN` with a warning.
+#' Values that cannot be coerced will be considered 'unknown' and will return the MO code `UNKNOWN` with a warning.
 #'
 #' Use the [`mo_*`][mo_property()] functions to get properties based on the returned code, see *Examples*.
 #'
-#' The [as.mo()] function uses a novel [matching score algorithm][mo_matching_score()] (see *Matching Score for Microorganisms* below) to match input against the [available microbial taxonomy][microorganisms] in this package. This will lead to the effect that e.g. `"E. coli"` (a microorganism highly prevalent in humans) will return the microbial ID of *Escherichia coli* and not *Entamoeba coli* (a microorganism less prevalent in humans), although the latter would alphabetically come first.
-#'
-#' With `Becker = TRUE`, the following `r length(MO_CONS[MO_CONS != "B_STPHY_CONS"])` staphylococci will be converted to the **coagulase-negative group**: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_CONS[MO_CONS != "B_STPHY_CONS"], keep_synonyms = TRUE)), quotes = "*")`.\cr The following `r length(MO_COPS[MO_COPS != "B_STPHY_COPS"])` staphylococci will be converted to the **coagulase-positive group**: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_COPS[MO_COPS != "B_STPHY_COPS"], keep_synonyms = TRUE)), quotes = "*")`.
-#'
-#' With `Lancefield = TRUE`, the following streptococci will be converted to their corresponding Lancefield group: `r vector_and(gsub("Streptococcus", "S.", paste0("*", mo_name(MO_LANCEFIELD, keep_synonyms = TRUE), "* (", mo_species(MO_LANCEFIELD, keep_synonyms = TRUE, Lancefield = TRUE), ")")), quotes = FALSE)`.
+#' The [as.mo()] function uses a novel and scientifically validated (\doi{10.18637/jss.v104.i03}) matching score algorithm (see *Matching Score for Microorganisms* below) to match input against the [available microbial taxonomy][microorganisms] in this package. This implicates that e.g. `"E. coli"` (a microorganism highly prevalent in humans) will return the microbial ID of *Escherichia coli* and not *Entamoeba coli* (a microorganism less prevalent in humans), although the latter would alphabetically come first.
 #'
 #' ### Coping with Uncertain Results
 #'
-#' Results of non-exact taxonomic input are based on their [matching score][mo_matching_score()]. The lowest allowed score can be set with the `minimum_matching_score` argument. At default this will be determined based on the character length of the input, and the [taxonomic kingdom][microorganisms] and [human pathogenicity][mo_matching_score()] of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to evaluate the results with [mo_uncertainties()], which returns a [data.frame] with all specifications.
+#' Results of non-exact taxonomic input are based on their [matching score][mo_matching_score()]. The lowest allowed score can be set with the `minimum_matching_score` argument. At default this will be determined based on the character length of the input, the [taxonomic kingdom][microorganisms], and the [human pathogenicity][mo_matching_score()] of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to inspect the results with [mo_uncertainties()], which returns a [data.frame] with all specifications.
 #'
-#' To increase the quality of matching, the `cleaning_regex` argument can be used to clean the input (i.e., `x`). This must be a [regular expression][base::regex] that matches parts of the input that should be removed before the input is matched against the [available microbial taxonomy][microorganisms]. It will be matched Perl-compatible and case-insensitive. The default value of `cleaning_regex` is the outcome of the helper function [mo_cleaning_regex()].
+#' To increase the quality of matching, the `cleaning_regex` argument is used to clean the input. This must be a [regular expression][base::regex] that matches parts of the input that should be removed before the input is matched against the [available microbial taxonomy][microorganisms]. It will be matched Perl-compatible and case-insensitive. The default value of `cleaning_regex` is the outcome of the helper function [mo_cleaning_regex()].
 #'
 #' There are three helper functions that can be run after using the [as.mo()] function:
 #' - Use [mo_uncertainties()] to get a [data.frame] that prints in a pretty format with all taxonomic names that were guessed. The output contains the matching score for all matches (see *Matching Score for Microorganisms* below).
 #' - Use [mo_failures()] to get a [character] [vector] with all values that could not be coerced to a valid value.
 #' - Use [mo_renamed()] to get a [data.frame] with all values that could be coerced based on old, previously accepted taxonomic names.
 #'
-#' ### Microbial Prevalence of Pathogens in Humans
+#' ### For Mycologists
+#' 
+#' The [matching score algorithm][mo_matching_score()] gives precedence to bacteria over fungi. If you are only analysing fungi, be sure to use `only_fungi = TRUE`, or better yet, add this to your code and run it once every session:
+#' 
+#' ```r
+#' options(AMR_only_fungi = TRUE)
+#' ```
+#' 
+#' This will make sure that no bacteria or other 'non-fungi' will be returned by [as.mo()], or any of the [`mo_*`][mo_property()] functions.
 #'
-#' The coercion rules consider the prevalence of microorganisms in humans, which is available as the `prevalence` column in the [microorganisms] data set. The grouping into human pathogenic prevalence is explained in the section *Matching Score for Microorganisms* below.
+#' ### Coagulase-negative and Coagulase-positive Staphylococci
+#' 
+#' With `Becker = TRUE`, the following staphylococci will be converted to their corresponding coagulase group:
+#' 
+#' * Coagulase-negative: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_CONS[MO_CONS != "B_STPHY_CONS"], keep_synonyms = TRUE)), quotes = "*")`
+#' * Coagulase-positive: `r vector_and(gsub("Staphylococcus", "S.", mo_name(MO_COPS[MO_COPS != "B_STPHY_COPS"], keep_synonyms = TRUE)), quotes = "*")`
+#' 
+#' This is based on:
+#' 
+#' * Becker K *et al.* (2014). **Coagulase-Negative Staphylococci.** *Clin Microbiol Rev.* 27(4): 870-926; \doi{10.1128/CMR.00109-13}
+#' * Becker K *et al.* (2019). **Implications of identifying the recently defined members of the *S. aureus* complex, *S. argenteus* and *S. schweitzeri*: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).** *Clin Microbiol Infect*; \doi{10.1016/j.cmi.2019.02.028}
+#' * Becker K *et al.* (2020). **Emergence of coagulase-negative staphylococci.** *Expert Rev Anti Infect Ther.* 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
+#' 
+#' For newly named staphylococcal species, such as *S. brunensis* (2024) and *S. shinii* (2023), we looked up the scientific reference to make sure the species are considered for the correct coagulase group.
+#' 
+#' ### Lancefield Groups in Streptococci
+#' 
+#' With `Lancefield = TRUE`, the following streptococci will be converted to their corresponding Lancefield group:
+#' 
+#' * `r paste(apply(aggregate(mo_name ~ mo_group_name, data = microorganisms.groups[microorganisms.groups$mo_group_name %like_case% "Streptococcus Group [A-Z]$", ], FUN = function(x) vector_and(gsub("Streptococcus", "S.", x, fixed = TRUE), quotes = "*", sort = TRUE)), 1, function(row) paste(row["mo_group_name"], ": ", row["mo_name"], sep = "")), collapse = "\n* ")`
+#' 
+#' This is based on:
+#' 
+#' * Lancefield RC (1933). **A serological differentiation of human and other groups of hemolytic streptococci.** *J Exp Med.* 57(4): 571-95; \doi{10.1084/jem.57.4.571}
+#' 
 #' @inheritSection mo_matching_score Matching Score for Microorganisms
 #'
 #  (source as a section here, so it can be inherited by other man pages)
 #' @section Source:
-#' 1. Berends MS *et al.* (2022). **AMR: An R Package for Working with Antimicrobial Resistance Data**. *Journal of Statistical Software*, 104(3), 1-31; \doi{10.18637/jss.v104.i03}
-#' 2. Becker K *et al.* (2014). **Coagulase-Negative Staphylococci.** *Clin Microbiol Rev.* 27(4): 870-926; \doi{10.1128/CMR.00109-13}
-#' 3. Becker K *et al.* (2019). **Implications of identifying the recently defined members of the *S. aureus* complex, *S. argenteus* and *S. schweitzeri*: A position paper of members of the ESCMID Study Group for staphylococci and Staphylococcal Diseases (ESGS).** *Clin Microbiol Infect*; \doi{10.1016/j.cmi.2019.02.028}
-#' 4. Becker K *et al.* (2020). **Emergence of coagulase-negative staphylococci.** *Expert Rev Anti Infect Ther.* 18(4):349-366; \doi{10.1080/14787210.2020.1730813}
-#' 5. Lancefield RC (1933). **A serological differentiation of human and other groups of hemolytic streptococci.** *J Exp Med.* 57(4): 571-95; \doi{10.1084/jem.57.4.571}
-#' 6. Berends MS *et al.* (2022). **Trends in Occurrence and Phenotypic Resistance of Coagulase-Negative Staphylococci (CoNS) Found in Human Blood in the Northern Netherlands between 2013 and 2019/** *Micro.rganisms* 10(9), 1801; \doi{10.3390/microorganisms10091801}
-#' 7. `r TAXONOMY_VERSION$LPSN$citation` Accessed from <`r TAXONOMY_VERSION$LPSN$url`> on `r documentation_date(TAXONOMY_VERSION$LPSN$accessed_date)`.
-#' 8. `r TAXONOMY_VERSION$MycoBank$citation` Accessed from <`r TAXONOMY_VERSION$MycoBank$url`> on `r documentation_date(TAXONOMY_VERSION$MycoBank$accessed_date)`.
-#' 9. `r TAXONOMY_VERSION$GBIF$citation` Accessed from <`r TAXONOMY_VERSION$GBIF$url`> on `r documentation_date(TAXONOMY_VERSION$GBIF$accessed_date)`.
-#' 10. `r TAXONOMY_VERSION$BacDive$citation` Accessed from <`r TAXONOMY_VERSION$BacDive$url`> on `r documentation_date(TAXONOMY_VERSION$BacDive$accessed_date)`.
-#' 11. `r TAXONOMY_VERSION$SNOMED$citation` URL: <`r TAXONOMY_VERSION$SNOMED$url`>
-#' 12. Bartlett A *et al.* (2022). **A comprehensive list of bacterial pathogens infecting humans** *Microbiology* 168:001269; \doi{10.1099/mic.0.001269}
+#' * Berends MS *et al.* (2022). **AMR: An R Package for Working with Antimicrobial Resistance Data**. *Journal of Statistical Software*, 104(3), 1-31; \doi{10.18637/jss.v104.i03}
+#' * `r TAXONOMY_VERSION$LPSN$citation` Accessed from <`r TAXONOMY_VERSION$LPSN$url`> on `r documentation_date(TAXONOMY_VERSION$LPSN$accessed_date)`.
+#' * `r TAXONOMY_VERSION$MycoBank$citation` Accessed from <`r TAXONOMY_VERSION$MycoBank$url`> on `r documentation_date(TAXONOMY_VERSION$MycoBank$accessed_date)`.
+#' * `r TAXONOMY_VERSION$GBIF$citation` Accessed from <`r TAXONOMY_VERSION$GBIF$url`> on `r documentation_date(TAXONOMY_VERSION$GBIF$accessed_date)`.
+#' * `r TAXONOMY_VERSION$BacDive$citation` Accessed from <`r TAXONOMY_VERSION$BacDive$url`> on `r documentation_date(TAXONOMY_VERSION$BacDive$accessed_date)`.
+#' * `r TAXONOMY_VERSION$SNOMED$citation` URL: <`r TAXONOMY_VERSION$SNOMED$url`>
+#' * Bartlett A *et al.* (2022). **A comprehensive list of bacterial pathogens infecting humans** *Microbiology* 168:001269; \doi{10.1099/mic.0.001269}
 #' @export
 #' @return A [character] [vector] with additional class [`mo`]
 #' @seealso [microorganisms] for the [data.frame] that is being used to determine ID's.
@ -161,6 +187,7 @@ as.mo <- function(x,
                  reference_df = get_mo_source(),
                  ignore_pattern = getOption("AMR_ignore_pattern", NULL),
                  cleaning_regex = getOption("AMR_cleaning_regex", mo_cleaning_regex()),
+                  only_fungi = getOption("AMR_only_fungi", FALSE),
                  language = get_AMR_locale(),
                  info = interactive(),
                  ...) {
@ -172,6 +199,7 @@ as.mo <- function(x,
  meet_criteria(reference_df, allow_class = "data.frame", allow_NULL = TRUE)
  meet_criteria(ignore_pattern, allow_class = "character", has_length = 1, allow_NULL = TRUE)
  meet_criteria(cleaning_regex, allow_class = "character", has_length = 1, allow_NULL = TRUE)
+  meet_criteria(only_fungi, allow_class = "logical", has_length = 1)
  language <- validate_language(language)
  meet_criteria(info, allow_class = "logical", has_length = 1)
  
@ -225,7 +253,7 @@ as.mo <- function(x,
  out[is.na(out)] <- convert_colloquial_input(x[is.na(out)])
  # From previous hits in this session ----
  old <- out
-  out[is.na(out) & paste(x, minimum_matching_score) %in% AMR_env$mo_previously_coerced$x] <- AMR_env$mo_previously_coerced$mo[match(paste(x, minimum_matching_score)[is.na(out) & paste(x, minimum_matching_score) %in% AMR_env$mo_previously_coerced$x], AMR_env$mo_previously_coerced$x)]
+  out[is.na(out) & paste(x, minimum_matching_score, only_fungi) %in% AMR_env$mo_previously_coerced$x] <- AMR_env$mo_previously_coerced$mo[match(paste(x, minimum_matching_score, only_fungi)[is.na(out) & paste(x, minimum_matching_score, only_fungi) %in% AMR_env$mo_previously_coerced$x], AMR_env$mo_previously_coerced$x)]
  new <- out
  if (isTRUE(info) && message_not_thrown_before("as.mo", old, new, entire_session = TRUE) && any(is.na(old) & !is.na(new), na.rm = TRUE)) {
    message_(
@ -256,6 +284,11 @@ as.mo <- function(x,
    
    msg <- character(0)
    
+    MO_lookup_current <- AMR_env$MO_lookup
+    if (isTRUE(only_fungi)) {
+      MO_lookup_current <- MO_lookup_current[MO_lookup_current$kingdom == "Fungi", , drop = FALSE]
+    }
+    
    # run it
    x_coerced <- vapply(FUN.VALUE = character(1), x_unique, function(x_search) {
      progress$tick()
@ -271,8 +304,8 @@ as.mo <- function(x,
      x_search_cleaned[x_search_cleaned == toupper(x_search_cleaned)] <- x_out[x_search_cleaned == toupper(x_search_cleaned)]
      
      # first check if cleaning led to an exact result, case-insensitive
-      if (x_out %in% AMR_env$MO_lookup$fullname_lower) {
-        return(as.character(AMR_env$MO_lookup$mo[match(x_out, AMR_env$MO_lookup$fullname_lower)]))
+      if (x_out %in% MO_lookup_current$fullname_lower) {
+        return(as.character(MO_lookup_current$mo[match(x_out, MO_lookup_current$fullname_lower)]))
      }
      
      # input must not be too short
@ -286,46 +319,46 @@ as.mo <- function(x,
      # do a pre-match on first character (and if it contains a space, first chars of first two terms)
      if (length(x_parts) %in% c(2, 3)) {
        # for genus + species + subspecies
-        if (paste(x_parts[1:2], collapse = " ") %in% AMR_env$MO_lookup$fullname_lower) {
-          filtr <- which(AMR_env$MO_lookup$fullname_lower %like% paste(x_parts[1:2], collapse = " "))
+        if (paste(x_parts[1:2], collapse = " ") %in% MO_lookup_current$fullname_lower) {
+          filtr <- which(MO_lookup_current$fullname_lower %like% paste(x_parts[1:2], collapse = " "))
        } else if (nchar(gsub("[^a-z]", "", x_parts[1], perl = TRUE)) <= 3) {
-          filtr <- which(AMR_env$MO_lookup$full_first == substr(x_parts[1], 1, 1) &
-                           (AMR_env$MO_lookup$species_first == substr(x_parts[2], 1, 1) |
-                              AMR_env$MO_lookup$subspecies_first == substr(x_parts[2], 1, 1) |
-                              AMR_env$MO_lookup$subspecies_first == substr(x_parts[3], 1, 1)))
+          filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) &
+                           (MO_lookup_current$species_first == substr(x_parts[2], 1, 1) |
+                              MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) |
+                              MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1)))
        } else {
-          filtr <- which(AMR_env$MO_lookup$full_first == substr(x_parts[1], 1, 1) |
-                           AMR_env$MO_lookup$species_first == substr(x_parts[2], 1, 1) |
-                           AMR_env$MO_lookup$subspecies_first == substr(x_parts[2], 1, 1) |
-                           AMR_env$MO_lookup$subspecies_first == substr(x_parts[3], 1, 1))
+          filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) |
+                           MO_lookup_current$species_first == substr(x_parts[2], 1, 1) |
+                           MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) |
+                           MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1))
        }
      } else if (length(x_parts) > 3) {
        first_chars <- paste0("(^| )[", paste(substr(x_parts, 1, 1), collapse = ""), "]")
-        filtr <- which(AMR_env$MO_lookup$full_first %like_case% first_chars)
+        filtr <- which(MO_lookup_current$full_first %like_case% first_chars)
      } else if (nchar(x_out) == 3) {
        # no space and 3 characters - probably a code such as SAU or ECO
        msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", totitle(substr(x_out, 1, 1)), AMR_env$dots, " ", substr(x_out, 2, 3), AMR_env$dots, "\""))
-        filtr <- which(AMR_env$MO_lookup$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 1), ".* ", substr(x_out, 2, 3)))
+        filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 1), ".* ", substr(x_out, 2, 3)))
      } else if (nchar(x_out) == 4) {
        # no space and 4 characters - probably a code such as STAU or ESCO
        msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", totitle(substr(x_out, 1, 2)), AMR_env$dots, " ", substr(x_out, 3, 4), AMR_env$dots, "\""))
-        filtr <- which(AMR_env$MO_lookup$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 2), ".* ", substr(x_out, 3, 4)))
+        filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", substr(x_out, 1, 2), ".* ", substr(x_out, 3, 4)))
      } else if (nchar(x_out) <= 6) {
        # no space and 5-6 characters - probably a code such as STAAUR or ESCCOL
        first_part <- paste0(substr(x_out, 1, 2), "[a-z]*", substr(x_out, 3, 3))
        second_part <- substr(x_out, 4, nchar(x_out))
        msg <<- c(msg, paste0("Input \"", x_search, "\" was assumed to be a microorganism code - tried to match on \"", gsub("[a-z]*", AMR_env$dots, totitle(first_part), fixed = TRUE), " ", second_part, AMR_env$dots, "\""))
-        filtr <- which(AMR_env$MO_lookup$fullname_lower %like_case% paste0("(^| )", first_part, ".* ", second_part))
+        filtr <- which(MO_lookup_current$fullname_lower %like_case% paste0("(^| )", first_part, ".* ", second_part))
      } else {
        # for genus or species or subspecies
-        filtr <- which(AMR_env$MO_lookup$full_first == substr(x_parts, 1, 1) |
-                         AMR_env$MO_lookup$species_first == substr(x_parts, 1, 1) |
-                         AMR_env$MO_lookup$subspecies_first == substr(x_parts, 1, 1))
+        filtr <- which(MO_lookup_current$full_first == substr(x_parts, 1, 1) |
+                         MO_lookup_current$species_first == substr(x_parts, 1, 1) |
+                         MO_lookup_current$subspecies_first == substr(x_parts, 1, 1))
      }
      if (length(filtr) == 0) {
-        mo_to_search <- AMR_env$MO_lookup$fullname
+        mo_to_search <- MO_lookup_current$fullname
      } else {
-        mo_to_search <- AMR_env$MO_lookup$fullname[filtr]
+        mo_to_search <- MO_lookup_current$fullname[filtr]
      }
      
      AMR_env$mo_to_search <- mo_to_search
@ -334,9 +367,9 @@ as.mo <- function(x,
      if (is.null(minimum_matching_score)) {
        minimum_matching_score_current <- min(0.6, min(10, nchar(x_search_cleaned)) * 0.08)
        # correct back for prevalence
-        minimum_matching_score_current <- minimum_matching_score_current / AMR_env$MO_lookup$prevalence[match(mo_to_search, AMR_env$MO_lookup$fullname)]
+        minimum_matching_score_current <- minimum_matching_score_current / MO_lookup_current$prevalence[match(mo_to_search, MO_lookup_current$fullname)]
        # correct back for kingdom
-        minimum_matching_score_current <- minimum_matching_score_current / AMR_env$MO_lookup$kingdom_index[match(mo_to_search, AMR_env$MO_lookup$fullname)]
+        minimum_matching_score_current <- minimum_matching_score_current / MO_lookup_current$kingdom_index[match(mo_to_search, MO_lookup_current$fullname)]
        minimum_matching_score_current <- pmax(minimum_matching_score_current, m)
        if (length(x_parts) > 1 && all(m <= 0.55, na.rm = TRUE)) {
          # if the highest score is 0.5, we have nothing serious - 0.5 is the lowest for pathogenic group 1
@ -355,7 +388,7 @@ as.mo <- function(x,
        warning_("No hits found for \"", x_search, "\" with minimum_matching_score = ", ifelse(is.null(minimum_matching_score), paste0("NULL (=", round(min(minimum_matching_score_current, na.rm = TRUE), 3), ")"), minimum_matching_score), ". Try setting this value lower or even to 0.", call = FALSE)
        result_mo <- NA_character_
      } else {
-        result_mo <- AMR_env$MO_lookup$mo[match(top_hits[1], AMR_env$MO_lookup$fullname)]
+        result_mo <- MO_lookup_current$mo[match(top_hits[1], MO_lookup_current$fullname)]
        AMR_env$mo_uncertainties <- rbind_AMR(
          AMR_env$mo_uncertainties,
          data.frame(
@ -373,7 +406,7 @@ as.mo <- function(x,
        AMR_env$mo_previously_coerced <- unique(rbind_AMR(
          AMR_env$mo_previously_coerced,
          data.frame(
-            x = paste(x_search, minimum_matching_score),
+            x = paste(x_search, minimum_matching_score, only_fungi),
            mo = result_mo,
            stringsAsFactors = FALSE
          )
@ -432,7 +465,7 @@ as.mo <- function(x,
  }
  
  # Apply Becker ----
-  if (isTRUE(Becker) || Becker == "all") {
+  if (!isTRUE(only_fungi) && (isTRUE(Becker) || Becker == "all")) {
    # warn when species found that are not in:
    # - Becker et al. 2014, PMID 25278577
    # - Becker et al. 2019, PMID 30872103
@ -462,7 +495,7 @@ as.mo <- function(x,
  }
  
  # Apply Lancefield ----
-  if (isTRUE(Lancefield) || Lancefield == "all") {
+  if (!isTRUE(only_fungi) && (isTRUE(Lancefield) || Lancefield == "all")) {
    # (using `%like_case%` to also match subspecies)
    
    # group A - S. pyogenes
@ -560,7 +593,8 @@ mo_reset_session <- function() {
 mo_cleaning_regex <- function() {
  parts_to_remove <- c("e?spp([^a-z]+|$)", "e?ssp([^a-z]+|$)", "e?ss([^a-z]+|$)", "e?sp([^a-z]+|$)", "e?subsp", "sube?species", "e?species",
                       "biovar[a-z]*", "biotype", "serovar[a-z]*", "var([^a-z]+|$)", "serogr.?up[a-z]*",
-                       "titer", "dummy", "Ig[ADEGM]")
+                       "titer", "dummy", "Ig[ADEGM]", " ?[a-z-]+[-](resistant|susceptible) ?")
+  
  paste0(
    "(",
    "[^A-Za-z- \\(\\)\\[\\]{}]+",
@ -923,7 +957,7 @@ print.mo_uncertainties <- function(x, n = 10, ...) {
                 ifelse(x[i, ]$keep_synonyms == FALSE & x[i, ]$mo %in% AMR_env$MO_lookup$mo[which(AMR_env$MO_lookup$status == "synonym")],
                        paste0(
                          strrep(" ", nchar(x[i, ]$original_input) + 6),
-                          font_red(paste0("This old taxonomic name was converted to ", font_italic(AMR_env$MO_lookup$fullname[match(synonym_mo_to_accepted_mo(x[i, ]$mo), AMR_env$MO_lookup$mo)], collapse = NULL), " (", synonym_mo_to_accepted_mo(x[i, ]$mo), ")."), collapse = NULL)
+                          font_red(paste0("This outdated taxonomic name was converted to ", font_italic(AMR_env$MO_lookup$fullname[match(synonym_mo_to_accepted_mo(x[i, ]$mo), AMR_env$MO_lookup$mo)], collapse = NULL), " (", synonym_mo_to_accepted_mo(x[i, ]$mo), ")."), collapse = NULL)
                        ),
                        ""
                 ),
@ -1233,14 +1267,17 @@ repair_reference_df <- function(reference_df) {
 }

 get_mo_uncertainties <- function() {
-  remember <- list(uncertainties = AMR_env$mo_uncertainties)
+  remember <- list(uncertainties = AMR_env$mo_uncertainties,
+                   failures = AMR_env$mo_failures)
  # empty them, otherwise e.g. mo_shortname("Chlamydophila psittaci") will give 3 notes
  AMR_env$mo_uncertainties <- NULL
+  AMR_env$mo_failures <- NULL
  remember
 }

 load_mo_uncertainties <- function(metadata) {
  AMR_env$mo_uncertainties <- metadata$uncertainties
+  AMR_env$mo_failures <- metadata$failures
 }

 synonym_mo_to_accepted_mo <- function(x, fill_in_accepted = FALSE, dataset = AMR_env$MO_lookup) {
--- a/R/mo_property.R
+++ b/R/mo_property.R
@ -883,6 +883,7 @@ mo_info <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("A
        ref = mo_ref(y, keep_synonyms = keep_synonyms),
        snomed = unlist(mo_snomed(y, keep_synonyms = keep_synonyms)),
        lpsn = mo_lpsn(y, language = language, keep_synonyms = keep_synonyms),
+        mycobank = mo_mycobank(y, language = language, keep_synonyms = keep_synonyms),
        gbif = mo_gbif(y, language = language, keep_synonyms = keep_synonyms),
        group_members = mo_group_members(y, language = language, keep_synonyms = keep_synonyms)
      )
--- a/R/sir.R
+++ b/R/sir.R
@ -34,7 +34,7 @@
 #' These breakpoints are currently implemented:
 #' - For **clinical microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "human")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "human")$guideline)))`;
 #' - For **veterinary microbiology**: EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "animal")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`;
-#' - ECOFFs (Epidemiological cut-off values): EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`.
+#' - For **ECOFFs** (Epidemiological Cut-off Values): EUCAST `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST" & type == "ECOFF")$guideline)))` and CLSI `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "ECOFF")$guideline)))`.
 #' 
 #' All breakpoints used for interpretation are available in our [clinical_breakpoints] data set.
 #' @rdname as.sir
@ -72,7 +72,7 @@
 #'      your_data %>% mutate_if(is.mic, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...))
 #'      
 #'      # for veterinary breakpoints, also set `host`:
-#'      your_data %>% mutate_if(is.mic, as.sir, host = "column_with_animal_hosts", guideline = "CLSI")
+#'      your_data %>% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI")
 #'      ```
 #'    * Operators like "<=" will be stripped before interpretation. When using `conserve_capped_values = TRUE`, an MIC value of e.g. ">2" will always return "R", even if the breakpoint according to the chosen guideline is ">=4". This is to prevent that capped values from raw laboratory data would not be treated conservatively. The default behaviour (`conserve_capped_values = FALSE`) considers ">2" to be lower than ">=4" and might in this case return "S" or "I".
 #' 3. For **interpreting disk diffusion diameters** according to EUCAST or CLSI. You must clean your disk zones first using [as.disk()], that also gives your columns the new data class [`disk`]. Also, be sure to have a column with microorganism names or codes. It will be found automatically, but can be set manually using the `mo` argument.
@ -84,7 +84,7 @@
 #'      your_data %>% mutate_if(is.disk, as.sir, ab = c("cipro", "ampicillin", ...), mo = c("E. coli", "K. pneumoniae", ...))
 #'      
 #'      # for veterinary breakpoints, also set `host`:
-#'      your_data %>% mutate_if(is.disk, as.sir, host = "column_with_animal_hosts", guideline = "CLSI")
+#'      your_data %>% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI")
 #'      ```
 #' 4. For **interpreting a complete data set**, with automatic determination of MIC values, disk diffusion diameters, microorganism names or codes, and antimicrobial test results. This is done very simply by running `as.sir(your_data)`.
 #'
@ -112,10 +112,14 @@
 #'   options(AMR_guideline = "CLSI")
 #'   options(AMR_breakpoint_type = "animal")
 #' ```
+#' 
+#' When applying veterinary breakpoints (by setting `host` or by setting `breakpoint_type = "animal"`), the [CLSI VET09 guideline](https://clsi.org/standards/products/veterinary-medicine/documents/vet09/) will be applied to cope with missing animal species-specific breakpoints.
 #'
 #' ### After Interpretation
 #'
 #' After using [as.sir()], you can use the [eucast_rules()] defined by EUCAST to (1) apply inferred susceptibility and resistance based on results of other antimicrobials and (2) apply intrinsic resistance based on taxonomic properties of a microorganism.
+#' 
+#' To determine which isolates are multi-drug resistant, be sure to run [mdro()] (which applies the MDR/PDR/XDR guideline from 2012 at default) on a data set that contains S/I/R values. Read more about [interpreting multidrug-resistant organisms here][mdro()].
 #'
 #' ### Machine-Readable Clinical Breakpoints
 #'
@ -150,7 +154,8 @@
 #'
 #' - **CLSI M39: Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). <https://clsi.org/standards/products/microbiology/documents/m39/>.
 #' - **CLSI M100: Performance Standard for Antimicrobial Susceptibility Testing**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type != "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type != "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). <https://clsi.org/standards/products/microbiology/documents/m100/>.
-#' - **CLSI VET01: Performance Standards for Antimicrobial Disk and Dilution Susceptibility Tests for Bacteria Isolated From Animals**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). <https://clsi.org/standards/products/veterinary-medicine/documents/vet01//>.
+#' - **CLSI VET01: Performance Standards for Antimicrobial Disk and Dilution Susceptibility Tests for Bacteria Isolated From Animals**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). <https://clsi.org/standards/products/veterinary-medicine/documents/vet01/>.
+#' - **CLSI VET09: Understanding Susceptibility Test Data as a Component of Antimicrobial Stewardship in Veterinary Settings**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "CLSI" & type == "animal")$guideline)))`, *Clinical and Laboratory Standards Institute* (CLSI). <https://clsi.org/standards/products/veterinary-medicine/documents/vet09/>.
 #' - **EUCAST Breakpoint tables for interpretation of MICs and zone diameters**, `r min(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`-`r max(as.integer(gsub("[^0-9]", "", subset(AMR::clinical_breakpoints, guideline %like% "EUCAST")$guideline)))`, *European Committee on Antimicrobial Susceptibility Testing* (EUCAST). <https://www.eucast.org/clinical_breakpoints>.
 #' - **WHONET** as a source for machine-reading the clinical breakpoints ([read more here](https://msberends.github.io/AMR/reference/clinical_breakpoints.html#imported-from-whonet)), 1989-`r max(as.integer(gsub("[^0-9]", "", AMR::clinical_breakpoints$guideline)))`, *WHO Collaborating Centre for Surveillance of Antimicrobial Resistance*. <https://whonet.org/>.
 #' 
@ -621,7 +626,7 @@ as.sir.data.frame <- function(x,
  if (missing(breakpoint_type) && any(host %in% AMR_env$host_preferred_order, na.rm = TRUE)) {
    message_("Assuming `breakpoint_type = \"animal\"` since `host` contains animal species.")
    breakpoint_type <- "animal"
-  } else if (any(!convert_host(host) %in% c("human", "ECOFF"), na.rm = TRUE)) {
+  } else if (any(!suppressMessages(convert_host(host)) %in% c("human", "ECOFF"), na.rm = TRUE)) {
    message_("Assuming `breakpoint_type = \"animal\"`.")
    breakpoint_type <- "animal"
  }
@ -915,6 +920,7 @@ as_sir_method <- function(method_short,
  guideline_coerced <- get_guideline(guideline, reference_data)
  
  if (message_not_thrown_before("as.sir", "sir_interpretation_history")) {
+    message()
    message_("Run `sir_interpretation_history()` afterwards to retrieve a logbook with all the details of the breakpoint interpretations.\n\n", add_fn = font_green)
  }
  
@ -938,6 +944,7 @@ as_sir_method <- function(method_short,
      host <- breakpoint_type
    }
  }
+  
  if (!is.null(host) && !all(toupper(as.character(host)) %in% c("HUMAN", "ECOFF"))) {
    if (!is.null(current_df) && length(host) == 1 && host %in% colnames(current_df) && any(current_df[[host]] %like% "[A-Z]", na.rm = TRUE)) {
      host <- current_df[[host]]
@ -954,8 +961,14 @@ as_sir_method <- function(method_short,
  }
  host.bak <- host
  host <- convert_host(host)
-  if (breakpoint_type == "animal" && message_not_thrown_before("as.sir", "host_preferred_order")) {
-    message_("Please note that in the absence of specific veterinary breakpoints for certain animal hosts, breakpoints for dogs, cattle, swine, cats, horse, aquatic, and poultry, in that order, are used as substitutes.\n\n")
+  if (any(is.na(host) & !is.na(host.bak)) && message_not_thrown_before("as.sir", "missing_hosts")) {
+    warning_("The following animal host(s) could not be coerced: ", vector_and(host.bak[is.na(host) & !is.na(host.bak)]), immediate = TRUE)
+    message() # new line
+  }
+  if (breakpoint_type == "animal" && message_not_thrown_before("as.sir", "host_missing_breakpoints")) {
+    if (guideline_coerced %like% "CLSI") {
+      message_("Please note that in the absence of specific veterinary breakpoints for certain animal hosts, the CLSI guideline VET09 will be applied where possible.\n\n")
+    }
  }
  
  # get ab
@ -1081,6 +1094,7 @@ as_sir_method <- function(method_short,
    }
  }
  
+  # format agents ----
  agent_formatted <- paste0("'", font_bold(ab.bak, collapse = NULL), "'")
  agent_name <- ab_name(ab, tolower = TRUE, language = NULL)
  same_ab <- generalise_antibiotic_name(ab) == generalise_antibiotic_name(agent_name)
@ -1101,6 +1115,7 @@ as_sir_method <- function(method_short,
                             ""),
                      "... ")

+  # prepare used arguments ----
  method <- method_short

  metadata_mo <- get_mo_uncertainties()
@ -1133,7 +1148,6 @@ as_sir_method <- function(method_short,
    host = host,
    stringsAsFactors = FALSE
  )
-  
  if (method == "mic") {
    # when as.sir.mic is called directly
    df$values <- as.mic(df$values)
@ -1141,11 +1155,16 @@ as_sir_method <- function(method_short,
    # when as.sir.disk is called directly
    df$values <- as.disk(df$values)
  }
+  
  df_unique <- unique(df[ , c("mo", "ab", "uti", "host"), drop = FALSE])
  
-  # get all breakpoints
+  # get all breakpoints, use humans as backup for animals
+  breakpoint_type_lookup <- breakpoint_type
+  if (breakpoint_type == "animal") {
+    breakpoint_type_lookup <- c(breakpoint_type, "human")
+  }
  breakpoints <- breakpoints %pm>%
-    subset(type == breakpoint_type)
+    subset(type %in% breakpoint_type_lookup)
  
  if (isFALSE(include_screening)) {
    # remove screening rules from the breakpoints table
@ -1193,16 +1212,12 @@ as_sir_method <- function(method_short,
    }
  }
  
-  # run the rules (df_unique is a row combination per mo/ab/uti/host)
+  # run the rules (df_unique is a row combination per mo/ab/uti/host) ----
  for (i in seq_len(nrow(df_unique))) {
    p$tick()
    mo_current <- df_unique[i, "mo", drop = TRUE]
    ab_current <- df_unique[i, "ab", drop = TRUE]
    host_current <- df_unique[i, "host", drop = TRUE]
-    if (is.na(host_current)) {
-      # fall back to human
-      host_current <- "human"
-    }
    uti_current <- df_unique[i, "uti", drop = TRUE]
    notes_current <- character(0)
    if (isFALSE(uti_current)) {
@ -1212,9 +1227,7 @@ as_sir_method <- function(method_short,
      rows <- which(df$mo == mo_current & df$ab == ab_current & df$host == host_current & df$uti == uti_current)
    }
    if (length(rows) == 0) {
-      notes_current <- c(notes_current, font_red("Returned an empty result, which is unexpected. Are all of `mo`, `ab`, and `host` set and available?"))
-      notes <- c(notes, notes_current)
-      rise_warning <- TRUE
+      # this can happen if a host is unavailable, just continue with the next one, since a note about hosts having NA are already given at this point
      next
    }
    values <- df[rows, "values", drop = TRUE]
@ -1254,6 +1267,72 @@ as_sir_method <- function(method_short,
        mo_current_other
      ))
    
+    ## fall-back methods for veterinary guidelines ----
+    if (breakpoint_type == "animal" && !host_current %in% breakpoints_current$host) {
+      if (guideline_coerced %like% "CLSI") {
+        # VET09 says that staph/strep/enterococcus BP can be extrapolated to all Gr+ cocci except for intrinsic resistance, so take all Gr+ cocci:
+        all_gram_pos_genera <- c("B_STPHY", "B_STRPT", "B_ENTRC", "B_PPTST", "B_AERCC", "B_MCRCCC", "B_TRPRL")
+        
+        # HUMAN SUBSTITUTES
+        if (ab_current == "AZM" && mo_current_genus %in% all_gram_pos_genera && host_current %in% c("dogs", "cats", "horse")) {
+          # azithro can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09."))
+        } else if (ab_current == "CTX" && mo_current_order == "B_[ORD]_ENTRBCTR" && host_current %in% c("dogs", "cats", "horse")) {
+          # cefotax can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales based on CLSI VET09."))
+        } else if (ab_current == "CAZ" && (mo_current_order == "B_[ORD]_ENTRBCTR" | mo_current == "B_PSDMN_AERG") && host_current %in% c("dogs", "cats", "horse")) {
+          # cefta can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales and ", font_italic("P. aeruginosa"), " based on CLSI VET09."))
+        } else if (ab_current == "ERY" && mo_current_genus %in% all_gram_pos_genera && host_current %in% c("dogs", "cats", "horse")) {
+          # erythro can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09."))
+        } else if (ab_current == "IPM" && (mo_current_order == "B_[ORD]_ENTRBCTR" | mo_current == "B_PSDMN_AERG") && host_current %in% c("dogs", "cats", "horse")) {
+          # imipenem can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Enterobacterales and ", font_italic("P. aeruginosa"), " based on CLSI VET09."))
+        } else if (ab_current == "LNZ" && mo_current_genus %in% all_gram_pos_genera && host_current %in% c("dogs", "cats")) {
+          # linezolid can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in staphylococci/enterococci based on CLSI VET09."))
+        } else if (ab_current == "NIT" && host_current %in% c("dogs", "cats")) {
+          # nitro can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09."))
+        } else if (ab_current == "PEN" && mo_current_genus %in% all_gram_pos_genera && host_current %in% c("dogs", "cats")) {
+          # penicillin can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in Gram-positive cocci based on CLSI VET09."))
+        } else if (ab_current == "RIF" && mo_current_genus %in% all_gram_pos_genera && host_current %in% c("dogs", "cats")) {
+          # rifampicin can take human breakpoints for staphylococci
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " in staphylococci based on CLSI VET09."))
+        } else if (ab_current == "SXT" && host_current %in% c("dogs", "cats", "horse")) {
+          # trimethoprim-sulfamethoxazole (TMS) can take human breakpoints for these agents
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09."))
+        } else if (ab_current == "VAN" && host_current %in% c("dogs", "cats", "horse")) {
+          # vancomycin can take human breakpoints in these hosts
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", ab_formatted, " based on CLSI VET09."))
+          
+        } else if (host_current %in% c("dogs", "cats") && (mo_current_genus %in% c("B_AMYCS", "B_NOCRD", "B_CMPYL", "B_CRYNB", "B_ENTRC", "B_MYCBC", "B_PSDMN", "B_AERMN") | mo_current_class == "B_[CLS]_BTPRTBCT" | mo_current == "B_LISTR_MNCY")) {
+          # human breakpoints if no canine/feline
+          breakpoints_current <- breakpoints_current %pm>% subset(host == "human")
+          notes_current <- c(notes_current, paste0("Using ", font_bold("human"), " breakpoints for ", mo_formatted, " based on CLSI VET09."))
+          
+        } else {
+          # no specific CLSI solution for this, so only filter on current host (if no breakpoints available -> too bad)
+          breakpoints_current <- breakpoints_current %pm>%
+            subset(host == host_current)
+        }
+      }
+      
+    }
+    
    if (NROW(breakpoints_current) == 0) {
      AMR_env$sir_interpretation_history <- rbind_AMR(
        AMR_env$sir_interpretation_history,
@ -1282,9 +1361,6 @@ as_sir_method <- function(method_short,
      next
    }
    
-    # set the host index according to most available breakpoints (see R/zzz.R where this is set in the pkg environment)
-    breakpoints_current$host_index <- match(breakpoints_current$host, c("human", "ECOFF", AMR_env$host_preferred_order))
-    
    # sort on host and taxonomic rank
    # (this will e.g. prefer 'species' breakpoints over 'order' breakpoints)
    if (all(uti_current == FALSE, na.rm = TRUE)) {
@ -1295,25 +1371,12 @@ as_sir_method <- function(method_short,
                                     ifelse(is.na(uti), 2,
                                            3))) %pm>%
        # be as specific as possible (i.e. prefer species over genus):
-        pm_arrange(host_index, rank_index, uti_index)
+        pm_arrange(rank_index, uti_index)
    } else if (all(uti_current == TRUE, na.rm = TRUE)) {
      breakpoints_current <- breakpoints_current %pm>%
        subset(uti == TRUE) %pm>%
        # be as specific as possible (i.e. prefer species over genus):
-        pm_arrange(host_index, rank_index)
-    }
-    
-    # veterinary host check
-    host_current <- unique(df_unique[i, "host", drop = TRUE])[1]
-    breakpoints_current$host_match <- breakpoints_current$host == host_current
-    if (breakpoint_type == "animal") {
-      if (any(breakpoints_current$host_match == TRUE, na.rm = TRUE)) {
-        breakpoints_current <- breakpoints_current %pm>%
-          subset(host_match == TRUE)
-      } else {
-        # no breakpoint found for this host, so sort on mostly available guidelines
-        notes_current <- c(notes_current, paste0("Using ", font_bold(breakpoints_current$host[1]), " breakpoints since ", font_bold(host_current), " for ", ab_formatted, " in ", mo_formatted, " are not available."))
-      }
+        pm_arrange(rank_index)
    }
 
    # throw messages for different body sites
--- a/R/sysdata.rda
+++ b/R/sysdata.rda
--- a/R/zzz.R
+++ b/R/zzz.R
@ -84,6 +84,9 @@ AMR_env$chin <- import_fn("%chin%", "data.table", error_on_fail = FALSE)
 # take cli symbols and error function if available
 AMR_env$info_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$info %or% "i"
 AMR_env$bullet_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$bullet %or% "*"
+
+AMR_env$cross_icon <- if (isTRUE(base::l10n_info()$`UTF-8`)) "\u00d7" else "x"
+
 AMR_env$dots <- import_fn("symbol", "cli", error_on_fail = FALSE)$ellipsis %or% "..."
 AMR_env$sup_1_icon <- import_fn("symbol", "cli", error_on_fail = FALSE)$sup_1 %or% "*"
 AMR_env$cli_abort <- import_fn("cli_abort", "cli", error_on_fail = FALSE)
--- a/data-raw/ab.md5
+++ b/data-raw/ab.md5
@ -1 +1 @@
-92d3e2f8deac335c92841d2ded974dee
+5ad894790ad048110f8eb9207b89501f
--- a/data-raw/antibiotics.dta
+++ b/data-raw/antibiotics.dta
--- a/data-raw/antibiotics.feather
+++ b/data-raw/antibiotics.feather
--- a/data-raw/antibiotics.parquet
+++ b/data-raw/antibiotics.parquet
--- a/data-raw/antibiotics.rds
+++ b/data-raw/antibiotics.rds
--- a/data-raw/antibiotics.sav
+++ b/data-raw/antibiotics.sav
--- a/data-raw/antibiotics.txt
+++ b/data-raw/antibiotics.txt
@ -468,7 +468,7 @@
 "TOH"		"Tobramycin-high"	"Aminoglycosides"	"NA"			"tobra high,tobramycin high,tohl"	""					""
 "TFX"	5517	"Tosufloxacin"	"Quinolones"	"J01MA22,S01AE09"			""	"tosufloxacin"	0.45	"g"			"100061-1,76146-0"
 "TMP"	5578	"Trimethoprim"	"Trimethoprims"	"J01EA01"	"Sulfonamides and trimethoprim"	"Trimethoprim and derivatives"	"t,tmp,tr,tri,trim,w"	"abaprim,alprim,anitrim,antrima,antrimox,bacdan,bacidal,bacide,bacterial,bacticel,bactifor,bactin,bactoprim,bactramin,bactrim,bencole,bethaprim,biosulten,briscotrim,chemotrin,colizole,colizole ds,conprim,cotrimel,cotrimoxizole,deprim,dosulfin,duocide,esbesul,espectrin,euctrim,exbesul,fermagex,fortrim,idotrim,ikaprim,infectotrimet,instalac,kombinax,lagatrim,lagatrim forte,lastrim,lescot,methoprim,metoprim,monoprim,monotrim,monotrimin,novotrimel,omstat,oraprim,pancidim,polytrim,priloprim,primosept,primsol,proloprim,protrin,purbal,resprim,resprim forte,roubac,roubal,salvatrim,septrin ds,septrin forte,septrin s,setprin,sinotrim,stopan,streptoplus,sugaprim,sulfamar,sulfamethoprim,sulfoxaprim,sulthrim,sultrex,syraprim,tiempe,tmp smx,toprim,trimanyl,trimethioprim,trimethopim,trimethoprim,trimethoprime,trimethoprimum,trimethopriom,trimetoprim,trimetoprima,trimexazole,trimexol,trimezol,trimogal,trimono,trimopan,trimpex,triprim,trisul,trisulcom,trisulfam,trisural,uretrim,urobactrim,utetrin,velaten,wellcoprim,wellcoprin,xeroprim,zamboprim"	0.4	"g"	0.4	"g"	"101495-0,11005-6,17747-7,18997-7,18998-5,20387-7,23614-1,23631-5,25273-4,32342-8,4079-0,4080-8,4081-6,511-6,512-4,513-2,514-0,515-7,516-5,517-3,518-1,55584-7,7056-5,7057-3,80552-3,80973-1"
-"SXT"	358641	"Trimethoprim/sulfamethoxazole"	"Trimethoprims"	"J01EE01"	"Sulfonamides and trimethoprim"	"Combinations of sulfonamides and trimethoprim, incl. derivatives"	"cot,cotrim,sxt,t/s,trsu,trsx,ts"	"abacin,abactrim,agoprim,alfatrim,aposulfatrim,bacteral,bacterial forte,bactilen,bactiver,bacton,bactoreduct,bactrim,bactrim ds,bactrim forte,bactrim pediatric,bactrimel,bactrizol,bactromin,bactropin,baktar,belcomycine,berlocid,bibacrim,biseptol,chemitrim,chemotrim,ciplin,colimycin,colimycin sulphate,colisticin,colistimethate,colistimethate sodium,colistin sulfate,colistin sulphate,colomycin,coly-mycin,cotribene,cotrim d.s.,cotrim eu rho,cotrim holsen,cotrim.l.u.t.,cotrimaxazol,cotrimazole,cotrimhexal,cotrimoxazol,cotrimoxazol al,cotrimoxazole,cotrimstada,cotriver,dibaprim,drylin,duratrimet,eltrianyl,escoprim,esteprim,eusaprim,fectrim,gantaprim,gantaprin,gantrim,groprim,helveprim,imexim,jenamoxazol,kemoprim,kepinol,kepinol forte,laratrim,linaris,maxtrim,microtrim,microtrim forte,mikrosid,momentol,oecotrim,oriprim,oxaprim,pantoprim,polymyxin e,polymyxin e. sulfate,primazole,promixin,septra,septra ds,septra grape,septrim,septrin,servitrim,sigaprim,sigaprin,sulfatrim pediatric,sulfotrim,sulfotrimin,sulmeprim pediatric,sulprim,sumetrolim,supracombin,suprim,tacumil,teleprim,teleprin,thiocuran,totazina,tribakin,trifen,trigonyl,trimesulf,trimetho comp,trimethoprimsulfa,trimetoger,trimexazol,trimforte,trimosulfa,uroplus,uroplus ds,uroplus ss"					"101495-0,18998-5,20387-7,23631-5,25273-4,32342-8,4081-6,515-7,516-5,517-3,518-1,7057-3"
+"SXT"	358641	"Trimethoprim/sulfamethoxazole"	"Trimethoprims"	"J01EE01"	"Sulfonamides and trimethoprim"	"Combinations of sulfonamides and trimethoprim, incl. derivatives"	"cot,cotrim,sxt,t/s,tms,trsu,trsx,ts"	"abacin,abactrim,agoprim,alfatrim,aposulfatrim,bacteral,bacterial forte,bactilen,bactiver,bacton,bactoreduct,bactrim,bactrim ds,bactrim forte,bactrim pediatric,bactrimel,bactrizol,bactromin,bactropin,baktar,belcomycine,berlocid,bibacrim,biseptol,chemitrim,chemotrim,ciplin,colimycin,colimycin sulphate,colisticin,colistimethate,colistimethate sodium,colistin sulfate,colistin sulphate,colomycin,coly-mycin,cotribene,cotrim d.s.,cotrim eu rho,cotrim holsen,cotrim.l.u.t.,cotrimaxazol,cotrimazole,cotrimhexal,cotrimoxazol,cotrimoxazol al,cotrimoxazole,cotrimstada,cotriver,dibaprim,drylin,duratrimet,eltrianyl,escoprim,esteprim,eusaprim,fectrim,gantaprim,gantaprin,gantrim,groprim,helveprim,imexim,jenamoxazol,kemoprim,kepinol,kepinol forte,laratrim,linaris,maxtrim,microtrim,microtrim forte,mikrosid,momentol,oecotrim,oriprim,oxaprim,pantoprim,polymyxin e,polymyxin e. sulfate,primazole,promixin,septra,septra ds,septra grape,septrim,septrin,servitrim,sigaprim,sigaprin,sulfatrim pediatric,sulfotrim,sulfotrimin,sulmeprim pediatric,sulprim,sumetrolim,supracombin,suprim,tacumil,teleprim,teleprin,thiocuran,totazina,tribakin,trifen,trigonyl,trimesulf,trimetho comp,trimethoprimsulfa,trimetoger,trimexazol,trimforte,trimosulfa,uroplus,uroplus ds,uroplus ss"					"101495-0,18998-5,20387-7,23631-5,25273-4,32342-8,4081-6,515-7,516-5,517-3,518-1,7057-3"
 "TRL"	202225	"Troleandomycin"	"Macrolides/lincosamides"	"J01FA08"	"Macrolides, lincosamides and streptogramins"	"Macrolides"	""	"acetyloleandomycin,aovine,cyclamycin,evramicina,matromicina,matromycin t,micotil,oleandocetine,oleandomycin,t.a.o.,treolmicina,tribiocillina,triocetin,triolan,troleandomicina,troleandomycin,troleandomycine,troleandomycinum,viamicina,wytrion"	1	"g"			"18999-3,519-9,520-7,521-5,522-3"
 "TRO"	55886	"Trospectomycin"	"Other antibacterials"	"NA"			""	"rubidiumnitrate,trospectinomycin,trospectomicina,trospectomycin,trospectomycine,trospectomycinum"					""
 "TVA"	62959	"Trovafloxacin"	"Quinolones"	"J01MA13"	"Quinolone antibacterials"	"Fluoroquinolones"	"trov"	"trovafloxacin,trovan"	0.2	"g"	0.2	"g"	"23642-2,23643-0,35855-6,7058-1"
--- a/data-raw/antibiotics.xlsx
+++ b/data-raw/antibiotics.xlsx
--- a/data-raw/antibiotics.xpt
+++ b/data-raw/antibiotics.xpt
--- a/data-raw/reproduction_of_microorganisms.R
+++ b/data-raw/reproduction_of_microorganisms.R
@ -736,6 +736,8 @@ taxonomy_mycobank <- taxonomy_mycobank %>%

 # Combine the datasets ----------------------------------------------------

+# TODO !! check why e.g. Clavispora lusitaniae is gotten from GBIF, not MycoBank !!
+
 taxonomy <- taxonomy_lpsn %>%
  # add fungi
  bind_rows(taxonomy_mycobank) %>% 
@ -1237,9 +1239,9 @@ saveRDS(taxonomy, "data-raw/taxonomy2.rds")
 # taxonomy <- readRDS("data-raw/taxonomy2.rds")


-# Remove unwanted taxonomic entries from Protoza/Fungi --------------------
+# Remove unwanted taxonomic entries ---------------------------------------

-taxonomy <- taxonomy %>%
+part1 <- taxonomy %>%
  filter(
    # keep all we added ourselves:
    source == "manually added" |
@ -1259,10 +1261,7 @@ taxonomy <- taxonomy %>%
      # kingdom of Protozoa:
      (phylum %in% c("Choanozoa", "Mycetozoa") & prevalence < 2) |
      # Fungi:
-      (kingdom == "Fungi" & (!rank %in% c("genus", "species", "subspecies") | prevalence < 2)) |
-      # !(phylum %in% c("Ascomycota", "Zygomycota", "Basidiomycota") & prevalence == 2 & rank %in% c("genus", "species", "subspecies")),
-      # !(genus %in% c("Leptosphaeria", "Physarum") & rank %in% c("species", "subspecies")), # keep only genus of this rare fungus, with resp. 850 and 500 species
-      # # (leave Alternaria in there, part of human mycobiome and opportunistic pathogen)
+      (kingdom == "Fungi" & (!rank %in% c("genus", "species", "subspecies") | prevalence < 2 | class == "Pichiomycetes")) |
      # Animalia:
      genus %in% c("Lucilia", "Lumbricus") |
      (class == "Insecta" & !rank %in% c("species", "subspecies")) | # keep only genus of insects, not all of their (sub)species
@ -1272,6 +1271,29 @@ taxonomy <- taxonomy %>%
  filter(kingdom != "Plantae",
         !(genus %in% c("Aedes", "Anopheles") & rank %in% c("species", "subspecies")))

+# now get the parents and old names
+part2 <- taxonomy %>% 
+  filter(gbif %in% c(part1$gbif_parent[!is.na(part1$gbif_parent)], part1$gbif_renamed_to[!is.na(part1$gbif_renamed_to)]) |
+           mycobank %in% c(part1$mycobank_parent[!is.na(part1$mycobank_parent)], part1$mycobank_renamed_to[!is.na(part1$mycobank_renamed_to)]) |
+           lpsn %in% c(part1$lpsn_parent[!is.na(part1$lpsn_parent)], part1$lpsn_renamed_to[!is.na(part1$lpsn_renamed_to)]))
+parts <- bind_rows(part1, part2)
+
+part3 <- taxonomy %>% 
+  filter(gbif %in% c(parts$gbif_parent[!is.na(parts$gbif_parent)], parts$gbif_renamed_to[!is.na(parts$gbif_renamed_to)]) |
+           mycobank %in% c(parts$mycobank_parent[!is.na(parts$mycobank_parent)], parts$mycobank_renamed_to[!is.na(parts$mycobank_renamed_to)]) |
+           lpsn %in% c(parts$lpsn_parent[!is.na(parts$lpsn_parent)], parts$lpsn_renamed_to[!is.na(parts$lpsn_renamed_to)]))
+parts <- bind_rows(part1, part2, part3)
+
+part4 <- taxonomy %>% 
+  filter(gbif %in% c(parts$gbif_parent[!is.na(parts$gbif_parent)], parts$gbif_renamed_to[!is.na(parts$gbif_renamed_to)]) |
+           mycobank %in% c(parts$mycobank_parent[!is.na(parts$mycobank_parent)], parts$mycobank_renamed_to[!is.na(parts$mycobank_renamed_to)]) |
+           lpsn %in% c(parts$lpsn_parent[!is.na(parts$lpsn_parent)], parts$lpsn_renamed_to[!is.na(parts$lpsn_renamed_to)]))
+parts <- bind_rows(part1, part2, part3, part4)
+
+taxonomy <- bind_rows(part1, part2, part3, part4) %>% 
+  arrange(fullname) %>% 
+  distinct(fullname, .keep_all = TRUE)
+
 # no ghost families, orders classes, phyla
 taxonomy <- taxonomy %>%
  group_by(kingdom, family) %>%
@ -2005,7 +2027,7 @@ taxonomy <- taxonomy %>%
  filter(!mo %in% groups$mo) %>% 
  bind_rows(groups)

-# we added MO code, so make sure everything is still unique
+# we added an MO code, so make sure everything is still unique
 any(duplicated(taxonomy$mo))
 any(duplicated(taxonomy$fullname))

@ -2018,13 +2040,17 @@ AMR::clinical_breakpoints %>% filter(!mo %in% taxonomy$mo)
 AMR::example_isolates %>% filter(!mo %in% taxonomy$mo)
 AMR::intrinsic_resistant %>% filter(!mo %in% taxonomy$mo)

+# all our previously manually added names should be in it
+all(microorganisms$fullname[microorganisms$source == "manually added"] %in% taxonomy$fullname)
+microorganisms$fullname[!microorganisms$fullname[microorganisms$source == "manually added"] %in% taxonomy$fullname]
+
 # put this one back
 taxonomy <- taxonomy %>% 
  bind_rows(microorganisms %>%
              filter(fullname == "Blastocystis hominis") %>% 
              mutate(mo = as.character(mo)))

-# we added MO code, so make sure everything is still unique
+# we added an MO code, so make sure everything is still unique
 any(duplicated(taxonomy$mo))
 any(duplicated(taxonomy$fullname))

--- a/data/antibiotics.rda
+++ b/data/antibiotics.rda
--- a/man/as.mo.Rd
+++ b/man/as.mo.Rd
@ -20,6 +20,7 @@ as.mo(
  reference_df = get_mo_source(),
  ignore_pattern = getOption("AMR_ignore_pattern", NULL),
  cleaning_regex = getOption("AMR_cleaning_regex", mo_cleaning_regex()),
+  only_fungi = getOption("AMR_only_fungi", FALSE),
  language = get_AMR_locale(),
  info = interactive(),
  ...
@ -58,9 +59,11 @@ This excludes enterococci at default (who are in group D), use \code{Lancefield

 \item{cleaning_regex}{a Perl-compatible \link[base:regex]{regular expression} (case-insensitive) to clean the input of \code{x}. Every matched part in \code{x} will be removed. At default, this is the outcome of \code{\link[=mo_cleaning_regex]{mo_cleaning_regex()}}, which removes texts between brackets and texts such as "species" and "serovar". The default can be set with the \link[=AMR-options]{package option} \code{\link[=AMR-options]{AMR_cleaning_regex}}.}

+\item{only_fungi}{a \link{logical} to indicate if only fungi must be found, making sure that e.g. misspellings always return records from the kingdom of Fungi. This can be set globally for \link[=mo_property]{all microorganism functions} with the \link[=AMR-options]{package option} \code{\link[=AMR-options]{AMR_only_fungi}}, i.e. \code{options(AMR_only_fungi = TRUE)}.}
+
 \item{language}{language to translate text like "no growth", which defaults to the system language (see \code{\link[=get_AMR_locale]{get_AMR_locale()}})}

-\item{info}{a \link{logical} to indicate if a progress bar should be printed if more than 25 items are to be coerced - the default is \code{TRUE} only in interactive mode}
+\item{info}{a \link{logical} to indicate that info must be printed, e.g. a progress bar when more than 25 items are to be coerced, or a list with old taxonomic names. The default is \code{TRUE} only in interactive mode.}

 \item{...}{other arguments passed on to functions}
 }
@ -87,15 +90,11 @@ A microorganism (MO) code from this package (class: \code{\link{mo}}) is human r
                            F (Fungi), PL (Plantae), P (Protozoa)
 }\if{html}{\out{</div>}}

-Values that cannot be coerced will be considered 'unknown' and will be returned as the MO code \code{UNKNOWN} with a warning.
+Values that cannot be coerced will be considered 'unknown' and will return the MO code \code{UNKNOWN} with a warning.

 Use the \code{\link[=mo_property]{mo_*}} functions to get properties based on the returned code, see \emph{Examples}.

-The \code{\link[=as.mo]{as.mo()}} function uses a novel \link[=mo_matching_score]{matching score algorithm} (see \emph{Matching Score for Microorganisms} below) to match input against the \link[=microorganisms]{available microbial taxonomy} in this package. This will lead to the effect that e.g. \code{"E. coli"} (a microorganism highly prevalent in humans) will return the microbial ID of \emph{Escherichia coli} and not \emph{Entamoeba coli} (a microorganism less prevalent in humans), although the latter would alphabetically come first.
-
-With \code{Becker = TRUE}, the following 89 staphylococci will be converted to the \strong{coagulase-negative group}: \emph{S. americanisciuri}, \emph{S. argensis}, \emph{S. arlettae}, \emph{S. auricularis}, \emph{S. borealis}, \emph{S. brunensis}, \emph{S. caeli}, \emph{S. caledonicus}, \emph{S. canis}, \emph{S. capitis}, \emph{S. capitis capitis}, \emph{S. capitis urealyticus}, \emph{S. capitis ureolyticus}, \emph{S. caprae}, \emph{S. carnosus}, \emph{S. carnosus carnosus}, \emph{S. carnosus utilis}, \emph{S. casei}, \emph{S. caseolyticus}, \emph{S. chromogenes}, \emph{S. cohnii}, \emph{S. cohnii cohnii}, \emph{S. cohnii urealyticum}, \emph{S. cohnii urealyticus}, \emph{S. condimenti}, \emph{S. croceilyticus}, \emph{S. debuckii}, \emph{S. devriesei}, \emph{S. durrellii}, \emph{S. edaphicus}, \emph{S. epidermidis}, \emph{S. equorum}, \emph{S. equorum equorum}, \emph{S. equorum linens}, \emph{S. felis}, \emph{S. fleurettii}, \emph{S. gallinarum}, \emph{S. haemolyticus}, \emph{S. hominis}, \emph{S. hominis hominis}, \emph{S. hominis novobiosepticus}, \emph{S. jettensis}, \emph{S. kloosii}, \emph{S. lentus}, \emph{S. lloydii}, \emph{S. lugdunensis}, \emph{S. marylandisciuri}, \emph{S. massiliensis}, \emph{S. microti}, \emph{S. muscae}, \emph{S. nepalensis}, \emph{S. pasteuri}, \emph{S. petrasii}, \emph{S. petrasii croceilyticus}, \emph{S. petrasii jettensis}, \emph{S. petrasii petrasii}, \emph{S. petrasii pragensis}, \emph{S. pettenkoferi}, \emph{S. piscifermentans}, \emph{S. pragensis}, \emph{S. pseudoxylosus}, \emph{S. pulvereri}, \emph{S. ratti}, \emph{S. rostri}, \emph{S. saccharolyticus}, \emph{S. saprophyticus}, \emph{S. saprophyticus bovis}, \emph{S. saprophyticus saprophyticus}, \emph{S. schleiferi}, \emph{S. schleiferi schleiferi}, \emph{S. sciuri}, \emph{S. sciuri carnaticus}, \emph{S. sciuri lentus}, \emph{S. sciuri rodentium}, \emph{S. sciuri sciuri}, \emph{S. shinii}, \emph{S. simulans}, \emph{S. stepanovicii}, \emph{S. succinus}, \emph{S. succinus casei}, \emph{S. succinus succinus}, \emph{S. taiwanensis}, \emph{S. urealyticus}, \emph{S. ureilyticus}, \emph{S. veratri}, \emph{S. vitulinus}, \emph{S. vitulus}, \emph{S. warneri}, and \emph{S. xylosus}.\cr The following 16 staphylococci will be converted to the \strong{coagulase-positive group}: \emph{S. agnetis}, \emph{S. argenteus}, \emph{S. coagulans}, \emph{S. cornubiensis}, \emph{S. delphini}, \emph{S. hyicus}, \emph{S. hyicus chromogenes}, \emph{S. hyicus hyicus}, \emph{S. intermedius}, \emph{S. lutrae}, \emph{S. pseudintermedius}, \emph{S. roterodami}, \emph{S. schleiferi coagulans}, \emph{S. schweitzeri}, \emph{S. simiae}, and \emph{S. singaporensis}.
-
-With \code{Lancefield = TRUE}, the following streptococci will be converted to their corresponding Lancefield group: \emph{S. agalactiae} (Group B), \emph{S. anginosus anginosus} (Group F), \emph{S. anginosus whileyi} (Group F), \emph{S. anginosus} (Group F), \emph{S. canis} (Group G), \emph{S. dysgalactiae dysgalactiae} (Group C), \emph{S. dysgalactiae equisimilis} (Group C), \emph{S. dysgalactiae} (Group C), \emph{S. equi equi} (Group C), \emph{S. equi ruminatorum} (Group C), \emph{S. equi zooepidemicus} (Group C), \emph{S. equi} (Group C), \emph{S. pyogenes} (Group A), \emph{S. salivarius salivarius} (Group K), \emph{S. salivarius thermophilus} (Group K), \emph{S. salivarius} (Group K), and \emph{S. sanguinis} (Group H).
+The \code{\link[=as.mo]{as.mo()}} function uses a novel and scientifically validated (\doi{10.18637/jss.v104.i03}) matching score algorithm (see \emph{Matching Score for Microorganisms} below) to match input against the \link[=microorganisms]{available microbial taxonomy} in this package. This implicates that e.g. \code{"E. coli"} (a microorganism highly prevalent in humans) will return the microbial ID of \emph{Escherichia coli} and not \emph{Entamoeba coli} (a microorganism less prevalent in humans), although the latter would alphabetically come first.
 \subsection{Coping with Uncertain Results}{

 Results of non-exact taxonomic input are based on their \link[=mo_matching_score]{matching score}. The lowest allowed score can be set with the \code{minimum_matching_score} argument. At default this will be determined based on the character length of the input, and the \link[=microorganisms]{taxonomic kingdom} and \link[=mo_matching_score]{human pathogenicity} of the taxonomic outcome. If values are matched with uncertainty, a message will be shown to suggest the user to evaluate the results with \code{\link[=mo_uncertainties]{mo_uncertainties()}}, which returns a \link{data.frame} with all specifications.
@ -110,9 +109,40 @@ There are three helper functions that can be run after using the \code{\link[=as
 }
 }

-\subsection{Microbial Prevalence of Pathogens in Humans}{
+\subsection{For Mycologists}{

-The coercion rules consider the prevalence of microorganisms in humans, which is available as the \code{prevalence} column in the \link{microorganisms} data set. The grouping into human pathogenic prevalence is explained in the section \emph{Matching Score for Microorganisms} below.
+The \link[=mo_matching_score]{matching score algorithm} gives precedence to bacteria over fungi. If you are only analysing fungi, be sure to use \code{only_fungi = TRUE}, or better yet, add this to your code and run it once every session:
+
+\if{html}{\out{<div class="sourceCode r">}}\preformatted{options(AMR_only_fungi = TRUE)
+}\if{html}{\out{</div>}}
+
+This will make sure that no bacteria or other 'non-fungi' will be returned by \code{\link[=as.mo]{as.mo()}}, or any of the \code{\link[=mo_property]{mo_*}} functions.
+}
+
+\subsection{Coagulase-negative and Coagulase-positive Staphylococci}{
+
+With \code{Becker = TRUE}, the following staphylococci will be converted to their corresponding coagulase group:
+\itemize{
+\item Coagulase-negative: \emph{S. americanisciuri}, \emph{S. argensis}, \emph{S. arlettae}, \emph{S. auricularis}, \emph{S. borealis}, \emph{S. brunensis}, \emph{S. caeli}, \emph{S. caledonicus}, \emph{S. canis}, \emph{S. capitis}, \emph{S. capitis capitis}, \emph{S. capitis urealyticus}, \emph{S. capitis ureolyticus}, \emph{S. caprae}, \emph{S. carnosus}, \emph{S. carnosus carnosus}, \emph{S. carnosus utilis}, \emph{S. casei}, \emph{S. caseolyticus}, \emph{S. chromogenes}, \emph{S. cohnii}, \emph{S. cohnii cohnii}, \emph{S. cohnii urealyticum}, \emph{S. cohnii urealyticus}, \emph{S. condimenti}, \emph{S. croceilyticus}, \emph{S. debuckii}, \emph{S. devriesei}, \emph{S. durrellii}, \emph{S. edaphicus}, \emph{S. epidermidis}, \emph{S. equorum}, \emph{S. equorum equorum}, \emph{S. equorum linens}, \emph{S. felis}, \emph{S. fleurettii}, \emph{S. gallinarum}, \emph{S. haemolyticus}, \emph{S. hominis}, \emph{S. hominis hominis}, \emph{S. hominis novobiosepticus}, \emph{S. jettensis}, \emph{S. kloosii}, \emph{S. lentus}, \emph{S. lloydii}, \emph{S. lugdunensis}, \emph{S. marylandisciuri}, \emph{S. massiliensis}, \emph{S. microti}, \emph{S. muscae}, \emph{S. nepalensis}, \emph{S. pasteuri}, \emph{S. petrasii}, \emph{S. petrasii croceilyticus}, \emph{S. petrasii jettensis}, \emph{S. petrasii petrasii}, \emph{S. petrasii pragensis}, \emph{S. pettenkoferi}, \emph{S. piscifermentans}, \emph{S. pragensis}, \emph{S. pseudoxylosus}, \emph{S. pulvereri}, \emph{S. ratti}, \emph{S. rostri}, \emph{S. saccharolyticus}, \emph{S. saprophyticus}, \emph{S. saprophyticus bovis}, \emph{S. saprophyticus saprophyticus}, \emph{S. schleiferi}, \emph{S. schleiferi schleiferi}, \emph{S. sciuri}, \emph{S. sciuri carnaticus}, \emph{S. sciuri lentus}, \emph{S. sciuri rodentium}, \emph{S. sciuri sciuri}, \emph{S. shinii}, \emph{S. simulans}, \emph{S. stepanovicii}, \emph{S. succinus}, \emph{S. succinus casei}, \emph{S. succinus succinus}, \emph{S. taiwanensis}, \emph{S. urealyticus}, \emph{S. ureilyticus}, \emph{S. veratri}, \emph{S. vitulinus}, \emph{S. vitulus}, \emph{S. warneri}, and \emph{S. xylosus}
+\item Coagulase-positive: \emph{S. agnetis}, \emph{S. argenteus}, \emph{S. coagulans}, \emph{S. cornubiensis}, \emph{S. delphini}, \emph{S. hyicus}, \emph{S. hyicus chromogenes}, \emph{S. hyicus hyicus}, \emph{S. intermedius}, \emph{S. lutrae}, \emph{S. pseudintermedius}, \emph{S. roterodami}, \emph{S. schleiferi coagulans}, \emph{S. schweitzeri}, \emph{S. simiae}, and \emph{S. singaporensis}
+}
+
+For newly named staphylococcal species, such as \emph{S. brunensis} (2024) and \emph{S. shinii} (2023), we look up the scientific reference to make sure the species are considered for the correct coagulase group.
+}
+
+\subsection{Lancefield Groups in Streptococci}{
+
+With \code{Lancefield = TRUE}, the following streptococci will be converted to their corresponding Lancefield group:
+\itemize{
+\item Streptococcus Group A: \emph{S. pyogenes}
+\item Streptococcus Group B: \emph{S. agalactiae}
+\item Streptococcus Group C: \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, \emph{S. dysgalactiae equisimilis}, \emph{S. equi}, \emph{S. equi equi}, \emph{S. equi ruminatorum}, and \emph{S. equi zooepidemicus}
+\item Streptococcus Group F: \emph{S. anginosus}, \emph{S. anginosus anginosus}, \emph{S. anginosus whileyi}, \emph{S. constellatus}, \emph{S. constellatus constellatus}, \emph{S. constellatus pharyngis}, \emph{S. constellatus viborgensis}, and \emph{S. intermedius}
+\item Streptococcus Group G: \emph{S. canis}, \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, and \emph{S. dysgalactiae equisimilis}
+\item Streptococcus Group H: \emph{S. sanguinis}
+\item Streptococcus Group K: \emph{S. salivarius}, \emph{S. salivarius salivarius}, and \emph{S. salivarius thermophilus}
+\item Streptococcus Group L: \emph{S. dysgalactiae}, \emph{S. dysgalactiae dysgalactiae}, and \emph{S. dysgalactiae equisimilis}
+}
 }
 }
 \section{Source}{
--- a/pkgdown/assets/AMR_intro.png
+++ b/pkgdown/assets/AMR_intro.png
--- a/pkgdown/assets/AMR_intro.svg
+++ b/pkgdown/assets/AMR_intro.svg
--- a/pkgdown/assets/countries.png
+++ b/pkgdown/assets/countries.png
--- a/pkgdown/assets/countries_large.png
+++ b/pkgdown/assets/countries_large.png
--- a/pkgdown/assets/endorsement_clsi_eucast.jpg
+++ b/pkgdown/assets/endorsement_clsi_eucast.jpg
--- a/pkgdown/assets/lang_cs.svg
+++ b/pkgdown/assets/lang_cs.svg
--- a/pkgdown/assets/lang_da.svg
+++ b/pkgdown/assets/lang_da.svg
--- a/pkgdown/assets/lang_de.svg
+++ b/pkgdown/assets/lang_de.svg
--- a/pkgdown/assets/lang_el.svg
+++ b/pkgdown/assets/lang_el.svg
--- a/pkgdown/assets/lang_en.svg
+++ b/pkgdown/assets/lang_en.svg
--- a/pkgdown/assets/lang_es.svg
+++ b/pkgdown/assets/lang_es.svg
--- a/pkgdown/assets/lang_fi.svg
+++ b/pkgdown/assets/lang_fi.svg
--- a/pkgdown/assets/lang_fr.svg
+++ b/pkgdown/assets/lang_fr.svg
--- a/pkgdown/assets/lang_it.svg
+++ b/pkgdown/assets/lang_it.svg
--- a/pkgdown/assets/lang_ja.svg
+++ b/pkgdown/assets/lang_ja.svg
--- a/pkgdown/assets/lang_nl.svg
+++ b/pkgdown/assets/lang_nl.svg
--- a/pkgdown/assets/lang_no.svg
+++ b/pkgdown/assets/lang_no.svg
--- a/pkgdown/assets/lang_pl.svg
+++ b/pkgdown/assets/lang_pl.svg
--- a/pkgdown/assets/lang_pt.svg
+++ b/pkgdown/assets/lang_pt.svg
--- a/pkgdown/assets/lang_ro.svg
+++ b/pkgdown/assets/lang_ro.svg
--- a/pkgdown/assets/lang_ru.svg
+++ b/pkgdown/assets/lang_ru.svg
--- a/pkgdown/assets/lang_sv.svg
+++ b/pkgdown/assets/lang_sv.svg
--- a/pkgdown/assets/lang_tr.svg
+++ b/pkgdown/assets/lang_tr.svg
--- a/pkgdown/assets/lang_uk.svg
+++ b/pkgdown/assets/lang_uk.svg
--- a/pkgdown/assets/lang_zh.svg
+++ b/pkgdown/assets/lang_zh.svg
--- a/pkgdown/assets/logo.svg
+++ b/pkgdown/assets/logo.svg
--- a/pkgdown/assets/logo_certe.svg
+++ b/pkgdown/assets/logo_certe.svg
--- a/pkgdown/assets/logo_eh1h.png
+++ b/pkgdown/assets/logo_eh1h.png
--- a/pkgdown/assets/logo_interreg.png
+++ b/pkgdown/assets/logo_interreg.png
--- a/pkgdown/assets/logo_rug.svg
+++ b/pkgdown/assets/logo_rug.svg
--- a/pkgdown/assets/logo_umcg.svg
+++ b/pkgdown/assets/logo_umcg.svg