age_groups fix

This commit is contained in:
dr. M.S. (Matthijs) Berends 2019-02-27 11:36:12 +01:00
parent 4ba2ff68e0
commit 54162522bd
41 changed files with 450 additions and 386 deletions

View File

@ -1,6 +1,6 @@
Package: AMR
Version: 0.5.0.9019
Date: 2019-02-26
Date: 2019-02-27
Title: Antimicrobial Resistance Analysis
Authors@R: c(
person(

12
R/age.R
View File

@ -73,8 +73,8 @@ age <- function(x, reference = Sys.Date()) {
#' \itemize{
#' \item{\code{"children"}, equivalent of: \code{c(0, 1, 2, 4, 6, 13, 18)}. This will split on 0, 1, 2-3, 4-5, 6-12, 13-17 and 18+.}
#' \item{\code{"elderly"} or \code{"seniors"}, equivalent of: \code{c(65, 75, 85, 95)}. This will split on 0-64, 65-74, 75-84, 85-94 and 95+.}
#' \item{\code{"fives"}, equivalent of: \code{1:20 * 5}. This will split on 0-4, 5-9, 10-14, 15-19 and so forth.}
#' \item{\code{"tens"}, equivalent of: \code{1:10 * 10}. This will split on 0-9, 10-19, 20-29 and so forth.}
#' \item{\code{"fives"}, equivalent of: \code{1:24 * 5}. This will split on 0-4, 5-9, 10-14, 15-19 and so forth, until 120.}
#' \item{\code{"tens"}, equivalent of: \code{1:12 * 10}. This will split on 0-9, 10-19, 20-29 and so forth, until 120.}
#' }
#' }
#' @keywords age_group age
@ -92,11 +92,11 @@ age <- function(x, reference = Sys.Date()) {
#' age_groups(ages, c(20, 50))
#'
#' # split into groups of ten years
#' age_groups(ages, 1:10 * 10)
#' age_groups(ages, 1:12 * 10)
#' age_groups(ages, split_at = "tens")
#'
#' # split into groups of five years
#' age_groups(ages, 1:20 * 5)
#' age_groups(ages, 1:24 * 5)
#' age_groups(ages, split_at = "fives")
#'
#' # split specifically for children
@ -122,9 +122,9 @@ age_groups <- function(x, split_at = c(12, 25, 55, 75)) {
} else if (split_at %like% "^(elder|senior)") {
split_at <- c(65, 75, 85, 95)
} else if (split_at %like% "^five") {
split_at <- 1:20 * 5
split_at <- 1:24 * 5
} else if (split_at %like% "^ten") {
split_at <- 1:10 * 10
split_at <- 1:12 * 10
}
}
split_at <- as.integer(split_at)

View File

@ -24,12 +24,12 @@
#' This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life.
#' @section Catalogue of Life:
#' \if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
#' This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
#' This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
#'
#' Included are:
#' \itemize{
#' \item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
#' \item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
#' \item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
#' \item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
#' \item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
#' \item{The responsible author(s) and year of scientific publication}
@ -76,6 +76,7 @@ NULL
#'
#' This function returns a list with info about the included data from the Catalogue of Life. It also shows if the included version is their latest annual release. The Catalogue of Life releases their annual release in March each year.
#' @seealso \code{\link{microorganisms}}
#' @details The list item \code{is_latest_annual_release} is based on the system date.
#' @inheritSection catalogue_of_life Catalogue of Life
#' @inheritSection AMR Read more on our website!
#' @export

94
R/mo.R
View File

@ -21,7 +21,7 @@
#' Transform to microorganism ID
#'
#' Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms \emph{Bacteria}, \emph{Fungi} and \emph{Protozoa} (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
#' Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms Archaea, Bacteria, Protozoa, Viruses and most microbial species from the kingdom Fungi (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
#' @param x a character vector or a \code{data.frame} with one or two columns
#' @param Becker a logical to indicate whether \emph{Staphylococci} should be categorised into Coagulase Negative \emph{Staphylococci} ("CoNS") and Coagulase Positive \emph{Staphylococci} ("CoPS") instead of their own species, according to Karsten Becker \emph{et al.} [1].
#'
@ -65,7 +65,6 @@
#' \itemize{
#' \item{\code{"E. coli"} will return the ID of \emph{Escherichia coli} and not \emph{Entamoeba coli}, although the latter would alphabetically come first}
#' \item{\code{"H. influenzae"} will return the ID of \emph{Haemophilus influenzae} and not \emph{Haematobacter influenzae} for the same reason}
#' \item{Something like \code{"p aer"} will return the ID of \emph{Pseudomonas aeruginosa} and not \emph{Pasteurella aerogenes}}
#' \item{Something like \code{"stau"} or \code{"S aur"} will return the ID of \emph{Staphylococcus aureus} and not \emph{Staphylococcus auricularis}}
#' }
#' This means that looking up human pathogenic microorganisms takes less time than looking up human \strong{non}-pathogenic microorganisms.
@ -77,7 +76,7 @@
#' \item{It strips off values between brackets and the brackets itself, and re-evaluates the input with all previous rules}
#' \item{It strips off words from the end one by one and re-evaluates the input with all previous rules}
#' \item{It strips off words from the start one by one and re-evaluates the input with all previous rules}
#' \item{It tries to look for some manual changes which are not yet published to the Catalogue of Life (like \emph{Propionibacterium} not yet being \emph{Cutibacterium})}
#' \item{It tries to look for some manual changes which are not (yet) published to the Catalogue of Life (like \emph{Propionibacterium} being \emph{Cutibacterium})}
#' }
#'
#' Examples:
@ -89,7 +88,7 @@
#'
#' Use \code{mo_failures()} to get a vector with all values that could not be coerced to a valid value.
#'
#' Use \code{mo_uncertainties()} to get a vector with all values that were coerced to a valid value, but with uncertainty.
#' Use \code{mo_uncertainties()} to get info about all values that were coerced to a valid value, but with uncertainty.
#'
#' Use \code{mo_renamed()} to get a vector with all values that could be coerced based on an old, previously accepted taxonomic name.
#'
@ -111,7 +110,7 @@
#'
#' [2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 57195. \url{https://dx.doi.org/10.1084/jem.57.4.571}
#'
#' [3] Catalogue of Life: Annual Checklist (public online database), \url{www.catalogueoflife.org}.
#' [3] Catalogue of Life: Annual Checklist (public online taxonomic database), \url{www.catalogueoflife.org} (check included annual version with \code{\link{catalogue_of_life_version}()}).
#' @export
#' @return Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}.
#' @seealso \code{\link{microorganisms}} for the \code{data.frame} that is being used to determine ID's. \cr
@ -238,7 +237,9 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
}
notes <- character(0)
uncertainties <- character(0)
uncertainties <- data.frame(input = character(0),
fullname = character(0),
mo = character(0))
failures <- character(0)
x_input <- x
# already strip leading and trailing spaces
@ -695,8 +696,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
found <- microorganismsDT[tolower(fullname) %like% paste(b.x_trimmed, "species"), ..property][[1]]
if (length(found) > 0) {
x[i] <- found[1L]
uncertainties <<- c(uncertainties,
paste0("'", a.x_backup, "' >> ", microorganismsDT[mo == found[1L], fullname][[1]], " (", found[1L], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = microorganismsDT[mo == found[1L], fullname][[1]],
mo = found[1L]))
return(x)
}
}
@ -719,8 +722,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
ref_old = found[1, ref],
ref_new = microorganismsDT[col_id == found[1, col_id_new], ref],
mo = microorganismsDT[col_id == found[1, col_id_new], mo])
uncertainties <<- c(uncertainties,
paste0("'", a.x_backup, "' >> ", found[1, fullname], " (Catalogue of Life ID ", found[1, col_id], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = found[1, fullname],
mo = paste("CoL", found[1, col_id])))
return(x)
}
@ -731,8 +736,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
if (!is.na(found) & nchar(b.x_trimmed) >= 6) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- c(uncertainties,
paste0("'", a.x_backup, "' >> ", microorganismsDT[mo == found_result[1L], fullname][[1]], " (", found_result[1L], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
}
@ -745,8 +752,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
if (!is.na(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- c(uncertainties,
paste0("'", a.x_backup, "' >> ", microorganismsDT[mo == found_result[1L], fullname][[1]], " (", found_result[1L], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
}
}
@ -761,8 +770,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
if (!is.na(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
uncertainties <<- c(uncertainties,
paste0("'", a.x_backup, "' >> ", microorganismsDT[mo == found_result[1L], fullname][[1]], " (", found_result[1L], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
}
}
@ -773,11 +784,10 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
if (!is.na(found)) {
found_result <- found
found <- microorganismsDT[mo == found, ..property][[1]]
warning(silver(paste0('Guessed with uncertainty: "',
a.x_backup, '" >> ', italic(microorganismsDT[mo == found_result[1L], fullname][[1]]), " (", found_result[1L], ")")),
call. = FALSE, immediate. = FALSE)
uncertainties <<- c(uncertainties,
paste0('"', a.x_backup, '" >> ', microorganismsDT[mo == found_result[1L], fullname][[1]], " (", found_result[1L], ")"))
uncertainties <<- rbind(uncertainties,
data.frame(input = a.x_backup,
fullname = microorganismsDT[mo == found_result[1L], fullname][[1]],
mo = found_result[1L]))
return(found[1L])
}
@ -799,7 +809,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
# failures
failures <- failures[!failures %in% c(NA, NULL, NaN)]
if (length(failures) > 0) {
if (length(failures) > 0 & clear_options == TRUE) {
options(mo_failures = sort(unique(failures)))
plural <- c("value", "it")
if (n_distinct(failures) > 1) {
@ -807,7 +817,7 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
}
total_failures <- length(x_input[x_input %in% failures & !x_input %in% c(NA, NULL, NaN)])
total_n <- length(x_input[!x_input %in% c(NA, NULL, NaN)])
msg <- paste0("\n", n_distinct(failures), " unique ", plural[1],
msg <- paste0("\n", nr2char(n_distinct(failures)), " unique input ", plural[1],
" (^= ", percent(total_failures / total_n, round = 1, force_zero = TRUE),
") could not be coerced to a valid MO code")
if (n_distinct(failures) <= 10) {
@ -819,14 +829,15 @@ exec_as.mo <- function(x, Becker = FALSE, Lancefield = FALSE,
immediate. = TRUE) # thus will always be shown, even if >= warnings
}
# uncertainties
if (length(uncertainties) > 0) {
options(mo_uncertainties = sort(unique(uncertainties)))
if (NROW(uncertainties) > 0 & clear_options == TRUE) {
options(mo_uncertainties = as.list(distinct(uncertainties, input, .keep_all = TRUE)))
plural <- c("value", "it")
if (n_distinct(failures) > 1) {
if (NROW(uncertainties) > 1) {
plural <- c("values", "them")
}
msg <- paste0("\nResults of ", n_distinct(uncertainties), " input ", plural[1],
" guessed with uncertainty. Use mo_uncertainties() to review ", plural[2], ".")
msg <- paste0("\nResults of ", nr2char(NROW(uncertainties)), " input ", plural[1],
" was guessed with uncertainty. Use mo_uncertainties() to review ", plural[2], ".")
warning(red(msg),
call. = FALSE,
immediate. = TRUE) # thus will always be shown, even if >= warnings
@ -961,6 +972,7 @@ print.mo <- function(x, ...) {
}
#' @exportMethod summary.mo
#' @importFrom dplyr n_distinct
#' @export
#' @noRd
summary.mo <- function(object, ...) {
@ -969,7 +981,7 @@ summary.mo <- function(object, ...) {
top_3 <- unname(top_freq(freq(x), 3))
c("Class" = "mo",
"<NA>" = length(x[is.na(x)]),
"Unique" = dplyr::n_distinct(x[!is.na(x)]),
"Unique" = n_distinct(x[!is.na(x)]),
"#1" = top_3[1],
"#2" = top_3[2],
"#3" = top_3[3])
@ -978,7 +990,7 @@ summary.mo <- function(object, ...) {
#' @exportMethod as.data.frame.mo
#' @export
#' @noRd
as.data.frame.mo <- function (x, ...) {
as.data.frame.mo <- function(x, ...) {
# same as as.data.frame.character but with removed stringsAsFactors, since it will be class "mo"
nm <- paste(deparse(substitute(x), width.cutoff = 500L),
collapse = " ")
@ -1004,13 +1016,31 @@ mo_failures <- function() {
}
#' @rdname as.mo
#' @importFrom crayon italic
#' @export
mo_uncertainties <- function() {
getOption("mo_uncertainties")
df <- as.data.frame(getOption("mo_uncertainties"))
msg <- ""
for (i in 1:nrow(df)) {
msg <- paste(msg,
paste0('"', df[i, "input"], '" -> ', italic(df[i, "fullname"]), " (", df[i, "mo"], ")"),
sep = "\n")
}
cat(paste0(bold("Results guessed with uncertainty:"), msg))
}
#' @rdname as.mo
#' @export
mo_renamed <- function() {
strip_style(gsub("was renamed", ">>", getOption("mo_renamed"), fixed = TRUE))
strip_style(gsub("was renamed", "->", getOption("mo_renamed"), fixed = TRUE))
}
nr2char <- function(x) {
if (x %in% c(1:10)) {
v <- c("one" = 1, "two" = 2, "three" = 3, "four" = 4, "five" = 5,
"six" = 6, "seven" = 7, "eight" = 8, "nine" = 9, "ten" = 10)
names(v[x])
} else {
x
}
}

View File

@ -26,13 +26,13 @@
#' @rdname mo_source
#' @name mo_source
#' @aliases set_mo_source get_mo_source
#' @details The reference file can be a text file seperated with commas (CSV) or pipes, an Excel file (old 'xls' format or new 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the \code{readxl} package installed.
#' @details The reference file can be a text file seperated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the \code{readxl} package installed.
#'
#' \code{set_mo_source} will check the file for validity: it must be a \code{data.frame}, must have a column named \code{"mo"} which contains values from \code{microorganisms$mo} and must have a reference column with your own defined values. If all tests pass, \code{set_mo_source} will read the file into R and export it to \code{"~/.mo_source.rds"}. This compressed data file will then be used at default for MO determination (function \code{\link{as.mo}} and consequently all \code{mo_*} functions like \code{\link{mo_genus}} and \code{\link{mo_gramstain}}). The location of the original file will be saved as option with \code{\link{options}(mo_source = path)}. Its timestamp will be saved with \code{\link{options}(mo_source_datetime = ...)}.
#'
#' \code{get_mo_source} will return the data set by reading \code{"~/.mo_source.rds"} with \code{\link{readRDS}}. If the original file has changed (the file defined with \code{path}), it will call \code{set_mo_source} to update the data file automatically.
#'
#' Reading an Excel file (\code{.xlsx}) with only one row has a size of 8-9 kB. The compressed file will have a size of 0.1 kB and can be read by \code{get_mo_source} in only a couple of microseconds (a millionth of a second).
#' Reading an Excel file (\code{.xlsx}) with only one row has a size of 8-9 kB. The compressed file used by this package will have a size of 0.1 kB and can be read by \code{get_mo_source} in only a couple of microseconds (a millionth of a second).
#' @importFrom dplyr select everything
#' @export
#' @inheritSection AMR Read more on our website!
@ -48,7 +48,7 @@
#' # 1. We save it as 'home/me/ourcodes.xlsx'
#'
#' # 2. We use it for input:
#' set_mo_source("C:\path\ourcodes.xlsx")
#' set_mo_source("home/me/ourcodes.xlsx")
#' #> Created mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.
#'
#' # 3. And use it in our functions:
@ -109,11 +109,20 @@ set_mo_source <- function(path) {
}
df <- readxl::read_excel(path)
} else if (path %like% '[.]tsv$') {
df <- utils::read.table(header = TRUE, sep = "\t", stringsAsFactors = FALSE)
} else {
# try comma first
try(
df <- utils::read.table(header = TRUE, sep = ",", stringsAsFactors = FALSE),
silent = TRUE)
if (!is_valid(df)) {
# try tab
try(
df <- utils::read.table(header = TRUE, sep = "\t", stringsAsFactors = FALSE),
silent = TRUE)
}
if (!is_valid(df)) {
# try pipe
try(

View File

@ -192,7 +192,7 @@
<h1>How to conduct AMR analysis</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">26 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>AMR.Rmd</code></div>
@ -201,7 +201,7 @@
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 26 February 2019.</p>
<p><strong>Note:</strong> values on this page will change with every website update since they are based on randomly created values and the page was written in <a href="https://rmarkdown.rstudio.com/">RMarkdown</a>. However, the methodology remains unchanged. This page was generated on 27 February 2019.</p>
<div id="introduction" class="section level1">
<h1 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h1>
@ -217,21 +217,21 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2019-02-26</td>
<td align="center">2019-02-27</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
</tr>
<tr class="even">
<td align="center">2019-02-26</td>
<td align="center">2019-02-27</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
</tr>
<tr class="odd">
<td align="center">2019-02-26</td>
<td align="center">2019-02-27</td>
<td align="center">efgh</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
@ -327,19 +327,41 @@
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2015-04-06</td>
<td align="center">E7</td>
<td align="center">Hospital C</td>
<td align="center">2015-01-18</td>
<td align="center">F9</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="even">
<td align="center">2017-12-07</td>
<td align="center">H7</td>
<td align="center">Hospital A</td>
<td align="center">Klebsiella pneumoniae</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2016-02-14</td>
<td align="center">J4</td>
<td align="center">Hospital A</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="even">
<td align="center">2016-10-23</td>
<td align="center">S6</td>
<td align="center">2010-12-25</td>
<td align="center">P2</td>
<td align="center">Hospital B</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
@ -349,44 +371,22 @@
<td align="center">F</td>
</tr>
<tr class="odd">
<td align="center">2010-02-02</td>
<td align="center">O1</td>
<td align="center">Hospital D</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2014-03-12</td>
<td align="center">H4</td>
<td align="center">2016-12-26</td>
<td align="center">S8</td>
<td align="center">Hospital A</td>
<td align="center">Escherichia coli</td>
<td align="center">Streptococcus pneumoniae</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
</tr>
<tr class="odd">
<td align="center">2011-11-01</td>
<td align="center">X1</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">F</td>
</tr>
<tr class="even">
<td align="center">2016-12-10</td>
<td align="center">W4</td>
<td align="center">Hospital B</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
<td align="center">2010-03-27</td>
<td align="center">R7</td>
<td align="center">Hospital D</td>
<td align="center">Klebsiella pneumoniae</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -411,8 +411,8 @@
#&gt;
#&gt; Item Count Percent Cum. Count Cum. Percent
#&gt; --- ----- ------- -------- ----------- -------------
#&gt; 1 M 10,479 52.4% 10,479 52.4%
#&gt; 2 F 9,521 47.6% 20,000 100.0%</code></pre>
#&gt; 1 M 10,386 51.9% 10,386 51.9%
#&gt; 2 F 9,614 48.1% 20,000 100.0%</code></pre>
<p>So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values <code>M</code> and <code>F</code>. From a researcher perspective: there are slightly more men. Nothing we didnt already know.</p>
<p>The data is already quite clean, but we still need to transform some variables. The <code>bacteria</code> column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The <code><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate()</a></code> function of the <code>dplyr</code> package makes this really easy:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1">data &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span></a>
@ -443,10 +443,10 @@
<a class="sourceLine" id="cb14-19" title="19"><span class="co">#&gt; Kingella kingae (no changes)</span></a>
<a class="sourceLine" id="cb14-20" title="20"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-21" title="21"><span class="co">#&gt; EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1324 changes)</span></a>
<a class="sourceLine" id="cb14-22" title="22"><span class="co">#&gt; Table 1: Intrinsic resistance in Enterobacteriaceae (1291 changes)</span></a>
<a class="sourceLine" id="cb14-23" title="23"><span class="co">#&gt; Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-24" title="24"><span class="co">#&gt; Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2776 changes)</span></a>
<a class="sourceLine" id="cb14-25" title="25"><span class="co">#&gt; Table 4: Intrinsic resistance in Gram-positive bacteria (2787 changes)</span></a>
<a class="sourceLine" id="cb14-26" title="26"><span class="co">#&gt; Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)</span></a>
<a class="sourceLine" id="cb14-27" title="27"><span class="co">#&gt; Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)</span></a>
<a class="sourceLine" id="cb14-28" title="28"><span class="co">#&gt; Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)</span></a>
@ -462,9 +462,9 @@
<a class="sourceLine" id="cb14-38" title="38"><span class="co">#&gt; Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)</span></a>
<a class="sourceLine" id="cb14-39" title="39"><span class="co">#&gt; Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)</span></a>
<a class="sourceLine" id="cb14-40" title="40"><span class="co">#&gt; </span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,376 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-41" title="41"><span class="co">#&gt; =&gt; EUCAST rules affected 7,442 out of 20,000 rows</span></a>
<a class="sourceLine" id="cb14-42" title="42"><span class="co">#&gt; -&gt; added 0 test results</span></a>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 4,100 test results (0 to S; 0 to I; 4,100 to R)</span></a></code></pre></div>
<a class="sourceLine" id="cb14-43" title="43"><span class="co">#&gt; -&gt; changed 4,078 test results (0 to S; 0 to I; 4,078 to R)</span></a></code></pre></div>
</div>
<div id="adding-new-variables" class="section level1">
<h1 class="hasAnchor">
@ -489,8 +489,8 @@
<a class="sourceLine" id="cb16-3" title="3"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `bacteria` as input for `col_mo`.</span></a>
<a class="sourceLine" id="cb16-4" title="4"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `date` as input for `col_date`.</span></a>
<a class="sourceLine" id="cb16-5" title="5"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,678 first isolates (28.4% of total)</span></a></code></pre></div>
<p>So only 28.4% is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<a class="sourceLine" id="cb16-6" title="6"><span class="co">#&gt; =&gt; Found 5,707 first isolates (28.5% of total)</span></a></code></pre></div>
<p>So only 28.5% is suitable for resistance analysis! We can now filter on it with the <code><a href="https://dplyr.tidyverse.org/reference/filter.html">filter()</a></code> function, also from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb17-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb17-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(first <span class="op">==</span><span class="st"> </span><span class="ot">TRUE</span>)</a></code></pre></div>
<p>For future use, the above two syntaxes can be shortened with the <code><a href="../reference/first_isolate.html">filter_first_isolate()</a></code> function:</p>
@ -516,65 +516,65 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-25</td>
<td align="center">B9</td>
<td align="center">2010-02-12</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-03-01</td>
<td align="center">B9</td>
<td align="center">2010-02-12</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-06-15</td>
<td align="center">B9</td>
<td align="center">2010-02-22</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-07-08</td>
<td align="center">B9</td>
<td align="center">2010-03-05</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-07-20</td>
<td align="center">B9</td>
<td align="center">2010-03-08</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-09-18</td>
<td align="center">B9</td>
<td align="center">2010-03-17</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -582,8 +582,8 @@
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-09-21</td>
<td align="center">B9</td>
<td align="center">2010-05-03</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
@ -593,33 +593,33 @@
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-11-24</td>
<td align="center">B9</td>
<td align="center">2010-07-03</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-12-08</td>
<td align="center">B9</td>
<td align="center">2010-09-11</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-01-19</td>
<td align="center">B9</td>
<td align="center">2010-09-24</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
@ -637,7 +637,7 @@
<a class="sourceLine" id="cb19-7" title="7"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `patient_id` as input for `col_patient_id`.</span></a>
<a class="sourceLine" id="cb19-8" title="8"><span class="co">#&gt; </span><span class="al">NOTE</span><span class="co">: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.</span></a>
<a class="sourceLine" id="cb19-9" title="9"><span class="co">#&gt; [Criterion] Inclusion based on key antibiotics, ignoring I.</span></a>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,822 first weighted isolates (79.1% of total)</span></a></code></pre></div>
<a class="sourceLine" id="cb19-10" title="10"><span class="co">#&gt; =&gt; Found 15,861 first weighted isolates (79.3% of total)</span></a></code></pre></div>
<table class="table">
<thead><tr class="header">
<th align="center">isolate</th>
@ -654,70 +654,70 @@
<tbody>
<tr class="odd">
<td align="center">1</td>
<td align="center">2010-01-25</td>
<td align="center">B9</td>
<td align="center">2010-02-12</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">TRUE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">2</td>
<td align="center">2010-03-01</td>
<td align="center">B9</td>
<td align="center">2010-02-12</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">3</td>
<td align="center">2010-06-15</td>
<td align="center">B9</td>
<td align="center">2010-02-22</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">4</td>
<td align="center">2010-07-08</td>
<td align="center">B9</td>
<td align="center">2010-03-05</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td align="center">5</td>
<td align="center">2010-07-20</td>
<td align="center">B9</td>
<td align="center">2010-03-08</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td align="center">6</td>
<td align="center">2010-09-18</td>
<td align="center">B9</td>
<td align="center">2010-03-17</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
@ -726,35 +726,35 @@
</tr>
<tr class="odd">
<td align="center">7</td>
<td align="center">2010-09-21</td>
<td align="center">B9</td>
<td align="center">2010-05-03</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="even">
<td align="center">8</td>
<td align="center">2010-11-24</td>
<td align="center">B9</td>
<td align="center">2010-07-03</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
<td align="center">TRUE</td>
<td align="center">FALSE</td>
</tr>
<tr class="odd">
<td align="center">9</td>
<td align="center">2010-12-08</td>
<td align="center">B9</td>
<td align="center">2010-09-11</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
@ -762,11 +762,11 @@
</tr>
<tr class="even">
<td align="center">10</td>
<td align="center">2011-01-19</td>
<td align="center">B9</td>
<td align="center">2010-09-24</td>
<td align="center">I9</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">FALSE</td>
@ -774,11 +774,11 @@
</tr>
</tbody>
</table>
<p>Instead of 1, now 9 isolates are flagged. In total, 79.1% of all isolates are marked first weighted - 50.7% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>Instead of 1, now 8 isolates are flagged. In total, 79.3% of all isolates are marked first weighted - 50.8% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.</p>
<p>As with <code><a href="../reference/first_isolate.html">filter_first_isolate()</a></code>, theres a shortcut for this new algorithm too:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" title="1">data_1st &lt;-<span class="st"> </span>data <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb20-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/first_isolate.html">filter_first_weighted_isolate</a></span>()</a></code></pre></div>
<p>So we end up with 15,822 isolates for analysis.</p>
<p>So we end up with 15,861 isolates for analysis.</p>
<p>We can remove unneeded columns:</p>
<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb21-1" title="1">data_1st &lt;-<span class="st"> </span>data_1st <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb21-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="op">-</span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(first, keyab))</a></code></pre></div>
@ -804,14 +804,14 @@
<tbody>
<tr class="odd">
<td>1</td>
<td align="center">2015-04-06</td>
<td align="center">E7</td>
<td align="center">Hospital C</td>
<td align="center">2015-01-18</td>
<td align="center">F9</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
@ -819,9 +819,25 @@
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>2</td>
<td align="center">2016-10-23</td>
<td align="center">S6</td>
<td>3</td>
<td align="center">2016-02-14</td>
<td align="center">J4</td>
<td align="center">Hospital A</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>4</td>
<td align="center">2010-12-25</td>
<td align="center">P2</td>
<td align="center">Hospital B</td>
<td align="center">B_STRPT_PNE</td>
<td align="center">S</td>
@ -834,68 +850,52 @@
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>3</td>
<td align="center">2010-02-02</td>
<td align="center">O1</td>
<td align="center">Hospital D</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>5</td>
<td align="center">2011-11-01</td>
<td align="center">X1</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">R</td>
<td align="center">2016-12-26</td>
<td align="center">S8</td>
<td align="center">Hospital A</td>
<td align="center">B_STRPT_PNE</td>
<td align="center">S</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">R</td>
<td align="center">F</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">Gram positive</td>
<td align="center">Streptococcus</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="odd">
<td>6</td>
<td align="center">2016-12-10</td>
<td align="center">W4</td>
<td align="center">Hospital B</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">2010-03-27</td>
<td align="center">R7</td>
<td align="center">Hospital D</td>
<td align="center">B_KLBSL_PNE</td>
<td align="center">R</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">Klebsiella</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
<tr class="even">
<td>7</td>
<td align="center">2015-07-07</td>
<td align="center">P8</td>
<td align="center">Hospital D</td>
<td align="center">B_ESCHR_COL</td>
<td align="center">S</td>
<td>8</td>
<td align="center">2016-08-08</td>
<td align="center">K8</td>
<td align="center">Hospital B</td>
<td align="center">B_KLBSL_PNE</td>
<td align="center">R</td>
<td align="center">I</td>
<td align="center">S</td>
<td align="center">S</td>
<td align="center">F</td>
<td align="center">M</td>
<td align="center">Gram negative</td>
<td align="center">Escherichia</td>
<td align="center">coli</td>
<td align="center">Klebsiella</td>
<td align="center">pneumoniae</td>
<td align="center">TRUE</td>
</tr>
</tbody>
@ -915,9 +915,9 @@
<div class="sourceCode" id="cb23"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb23-1" title="1"><span class="kw"><a href="../reference/freq.html">freq</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/paste">paste</a></span>(data_1st<span class="op">$</span>genus, data_1st<span class="op">$</span>species))</a></code></pre></div>
<p>Or can be used like the <code>dplyr</code> way, which is easier readable:</p>
<div class="sourceCode" id="cb24"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb24-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(genus, species)</a></code></pre></div>
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from a <code>data.frame</code> (15,822 x 13)</strong></p>
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from a <code>data.frame</code> (15,861 x 13)</strong></p>
<p>Columns: 2<br>
Length: 15,822 (of which NA: 0 = 0.00%)<br>
Length: 15,861 (of which NA: 0 = 0.00%)<br>
Unique: 4</p>
<p>Shortest: 16<br>
Longest: 24</p>
@ -934,33 +934,33 @@ Longest: 24</p>
<tr class="odd">
<td align="left">1</td>
<td align="left">Escherichia coli</td>
<td align="right">7,838</td>
<td align="right">49.5%</td>
<td align="right">7,838</td>
<td align="right">49.5%</td>
<td align="right">7,879</td>
<td align="right">49.7%</td>
<td align="right">7,879</td>
<td align="right">49.7%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">Staphylococcus aureus</td>
<td align="right">3,965</td>
<td align="right">25.1%</td>
<td align="right">11,803</td>
<td align="right">74.6%</td>
<td align="right">3,915</td>
<td align="right">24.7%</td>
<td align="right">11,794</td>
<td align="right">74.4%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">Streptococcus pneumoniae</td>
<td align="right">2,457</td>
<td align="right">15.5%</td>
<td align="right">14,260</td>
<td align="right">90.1%</td>
<td align="right">2,482</td>
<td align="right">15.6%</td>
<td align="right">14,276</td>
<td align="right">90.0%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">Klebsiella pneumoniae</td>
<td align="right">1,562</td>
<td align="right">9.9%</td>
<td align="right">15,822</td>
<td align="right">1,585</td>
<td align="right">10.0%</td>
<td align="right">15,861</td>
<td align="right">100.0%</td>
</tr>
</tbody>
@ -971,7 +971,7 @@ Longest: 24</p>
<a href="#resistance-percentages" class="anchor"></a>Resistance percentages</h2>
<p>The functions <code>portion_R</code>, <code>portion_RI</code>, <code>portion_I</code>, <code>portion_IS</code> and <code>portion_S</code> can be used to determine the portion of a specific antimicrobial outcome. They can be used on their own:</p>
<div class="sourceCode" id="cb25"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb25-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/portion.html">portion_IR</a></span>(amox)</a>
<a class="sourceLine" id="cb25-2" title="2"><span class="co">#&gt; [1] 0.4722538</span></a></code></pre></div>
<a class="sourceLine" id="cb25-2" title="2"><span class="co">#&gt; [1] 0.4744341</span></a></code></pre></div>
<p>Or can be used in conjuction with <code><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by()</a></code> and <code><a href="https://dplyr.tidyverse.org/reference/summarise.html">summarise()</a></code>, both from the <code>dplyr</code> package:</p>
<div class="sourceCode" id="cb26"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb26-1" title="1">data_1st <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb26-2" title="2"><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(hospital) <span class="op">%&gt;%</span><span class="st"> </span></a>
@ -984,19 +984,19 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Hospital A</td>
<td align="center">0.4692014</td>
<td align="center">0.4759916</td>
</tr>
<tr class="even">
<td align="center">Hospital B</td>
<td align="center">0.4694061</td>
<td align="center">0.4808997</td>
</tr>
<tr class="odd">
<td align="center">Hospital C</td>
<td align="center">0.4845361</td>
<td align="center">0.4682779</td>
</tr>
<tr class="even">
<td align="center">Hospital D</td>
<td align="center">0.4727669</td>
<td align="center">0.4651015</td>
</tr>
</tbody>
</table>
@ -1014,23 +1014,23 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Hospital A</td>
<td align="center">0.4692014</td>
<td align="center">4708</td>
<td align="center">0.4759916</td>
<td align="center">4790</td>
</tr>
<tr class="even">
<td align="center">Hospital B</td>
<td align="center">0.4694061</td>
<td align="center">5573</td>
<td align="center">0.4808997</td>
<td align="center">5602</td>
</tr>
<tr class="odd">
<td align="center">Hospital C</td>
<td align="center">0.4845361</td>
<td align="center">2328</td>
<td align="center">0.4682779</td>
<td align="center">2317</td>
</tr>
<tr class="even">
<td align="center">Hospital D</td>
<td align="center">0.4727669</td>
<td align="center">3213</td>
<td align="center">0.4651015</td>
<td align="center">3152</td>
</tr>
</tbody>
</table>
@ -1050,27 +1050,27 @@ Longest: 24</p>
<tbody>
<tr class="odd">
<td align="center">Escherichia</td>
<td align="center">0.7269712</td>
<td align="center">0.9050778</td>
<td align="center">0.9744833</td>
<td align="center">0.7292804</td>
<td align="center">0.8975758</td>
<td align="center">0.9772814</td>
</tr>
<tr class="even">
<td align="center">Klebsiella</td>
<td align="center">0.7349552</td>
<td align="center">0.8988476</td>
<td align="center">0.9763124</td>
<td align="center">0.7438486</td>
<td align="center">0.9015773</td>
<td align="center">0.9741325</td>
</tr>
<tr class="odd">
<td align="center">Staphylococcus</td>
<td align="center">0.7263556</td>
<td align="center">0.9235813</td>
<td align="center">0.9793190</td>
<td align="center">0.7315453</td>
<td align="center">0.9154534</td>
<td align="center">0.9793103</td>
</tr>
<tr class="even">
<td align="center">Streptococcus</td>
<td align="center">0.7391127</td>
<td align="center">0.7352941</td>
<td align="center">0.0000000</td>
<td align="center">0.7391127</td>
<td align="center">0.7352941</td>
</tr>
</tbody>
</table>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 21 KiB

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 68 KiB

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 50 KiB

View File

@ -192,7 +192,7 @@
<h1>How to apply EUCAST rules</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">26 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>EUCAST.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to use the <em>G</em>-test</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">26 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>G_test.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to work with WHONET data</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">26 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>WHONET.Rmd</code></div>

View File

@ -192,7 +192,7 @@
<h1>How to get properties of an antibiotic</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">26 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>atc_property.Rmd</code></div>

View File

@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
</span>
</div>
@ -192,7 +192,7 @@
<h1>Benchmarks</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">25 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>benchmarks.Rmd</code></div>
@ -217,15 +217,15 @@
<a class="sourceLine" id="cb2-8" title="8"> <span class="dt">times =</span> <span class="dv">10</span>)</a>
<a class="sourceLine" id="cb2-9" title="9"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(S.aureus, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb2-10" title="10"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb2-11" title="11"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb2-12" title="12"><span class="co">#&gt; as.mo("sau") 13.40 13.60 17.8 13.60 13.80 51.6 10</span></a>
<a class="sourceLine" id="cb2-13" title="13"><span class="co">#&gt; as.mo("stau") 83.00 83.30 96.5 85.30 88.40 163.0 10</span></a>
<a class="sourceLine" id="cb2-14" title="14"><span class="co">#&gt; as.mo("staaur") 13.50 13.50 19.1 13.70 14.90 51.5 10</span></a>
<a class="sourceLine" id="cb2-15" title="15"><span class="co">#&gt; as.mo("STAAUR") 13.50 13.50 14.1 13.60 13.70 18.2 10</span></a>
<a class="sourceLine" id="cb2-16" title="16"><span class="co">#&gt; as.mo("S. aureus") 21.40 21.40 22.1 21.50 21.70 25.4 10</span></a>
<a class="sourceLine" id="cb2-17" title="17"><span class="co">#&gt; as.mo("S. aureus") 21.40 21.40 25.7 21.60 23.30 60.1 10</span></a>
<a class="sourceLine" id="cb2-18" title="18"><span class="co">#&gt; as.mo("Staphylococcus aureus") 5.63 5.87 15.2 5.94 8.32 57.8 10</span></a></code></pre></div>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 10 milliseconds means it can determine 100 input values per second. It case of 50 milliseconds, this is only 20 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first is a WHONET code) or common laboratory codes, or common full organism names like the last one.</p>
<a class="sourceLine" id="cb2-11" title="11"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb2-12" title="12"><span class="co">#&gt; as.mo("sau") 15.40 15.50 22.70 15.60 15.90 53.3 10</span></a>
<a class="sourceLine" id="cb2-13" title="13"><span class="co">#&gt; as.mo("stau") 84.20 84.30 86.60 84.60 86.60 102.0 10</span></a>
<a class="sourceLine" id="cb2-14" title="14"><span class="co">#&gt; as.mo("staaur") 15.40 15.40 19.70 15.50 15.60 57.1 10</span></a>
<a class="sourceLine" id="cb2-15" title="15"><span class="co">#&gt; as.mo("STAAUR") 15.40 15.40 15.50 15.50 15.60 15.9 10</span></a>
<a class="sourceLine" id="cb2-16" title="16"><span class="co">#&gt; as.mo("S. aureus") 23.50 23.50 31.10 23.50 23.60 61.7 10</span></a>
<a class="sourceLine" id="cb2-17" title="17"><span class="co">#&gt; as.mo("S. aureus") 23.50 23.50 36.50 23.50 61.60 74.3 10</span></a>
<a class="sourceLine" id="cb2-18" title="18"><span class="co">#&gt; as.mo("Staphylococcus aureus") 7.19 7.27 9.01 7.44 7.67 23.2 10</span></a></code></pre></div>
<p>In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 5 milliseconds means it can determine 200 input values per second. It case of 100 milliseconds, this is only 10 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first one is a WHONET code) or common laboratory codes, or common full organism names like the last one. Full organism names are always preferred.</p>
<p>To achieve this speed, the <code>as.mo</code> function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of <em>Thermus islandicus</em> (<code>B_THERMS_ISL</code>), a bug probably never found before in humans:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1">T.islandicus &lt;-<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/microbenchmark/topics/microbenchmark">microbenchmark</a></span>(<span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"theisl"</span>),</a>
<a class="sourceLine" id="cb3-2" title="2"> <span class="kw"><a href="../reference/as.mo.html">as.mo</a></span>(<span class="st">"THEISL"</span>),</a>
@ -236,12 +236,12 @@
<a class="sourceLine" id="cb3-7" title="7"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(T.islandicus, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb3-8" title="8"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb3-9" title="9"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb3-10" title="10"><span class="co">#&gt; as.mo("theisl") 448.0 486.0 483.0 489.0 490.0 510.0 10</span></a>
<a class="sourceLine" id="cb3-11" title="11"><span class="co">#&gt; as.mo("THEISL") 447.0 489.0 487.0 491.0 493.0 499.0 10</span></a>
<a class="sourceLine" id="cb3-12" title="12"><span class="co">#&gt; as.mo("T. islandicus") 78.0 78.2 78.9 78.7 78.9 82.3 10</span></a>
<a class="sourceLine" id="cb3-13" title="13"><span class="co">#&gt; as.mo("T. islandicus") 78.1 78.3 84.4 78.8 81.3 129.0 10</span></a>
<a class="sourceLine" id="cb3-14" title="14"><span class="co">#&gt; as.mo("Thermus islandicus") 61.8 62.1 75.4 62.8 104.0 109.0 10</span></a></code></pre></div>
<p>That takes 8 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like <em>Thermus islandicus</em>) are almost fast - these are the most probable input from most data sets.</p>
<a class="sourceLine" id="cb3-10" title="10"><span class="co">#&gt; as.mo("theisl") 444.0 449.0 479.0 488.0 493.0 506.0 10</span></a>
<a class="sourceLine" id="cb3-11" title="11"><span class="co">#&gt; as.mo("THEISL") 444.0 484.0 488.0 491.0 507.0 514.0 10</span></a>
<a class="sourceLine" id="cb3-12" title="12"><span class="co">#&gt; as.mo("T. islandicus") 80.5 80.8 87.8 81.3 89.9 118.0 10</span></a>
<a class="sourceLine" id="cb3-13" title="13"><span class="co">#&gt; as.mo("T. islandicus") 79.8 80.4 82.0 80.7 81.2 93.5 10</span></a>
<a class="sourceLine" id="cb3-14" title="14"><span class="co">#&gt; as.mo("Thermus islandicus") 63.4 63.5 72.3 64.0 64.5 107.0 10</span></a></code></pre></div>
<p>That takes 7.7 times as much time on average. A value of 100 milliseconds means it can only determine ~10 different input values per second. We can conclude that looking up arbitrary codes of less prevalent microorganisms is the worst way to go, in terms of calculation performance. Full names (like <em>Thermus islandicus</em>) are almost fast - these are the most probable input from most data sets.</p>
<p>In the figure below, we compare <em>Escherichia coli</em> (which is very common) with <em>Prevotella brevis</em> (which is moderately common) and with <em>Thermus islandicus</em> (which is very uncommon):</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/graphics/topics/par">par</a></span>(<span class="dt">mar =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="dv">5</span>, <span class="dv">16</span>, <span class="dv">4</span>, <span class="dv">2</span>)) <span class="co"># set more space for left margin text (16)</span></a>
<a class="sourceLine" id="cb4-2" title="2"></a>
@ -287,8 +287,8 @@
<a class="sourceLine" id="cb5-24" title="24"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb5-25" title="25"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb5-26" title="26"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb5-27" title="27"><span class="co">#&gt; mo_fullname(x) 741 746 806 778 827 968 10</span></a></code></pre></div>
<p>So transforming 500,000 values (!!) of 50 unique values only takes 0.78 seconds (778 ms). You only lose time on your unique input values.</p>
<a class="sourceLine" id="cb5-27" title="27"><span class="co">#&gt; mo_fullname(x) 743 771 805 798 844 886 10</span></a></code></pre></div>
<p>So transforming 500,000 values (!!) of 50 unique values only takes 0.8 seconds (798 ms). You only lose time on your unique input values.</p>
</div>
<div id="precalculated-results" class="section level3">
<h3 class="hasAnchor">
@ -300,11 +300,11 @@
<a class="sourceLine" id="cb6-4" title="4"> <span class="dt">times =</span> <span class="dv">10</span>)</a>
<a class="sourceLine" id="cb6-5" title="5"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb6-6" title="6"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb6-7" title="7"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb6-8" title="8"><span class="co">#&gt; A 10.200 10.300 10.600 10.400 11.00 11.300 10</span></a>
<a class="sourceLine" id="cb6-9" title="9"><span class="co">#&gt; B 20.500 20.700 21.300 21.400 22.00 22.100 10</span></a>
<a class="sourceLine" id="cb6-10" title="10"><span class="co">#&gt; C 0.308 0.504 0.589 0.591 0.73 0.863 10</span></a></code></pre></div>
<p>So going from <code><a href="../reference/mo_property.html">mo_fullname("Staphylococcus aureus")</a></code> to <code>"Staphylococcus aureus"</code> takes 0.0006 seconds - it doesnt even start calculating <em>if the result would be the same as the expected resulting value</em>. That goes for all helper functions:</p>
<a class="sourceLine" id="cb6-7" title="7"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb6-8" title="8"><span class="co">#&gt; A 10.900 11.100 11.200 11.200 11.300 11.400 10</span></a>
<a class="sourceLine" id="cb6-9" title="9"><span class="co">#&gt; B 21.300 21.400 21.600 21.600 21.700 22.000 10</span></a>
<a class="sourceLine" id="cb6-10" title="10"><span class="co">#&gt; C 0.302 0.313 0.492 0.532 0.569 0.725 10</span></a></code></pre></div>
<p>So going from <code><a href="../reference/mo_property.html">mo_fullname("Staphylococcus aureus")</a></code> to <code>"Staphylococcus aureus"</code> takes 0.0005 seconds - it doesnt even start calculating <em>if the result would be the same as the expected resulting value</em>. That goes for all helper functions:</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" title="1">run_it &lt;-<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/microbenchmark/topics/microbenchmark">microbenchmark</a></span>(<span class="dt">A =</span> <span class="kw"><a href="../reference/mo_property.html">mo_species</a></span>(<span class="st">"aureus"</span>),</a>
<a class="sourceLine" id="cb7-2" title="2"> <span class="dt">B =</span> <span class="kw"><a href="../reference/mo_property.html">mo_genus</a></span>(<span class="st">"Staphylococcus"</span>),</a>
<a class="sourceLine" id="cb7-3" title="3"> <span class="dt">C =</span> <span class="kw"><a href="../reference/mo_property.html">mo_fullname</a></span>(<span class="st">"Staphylococcus aureus"</span>),</a>
@ -317,14 +317,14 @@
<a class="sourceLine" id="cb7-10" title="10"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">3</span>)</a>
<a class="sourceLine" id="cb7-11" title="11"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb7-12" title="12"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb7-13" title="13"><span class="co">#&gt; A 0.318 0.340 0.388 0.382 0.434 0.474 10</span></a>
<a class="sourceLine" id="cb7-14" title="14"><span class="co">#&gt; B 0.339 0.362 0.424 0.428 0.449 0.555 10</span></a>
<a class="sourceLine" id="cb7-15" title="15"><span class="co">#&gt; C 0.331 0.369 0.522 0.526 0.637 0.673 10</span></a>
<a class="sourceLine" id="cb7-16" title="16"><span class="co">#&gt; D 0.269 0.278 0.313 0.300 0.353 0.384 10</span></a>
<a class="sourceLine" id="cb7-17" title="17"><span class="co">#&gt; E 0.252 0.266 0.322 0.302 0.349 0.448 10</span></a>
<a class="sourceLine" id="cb7-18" title="18"><span class="co">#&gt; F 0.241 0.264 0.310 0.313 0.347 0.379 10</span></a>
<a class="sourceLine" id="cb7-19" title="19"><span class="co">#&gt; G 0.241 0.258 0.310 0.317 0.355 0.386 10</span></a>
<a class="sourceLine" id="cb7-20" title="20"><span class="co">#&gt; H 0.278 0.289 0.316 0.313 0.334 0.375 10</span></a></code></pre></div>
<a class="sourceLine" id="cb7-13" title="13"><span class="co">#&gt; A 0.330 0.399 0.444 0.425 0.480 0.599 10</span></a>
<a class="sourceLine" id="cb7-14" title="14"><span class="co">#&gt; B 0.343 0.362 0.386 0.376 0.425 0.439 10</span></a>
<a class="sourceLine" id="cb7-15" title="15"><span class="co">#&gt; C 0.327 0.454 0.550 0.571 0.640 0.816 10</span></a>
<a class="sourceLine" id="cb7-16" title="16"><span class="co">#&gt; D 0.273 0.306 0.329 0.319 0.366 0.392 10</span></a>
<a class="sourceLine" id="cb7-17" title="17"><span class="co">#&gt; E 0.246 0.266 0.295 0.286 0.323 0.364 10</span></a>
<a class="sourceLine" id="cb7-18" title="18"><span class="co">#&gt; F 0.260 0.265 0.320 0.312 0.364 0.407 10</span></a>
<a class="sourceLine" id="cb7-19" title="19"><span class="co">#&gt; G 0.238 0.252 0.281 0.270 0.319 0.339 10</span></a>
<a class="sourceLine" id="cb7-20" title="20"><span class="co">#&gt; H 0.251 0.278 0.316 0.320 0.358 0.381 10</span></a></code></pre></div>
<p>Of course, when running <code><a href="../reference/mo_property.html">mo_phylum("Firmicutes")</a></code> the function has zero knowledge about the actual microorganism, namely <em>S. aureus</em>. But since the result would be <code>"Firmicutes"</code> too, there is no point in calculating the result. And because this package knows all phyla of all known bacteria (according to the Catalogue of Life), it can just return the initial value immediately.</p>
</div>
<div id="results-in-other-languages" class="section level3">
@ -351,13 +351,13 @@
<a class="sourceLine" id="cb8-18" title="18"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/print">print</a></span>(run_it, <span class="dt">unit =</span> <span class="st">"ms"</span>, <span class="dt">signif =</span> <span class="dv">4</span>)</a>
<a class="sourceLine" id="cb8-19" title="19"><span class="co">#&gt; Unit: milliseconds</span></a>
<a class="sourceLine" id="cb8-20" title="20"><span class="co">#&gt; expr min lq mean median uq max neval</span></a>
<a class="sourceLine" id="cb8-21" title="21"><span class="co">#&gt; en 13.23 13.57 16.92 13.69 13.73 46.78 10</span></a>
<a class="sourceLine" id="cb8-22" title="22"><span class="co">#&gt; de 22.09 22.20 25.72 22.32 23.16 55.31 10</span></a>
<a class="sourceLine" id="cb8-23" title="23"><span class="co">#&gt; nl 21.66 22.03 22.12 22.15 22.20 22.52 10</span></a>
<a class="sourceLine" id="cb8-24" title="24"><span class="co">#&gt; es 21.67 22.07 22.32 22.16 22.45 23.26 10</span></a>
<a class="sourceLine" id="cb8-25" title="25"><span class="co">#&gt; it 21.64 21.86 22.35 22.21 22.48 23.90 10</span></a>
<a class="sourceLine" id="cb8-26" title="26"><span class="co">#&gt; fr 21.70 22.10 28.72 22.21 22.33 55.28 10</span></a>
<a class="sourceLine" id="cb8-27" title="27"><span class="co">#&gt; pt 21.78 22.12 28.83 22.19 22.21 55.99 10</span></a></code></pre></div>
<a class="sourceLine" id="cb8-21" title="21"><span class="co">#&gt; en 14.37 14.43 17.91 14.64 14.82 47.42 10</span></a>
<a class="sourceLine" id="cb8-22" title="22"><span class="co">#&gt; de 22.59 22.88 27.57 23.00 23.55 67.95 10</span></a>
<a class="sourceLine" id="cb8-23" title="23"><span class="co">#&gt; nl 22.50 22.91 26.39 22.94 23.01 57.05 10</span></a>
<a class="sourceLine" id="cb8-24" title="24"><span class="co">#&gt; es 22.56 22.76 26.83 23.05 24.02 57.31 10</span></a>
<a class="sourceLine" id="cb8-25" title="25"><span class="co">#&gt; it 22.53 22.86 29.52 22.97 23.29 56.11 10</span></a>
<a class="sourceLine" id="cb8-26" title="26"><span class="co">#&gt; fr 22.49 22.92 23.06 23.01 23.18 23.99 10</span></a>
<a class="sourceLine" id="cb8-27" title="27"><span class="co">#&gt; pt 22.49 22.86 23.21 23.06 23.62 24.09 10</span></a></code></pre></div>
<p>Currently supported are German, Dutch, Spanish, Italian, French and Portuguese.</p>
</div>
</div>

Binary file not shown.

Before

Width:  |  Height:  |  Size: 29 KiB

After

Width:  |  Height:  |  Size: 29 KiB

View File

@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
</span>
</div>
@ -192,7 +192,7 @@
<h1>How to create frequency tables</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">25 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>freq.Rmd</code></div>

View File

@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
</span>
</div>
@ -192,7 +192,7 @@
<h1>How to get properties of a microorganism</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">25 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>mo_property.Rmd</code></div>

View File

@ -40,7 +40,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
</span>
</div>
@ -192,7 +192,7 @@
<h1>How to predict antimicrobial resistance</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">25 February 2019</h4>
<h4 class="date">27 February 2019</h4>
<div class="hidden name"><code>resistance_predict.Rmd</code></div>

View File

@ -78,7 +78,7 @@
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9018</span>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.5.0.9019</span>
</span>
</div>
@ -252,13 +252,11 @@
<li>Catalogue of Life as a new taxonomic source for data about microorganisms, which also contains all ITIS data we used previously. The <code>microorganisms</code> data set now contains:
<ul>
<li>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</li>
<li>
<p>All ~3,000 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales and Schizosaccharomycetales.</p>
The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant (sub)species are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</li>
<li>All ~3,000 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales and Schizosaccharomycetales (covering at least like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>)</li>
<li>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</li>
<li>
<p>The responsible author(s) and year of scientific publication</p>
This data is updated annually - check the included version with <code><a href="../reference/catalogue_of_life_version.html">catalogue_of_life_version()</a></code>.</li>
This data is updated annually - check the included version with the new function <code><a href="../reference/catalogue_of_life_version.html">catalogue_of_life_version()</a></code>.</li>
<li>Due to this change, some <code>mo</code> codes changed (e.g. <em>Streptococcus</em> changed from <code>B_STRPTC</code> to <code>B_STRPT</code>). A translation table is used internally to support older microorganism IDs, so users will not notice this difference.</li>
</ul>
</li>
@ -338,7 +336,7 @@ These functions use <code><a href="../reference/as.atc.html">as.atc()</a></code>
</li>
<li>Understanding of highly virulent <em>E. coli</em> strains like EIEC, EPEC and STEC</li>
<li>There will be looked for uncertain results at default - these results will be returned with an informative warning</li>
<li>Manual now contains more info about the algorithms</li>
<li>Manual (help page) now contains more info about the algorithms</li>
<li>Progress bar will be shown when it takes more than 3 seconds to get results</li>
<li>Support for formatted console text</li>
<li>Console will return the percentage of uncoercable input</li>

View File

@ -268,8 +268,8 @@
<li><p>A character:</p><ul>
<li><p><code>"children"</code>, equivalent of: <code><a href='https://www.rdocumentation.org/packages/base/topics/c'>c(0, 1, 2, 4, 6, 13, 18)</a></code>. This will split on 0, 1, 2-3, 4-5, 6-12, 13-17 and 18+.</p></li>
<li><p><code>"elderly"</code> or <code>"seniors"</code>, equivalent of: <code><a href='https://www.rdocumentation.org/packages/base/topics/c'>c(65, 75, 85, 95)</a></code>. This will split on 0-64, 65-74, 75-84, 85-94 and 95+.</p></li>
<li><p><code>"fives"</code>, equivalent of: <code>1:20 * 5</code>. This will split on 0-4, 5-9, 10-14, 15-19 and so forth.</p></li>
<li><p><code>"tens"</code>, equivalent of: <code>1:10 * 10</code>. This will split on 0-9, 10-19, 20-29 and so forth.</p></li>
<li><p><code>"fives"</code>, equivalent of: <code>1:24 * 5</code>. This will split on 0-4, 5-9, 10-14, 15-19 and so forth, until 120.</p></li>
<li><p><code>"tens"</code>, equivalent of: <code>1:12 * 10</code>. This will split on 0-9, 10-19, 20-29 and so forth, until 120.</p></li>
</ul></li>
</ul>
@ -294,11 +294,11 @@
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='fl'>20</span>, <span class='fl'>50</span>))
<span class='co'># split into groups of ten years</span>
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='fl'>1</span>:<span class='fl'>10</span> * <span class='fl'>10</span>)
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='fl'>1</span>:<span class='fl'>12</span> * <span class='fl'>10</span>)
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='kw'>split_at</span> <span class='kw'>=</span> <span class='st'>"tens"</span>)
<span class='co'># split into groups of five years</span>
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='fl'>1</span>:<span class='fl'>20</span> * <span class='fl'>5</span>)
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='fl'>1</span>:<span class='fl'>24</span> * <span class='fl'>5</span>)
<span class='fu'>age_groups</span>(<span class='no'>ages</span>, <span class='kw'>split_at</span> <span class='kw'>=</span> <span class='st'>"fives"</span>)
<span class='co'># split specifically for children</span>

View File

@ -47,7 +47,7 @@
<script src="../extra.js"></script>
<meta property="og:title" content="Transform to microorganism ID — as.mo" />
<meta property="og:description" content="Use this function to determine a valid microorganism ID (mo). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms Bacteria, Fungi and Protozoa (see Source), so the input can be almost anything: a full name (like &quot;Staphylococcus aureus&quot;), an abbreviated name (like &quot;S. aureus&quot;), an abbreviation known in the field (like &quot;MRSA&quot;), or just a genus. You could also select a genus and species column, zie Examples." />
<meta property="og:description" content="Use this function to determine a valid microorganism ID (mo). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms Archaea, Bacteria, Protozoa, Viruses and most microbial species from the kingdom Fungi (see Source), so the input can be almost anything: a full name (like &quot;Staphylococcus aureus&quot;), an abbreviated name (like &quot;S. aureus&quot;), an abbreviation known in the field (like &quot;MRSA&quot;), or just a genus. You could also select a genus and species column, zie Examples." />
<meta property="og:image" content="https://msberends.gitlab.io/AMR/logo.png" />
<meta name="twitter:card" content="summary" />
@ -237,7 +237,7 @@
<div class="ref-description">
<p>Use this function to determine a valid microorganism ID (<code>mo</code>). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms <em>Bacteria</em>, <em>Fungi</em> and <em>Protozoa</em> (see Source), so the input can be almost anything: a full name (like <code>"Staphylococcus aureus"</code>), an abbreviated name (like <code>"S. aureus"</code>), an abbreviation known in the field (like <code>"MRSA"</code>), or just a genus. You could also <code>select</code> a genus and species column, zie Examples.</p>
<p>Use this function to determine a valid microorganism ID (<code>mo</code>). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms Archaea, Bacteria, Protozoa, Viruses and most microbial species from the kingdom Fungi (see Source), so the input can be almost anything: a full name (like <code>"Staphylococcus aureus"</code>), an abbreviated name (like <code>"S. aureus"</code>), an abbreviation known in the field (like <code>"MRSA"</code>), or just a genus. You could also <code>select</code> a genus and species column, zie Examples.</p>
</div>
@ -309,7 +309,6 @@
<p>A couple of effects because of these rules:</p><ul>
<li><p><code>"E. coli"</code> will return the ID of <em>Escherichia coli</em> and not <em>Entamoeba coli</em>, although the latter would alphabetically come first</p></li>
<li><p><code>"H. influenzae"</code> will return the ID of <em>Haemophilus influenzae</em> and not <em>Haematobacter influenzae</em> for the same reason</p></li>
<li><p>Something like <code>"p aer"</code> will return the ID of <em>Pseudomonas aeruginosa</em> and not <em>Pasteurella aerogenes</em></p></li>
<li><p>Something like <code>"stau"</code> or <code>"S aur"</code> will return the ID of <em>Staphylococcus aureus</em> and not <em>Staphylococcus auricularis</em></p></li>
</ul><p>This means that looking up human pathogenic microorganisms takes less time than looking up human <strong>non</strong>-pathogenic microorganisms.</p>
<p><strong>UNCERTAIN RESULTS</strong> <br />
@ -318,7 +317,7 @@ When using <code>allow_uncertain = TRUE</code> (which is the default setting), i
<li><p>It strips off values between brackets and the brackets itself, and re-evaluates the input with all previous rules</p></li>
<li><p>It strips off words from the end one by one and re-evaluates the input with all previous rules</p></li>
<li><p>It strips off words from the start one by one and re-evaluates the input with all previous rules</p></li>
<li><p>It tries to look for some manual changes which are not yet published to the Catalogue of Life (like <em>Propionibacterium</em> not yet being <em>Cutibacterium</em>)</p></li>
<li><p>It tries to look for some manual changes which are not (yet) published to the Catalogue of Life (like <em>Propionibacterium</em> being <em>Cutibacterium</em>)</p></li>
</ul>
<p>Examples:</p><ul>
<li><p><code>"Streptococcus group B (known as S. agalactiae)"</code>. The text between brackets will be removed and a warning will be thrown that the result <em>Streptococcus group B</em> (<code>B_STRPT_GRB</code>) needs review.</p></li>
@ -326,7 +325,7 @@ When using <code>allow_uncertain = TRUE</code> (which is the default setting), i
<li><p><code>"Fluoroquinolone-resistant Neisseria gonorrhoeae"</code>. The first word will be stripped, after which the function will try to find a match. A warning will be thrown that the result <em>Neisseria gonorrhoeae</em> (<code>B_NESSR_GON</code>) needs review.</p></li>
</ul>
<p>Use <code>mo_failures()</code> to get a vector with all values that could not be coerced to a valid value.</p>
<p>Use <code>mo_uncertainties()</code> to get a vector with all values that were coerced to a valid value, but with uncertainty.</p>
<p>Use <code>mo_uncertainties()</code> to get info about all values that were coerced to a valid value, but with uncertainty.</p>
<p>Use <code>mo_renamed()</code> to get a vector with all values that could be coerced based on an old, previously accepted taxonomic name.</p>
<h2 class="hasAnchor" id="microbial-prevalence-of-pathogens-in-humans"><a class="anchor" href="#microbial-prevalence-of-pathogens-in-humans"></a>Microbial prevalence of pathogens in humans</h2>
@ -345,16 +344,16 @@ When using <code>allow_uncertain = TRUE</code> (which is the default setting), i
<p>[1] Becker K <em>et al.</em> <strong>Coagulase-Negative Staphylococci</strong>. 2014. Clin Microbiol Rev. 27(4): 870926. <a href='https://dx.doi.org/10.1128/CMR.00109-13'>https://dx.doi.org/10.1128/CMR.00109-13</a></p>
<p>[2] Lancefield RC <strong>A serological differentiation of human and other groups of hemolytic streptococci</strong>. 1933. J Exp Med. 57(4): 57195. <a href='https://dx.doi.org/10.1084/jem.57.4.571'>https://dx.doi.org/10.1084/jem.57.4.571</a></p>
<p>[3] Catalogue of Life: Annual Checklist (public online database), <a href='www.catalogueoflife.org'>www.catalogueoflife.org</a>.</p>
<p>[3] Catalogue of Life: Annual Checklist (public online taxonomic database), <a href='www.catalogueoflife.org'>www.catalogueoflife.org</a> (check included annual version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>).</p>
<h2 class="hasAnchor" id="catalogue-of-life"><a class="anchor" href="#catalogue-of-life"></a>Catalogue of Life</h2>
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>

View File

@ -246,10 +246,10 @@
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>

View File

@ -243,14 +243,18 @@
<pre class="usage"><span class='fu'>catalogue_of_life_version</span>()</pre>
<h2 class="hasAnchor" id="details"><a class="anchor" href="#details"></a>Details</h2>
<p>The list item <code>is_latest_annual_release</code> is based on the system date.</p>
<h2 class="hasAnchor" id="catalogue-of-life"><a class="anchor" href="#catalogue-of-life"></a>Catalogue of Life</h2>
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code>catalogue_of_life_version</code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code>catalogue_of_life_version()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>
@ -279,6 +283,8 @@ This package contains the complete taxonomic tree of almost all microorganisms f
<h2>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#details">Details</a></li>
<li><a href="#catalogue-of-life">Catalogue of Life</a></li>
<li><a href="#read-more-on-our-website-">Read more on our website!</a></li>

View File

@ -254,10 +254,10 @@
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>

View File

@ -278,10 +278,10 @@
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>

View File

@ -260,10 +260,10 @@
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>

View File

@ -334,10 +334,10 @@
<p><img src='figures/logo_col.png' height=60px style=margin-bottom:5px /> <br />
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a></code>.</p>
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (<a href='http://www.catalogueoflife.org'>http://www.catalogueoflife.org</a>). This data is updated annually - check the included version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>.</p>
<p>Included are:</p><ul>
<li><p>All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of <em>Aspergillus</em>, <em>Candida</em>, <em>Cryptococcus</em>, <em>Histplasma</em>, <em>Pneumocystis</em>, <em>Saccharomyces</em> and <em>Trichophyton</em>).</p></li>
<li><p>All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed</p></li>
<li><p>The complete taxonomic tree of all included (sub)species: from kingdom to subspecies</p></li>
<li><p>The responsible author(s) and year of scientific publication</p></li>
@ -350,7 +350,7 @@ This package contains the complete taxonomic tree of almost all microorganisms f
<p>[1] Becker K <em>et al.</em> <strong>Coagulase-Negative Staphylococci</strong>. 2014. Clin Microbiol Rev. 27(4): 870926. <a href='https://dx.doi.org/10.1128/CMR.00109-13'>https://dx.doi.org/10.1128/CMR.00109-13</a></p>
<p>[2] Lancefield RC <strong>A serological differentiation of human and other groups of hemolytic streptococci</strong>. 1933. J Exp Med. 57(4): 57195. <a href='https://dx.doi.org/10.1084/jem.57.4.571'>https://dx.doi.org/10.1084/jem.57.4.571</a></p>
<p>[3] Catalogue of Life: Annual Checklist (public online database), <a href='www.catalogueoflife.org'>www.catalogueoflife.org</a>.</p>
<p>[3] Catalogue of Life: Annual Checklist (public online taxonomic database), <a href='www.catalogueoflife.org'>www.catalogueoflife.org</a> (check included annual version with <code><a href='catalogue_of_life_version.html'>catalogue_of_life_version</a>()</code>).</p>
<h2 class="hasAnchor" id="read-more-on-our-website-"><a class="anchor" href="#read-more-on-our-website-"></a>Read more on our website!</h2>

View File

@ -256,10 +256,10 @@
<h2 class="hasAnchor" id="details"><a class="anchor" href="#details"></a>Details</h2>
<p>The reference file can be a text file seperated with commas (CSV) or pipes, an Excel file (old 'xls' format or new 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the <code>readxl</code> package installed.</p>
<p>The reference file can be a text file seperated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the <code>readxl</code> package installed.</p>
<p><code>set_mo_source</code> will check the file for validity: it must be a <code>data.frame</code>, must have a column named <code>"mo"</code> which contains values from <code>microorganisms$mo</code> and must have a reference column with your own defined values. If all tests pass, <code>set_mo_source</code> will read the file into R and export it to <code>"~/.mo_source.rds"</code>. This compressed data file will then be used at default for MO determination (function <code><a href='as.mo.html'>as.mo</a></code> and consequently all <code>mo_*</code> functions like <code><a href='mo_property.html'>mo_genus</a></code> and <code><a href='mo_property.html'>mo_gramstain</a></code>). The location of the original file will be saved as option with <code><a href='https://www.rdocumentation.org/packages/base/topics/options'>options</a>(mo_source = path)</code>. Its timestamp will be saved with <code><a href='https://www.rdocumentation.org/packages/base/topics/options'>options</a>(mo_source_datetime = ...)</code>.</p>
<p><code>get_mo_source</code> will return the data set by reading <code>"~/.mo_source.rds"</code> with <code><a href='https://www.rdocumentation.org/packages/base/topics/readRDS'>readRDS</a></code>. If the original file has changed (the file defined with <code>path</code>), it will call <code>set_mo_source</code> to update the data file automatically.</p>
<p>Reading an Excel file (<code>.xlsx</code>) with only one row has a size of 8-9 kB. The compressed file will have a size of 0.1 kB and can be read by <code>get_mo_source</code> in only a couple of microseconds (a millionth of a second).</p>
<p>Reading an Excel file (<code>.xlsx</code>) with only one row has a size of 8-9 kB. The compressed file used by this package will have a size of 0.1 kB and can be read by <code>get_mo_source</code> in only a couple of microseconds (a millionth of a second).</p>
<h2 class="hasAnchor" id="read-more-on-our-website-"><a class="anchor" href="#read-more-on-our-website-"></a>Read more on our website!</h2>
@ -268,34 +268,33 @@
<h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>
<pre class="examples"># NOT RUN {
# imagine this Excel file (mo codes looked up in `microorganisms` data set):
# A B
# 1 our code mo
# 2 lab_mo_ecoli B_ESCHR_COL
# 3 lab_mo_kpneumoniae B_KLBSL_PNE
<pre class="examples"><span class='co'># NOT RUN {</span>
<span class='co'># imagine this Excel file (mo codes looked up in `microorganisms` data set):</span>
<span class='co'># A B</span>
<span class='co'># 1 our code mo</span>
<span class='co'># 2 lab_mo_ecoli B_ESCHR_COL</span>
<span class='co'># 3 lab_mo_kpneumoniae B_KLBSL_PNE</span>
# 1. We save it as 'home/me/ourcodes.xlsx'
<span class='co'># 1. We save it as 'home/me/ourcodes.xlsx'</span>
# 2. We use it for input:
set_mo_source("C:\path\ourcodes.xlsx")
#> Created mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.
<span class='co'># 2. We use it for input:</span>
<span class='fu'>set_mo_source</span>(<span class='st'>"home/me/ourcodes.xlsx"</span>)
<span class='co'>#&gt; Created mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.</span>
# 3. And use it in our functions:
as.mo("lab_mo_ecoli")
#> B_ESCHR_COL
<span class='co'># 3. And use it in our functions:</span>
<span class='fu'><a href='as.mo.html'>as.mo</a></span>(<span class='st'>"lab_mo_ecoli"</span>)
<span class='co'>#&gt; B_ESCHR_COL</span>
mo_genus("lab_mo_kpneumoniae")
#> "Klebsiella"
<span class='fu'><a href='mo_property.html'>mo_genus</a></span>(<span class='st'>"lab_mo_kpneumoniae"</span>)
<span class='co'>#&gt; "Klebsiella"</span>
# 4. It will look for changes itself:
# (add new row to the Excel file and save it)
<span class='co'># 4. It will look for changes itself:</span>
<span class='co'># (add new row to the Excel file and save it)</span>
mo_genus("lab_mo_kpneumoniae")
#> Updated mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.
#> "Klebsiella"
# }
</pre>
<span class='fu'><a href='mo_property.html'>mo_genus</a></span>(<span class='st'>"lab_mo_kpneumoniae"</span>)
<span class='co'>#&gt; Updated mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.</span>
<span class='co'>#&gt; "Klebsiella"</span>
<span class='co'># }</span></pre>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<h2>Contents</h2>

View File

@ -26,8 +26,8 @@ To split ages, the input can be:
\itemize{
\item{\code{"children"}, equivalent of: \code{c(0, 1, 2, 4, 6, 13, 18)}. This will split on 0, 1, 2-3, 4-5, 6-12, 13-17 and 18+.}
\item{\code{"elderly"} or \code{"seniors"}, equivalent of: \code{c(65, 75, 85, 95)}. This will split on 0-64, 65-74, 75-84, 85-94 and 95+.}
\item{\code{"fives"}, equivalent of: \code{1:20 * 5}. This will split on 0-4, 5-9, 10-14, 15-19 and so forth.}
\item{\code{"tens"}, equivalent of: \code{1:10 * 10}. This will split on 0-9, 10-19, 20-29 and so forth.}
\item{\code{"fives"}, equivalent of: \code{1:24 * 5}. This will split on 0-4, 5-9, 10-14, 15-19 and so forth, until 120.}
\item{\code{"tens"}, equivalent of: \code{1:12 * 10}. This will split on 0-9, 10-19, 20-29 and so forth, until 120.}
}
}
}
@ -46,11 +46,11 @@ age_groups(ages, 50)
age_groups(ages, c(20, 50))
# split into groups of ten years
age_groups(ages, 1:10 * 10)
age_groups(ages, 1:12 * 10)
age_groups(ages, split_at = "tens")
# split into groups of five years
age_groups(ages, 1:20 * 5)
age_groups(ages, 1:24 * 5)
age_groups(ages, split_at = "fives")
# split specifically for children

View File

@ -39,7 +39,7 @@ mo_renamed()
Character (vector) with class \code{"mo"}. Unknown values will return \code{NA}.
}
\description{
Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms \emph{Bacteria}, \emph{Fungi} and \emph{Protozoa} (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
Use this function to determine a valid microorganism ID (\code{mo}). Determination is done using Artificial Intelligence (AI) and the complete taxonomic kingdoms Archaea, Bacteria, Protozoa, Viruses and most microbial species from the kingdom Fungi (see Source), so the input can be almost anything: a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), an abbreviation known in the field (like \code{"MRSA"}), or just a genus. You could also \code{\link{select}} a genus and species column, zie Examples.
}
\details{
A microbial ID from this package (class: \code{mo}) typically looks like these examples:\cr
@ -72,7 +72,6 @@ A couple of effects because of these rules:
\itemize{
\item{\code{"E. coli"} will return the ID of \emph{Escherichia coli} and not \emph{Entamoeba coli}, although the latter would alphabetically come first}
\item{\code{"H. influenzae"} will return the ID of \emph{Haemophilus influenzae} and not \emph{Haematobacter influenzae} for the same reason}
\item{Something like \code{"p aer"} will return the ID of \emph{Pseudomonas aeruginosa} and not \emph{Pasteurella aerogenes}}
\item{Something like \code{"stau"} or \code{"S aur"} will return the ID of \emph{Staphylococcus aureus} and not \emph{Staphylococcus auricularis}}
}
This means that looking up human pathogenic microorganisms takes less time than looking up human \strong{non}-pathogenic microorganisms.
@ -84,7 +83,7 @@ When using \code{allow_uncertain = TRUE} (which is the default setting), it will
\item{It strips off values between brackets and the brackets itself, and re-evaluates the input with all previous rules}
\item{It strips off words from the end one by one and re-evaluates the input with all previous rules}
\item{It strips off words from the start one by one and re-evaluates the input with all previous rules}
\item{It tries to look for some manual changes which are not yet published to the Catalogue of Life (like \emph{Propionibacterium} not yet being \emph{Cutibacterium})}
\item{It tries to look for some manual changes which are not (yet) published to the Catalogue of Life (like \emph{Propionibacterium} being \emph{Cutibacterium})}
}
Examples:
@ -96,7 +95,7 @@ Examples:
Use \code{mo_failures()} to get a vector with all values that could not be coerced to a valid value.
Use \code{mo_uncertainties()} to get a vector with all values that were coerced to a valid value, but with uncertainty.
Use \code{mo_uncertainties()} to get info about all values that were coerced to a valid value, but with uncertainty.
Use \code{mo_renamed()} to get a vector with all values that could be coerced based on an old, previously accepted taxonomic name.
}
@ -120,18 +119,18 @@ Group 2 probably contains all microbial pathogens ever found in humans.
[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 57195. \url{https://dx.doi.org/10.1084/jem.57.4.571}
[3] Catalogue of Life: Annual Checklist (public online database), \url{www.catalogueoflife.org}.
[3] Catalogue of Life: Annual Checklist (public online taxonomic database), \url{www.catalogueoflife.org} (check included annual version with \code{\link{catalogue_of_life_version}()}).
}
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -9,12 +9,12 @@ This package contains the complete taxonomic tree of almost all microorganisms f
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -9,15 +9,18 @@ catalogue_of_life_version()
\description{
This function returns a list with info about the included data from the Catalogue of Life. It also shows if the included version is their latest annual release. The Catalogue of Life releases their annual release in March each year.
}
\details{
The list item \code{is_latest_annual_release} is based on the system date.
}
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -41,12 +41,12 @@ Manually added were:
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -18,12 +18,12 @@ A data set containing commonly used codes for microorganisms, from laboratory sy
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -23,12 +23,12 @@ A data set containing old (previously valid or accepted) taxonomic names accordi
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}

View File

@ -102,12 +102,12 @@ Supported languages are \code{"en"} (English), \code{"de"} (German), \code{"nl"}
\section{Catalogue of Life}{
\if{html}{\figure{logo_col.png}{options: height=60px style=margin-bottom:5px} \cr}
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}}.
This package contains the complete taxonomic tree of almost all microorganisms from the authoritative and comprehensive Catalogue of Life (\url{http://www.catalogueoflife.org}). This data is updated annually - check the included version with \code{\link{catalogue_of_life_version}()}.
Included are:
\itemize{
\item{All ~55,000 (sub)species from the kingdoms of Archaea, Bacteria, Protozoa and Viruses}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. The kingdom of Fungi is a very large taxon with almost 300,000 different (sub)species, of which most are not microbial (but rather macroscopic, like mushrooms). Because of this, not all fungi fit the scope of this package and including everything would tremendously slow down our algorithms too. By only including the aforementioned taxonomic orders, the most relevant fungi are covered (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~3,500 (sub)species from these orders of the kingdom of Fungi: Eurotiales, Onygenales, Pneumocystales, Saccharomycetales, Schizosaccharomycetales and Tremellales. This covers the most relevant microbial fungi (like all species of \emph{Aspergillus}, \emph{Candida}, \emph{Cryptococcus}, \emph{Histplasma}, \emph{Pneumocystis}, \emph{Saccharomyces} and \emph{Trichophyton}).}
\item{All ~15,000 previously accepted names of included (sub)species that have been taxonomically renamed}
\item{The complete taxonomic tree of all included (sub)species: from kingdom to subspecies}
\item{The responsible author(s) and year of scientific publication}
@ -124,7 +124,7 @@ The syntax used to transform the original data to a cleansed R format, can be fo
[2] Lancefield RC \strong{A serological differentiation of human and other groups of hemolytic streptococci}. 1933. J Exp Med. 57(4): 57195. \url{https://dx.doi.org/10.1084/jem.57.4.571}
[3] Catalogue of Life: Annual Checklist (public online database), \url{www.catalogueoflife.org}.
[3] Catalogue of Life: Annual Checklist (public online taxonomic database), \url{www.catalogueoflife.org} (check included annual version with \code{\link{catalogue_of_life_version}()}).
}
\section{Read more on our website!}{

View File

@ -17,13 +17,13 @@ get_mo_source()
These functions can be used to predefine your own reference to be used in \code{\link{as.mo}} and consequently all \code{mo_*} functions like \code{\link{mo_genus}} and \code{\link{mo_gramstain}}.
}
\details{
The reference file can be a text file seperated with commas (CSV) or pipes, an Excel file (old 'xls' format or new 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the \code{readxl} package installed.
The reference file can be a text file seperated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an R object file (extension '.rds'). To use an Excel file, you need to have the \code{readxl} package installed.
\code{set_mo_source} will check the file for validity: it must be a \code{data.frame}, must have a column named \code{"mo"} which contains values from \code{microorganisms$mo} and must have a reference column with your own defined values. If all tests pass, \code{set_mo_source} will read the file into R and export it to \code{"~/.mo_source.rds"}. This compressed data file will then be used at default for MO determination (function \code{\link{as.mo}} and consequently all \code{mo_*} functions like \code{\link{mo_genus}} and \code{\link{mo_gramstain}}). The location of the original file will be saved as option with \code{\link{options}(mo_source = path)}. Its timestamp will be saved with \code{\link{options}(mo_source_datetime = ...)}.
\code{get_mo_source} will return the data set by reading \code{"~/.mo_source.rds"} with \code{\link{readRDS}}. If the original file has changed (the file defined with \code{path}), it will call \code{set_mo_source} to update the data file automatically.
Reading an Excel file (\code{.xlsx}) with only one row has a size of 8-9 kB. The compressed file will have a size of 0.1 kB and can be read by \code{get_mo_source} in only a couple of microseconds (a millionth of a second).
Reading an Excel file (\code{.xlsx}) with only one row has a size of 8-9 kB. The compressed file used by this package will have a size of 0.1 kB and can be read by \code{get_mo_source} in only a couple of microseconds (a millionth of a second).
}
\section{Read more on our website!}{
@ -42,7 +42,7 @@ On our website \url{https://msberends.gitlab.io/AMR} you can find \href{https://
# 1. We save it as 'home/me/ourcodes.xlsx'
# 2. We use it for input:
set_mo_source("C:\\path\\ourcodes.xlsx")
set_mo_source("home/me/ourcodes.xlsx")
#> Created mo_source file '~/.mo_source.rds' from 'home/me/ourcodes.xlsx'.
# 3. And use it in our functions:

View File

@ -58,9 +58,16 @@ test_that("data sets are valid", {
test_that("creation of data sets is valid", {
df <- make()
expect_lt(nrow(df[which(df$prevalence == 1), ]), nrow(df[which(df$prevalence == 2), ]))
expect_lt(nrow(df[which(df$prevalence == 2), ]), nrow(df[which(df$prevalence == 3), ]))
DT <- make_DT()
expect_lt(nrow(DT[prevalence == 1]), nrow(DT[prevalence == 2]))
expect_lt(nrow(DT[prevalence == 2]), nrow(DT[prevalence == 3]))
old <- make_trans_tbl()
expect_gt(length(old), 0)
})
test_that("CoL version info works", {
expect_equal(class(catalogue_of_life_version()), "list")
})

View File

@ -30,4 +30,17 @@ test_that("deprecated functions work", {
expect_identical(suppressWarnings(ratio(c(772, 1611, 737), ratio = "1:2:1")), c(780, 1560, 780))
expect_identical(suppressWarnings(ratio(c(1752, 1895), ratio = c(1, 1))), c(1823.5, 1823.5))
expect_warning(guess_mo("esco"))
expect_warning(guess_atc("amox"))
expect_warning(ab_property("amox"))
expect_warning(ab_atc("amox"))
expect_warning(ab_official("amox"))
expect_warning(ab_name("amox"))
expect_warning(ab_trivial_nl("amox"))
expect_warning(ab_certe("amox"))
expect_warning(ab_umcg("amox"))
expect_warning(ab_tradenames("amox"))
expect_warning(atc_ddd("amox"))
expect_warning(atc_groups("amox"))
})

View File

@ -52,7 +52,7 @@ S.aureus <- microbenchmark(as.mo("sau"),
print(S.aureus, unit = "ms", signif = 3)
```
In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 10 milliseconds means it can determine 100 input values per second. It case of 50 milliseconds, this is only 20 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first is a WHONET code) or common laboratory codes, or common full organism names like the last one.
In the table above, all measurements are shown in milliseconds (thousands of seconds). A value of 5 milliseconds means it can determine 200 input values per second. It case of 100 milliseconds, this is only 10 input values per second. The second input is the only one that has to be looked up thoroughly. All the others are known codes (the first one is a WHONET code) or common laboratory codes, or common full organism names like the last one. Full organism names are always preferred.
To achieve this speed, the `as.mo` function also takes into account the prevalence of human pathogenic microorganisms. The downside is of course that less prevalent microorganisms will be determined less fast. See this example for the ID of *Thermus islandicus* (`B_THERMS_ISL`), a bug probably never found before in humans: