diff --git a/DESCRIPTION b/DESCRIPTION index f73897f1..eca956d2 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.6.1.9052 -Date: 2019-06-01 +Version: 0.6.1.9053 +Date: 2019-06-02 Title: Antimicrobial Resistance Analysis Authors@R: c( person( diff --git a/NEWS.md b/NEWS.md index 2d98faf2..42c539dd 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,5 +1,4 @@ -# AMR 0.6.1.9052 -**Note: latest development version** +# AMR 0.6.1.9053 #### New * Support for translation of disk diffusion and MIC values to RSI values (i.e. antimicrobial interpretations). Supported guidelines are EUCAST (2011 to 2019) and CLSI (2011 to 2019). Use `as.rsi()` on an MIC value (created with `as.mic()`), a disk diffusion value (created with the new `as.disk()`) or on a complete date set containing columns with MIC or disk diffusion values. @@ -30,7 +29,8 @@ * Removed deprecated functions `guess_mo()`, `guess_atc()`, `EUCAST_rules()`, `interpretive_reading()`, `rsi()` * Frequency tables (`freq()`): * speed improvement for microbial IDs - * fixed level names in markdown + * fixed factor level names for R Markdown + * when all values are unique it now shows a message instead of a warning * support for boxplots: ```r septic_patients %>% @@ -51,6 +51,7 @@ * Fix for `first_isolate()` for when dates are missing * Improved speed of `guess_ab_col()` * Function `as.mo()` now gently interprets any number of whitespace characters (like tabs) as one space +* Function `as.mo()` now returns `UNKNOWN` for `"con"` (WHONET ID of 'contamination') and returns `NA` for `"xxx"`(WHONET ID of 'no growth') * Small algorithm fix for `as.mo()` * Removed viruses from data set `microorganisms.codes` and cleaned it up * Fix for `mo_shortname()` where species would not be determined correctly diff --git a/R/amr.R b/R/amr.R index ac76faa4..b04d3b40 100644 --- a/R/amr.R +++ b/R/amr.R @@ -45,7 +45,7 @@ #' @section Authors: #' Matthijs S. Berends[1,2] Christian F. Luz[1], Erwin E.A. Hassing[2], Corinna Glasner[1], Alex W. Friedrich[1], Bhanu N.M. Sinha[1] \cr #' -#' [1] Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands - \url{rug.nl} \url{umcg.nl} \cr +#' [1] Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands - \url{https://www.rug.nl} \url{https://www.umcg.nl} \cr #' [2] Certe Medical Diagnostics & Advice, Groningen, the Netherlands - \url{certe.nl} #' @section Read more on our website!: diff --git a/R/catalogue_of_life.R b/R/catalogue_of_life.R index e37da8f8..abd513d6 100755 --- a/R/catalogue_of_life.R +++ b/R/catalogue_of_life.R @@ -40,7 +40,7 @@ #' #' The Catalogue of Life (\url{http://www.catalogueoflife.org}) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.6 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation. #' -#' The syntax used to transform the original data to a cleansed R format, can be found here: \url{https://gitlab.com/msberends/AMR/blob/master/reproduction_of_microorganisms.R}. +#' The syntax used to transform the original data to a cleansed R format, can be found here: \url{https://gitlab.com/msberends/AMR/blob/master/data-raw/reproduction_of_microorganisms.R}. #' @inheritSection AMR Read more on our website! #' @name catalogue_of_life #' @rdname catalogue_of_life diff --git a/R/data.R b/R/data.R index 2264ad45..c14da189 100755 --- a/R/data.R +++ b/R/data.R @@ -59,16 +59,9 @@ #' \describe{ #' \item{\code{mo}}{ID of microorganism as used by this package} #' \item{\code{col_id}}{Catalogue of Life ID} -#' \item{\code{fullname}}{Full name, like \code{"Echerichia coli"}} -#' \item{\code{kingdom}}{Taxonomic kingdom of the microorganism} -#' \item{\code{phylum}}{Taxonomic phylum of the microorganism} -#' \item{\code{class}}{Taxonomic class of the microorganism} -#' \item{\code{order}}{Taxonomic order of the microorganism} -#' \item{\code{family}}{Taxonomic family of the microorganism} -#' \item{\code{genus}}{Taxonomic genus of the microorganism} -#' \item{\code{species}}{Taxonomic species of the microorganism} -#' \item{\code{subspecies}}{Taxonomic subspecies of the microorganism} -#' \item{\code{rank}}{Taxonomic rank of the microorganism, like \code{"species"} or \code{"genus"}} +#' \item{\code{fullname}}{Full name, like \code{"Escherichia coli"}} +#' \item{\code{kingdom}, \code{phylum}, \code{class}, \code{order}, \code{family}, \code{genus}, \code{species}, \code{subspecies}}{Taxonomic rank of the microorganism} +#' \item{\code{rank}}{Text of the taxonomic rank of the microorganism, like \code{"species"} or \code{"genus"}} #' \item{\code{ref}}{Author(s) and year of concerning scientific publication} #' \item{\code{species_id}}{ID of the species as used by the Catalogue of Life} #' \item{\code{source}}{Either \code{"CoL"}, \code{"DSMZ"} (see source) or "manually added"} @@ -119,7 +112,7 @@ catalogue_of_life <- list( #' Translation table for microorganism codes #' #' A data set containing commonly used codes for microorganisms, from laboratory systems and WHONET. Define your own with \code{\link{set_mo_source}}. -#' @format A \code{\link{data.frame}} with 5,171 observations and 2 variables: +#' @format A \code{\link{data.frame}} with 4,969 observations and 2 variables: #' \describe{ #' \item{\code{certe}}{Commonly used code of a microorganism} #' \item{\code{mo}}{ID of the microorganism in the \code{\link{microorganisms}} data set} diff --git a/R/freq.R b/R/freq.R index a3e4aec5..07faa5bd 100755 --- a/R/freq.R +++ b/R/freq.R @@ -342,7 +342,7 @@ freq <- function(x, # mult.columns <- 2 } else { x.name <- deparse(substitute(x)) - if (x.name %like% "[$]") { + if (all(x.name %like% "[$]") & length(x.name) == 1) { cols <- unlist(strsplit(x.name, "$", fixed = TRUE))[2] x.name <- unlist(strsplit(x.name, "$", fixed = TRUE))[1] # try to find the object to determine dimensions @@ -710,7 +710,8 @@ format_header <- function(x, markdown = FALSE, decimal.mark = ".", big.mark = ", }) # numeric values - if (has_length == TRUE & any(x_class %in% c("double", "integer", "numeric", "raw", "single"))) { + if (has_length == TRUE & !is.null(header$sd)) { + # any(x_class %in% c("double", "integer", "numeric", "raw", "single"))) { header$sd <- paste0(header$sd, " (CV: ", header$cv, ", MAD: ", header$mad, ")") header$fivenum <- paste0(paste(trimws(header$fivenum), collapse = " | "), " (IQR: ", header$IQR, ", CQV: ", header$cqv, ")") header$outliers_total <- paste0(header$outliers_total, " (unique count: ", header$outliers_unique, ")") @@ -1018,9 +1019,11 @@ print.freq <- function(x, } else { opt$column_names <- opt$column_names[!opt$column_names == "Item"] } + + all_unique <- FALSE if ("count" %in% colnames(x)) { if (all(x$count == 1)) { - warning("All observations are unique.", call. = FALSE) + all_unique <- TRUE } x$count <- format(x$count, decimal.mark = opt$decimal.mark, big.mark = opt$big.mark) } else { @@ -1072,6 +1075,10 @@ print.freq <- function(x, cat("\n") } + if (all_unique == TRUE) { + message("NOTE: All observations are unique.") + } + # reset old kable setting options(knitr.kable.NA = opt.old) return(invisible()) diff --git a/R/mo.R b/R/mo.R index b5ff422b..ff249403 100755 --- a/R/mo.R +++ b/R/mo.R @@ -195,10 +195,11 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, allow_uncertain = TRUE, # check onLoad() in R/zzz.R: data tables are created there. } - x[x == ""] <- NA_character_ + # WHONET: xxx = no growth + x[tolower(as.character(paste0(x, ""))) %in% c("", "xxx", "na", "nan")] <- NA_character_ uncertainty_level <- translate_allow_uncertain(allow_uncertain) - mo_hist <- get_mo_history(x, uncertainty_level, force = isTRUE(list(...)$force_mo_history)) + # mo_hist <- get_mo_history(x, uncertainty_level, force = isTRUE(list(...)$force_mo_history)) if (mo_source_isvalid(reference_df) & isFALSE(Becker) @@ -231,11 +232,11 @@ as.mo <- function(x, Becker = FALSE, Lancefield = FALSE, allow_uncertain = TRUE, & isFALSE(Lancefield)) { y <- x - } else if (!any(is.na(mo_hist)) - & isFALSE(Becker) - & isFALSE(Lancefield)) { - # check previously found results - y <- mo_hist + # } else if (!any(is.na(mo_hist)) + # & isFALSE(Becker) + # & isFALSE(Lancefield)) { + # # check previously found results + # y <- mo_hist } else if (all(tolower(x) %in% microorganismsDT$fullname_lower) & isFALSE(Becker) @@ -299,7 +300,8 @@ exec_as.mo <- function(x, # check onLoad() in R/zzz.R: data tables are created there. } - x[x == ""] <- NA_character_ + # WHONET: xxx = no growth + x[tolower(as.character(paste0(x, ""))) %in% c("", "xxx", "na", "nan")] <- NA_character_ if (initial_search == TRUE) { options(mo_failures = NULL) @@ -340,12 +342,11 @@ exec_as.mo <- function(x, # only check the uniques, which is way faster x <- unique(x) # remove empty values (to later fill them in again with NAs) - # ("xxx" is WHONET code for 'no growth' and "con" is WHONET code for 'contamination') + # ("xxx" is WHONET code for 'no growth') x <- x[!is.na(x) & !is.null(x) & !identical(x, "") - & !identical(x, "xxx") - & !identical(x, "con")] + & !identical(x, "xxx")] # conversion of old MO codes from v0.5.0 (ITIS) to later versions (Catalogue of Life) if (any(x %like% "^[BFP]_[A-Z]{3,7}") & !all(x %in% microorganisms$mo)) { @@ -560,7 +561,8 @@ exec_as.mo <- function(x, next } - if (any(tolower(x_backup_without_spp[i]) %in% c(NA, "", "xxx", "con", "na", "nan"))) { + # WHONET: xxx = no growth + if (tolower(as.character(paste0(x_backup_without_spp[i], ""))) %in% c("", "xxx", "na", "nan")) { x[i] <- NA_character_ next } @@ -1273,8 +1275,7 @@ exec_as.mo <- function(x, x_input_unique_nonempty <- unique(x_input[!is.na(x_input) & !is.null(x_input) & !identical(x_input, "") - & !identical(x_input, "xxx") - & !identical(x_input, "con")]) + & !identical(x_input, "xxx")]) # left join the found results to the original input values (x_input) df_found <- data.frame(input = as.character(x_input_unique_nonempty), diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 7727a996..c58cc60e 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -78,7 +78,7 @@
diff --git a/docs/articles/AMR.html b/docs/articles/AMR.html index eeab616b..4256a340 100644 --- a/docs/articles/AMR.html +++ b/docs/articles/AMR.html @@ -40,7 +40,7 @@ @@ -199,7 +199,7 @@AMR.Rmd
Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 01 June 2019.
+Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 02 June 2019.
As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr
and ggplot2
by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.
As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr
and ggplot2
by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.
Our AMR
package depends on these packages and even extends their use and functions.
Now, let’s start the cleaning and the analysis!
@@ -418,8 +418,8 @@ # # Item Count Percent Cum. Count Cum. Percent # --- ----- ------- -------- ----------- ------------- -# 1 M 10,472 52.4% 10,472 52.4% -# 2 F 9,528 47.6% 20,000 100.0% +# 1 M 10,233 51.2% 10,233 51.2% +# 2 F 9,767 48.8% 20,000 100.0%So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values M
and F
. From a researcher perspective: there are slightly more men. Nothing we didn’t already know.
The data is already quite clean, but we still need to transform some variables. The bacteria
column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The mutate()
function of the dplyr
package makes this really easy:
data <- data %>%
@@ -449,14 +449,14 @@
# Pasteurella multocida (no new changes)
# Staphylococcus (no new changes)
# Streptococcus groups A, B, C, G (no new changes)
-# Streptococcus pneumoniae (1519 new changes)
+# Streptococcus pneumoniae (1509 new changes)
# Viridans group streptococci (no new changes)
#
# EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)
-# Table 01: Intrinsic resistance in Enterobacteriaceae (1280 new changes)
+# Table 01: Intrinsic resistance in Enterobacteriaceae (1276 new changes)
# Table 02: Intrinsic resistance in non-fermentative Gram-negative bacteria (no new changes)
# Table 03: Intrinsic resistance in other Gram-negative bacteria (no new changes)
-# Table 04: Intrinsic resistance in Gram-positive bacteria (2810 new changes)
+# Table 04: Intrinsic resistance in Gram-positive bacteria (2758 new changes)
# Table 08: Interpretive rules for B-lactam agents and Gram-positive cocci (no new changes)
# Table 09: Interpretive rules for B-lactam agents and Gram-negative rods (no new changes)
# Table 11: Interpretive rules for macrolides, lincosamides, and streptogramins (no new changes)
@@ -464,24 +464,24 @@
# Table 13: Interpretive rules for quinolones (no new changes)
#
# Other rules
-# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2266 new changes)
-# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (95 new changes)
+# Non-EUCAST: amoxicillin/clav acid = S where ampicillin = S (2254 new changes)
+# Non-EUCAST: ampicillin = R where amoxicillin/clav acid = R (138 new changes)
# Non-EUCAST: piperacillin = R where piperacillin/tazobactam = R (no new changes)
# Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no new changes)
# Non-EUCAST: trimethoprim = R where trimethoprim/sulfa = R (no new changes)
# Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no new changes)
#
# --------------------------------------------------------------------------
-# EUCAST rules affected 6,560 out of 20,000 rows, making a total of 7,970 edits
+# EUCAST rules affected 6,564 out of 20,000 rows, making a total of 7,935 edits
# => added 0 test results
#
-# => changed 7,970 test results
-# - 118 test results changed from S to I
-# - 4,807 test results changed from S to R
-# - 1,119 test results changed from I to S
-# - 318 test results changed from I to R
-# - 1,590 test results changed from R to S
-# - 18 test results changed from R to I
+# => changed 7,935 test results
+# - 109 test results changed from S to I
+# - 4,768 test results changed from S to R
+# - 1,081 test results changed from I to S
+# - 345 test results changed from I to R
+# - 1,615 test results changed from R to S
+# - 17 test results changed from R to I
# --------------------------------------------------------------------------
#
# Use verbose = TRUE to get a data.frame with all specified edits instead.
So only 28.5% is suitable for resistance analysis! We can now filter on it with the filter()
function, also from the dplyr
package:
So only 28.3% is suitable for resistance analysis! We can now filter on it with the filter()
function, also from the dplyr
package:
For future use, the above two syntaxes can be shortened with the filter_first_isolate()
function:
Only 1 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics()
function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.
Only 2 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics()
function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.
If a column exists with a name like ‘key(…)ab’ the first_isolate()
function will automatically use it and determine the first weighted isolates. Mind the NOTEs in below output:
data <- data %>%
mutate(keyab = key_antibiotics(.)) %>%
@@ -657,7 +657,7 @@
# NOTE: Using column `patient_id` as input for `col_patient_id`.
# NOTE: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.
# [Criterion] Inclusion based on key antibiotics, ignoring I.
-# => Found 15,183 first weighted isolates (75.9% of total)
isolate | @@ -674,8 +674,8 @@|||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | -2010-01-23 | -H6 | +2010-01-07 | +K9 | B_ESCHR_COL | S | S | @@ -686,8 +686,8 @@||||||
2 | -2010-03-03 | -H6 | +2010-02-14 | +K9 | B_ESCHR_COL | R | S | @@ -698,10 +698,10 @@||||||
3 | -2010-03-26 | -H6 | +2010-02-18 | +K9 | B_ESCHR_COL | -S | +R | S | R | S | @@ -710,20 +710,20 @@|||
4 | -2010-04-21 | -H6 | +2010-06-01 | +K9 | B_ESCHR_COL | +S | +S | R | S | -S | -S | FALSE | TRUE |
5 | -2010-07-15 | -H6 | +2010-08-14 | +K9 | B_ESCHR_COL | S | S | @@ -734,20 +734,8 @@||||||
6 | -2010-08-13 | -H6 | -B_ESCHR_COL | -S | -S | -S | -S | -FALSE | -FALSE | -||||
7 | -2010-09-29 | -H6 | +2010-11-08 | +K9 | B_ESCHR_COL | S | S | @@ -756,37 +744,49 @@FALSE | TRUE | ||||
7 | +2010-11-16 | +K9 | +B_ESCHR_COL | +S | +S | +S | +S | +FALSE | +TRUE | +||||
8 | -2010-10-10 | -H6 | +2010-12-11 | +K9 | B_ESCHR_COL | -R | +S | S | S | S | FALSE | -TRUE | +FALSE |
9 | -2010-12-09 | -H6 | +2011-02-27 | +K9 | B_ESCHR_COL | R | -R | S | S | -FALSE | +S | +TRUE | TRUE |
10 | -2010-12-11 | -H6 | +2011-03-06 | +K9 | B_ESCHR_COL | R | -S | +R | S | S | FALSE | @@ -794,11 +794,11 @@
Instead of 1, now 9 isolates are flagged. In total, 75.9% of all isolates are marked ‘first weighted’ - 47.4% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.
+Instead of 2, now 9 isolates are flagged. In total, 75.1% of all isolates are marked ‘first weighted’ - 46.8% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.
As with filter_first_isolate()
, there’s a shortcut for this new algorithm too:
So we end up with 15,183 isolates for analysis.
+So we end up with 15,014 isolates for analysis.
We can remove unneeded columns:
@@ -824,28 +824,44 @@Or can be used like the dplyr
way, which is easier readable:
Frequency table of genus
and species
from a data.frame
(15,183 x 13)
Frequency table of genus
and species
from a data.frame
(15,014 x 13)
Columns: 2
-Length: 15,183 (of which NA: 0 = 0.00%)
+Length: 15,014 (of which NA: 0 = 0.00%)
Unique: 4
Shortest: 16
Longest: 24
The functions portion_S()
, portion_SI()
, portion_I()
, portion_IR()
and portion_R()
can be used to determine the portion of a specific antimicrobial outcome. As per the EUCAST guideline of 2019, we calculate resistance as the portion of R (portion_R()
) and susceptibility as the portion of S and I (portion_SI()
). These functions can be used on their own:
Or can be used in conjuction with group_by()
and summarise()
, both from the dplyr
package:
data_1st %>%
group_by(hospital) %>%
@@ -1004,19 +1004,19 @@ Longest: 24
Hospital A
-0.4789879
+0.4625681
Hospital B
-0.4752328
+0.4799232
Hospital C
-0.4525453
+0.4727660
Hospital D
-0.4688136
+0.4660767
@@ -1034,23 +1034,23 @@ Longest: 24
Hospital A
-0.4789879
-4545
+0.4625681
+4408
Hospital B
-0.4752328
-5370
+0.4799232
+5205
Hospital C
-0.4525453
-2318
+0.4727660
+2350
Hospital D
-0.4688136
-2950
+0.4660767
+3051
@@ -1070,27 +1070,27 @@ Longest: 24
Escherichia
-0.9224649
-0.8938781
-0.9920442
+0.9252935
+0.8992796
+0.9945304
Klebsiella
-0.8119122
-0.9072100
-0.9874608
+0.8150773
+0.9059278
+0.9864691
Staphylococcus
-0.9185401
-0.9098122
-0.9920656
+0.9216489
+0.9211029
+0.9923560
Streptococcus
-0.6056043
+0.6026921
0.0000000
-0.6056043
+0.6026921
diff --git a/docs/articles/AMR_files/figure-html/plot 1-1.png b/docs/articles/AMR_files/figure-html/plot 1-1.png
index c5e55744..a8e8cac5 100644
Binary files a/docs/articles/AMR_files/figure-html/plot 1-1.png and b/docs/articles/AMR_files/figure-html/plot 1-1.png differ
diff --git a/docs/articles/AMR_files/figure-html/plot 3-1.png b/docs/articles/AMR_files/figure-html/plot 3-1.png
index 7884bd96..8bf7da44 100644
Binary files a/docs/articles/AMR_files/figure-html/plot 3-1.png and b/docs/articles/AMR_files/figure-html/plot 3-1.png differ
diff --git a/docs/articles/AMR_files/figure-html/plot 4-1.png b/docs/articles/AMR_files/figure-html/plot 4-1.png
index c520d721..22270b73 100644
Binary files a/docs/articles/AMR_files/figure-html/plot 4-1.png and b/docs/articles/AMR_files/figure-html/plot 4-1.png differ
diff --git a/docs/articles/AMR_files/figure-html/plot 5-1.png b/docs/articles/AMR_files/figure-html/plot 5-1.png
index 49178b79..396a17b1 100644
Binary files a/docs/articles/AMR_files/figure-html/plot 5-1.png and b/docs/articles/AMR_files/figure-html/plot 5-1.png differ
diff --git a/docs/articles/index.html b/docs/articles/index.html
index e9fdf32e..0930bf8b 100644
--- a/docs/articles/index.html
+++ b/docs/articles/index.html
@@ -78,7 +78,7 @@
resistance_predict.Rmd
As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr
and ggplot2
by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.
As with many uses in R, we need some additional packages for AMR analysis. Our package works closely together with the tidyverse packages dplyr
and ggplot2
by Dr Hadley Wickham. The tidyverse tremendously improves the way we conduct data science - it allows for a very natural way of writing syntaxes and creating beautiful plots in R.
Our AMR
package depends on these packages and even extends their use and functions.
library(dplyr)
library(ggplot2)
diff --git a/docs/authors.html b/docs/authors.html
index 84f55c48..673a0e9e 100644
--- a/docs/authors.html
+++ b/docs/authors.html
@@ -78,7 +78,7 @@
Note: latest development version
support for boxplots:
septic_patients %>%
@@ -319,6 +319,7 @@ Please guess_ab_col()
as.mo()
now gently interprets any number of whitespace characters (like tabs) as one spaceas.mo()
now returns UNKNOWN
for "con"
(WHONET ID of ‘contamination’) and returns NA
for "xxx"
(WHONET ID of ‘no growth’)as.mo()
microorganisms.codes
and cleaned it upas.mo(..., allow_uncertain = 3)
Contents
Matthijs S. Berends[1,2] Christian F. Luz[1], Erwin E.A. Hassing[2], Corinna Glasner[1], Alex W. Friedrich[1], Bhanu N.M. Sinha[1]
[1] Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands - rug.nl umcg.nl
+
[1] Department of Medical Microbiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands - https://www.rug.nl https://www.umcg.nl
[2] Certe Medical Diagnostics & Advice, Groningen, the Netherlands - certe.nl
The responsible author(s) and year of scientific publication
The Catalogue of Life (http://www.catalogueoflife.org) is the most comprehensive and authoritative global index of species currently available. It holds essential information on the names, relationships and distributions of over 1.6 million species. The Catalogue of Life is used to support the major biodiversity and conservation information services such as the Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EoL) and the International Union for Conservation of Nature Red List. It is recognised by the Convention on Biological Diversity as a significant component of the Global Taxonomy Initiative and a contribution to Target 1 of the Global Strategy for Plant Conservation.
-The syntax used to transform the original data to a cleansed R format, can be found here: https://gitlab.com/msberends/AMR/blob/master/reproduction_of_microorganisms.R.
+The syntax used to transform the original data to a cleansed R format, can be found here: https://gitlab.com/msberends/AMR/blob/master/data-raw/reproduction_of_microorganisms.R.
A data.frame
with 5,171 observations and 2 variables:
A data.frame
with 4,969 observations and 2 variables:
certe
Commonly used code of a microorganism
mo
ID of the microorganism in the microorganisms
data set
A data.frame
with 67,903 observations and 16 variables:
mo
ID of microorganism as used by this package
col_id
Catalogue of Life ID
fullname
Full name, like "Echerichia coli"
kingdom
Taxonomic kingdom of the microorganism
phylum
Taxonomic phylum of the microorganism
class
Taxonomic class of the microorganism
order
Taxonomic order of the microorganism
family
Taxonomic family of the microorganism
genus
Taxonomic genus of the microorganism
species
Taxonomic species of the microorganism
subspecies
Taxonomic subspecies of the microorganism
rank
Taxonomic rank of the microorganism, like "species"
or "genus"
ref
Author(s) and year of concerning scientific publication
species_id
ID of the species as used by the Catalogue of Life
source
Either "CoL"
, "DSMZ"
(see source) or "manually added"
prevalence
Prevalence of the microorganism, see ?as.mo
An object of class data.frame
with 67903 rows and 16 columns.