diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 7cf67639..9593a551 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -20,7 +20,7 @@ # ==================================================================== # # to check with R-Hub: -# rhub::check_for_cran(devtools::build(args = c('--no-build-vignettes'))) +# chck <- rhub::check_for_cran(devtools::build(args = c('--no-build-vignettes'))) stages: - build @@ -124,18 +124,10 @@ coverage: pages: stage: deploy when: always - #cache: - # key: "$CI_COMMIT_REF_SLUG" - # paths: - # - installed_deps/ only: - master script: - mv docs public - # install missing and outdated packages - #- Rscript -e 'source(".gitlab-ci.R"); gl_update_pkg_all(repos = "https://cran.rstudio.com", quiet = TRUE, install_pkgdown = TRUE)' - #- Rscript -e "devtools::install(build = TRUE, upgrade = FALSE)" - #- R -e "pkgdown::build_site(examples = FALSE, lazy = TRUE, override = list(destination = 'public'))" artifacts: paths: - public diff --git a/DESCRIPTION b/DESCRIPTION index 46650a03..1c5d092e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 0.7.0.9009 -Date: 2019-06-13 +Version: 0.7.0.9010 +Date: 2019-06-16 Title: Antimicrobial Resistance Analysis Authors@R: c( person( diff --git a/NAMESPACE b/NAMESPACE index 043ce300..42610ec1 100755 --- a/NAMESPACE +++ b/NAMESPACE @@ -154,6 +154,7 @@ export(mo_renamed) export(mo_shortname) export(mo_species) export(mo_subspecies) +export(mo_synonyms) export(mo_taxonomy) export(mo_type) export(mo_uncertainties) diff --git a/NEWS.md b/NEWS.md index 700f8233..57ef0325 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,7 +1,7 @@ -# AMR 0.7.0.9009 +# AMR 0.7.0.9010 #### New -* Function `rsi_df()` to transform a `data.frame` to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combinations of existing functions `count_df()` and `portion_df()` to immediately show resistance percentages and number of available isolates: +* Function `rsi_df()` to transform a `data.frame` to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combination of the existing functions `count_df()` and `portion_df()` to immediately show resistance percentages and number of available isolates: ```r septic_patients %>% select(AMX, CIP) %>% @@ -12,14 +12,31 @@ # 3 Ciprofloxacin SI 0.8381831 1181 # 4 Ciprofloxacin R 0.1618169 228 ``` -* Support for all scientifically published pathotypes of *E. coli* to date. Supported are: AIEC (Adherent-Invasive *E. coli*), ATEC (Atypical Entero-pathogenic *E. coli*), DAEC (Diffusely Adhering *E. coli*), EAEC (Entero-Aggresive *E. coli*), EHEC (Entero-Haemorrhagic *E. coli*), EIEC (Entero-Invasive *E. coli*), EPEC (Entero-Pathogenic *E. coli*), ETEC (Entero-Toxigenic *E. coli*), NMEC (Neonatal Meningitis‐causing *E. coli*), STEC (Shiga-toxin producing *E. coli*) and UPEC (Uropathogenic *E. coli*). All these lead to the microbial ID of *E. coli*: +* Support for all scientifically published pathotypes of *E. coli* to date. Supported are: + + * AIEC (Adherent-Invasive *E. coli*) + * ATEC (Atypical Entero-pathogenic *E. coli*) + * DAEC (Diffusely Adhering *E. coli*) + * EAEC (Entero-Aggresive *E. coli*) + * EHEC (Entero-Haemorrhagic *E. coli*) + * EIEC (Entero-Invasive *E. coli*) + * EPEC (Entero-Pathogenic *E. coli*) + * ETEC (Entero-Toxigenic *E. coli*) + * NMEC (Neonatal Meningitis‐causing *E. coli*) + * STEC (Shiga-toxin producing *E. coli*) + * UPEC (Uropathogenic *E. coli*) + + All these lead to the microbial ID of *E. coli*: ```r as.mo("UPEC") # B_ESCHR_COL - mo_fullname("UPEC") + mo_name("UPEC") # "Escherichia coli" + mo_gramstain("EHEC") + # "Gram-negative" ``` * Function `mo_info()` as an analogy to `ab_info()`. The `mo_info()` prints a list with the full taxonomy, authors, and the URL to the online database of a microorganism +* Function `mo_synonyms()` to get all previously accepted taxonomic names of a microorganism #### Changed * Column names of output `count_df()` and `portion_df()` are now lowercase @@ -36,6 +53,7 @@ * Removed antibiotic code `PVM1` from the `antibiotics` data set as this was a duplicate of `PME` * Fixed bug where not all old taxonomic named would not be printed when using a vector as input for `as.mo()` * Manually added *Trichomonas vaginalis* from the kingdom of Protozoa, which is missing from the Catalogue of Life +* Small improvements to `plot()` and `barplot()` for MIC and RSI classes #### Other * Fixed a note thrown by CRAN tests diff --git a/R/count.R b/R/count.R index 7c275b11..9fd8a69b 100755 --- a/R/count.R +++ b/R/count.R @@ -33,7 +33,7 @@ #' #' The function \code{count_df} takes any variable from \code{data} that has an \code{"rsi"} class (created with \code{\link{as.rsi}}) and counts the amounts of S, I and R. The resulting \emph{tidy data} (see Source) \code{data.frame} will have three rows (S/I/R) and a column for each variable with class \code{"rsi"}. #' -#' The function \code{rsi_df} works exactly like \code{count_df}, but add the percentage of S, I and R. +#' The function \code{rsi_df} works exactly like \code{count_df}, but adds the percentage of S, I and R. #' @source Wickham H. \strong{Tidy Data.} The Journal of Statistical Software, vol. 59, 2014. \url{http://vita.had.co.nz/papers/tidy-data.html} #' @seealso \code{\link{portion}_*} to calculate microbial resistance and susceptibility. #' @keywords resistance susceptibility rsi antibiotics isolate isolates diff --git a/R/mic.R b/R/mic.R index ea71c3da..0ef5206f 100755 --- a/R/mic.R +++ b/R/mic.R @@ -242,34 +242,38 @@ summary.mic <- function(object, ...) { #' @exportMethod plot.mic #' @export -#' @importFrom dplyr %>% group_by summarise -#' @importFrom graphics plot text +#' @importFrom graphics barplot axis #' @noRd -plot.mic <- function(x, ...) { - x_name <- deparse(substitute(x)) - create_barplot_mic(x, x_name, ...) +plot.mic <- function(x, + main = paste('MIC values of', deparse(substitute(x))), + ylab = 'Frequency', + xlab = 'MIC value', + axes = FALSE, + ...) { + barplot(table(droplevels.factor(x)), + ylab = ylab, + xlab = xlab, + axes = axes, + main = main, + ...) + axis(2, seq(0, max(table(droplevels.factor(x))))) } #' @exportMethod barplot.mic #' @export #' @importFrom graphics barplot axis #' @noRd -barplot.mic <- function(height, ...) { - x_name <- deparse(substitute(height)) - create_barplot_mic(height, x_name, ...) -} - -#' @importFrom graphics barplot axis -#' @importFrom dplyr %>% group_by summarise -create_barplot_mic <- function(x, x_name, ...) { - data <- data.frame(mic = droplevels(x), cnt = 1) %>% - group_by(mic) %>% - summarise(cnt = sum(cnt)) - barplot(table(droplevels.factor(x)), - ylab = 'Frequency', - xlab = 'MIC value', - main = paste('MIC values of', x_name), - axes = FALSE, +barplot.mic <- function(height, + main = paste('MIC values of', deparse(substitute(height))), + ylab = 'Frequency', + xlab = 'MIC value', + axes = FALSE, + ...) { + barplot(table(droplevels.factor(height)), + ylab = ylab, + xlab = xlab, + axes = axes, + main = main, ...) - axis(2, seq(0, max(data$cnt))) + axis(2, seq(0, max(table(droplevels.factor(height))))) } diff --git a/R/misc.R b/R/misc.R index f1ace71b..3e5f150c 100755 --- a/R/misc.R +++ b/R/misc.R @@ -307,5 +307,14 @@ translate_AMR <- function(from, language = get_locale(), only_unknown = FALSE) { } "%or%" <- function(x, y) { - ifelse(!is.na(x), x, ifelse(!is.na(y), y, NA)) + if (is.null(x) | is.null(y)) { + if (is.null(x)) { + return(y) + } else { + return(x) + } + } + ifelse(!is.na(x), + x, + ifelse(!is.na(y), y, NA)) } diff --git a/R/mo_property.R b/R/mo_property.R index 69b95e83..e02bdae8 100755 --- a/R/mo_property.R +++ b/R/mo_property.R @@ -73,6 +73,7 @@ #' mo_type("E. coli") # "Bacteria" (equal to kingdom, but may be translated) #' mo_rank("E. coli") # "species" #' mo_url("E. coli") # get the direct url to the online database entry +#' mo_synonyms("E. coli") # get previously accepted taxonomic names #' #' ## scientific reference #' mo_ref("E. coli") # "Castellani et al., 1919" @@ -312,12 +313,24 @@ mo_taxonomy <- function(x, language = get_locale(), ...) { subspecies = mo_subspecies(x, language = language)) } +#' @rdname mo_property +#' @export +mo_synonyms <- function(x, ...) { + x <- AMR::as.mo(x, ...) + col_id <- AMR::microorganisms[which(AMR::microorganisms$mo == x), "col_id"] + if (is.na(col_id) | !col_id %in% AMR::microorganisms.old$col_id_new) { + return(NULL) + } + sort(AMR::microorganisms.old[which(AMR::microorganisms.old$col_id_new == col_id), "fullname"]) +} + #' @rdname mo_property #' @export mo_info <- function(x, language = get_locale(), ...) { x <- AMR::as.mo(x, ...) c(mo_taxonomy(x, language = language), - list(url = unname(mo_url(x, open = FALSE)), + list(synonyms = mo_synonyms(x), + url = unname(mo_url(x, open = FALSE)), ref = mo_ref(x))) } diff --git a/R/portion.R b/R/portion.R index 9bf11ab4..3474b52a 100755 --- a/R/portion.R +++ b/R/portion.R @@ -40,7 +40,7 @@ #' #' The function \code{portion_df} takes any variable from \code{data} that has an \code{"rsi"} class (created with \code{\link{as.rsi}}) and calculates the portions R, I and S. The resulting \emph{tidy data} (see Source) \code{data.frame} will have three rows (S/I/R) and a column for each group and each variable with class \code{"rsi"}. #' -#' The function \code{rsi_df} works exactly like \code{portion_df}, but add the number of isolates. +#' The function \code{rsi_df} works exactly like \code{portion_df}, but adds the number of isolates. #' \if{html}{ # (created with https://www.latex4technics.com/) #' \cr\cr diff --git a/R/rsi.R b/R/rsi.R index c0f83de4..2b749987 100755 --- a/R/rsi.R +++ b/R/rsi.R @@ -387,9 +387,14 @@ summary.rsi <- function(object, ...) { #' @importFrom dplyr %>% group_by summarise filter mutate if_else n_distinct #' @importFrom graphics plot text #' @noRd -plot.rsi <- function(x, ...) { - x_name <- deparse(substitute(x)) - +plot.rsi <- function(x, + lwd = 2, + ylim = NULL, + ylab = 'Percentage', + xlab = 'Antimicrobial Interpretation', + main = paste('Susceptibility Analysis of', deparse(substitute(x))), + axes = FALSE, + ...) { suppressWarnings( data <- data.frame(x = x, y = 1, @@ -415,13 +420,12 @@ plot.rsi <- function(x, ...) { plot(x = data$x, y = data$s, - lwd = 2, - col = c('green', 'orange', 'red'), + lwd = lwd, ylim = c(0, ymax), - ylab = 'Percentage', - xlab = 'Antimicrobial Interpretation', - main = paste('Susceptibility Analysis of', x_name), - axes = FALSE, + ylab = ylab, + xlab = xlab, + main = main, + axes = axes, ...) # x axis axis(side = 1, at = 1:n_distinct(data$x), labels = levels(data$x), lwd = 0) @@ -439,24 +443,32 @@ plot.rsi <- function(x, ...) { #' @importFrom dplyr %>% group_by summarise #' @importFrom graphics barplot axis #' @noRd -barplot.rsi <- function(height, ...) { - x <- height - x_name <- deparse(substitute(height)) +barplot.rsi <- function(height, + col = c('green3', 'orange2', 'red3'), + xlab = ifelse(beside, 'Antimicrobial Interpretation', ''), + main = paste('Susceptibility Analysis of', deparse(substitute(height))), + ylab = 'Frequency', + beside = TRUE, + axes = beside, + ...) { - suppressWarnings( - data <- data.frame(rsi = x, cnt = 1) %>% - group_by(rsi) %>% - summarise(cnt = sum(cnt)) %>% - droplevels() - ) + if (axes == TRUE) { + par(mar = c(5, 4, 4, 2) + 0.1) + } else { + par(mar = c(2, 4, 4, 2) + 0.1) + } - barplot(table(x), - col = c('green3', 'orange2', 'red3'), - xlab = 'Antimicrobial Interpretation', - main = paste('Susceptibility Analysis of', x_name), - ylab = 'Frequency', + barplot(as.matrix(table(height)), + col = col, + xlab = xlab, + main = main, + ylab = ylab, + beside = beside, axes = FALSE, ...) # y axis, 0-100% - axis(side = 2, at = seq(0, max(data$cnt) + max(data$cnt) * 1.1, by = 25)) + axis(side = 2, at = seq(0, max(table(height)) + max(table(height)) * 1.1, by = 25)) + if (axes == TRUE && beside == TRUE) { + axis(side = 1, labels = levels(height), at = c(1, 2, 3) + 0.5, lwd = 0) + } } diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 134750b1..49765907 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -78,7 +78,7 @@
diff --git a/docs/articles/AMR.html b/docs/articles/AMR.html index 32705737..139af9fb 100644 --- a/docs/articles/AMR.html +++ b/docs/articles/AMR.html @@ -40,7 +40,7 @@ @@ -125,13 +125,6 @@ Create frequency tables -AMR.Rmd
Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 07 June 2019.
+Note: values on this page will change with every website update since they are based on randomly created values and the page was written in R Markdown. However, the methodology remains unchanged. This page was generated on 15 June 2019.
Let’s pretend that our data consists of blood cultures isolates from 1 January 2010 until 1 January 2018.
+Let’s pretend that our data consists of blood cultures isolates from between 1 January 2010 and 1 January 2018.
This dates
object now contains all days in our date range.
Using the left_join()
function from the dplyr
package, we can ‘map’ the gender to the patient ID using the patients_table
object we created earlier:
The resulting data set contains 5,000 blood culture isolates. With the head()
function we can preview the first 6 values of this data set:
The resulting data set contains 20,000 blood culture isolates. With the head()
function we can preview the first 6 values of this data set:
2010-01-26 | -S3 | -Hospital B | -Escherichia coli | -R | -I | -R | -S | -F | -||||
2010-08-26 | -G7 | -Hospital D | -Escherichia coli | -R | -S | -S | -S | -M | -||||
2016-05-04 | -B8 | +2010-06-23 | +R10 | Hospital A | -Escherichia coli | -S | -S | -R | -S | -M | -||
2016-11-19 | -E10 | -Hospital D | Streptococcus pneumoniae | S | S | S | S | -M | -||||
2013-07-13 | -G3 | -Hospital A | -Escherichia coli | -R | -S | -S | -S | -M | +F | |||
2017-05-23 | -V3 | -Hospital C | -Staphylococcus aureus | +2016-04-09 | +R1 | +Hospital D | +Klebsiella pneumoniae | S | S | S | S | F |
2017-01-27 | +U10 | +Hospital A | +Streptococcus pneumoniae | +R | +S | +R | +S | +F | +||||
2017-05-23 | +C4 | +Hospital B | +Klebsiella pneumoniae | +R | +S | +R | +S | +M | +||||
2015-03-27 | +W3 | +Hospital B | +Escherichia coli | +S | +S | +S | +S | +F | +||||
2014-06-14 | +L5 | +Hospital A | +Escherichia coli | +S | +S | +S | +S | +M | +
Now, let’s start the cleaning and the analysis!
@@ -418,9 +411,9 @@ # # Item Count Percent Cum. Count Cum. Percent # --- ----- ------- -------- ----------- ------------- -# 1 M 10,368 51.8% 10,368 51.8% -# 2 F 9,632 48.2% 20,000 100.0% -So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values M
and F
. From a researcher perspective: there are slightly more men. Nothing we didn’t already know.
So, we can draw at least two conclusions immediately. From a data scientists perspective, the data looks clean: only values M
and F
. From a researchers perspective: there are slightly more men. Nothing we didn’t already know.
The data is already quite clean, but we still need to transform some variables. The bacteria
column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The mutate()
function of the dplyr
package makes this really easy:
So only 28.5% is suitable for resistance analysis! We can now filter on it with the filter()
function, also from the dplyr
package:
So only 28.3% is suitable for resistance analysis! We can now filter on it with the filter()
function, also from the dplyr
package:
For future use, the above two syntaxes can be shortened with the filter_first_isolate()
function:
We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Imagine this data, sorted on date:
+We made a slight twist to the CLSI algorithm, to take into account the antimicrobial susceptibility profile. Have a look at all isolates of patient V3, sorted on date:
isolate | @@ -536,21 +529,21 @@||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | -2010-01-16 | -V9 | +2010-01-06 | +V3 | B_ESCHR_COL | S | S | S | -S | +R | TRUE | |
2 | -2010-02-23 | -V9 | +2010-07-24 | +V3 | B_ESCHR_COL | -S | +R | S | S | S | @@ -558,10 +551,10 @@||
3 | -2010-04-05 | -V9 | +2010-07-26 | +V3 | B_ESCHR_COL | -S | +R | S | S | S | @@ -569,19 +562,19 @@||
4 | -2010-04-28 | -V9 | +2011-05-06 | +V3 | B_ESCHR_COL | +R | S | S | S | -S | -FALSE | +TRUE |
5 | -2010-05-20 | -V9 | +2011-06-04 | +V3 | B_ESCHR_COL | S | S | @@ -591,41 +584,41 @@|||||
6 | -2010-07-07 | -V9 | +2011-07-22 | +V3 | B_ESCHR_COL | S | S | +S | +S | +FALSE | +||
7 | +2011-08-15 | +V3 | +B_ESCHR_COL | +I | +I | +S | +R | +FALSE | +||||
8 | +2011-09-20 | +V3 | +B_ESCHR_COL | +R | +S | R | S | FALSE | ||||
7 | -2010-09-11 | -V9 | -B_ESCHR_COL | -S | -S | -S | -S | -FALSE | -||||
8 | -2010-09-19 | -V9 | -B_ESCHR_COL | -S | -S | -S | -S | -FALSE | -||||
9 | -2010-10-07 | -V9 | +2012-03-26 | +V3 | B_ESCHR_COL | S | S | @@ -635,18 +628,18 @@|||||
10 | -2011-01-14 | -V9 | +2012-06-01 | +V3 | B_ESCHR_COL | R | +I | S | S | -S | -FALSE | +TRUE |
Only 1 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics()
function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.
Only 3 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The key_antibiotics()
function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.
If a column exists with a name like ‘key(…)ab’ the first_isolate()
function will automatically use it and determine the first weighted isolates. Mind the NOTEs in below output:
data <- data %>%
mutate(keyab = key_antibiotics(.)) %>%
@@ -657,7 +650,7 @@
# NOTE: Using column `patient_id` as input for `col_patient_id`.
# NOTE: Using column `keyab` as input for `col_keyantibiotics`. Use col_keyantibiotics = FALSE to prevent this.
# [Criterion] Inclusion based on key antibiotics, ignoring I.
-# => Found 15,072 first weighted isolates (75.4% of total)
isolate | @@ -674,34 +667,34 @@||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | -2010-01-16 | -V9 | +2010-01-06 | +V3 | B_ESCHR_COL | S | S | S | -S | +R | TRUE | TRUE | ||
2 | -2010-02-23 | -V9 | +2010-07-24 | +V3 | B_ESCHR_COL | -S | +R | S | S | S | FALSE | -FALSE | +TRUE | |
3 | -2010-04-05 | -V9 | +2010-07-26 | +V3 | B_ESCHR_COL | -S | +R | S | S | S | @@ -710,95 +703,95 @@||||
4 | -2010-04-28 | -V9 | +2011-05-06 | +V3 | B_ESCHR_COL | +R | S | S | S | -S | -FALSE | -FALSE | +TRUE | +TRUE |
5 | -2010-05-20 | -V9 | +2011-06-04 | +V3 | B_ESCHR_COL | S | S | S | S | FALSE | -FALSE | -|||
6 | -2010-07-07 | -V9 | -B_ESCHR_COL | -S | -S | -R | -S | -FALSE | TRUE | |||||
7 | -2010-09-11 | -V9 | +||||||||||||
6 | +2011-07-22 | +V3 | B_ESCHR_COL | S | S | S | S | FALSE | +FALSE | +|||||
7 | +2011-08-15 | +V3 | +B_ESCHR_COL | +I | +I | +S | +R | +FALSE | TRUE | |||||
8 | -2010-09-19 | -V9 | +2011-09-20 | +V3 | B_ESCHR_COL | +R | S | -S | -S | +R | S | FALSE | -FALSE | +TRUE |
9 | -2010-10-07 | -V9 | +2012-03-26 | +V3 | B_ESCHR_COL | S | S | S | S | FALSE | -FALSE | -|||
10 | -2011-01-14 | -V9 | -B_ESCHR_COL | -R | -S | -S | -S | -FALSE | +TRUE | +|||||
10 | +2012-06-01 | +V3 | +B_ESCHR_COL | +R | +I | +S | +S | +TRUE | TRUE |
Instead of 1, now 4 isolates are flagged. In total, 75.4% of all isolates are marked ‘first weighted’ - 46.9% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.
+Instead of 3, now 8 isolates are flagged. In total, 76% of all isolates are marked ‘first weighted’ - 47.7% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline.
As with filter_first_isolate()
, there’s a shortcut for this new algorithm too:
So we end up with 15,072 isolates for analysis.
+So we end up with 15,202 isolates for analysis.
We can remove unneeded columns:
@@ -806,7 +799,6 @@date | patient_id | hospital | @@ -823,89 +815,68 @@|||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | -2010-01-26 | -S3 | -Hospital B | -B_ESCHR_COL | -R | -I | -R | +2010-06-23 | +R10 | +Hospital A | +B_STRPT_PNE | S | +S | +S | +R | F | -Gram negative | -Escherichia | -coli | +Gram-positive | +Streptococcus | +pneumoniae | TRUE |
2 | -2010-08-26 | -G7 | +2016-04-09 | +R1 | Hospital D | -B_ESCHR_COL | +B_KLBSL_PNE | R | S | S | S | -M | -Gram negative | -Escherichia | -coli | -TRUE | -|||||||
3 | -2016-05-04 | -B8 | -Hospital A | -B_ESCHR_COL | -S | -S | -R | -S | -M | -Gram negative | -Escherichia | -coli | -TRUE | -||||||||||
5 | -2013-07-13 | -G3 | -Hospital A | -B_ESCHR_COL | -R | -S | -S | -S | -M | -Gram negative | -Escherichia | -coli | -TRUE | -||||||||||
8 | -2016-11-20 | -Y6 | -Hospital A | -B_ESCHR_COL | -S | -S | -R | -S | F | -Gram negative | -Escherichia | -coli | +Gram-negative | +Klebsiella | +pneumoniae | +TRUE | +|||||||
2017-01-27 | +U10 | +Hospital A | +B_STRPT_PNE | +R | +R | +R | +R | +F | +Gram-positive | +Streptococcus | +pneumoniae | TRUE | |||||||||||
10 | -2015-11-03 | -W8 | +2017-05-23 | +C4 | +Hospital B | +B_KLBSL_PNE | +R | +S | +R | +S | +M | +Gram-negative | +Klebsiella | +pneumoniae | +TRUE | +||||||||
2015-03-27 | +W3 | Hospital B | B_ESCHR_COL | S | @@ -913,7 +884,22 @@S | S | F | -Gram negative | +Gram-negative | +Escherichia | +coli | +TRUE | +|||||||||||
2014-06-14 | +L5 | +Hospital A | +B_ESCHR_COL | +S | +S | +S | +S | +M | +Gram-negative | Escherichia | coli | TRUE | @@ -935,9 +921,9 @@|||||||||||
1 | Escherichia coli | -7,451 | -49.4% | -7,451 | -49.4% | +7,485 | +49.2% | +7,485 | +49.2% | ||||||||||||||
2 | Staphylococcus aureus | -3,713 | -24.6% | -11,164 | -74.1% | +3,758 | +24.7% | +11,243 | +74.0% | ||||||||||||||
3 | Streptococcus pneumoniae | -2,367 | -15.7% | -13,531 | -89.8% | +2,371 | +15.6% | +13,614 | +89.6% | ||||||||||||||
4 | Klebsiella pneumoniae | -1,541 | -10.2% | -15,072 | +1,588 | +10.4% | +15,202 | 100.0% | |||||||||||||||
Hospital A | -0.4472198 | +0.4688237 | |||||||||||||||||||||
Hospital B | -0.4684564 | +0.4693374 | |||||||||||||||||||||
Hospital C | -0.4662494 | +0.4641460 | |||||||||||||||||||||
Hospital D | -0.4704907 | +0.4664644 |
Omit the translate_ab = FALSE
to have the antibiotic codes (AMX, AMC, CIP, GEN) translated to official WHO names (amoxicillin, amoxicillin and betalactamase inhibitor, ciprofloxacin, gentamicin).
Omit the translate_ab = FALSE
to have the antibiotic codes (AMX, AMC, CIP, GEN) translated to official WHO names (amoxicillin, amoxicillin/clavulanic acid, ciprofloxacin, gentamicin).
If we group on e.g. the genus
column and add some additional functions from our package, we can create this:
# group the data on `genus`
ggplot(data_1st %>% group_by(genus)) +
@@ -1136,7 +1122,7 @@ Longest: 24
# of which we have 4 (earlier created with `as.rsi`)
geom_rsi(x = "genus") +
# split plots on antibiotic
- facet_rsi(facet = "Antibiotic") +
+ facet_rsi(facet = "antibiotic") +
# make R red, I yellow and S green
scale_rsi_colours() +
# show percentages on y axis
@@ -1154,7 +1140,7 @@ Longest: 24
data_1st %>%
group_by(genus) %>%
ggplot_rsi(x = "genus",
- facet = "Antibiotic",
+ facet = "antibiotic",
breaks = 0:4 * 25,
datalabels = FALSE) +
coord_flip()
@@ -1164,48 +1150,35 @@ Longest: 24
Independence test
The next example uses the included septic_patients
, which is an anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This data.frame
can be used to practice AMR analysis.
-We will compare the resistance to fosfomycin (column FOS
) in hospital A and D. The input for the final fisher.test()
will be this:
-
-
-
-A
-D
-
-
-
-IR
-25
-77
-
-
-S
-24
-33
-
-
-
-We can transform the data and apply the test in only a couple of lines:
-septic_patients %>%
+We will compare the resistance to fosfomycin (column FOS
) in hospital A and D. The input for the fisher.test()
can be retrieved with a transformation like this:
+check_FOS <- septic_patients %>%
filter(hospital_id %in% c("A", "D")) %>% # filter on only hospitals A and D
select(hospital_id, FOS) %>% # select the hospitals and fosfomycin
group_by(hospital_id) %>% # group on the hospitals
- count_df(combine_IR = TRUE) %>% # count all isolates per group (hospital_id)
- tidyr::spread(hospital_id, Value) %>% # transform output so A and D are columns
+ count_df(combine_SI = TRUE) %>% # count all isolates per group (hospital_id)
+ tidyr::spread(hospital_id, value) %>% # transform output so A and D are columns
select(A, D) %>% # and select these only
- as.matrix() %>% # transform to good old matrix for fisher.test()
- fisher.test() # do Fisher's Exact Test
-#
-# Fisher's Exact Test for Count Data
-#
-# data: .
-# p-value = 0.03104
-# alternative hypothesis: true odds ratio is not equal to 1
-# 95 percent confidence interval:
-# 0.2111489 0.9485124
-# sample estimates:
-# odds ratio
-# 0.4488318
-As can be seen, the p value is 0.03, which means that the fosfomycin resistances found in hospital A and D are really different.
+ as.matrix() # transform to good old matrix for fisher.test()
+
+check_FOS
+# A D
+# [1,] 25 77
+# [2,] 24 33
+We can apply the test now with:
+# do Fisher's Exact Test
+fisher.test(check_FOS)
+#
+# Fisher's Exact Test for Count Data
+#
+# data: check_FOS
+# p-value = 0.03104
+# alternative hypothesis: true odds ratio is not equal to 1
+# 95 percent confidence interval:
+# 0.2111489 0.9485124
+# sample estimates:
+# odds ratio
+# 0.4488318
+As can be seen, the p value is 0.031, which means that the fosfomycin resistances found in hospital A and D are really different.
Function rsi_df()
to transform a data.frame
to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combinations of existing functions count_df()
and portion_df()
to immediately show resistance percentages and number of available isolates:
Function rsi_df()
to transform a data.frame
to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combination of the existing functions count_df()
and portion_df()
to immediately show resistance percentages and number of available isolates:
Support for all scientifically published pathotypes of E. coli to date. Supported are: AIEC (Adherent-Invasive E. coli), ATEC (Atypical Entero-pathogenic E. coli), DAEC (Diffusely Adhering E. coli), EAEC (Entero-Aggresive E. coli), EHEC (Entero-Haemorrhagic E. coli), EIEC (Entero-Invasive E. coli), EPEC (Entero-Pathogenic E. coli), ETEC (Entero-Toxigenic E. coli), NMEC (Neonatal Meningitis‐causing E. coli), STEC (Shiga-toxin producing E. coli) and UPEC (Uropathogenic E. coli). All these lead to the microbial ID of E. coli:
+Support for all scientifically published pathotypes of E. coli to date. Supported are:
+All these lead to the microbial ID of E. coli:
+mo_name("UPEC") +# "Escherichia coli" +mo_gramstain("EHEC") +# "Gram-negative"Function mo_info()
as an analogy to ab_info()
. The mo_info()
prints a list with the full taxonomy, authors, and the URL to the online database of a microorganism
mo_info()
as an analogy to ab_info()
. The mo_info()
prints a list with the full taxonomy, authors, and the URL to the online database of a microorganismFunction mo_synonyms()
to get all previously accepted taxonomic names of a microorganism
as.mo()
plot()
and barplot()
for MIC and RSI classesas.mo(..., allow_uncertain = 3)
Contents
mo_name()
mo_fullname()
mo_shortname()
mo_subspecies()
mo_species()
mo_genus()
mo_family()
mo_order()
mo_class()
mo_phylum()
mo_kingdom()
mo_type()
mo_gramstain()
mo_ref()
mo_authors()
mo_year()
mo_rank()
mo_taxonomy()
mo_info()
mo_url()
mo_property()
mo_name()
mo_fullname()
mo_shortname()
mo_subspecies()
mo_species()
mo_genus()
mo_family()
mo_order()
mo_class()
mo_phylum()
mo_kingdom()
mo_type()
mo_gramstain()
mo_ref()
mo_authors()
mo_year()
mo_rank()
mo_taxonomy()
mo_synonyms()
mo_info()
mo_url()
mo_property()
Property of a microorganism