diff --git a/DESCRIPTION b/DESCRIPTION index d60a7cdd7..783dda021 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,5 +1,5 @@ Package: AMR -Version: 1.8.1.9002 +Version: 1.8.1.9003 Date: 2022-05-09 Title: Antimicrobial Resistance Data Analysis Description: Functions to simplify and standardise antimicrobial resistance (AMR) diff --git a/NEWS.md b/NEWS.md index 0dc4ce2b2..9a683ff0e 100755 --- a/NEWS.md +++ b/NEWS.md @@ -1,10 +1,11 @@ -# `AMR` 1.8.1.9002 +# `AMR` 1.8.1.9003 ## Last updated: 9 May 2022 ### Changed -* Removed `as.integer()` for MIC values, since MIC are not integer values and running `table()` on MIC values will consequently fail for not being able to retrieve the level position (as that's how normally `as.integer()` on `factor`s work) +* Removed `as.integer()` for MIC values, since MIC are not integer values and running `table()` on MIC values consequently failed for not being able to retrieve the level position (as that's how normally `as.integer()` on `factor`s work) * `droplevels()` on MIC will now return a common `factor` at default and will lose the `` class. Use `droplevels(..., as.mic = TRUE)` to keep the `` class. * Small fix for using `ab_from_text()` +* Fixes for reading in text files using `set_mo_source()`, which now also allows the source file to contain valid taxonomic names instead of only valid microorganism ID of this package # `AMR` 1.8.1 diff --git a/R/mo_source.R b/R/mo_source.R index 7d37fff1d..c1fa48350 100644 --- a/R/mo_source.R +++ b/R/mo_source.R @@ -29,16 +29,16 @@ #' #' This is **the fastest way** to have your organisation (or analysis) specific codes picked up and translated by this package, since you don't have to bother about it again after setting it up once. #' @inheritSection lifecycle Stable Lifecycle -#' @param path location of your reference file, see *Details*. Can be `""`, `NULL` or `FALSE` to delete the reference file. +#' @param path location of your reference file, this can be any text file (comma-, tab- or pipe-separated) or an Excel file (see *Details*). Can also be `""`, `NULL` or `FALSE` to delete the reference file. #' @param destination destination of the compressed data file, default to the user's home directory. #' @rdname mo_source #' @name mo_source #' @aliases set_mo_source get_mo_source #' @details The reference file can be a text file separated with commas (CSV) or tabs or pipes, an Excel file (either 'xls' or 'xlsx' format) or an \R object file (extension '.rds'). To use an Excel file, you will need to have the `readxl` package installed. #' -#' [set_mo_source()] will check the file for validity: it must be a [data.frame], must have a column named `"mo"` which contains values from [`microorganisms$mo`][microorganisms] and must have a reference column with your own defined values. If all tests pass, [set_mo_source()] will read the file into \R and will ask to export it to `"~/mo_source.rds"`. The CRAN policy disallows packages to write to the file system, although '*exceptions may be allowed in interactive sessions if the package obtains confirmation from the user*'. For this reason, this function only works in interactive sessions so that the user can **specifically confirm and allow** that this file will be created. The destination of this file can be set with the `destination` argument and defaults to the user's home directory. It can also be set as an \R option, using `options(AMR_mo_source = "my/location/file.rds")`. +#' [set_mo_source()] will check the file for validity: it must be a [data.frame], must have a column named `"mo"` which contains values from [`microorganisms$mo`][microorganisms] or [`microorganisms$fullname`][microorganisms] and must have a reference column with your own defined values. If all tests pass, [set_mo_source()] will read the file into \R and will ask to export it to `"~/mo_source.rds"`. The CRAN policy disallows packages to write to the file system, although '*exceptions may be allowed in interactive sessions if the package obtains confirmation from the user*'. For this reason, this function only works in interactive sessions so that the user can **specifically confirm and allow** that this file will be created. The destination of this file can be set with the `destination` argument and defaults to the user's home directory. It can also be set as an \R option, using `options(AMR_mo_source = "my/location/file.rds")`. #' -#' The created compressed data file `"mo_source.rds"` will be used at default for MO determination (function [as.mo()] and consequently all `mo_*` functions like [mo_genus()] and [mo_gramstain()]). The location and timestamp of the original file will be saved as an attribute to the compressed data file. +#' The created compressed data file `"mo_source.rds"` will be used at default for MO determination (function [as.mo()] and consequently all `mo_*` functions like [mo_genus()] and [mo_gramstain()]). The location and timestamp of the original file will be saved as an [attribute][base::attributes()] to the compressed data file. #' #' The function [get_mo_source()] will return the data set by reading `"mo_source.rds"` with [readRDS()]. If the original file has changed (by checking the location and timestamp of the original file), it will call [set_mo_source()] to update the data file automatically if used in an interactive session. #' @@ -46,15 +46,15 @@ #' #' @section How to Setup: #' -#' Imagine this data on a sheet of an Excel file (mo codes were looked up in the [microorganisms] data set). The first column contains the organisation specific codes, the second column contains an MO code from this package: +#' Imagine this data on a sheet of an Excel file. The first column contains the organisation specific codes, the second column contains valid taxonomic names: #' #' ``` -#' | A | B | -#' --|--------------------|--------------| -#' 1 | Organisation XYZ | mo | -#' 2 | lab_mo_ecoli | B_ESCHR_COLI | -#' 3 | lab_mo_kpneumoniae | B_KLBSL_PNMN | -#' 4 | | | +#' | A | B | +#' --|--------------------|-----------------------| +#' 1 | Organisation XYZ | mo | +#' 2 | lab_mo_ecoli | Escherichia coli | +#' 3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +#' 4 | | | #' ``` #' #' We save it as `"home/me/ourcodes.xlsx"`. Now we have to set it as a source: @@ -89,13 +89,13 @@ #' If we edit the Excel file by, let's say, adding row 4 like this: #' #' ``` -#' | A | B | -#' --|--------------------|--------------| -#' 1 | Organisation XYZ | mo | -#' 2 | lab_mo_ecoli | B_ESCHR_COLI | -#' 3 | lab_mo_kpneumoniae | B_KLBSL_PNMN | -#' 4 | lab_Staph_aureus | B_STPHY_AURS | -#' 5 | | | +#' | A | B | +#' --|--------------------|-----------------------| +#' 1 | Organisation XYZ | mo | +#' 2 | lab_mo_ecoli | Escherichia coli | +#' 3 | lab_mo_kpneumoniae | Klebsiella pneumoniae | +#' 4 | lab_Staph_aureus | Staphylococcus aureus | +#' 5 | | | #' ``` #' #' ...any new usage of an MO function in this package will update your data file: @@ -144,6 +144,7 @@ set_mo_source <- function(path, destination = getOption("AMR_mo_source", "~/mo_s stop_ifnot(file.exists(path), "file not found: ", path) + df <- NULL if (path %like% "[.]rds$") { df <- readRDS(path) @@ -153,28 +154,34 @@ set_mo_source <- function(path, destination = getOption("AMR_mo_source", "~/mo_s df <- readxl::read_excel(path) } else if (path %like% "[.]tsv$") { - df <- utils::read.table(header = TRUE, sep = "\t", stringsAsFactors = FALSE) + df <- utils::read.table(file = path, header = TRUE, sep = "\t", stringsAsFactors = FALSE) + + } else if (path %like% "[.]csv$") { + df <- utils::read.table(file = path, header = TRUE, sep = ",", stringsAsFactors = FALSE) } else { # try comma first try( - df <- utils::read.table(header = TRUE, sep = ",", stringsAsFactors = FALSE), + df <- utils::read.table(file = path, header = TRUE, sep = ",", stringsAsFactors = FALSE), silent = TRUE) if (!check_validity_mo_source(df, stop_on_error = FALSE)) { # try tab try( - df <- utils::read.table(header = TRUE, sep = "\t", stringsAsFactors = FALSE), + df <- utils::read.table(file = path, header = TRUE, sep = "\t", stringsAsFactors = FALSE), silent = TRUE) } if (!check_validity_mo_source(df, stop_on_error = FALSE)) { # try pipe try( - df <- utils::read.table(header = TRUE, sep = "|", stringsAsFactors = FALSE), + df <- utils::read.table(file = path, header = TRUE, sep = "|", stringsAsFactors = FALSE), silent = TRUE) } } # check integrity + if (is.null(df)) { + stop_("the path '", path, "' could not be imported as a dataset.") + } check_validity_mo_source(df) df <- subset(df, !is.na(mo)) @@ -187,7 +194,7 @@ set_mo_source <- function(path, destination = getOption("AMR_mo_source", "~/mo_s } df <- as.data.frame(df, stringAsFactors = FALSE) - df[, "mo"] <- set_clean_class(df[, "mo", drop = TRUE], c("mo", "character")) + df[, "mo"] <- as.mo(df[, "mo", drop = TRUE]) # success if (file.exists(mo_source_destination)) { @@ -275,9 +282,9 @@ check_validity_mo_source <- function(x, refer_to_name = "`reference_df`", stop_o return(FALSE) } } - if (!all(x$mo %in% c("", microorganisms$mo), na.rm = TRUE)) { + if (!all(x$mo %in% c("", microorganisms$mo, microorganisms$fullname), na.rm = TRUE)) { if (stop_on_error == TRUE) { - invalid <- x[which(!x$mo %in% c("", microorganisms$mo)), , drop = FALSE] + invalid <- x[which(!x$mo %in% c("", microorganisms$mo, microorganisms$fullname)), , drop = FALSE] if (nrow(invalid) > 1) { plural <- "s" } else { diff --git a/data-raw/AMR_latest.tar.gz b/data-raw/AMR_latest.tar.gz index ad0a2a2a3..f98e02ae7 100644 Binary files a/data-raw/AMR_latest.tar.gz and b/data-raw/AMR_latest.tar.gz differ diff --git a/docs/404.html b/docs/404.html index 441095c52..d0e3cfb3d 100644 --- a/docs/404.html +++ b/docs/404.html @@ -43,7 +43,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html index 5c3893e15..12c4f12c9 100644 --- a/docs/LICENSE-text.html +++ b/docs/LICENSE-text.html @@ -17,7 +17,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/articles/datasets.html b/docs/articles/datasets.html index b2fd8a3c4..e9bbe351f 100644 --- a/docs/articles/datasets.html +++ b/docs/articles/datasets.html @@ -44,7 +44,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/authors.html b/docs/authors.html index 18f114a75..43af78f66 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -17,7 +17,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/index.html b/docs/index.html index 76a7b1d14..9ca967887 100644 --- a/docs/index.html +++ b/docs/index.html @@ -47,7 +47,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/news/index.html b/docs/news/index.html index 1621a2c78..314cf7775 100644 --- a/docs/news/index.html +++ b/docs/news/index.html @@ -17,7 +17,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 @@ -157,17 +157,18 @@
- +
-

Last updated: 9 May 2022

+

Last updated: 9 May 2022

-

Changed

-
  • Removed as.integer() for MIC values, since MIC are not integer values and running table() on MIC values will consequently fail for not being able to retrieve the level position (as that’s how normally as.integer() on factors work)
  • +

    Changed

    +
    • Removed as.integer() for MIC values, since MIC are not integer values and running table() on MIC values consequently failed for not being able to retrieve the level position (as that’s how normally as.integer() on factors work)
    • droplevels() on MIC will now return a common factor at default and will lose the <mic> class. Use droplevels(..., as.mic = TRUE) to keep the <mic> class.
    • Small fix for using ab_from_text()
    • +
    • Fixes for reading in text files using set_mo_source(), which now also allows the source file to contain valid taxonomic names instead of only valid microorganism ID of this package
diff --git a/docs/reference/index.html b/docs/reference/index.html index d740cdf06..5e8e510ef 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -17,7 +17,7 @@ AMR (for R) - 1.8.1.9002 + 1.8.1.9003 diff --git a/docs/reference/mo_source.html b/docs/reference/mo_source.html index f3e10490c..af843bedf 100644 --- a/docs/reference/mo_source.html +++ b/docs/reference/mo_source.html @@ -18,7 +18,7 @@ This is the fastest way to have your organisation (or analysis) specific codes p AMR (for R) - 1.8.1 + 1.8.1.9003 @@ -31,7 +31,7 @@ This is the fastest way to have your organisation (or analysis) specific codes p