NEWS.md
custom_eucast_rules() that brings support for custom AMR rules in eucast_rules()
key_antimicrobials() and all_antimicrobials() have replaced the now deprecated function key_antibiotics()
key_antimicrobials() still only selects six preferred antibiotics for Gram-negatives, six for Gram-positives, and six universal antibiotics. It has a new antifungal argument to set antifungal agents (antimycotics).first_isolate() function gained the argument method that has to be “phenotype-based”, “episode-based”, “patient-based”, or “isolate-based”. The old behaviour is equal to “episode-based”, while the new default is “phenotype-based”.type == "points" in the first_isolate() function for phenotype-based selection will now consider all antimicrobial drugs in the data set, using the new all_antimicrobials()
first_isolate() function can now take a vector of values for col_keyantibiotics and can have an episode length of Inf
filter_first_isolate() function has not changed, as it uses the episode-based method. The filter_first_weighted_isolate() may now include more isolates as uses the phenotype-based method.first_isolate() and key_antimicrobials() functions has been completely rewritten.mdro(), custom_mdro_guideline()):
c()
age_groups() for persons aged zeroexample_isolates data set now contains some (fictitious) zero-year old patientsdata.frame or tibble now gives a warning if the data contains old microbial codes (from a previous AMR package version)like() functions:
pattern is a valid regular expressionAdded %unlike% and %unlike_case% (as negations of the existing %like% and %like_case%). This greatly improves readability:
if (!grepl("EUCAST", guideline)) ... # same: if (guideline %unlike% "EUCAST") ...
%like% -> %unlike% -> %like_case% -> %unlike_case% if you keep pressing your keyboard shortcutinfo argument to as.mo() to turn on/off the progress barcol_mo for some functions (esp. eucast_rules() and mdro()) could not be column names of the microorganisms data set as it would throw an erroreucast_rules() function and in as.rsi() to interpret MIC and disk diffusion values. This is now the default guideline in this package.
eucast_dosage() to get a data.frame with advised dosages of a certain bug-drug combination, which is based on the new dosage data setdosage to fuel the new eucast_dosage() function and to make this data available in a structured wayexample_isolates now reflects the latest EUCAST rulesonly_rsi_columns for some functions, which defaults to FALSE, to indicate if the functions must only be applied to columns that are of class <rsi> (i.e., transformed with as.rsi()). This increases speed since automatic determination of antibiotic columns is not needed anymore. Affected functions are:
ab_class() and its wrappers, such as aminoglycosides(), carbapenems(), penicillins())filter_ab_class() and its wrappers, such as filter_aminoglycosides(), filter_carbapenems(), filter_penicillins())eucast_rules()mdro() (including wrappers such as brmo(), mrgn() and eucast_exceptional_phenotypes())guess_ab_col()Functions oxazolidinones() (an antibiotic selector function) and filter_oxazolidinones() (an antibiotic filter function) to select/filter on e.g. linezolid and tedizolid
library(dplyr) x <- example_isolates %>% select(date, hospital_id, oxazolidinones()) #> Selecting oxazolidinones: column 'LNZ' (linezolid) x <- example_isolates %>% filter_oxazolidinones() #> Filtering on oxazolidinones: value in column `LNZ` (linezolid) is either "R", "S" or "I"
custom_mdro_guideline() function, please see mdro() for additional infoggplot() generics for classes <mic> and <disk>
Function mo_is_yeast(), which determines whether a microorganism is a member of the taxonomic class Saccharomycetes or the taxonomic order Saccharomycetales:
mo_kingdom(c("Aspergillus", "Candida")) #> [1] "Fungi" "Fungi" mo_is_yeast(c("Aspergillus", "Candida")) #> [1] FALSE TRUE # usage for filtering data: example_isolates[which(mo_is_yeast()), ] # base R example_isolates %>% filter(mo_is_yeast()) # dplyr
The mo_type() function has also been updated to reflect this change:
antibiotics data setMIC values (see as.mic()) can now be used in any mathematical processing, such as usage inside functions min(), max(), range(), and with binary operators (+, -, etc.). This allows for easy distribution analysis and fast filtering on MIC values:
x <- random_mic(10) x #> Class <mic> #> [1] 128 0.5 2 0.125 64 0.25 >=256 8 16 4 x[x > 4] #> Class <mic> #> [1] 128 64 >=256 8 16 range(x) #> [1] 0.125 256.000 range(log2(x)) #> [1] -3 8
mo_url()) will now lead to https://lpsn.dsmz.de
rsi, <mic>, and <disk>:
translate)plot() and with ggplot2 using ggplot() on any vector of MIC and disk diffusion valuesmicroorganisms data setis.rsi() and is.rsi.eligible() now return a vector of TRUE/FALSE when the input is a data set, by iterating over all columnsmo_is_gram_negative(), mo_is_gram_positive(), mo_is_intrinsic_resistant(), first_isolate(), mdro()) now work with dplyrs group_by() againfirst_isolate() can be used with group_by() (also when using a dot . as input for the data) and now returns the names of the groupsmicroorganisms.codes (which contains popular LIS and WHONET codes for microorganisms) for some species of Mycobacterium that previously incorrectly returned M. africanum
"PNV" will now correctly be interpreted as PHN, the antibiotic code for phenoxymethylpenicillin (‘peni V’)mdro(..., verbose = TRUE) for German guideline (3MGRN and 4MGRN) and Dutch guideline (BRMO, only P. aeruginosa)is.rsi.eligible() now detects if the column name resembles an antibiotic name or code and now returns TRUE immediately if the input contains any of the values “R”, “S” or “I”. This drastically improves speed, also for a lot of other functions that rely on automatic determination of antibiotic columns.get_episode() and is_new_episode() now support less than a day as value for argument episode_days (e.g., to include one patient/test per hour)ampc_cephalosporin_resistance in eucast_rules() now also applies to value “I” (not only “S”)print() and summary() on a Principal Components Analysis object (pca()) now print additional group info if the original data was grouped using dplyr::group_by()
guess_ab_col(). As this also internally improves the reliability of first_isolate() and mdro(), this might have a slight impact on the results of those functions.mo_name() when used in other languages than Englishlike() function (and its fast alias %like%) now always use Perl compatibility, improving speed for many functions in this package (e.g., as.mo() is now up to 4 times faster)random_disk() and random_mic() now have an expanded range in their randomisationmo_genus("GISA") will return "Staphylococcus"
as.ab() when the input is an official name or ATC codeinclude_untested_rsi to the first_isolate() functions (defaults to TRUE to keep existing behaviour), to be able to exclude rows where all R/SI values (class <rsi>, see as.rsi()) are emptylibrary(AMR)) now is ~50 times faster than before, in costs of package size (which increased by ~3 MB)Functions get_episode() and is_new_episode() to determine (patient) episodes which are not necessarily based on microorganisms. The get_episode() function returns the index number of the episode per group, while the is_new_episode() function returns values TRUE/FALSE to indicate whether an item in a vector is the start of a new episode. They also support dplyrs grouping (i.e. using group_by()):
library(dplyr) example_isolates %>% group_by(patient_id, hospital_id) %>% filter(is_new_episode(date, episode_days = 60))
mo_is_gram_negative() and mo_is_gram_positive() as wrappers around mo_gramstain(). They always return TRUE or FALSE (except when the input is NA or the MO code is UNKNOWN), thus always return FALSE for species outside the taxonomic kingdom of Bacteria.mo_is_intrinsic_resistant() to test for intrinsic resistance, based on EUCAST Intrinsic Resistance and Unusual Phenotypes v3.2 from 2020.Functions random_mic(), random_disk() and random_rsi() for random value generation. The functions random_mic() and random_disk() take microorganism names and antibiotic names as input to make generation more realistic.
ampc_cephalosporin_resistance in eucast_rules() to correct for AmpC de-repressed cephalosporin-resistant mutantsas.rsi():
as.rsi() can now be set by the user, using the reference_data argument. This allows for using own interpretation guidelines. The user-set data must have the same structure as rsi_translation.as.rsi() on a data.frameas.rsi() on a data.frame in older R versionsas.rsi() on a data.frame will not print a message anymore if the values are already clean R/SI valuesas.rsi() on MICs or disk diffusion while there is intrinsic antimicrobial resistance, a warning will be thrown to remind about thisas.rsi() on a data.frame that only contains one column for antibiotic interpretationsdplyr verbs, such as filter(), mutate() and summarise(). This means that then the data argument does not need to be set anymore. This is the case for the new functions:
… and for the existing functions:
first_isolate(),key_antibiotics(),mdro(),brmo(),mrgn(),mdr_tb(),mdr_cmi2012(),eucast_exceptional_phenotypes()# to select first isolates that are Gram-negative # and view results of cephalosporins and aminoglycosides: library(dplyr) example_isolates %>% filter(first_isolate(), mo_is_gram_negative()) %>% select(mo, cephalosporins(), aminoglycosides()) %>% as_tibble()
For antibiotic selection functions (such as cephalosporins(), aminoglycosides()) to select columns based on a certain antibiotic group, the dependency on the tidyselect package was removed, meaning that they can now also be used without the need to have this package installed and now also work in base R function calls (they rely on R 3.2 or later):
# above example in base R: example_isolates[which(first_isolate() & mo_is_gram_negative()), c("mo", cephalosporins(), aminoglycosides())]
typed package). If the user input for a certain function does not meet the requirements for a specific argument (such as the class or length), an informative error will be thrown. This makes the package more robust and the use of it more reproducible and reliable. In total, more than 420 arguments were defined.set_mo_source(), that previously would not remember the file location of the original filep_symbol() that not really fits the scope of this package. It will be removed in a future version. See here for the source code to preserve it.reference_df in as.mo() and mo_*() functions that contain old microbial codes (from previous package versions)mo_uncertainties() would not return the results based on the MO matching scoreas.mo() would not return results for known laboratory codes for microorganismsas.ab() would sometimes failplot()
plot() generic to class <disk>
mo_genus("LA-MRSA") will return "Staphylococcus" and mo_is_gram_positive("LA-MRSA") will return TRUE.NAmo_shortname() when the input contains NA
If as.mo() takes more than 30 seconds, some suggestions will be done to improve speed
options() were all removed in favour of a new internal environment pkg_env
sapply() calls with vapply())eucast_rules() function can now correct for more than 180 different antibiotics and the mdro() function can determine multidrug resistance based on more than 150 different antibiotics. All previously implemented versions of the EUCAST rules are now maintained and kept available in this package. The eucast_rules() function consequently gained the arguments version_breakpoints (at the moment defaults to v10.0, 2020) and version_expertrules (at the moment defaults to v3.2, 2020). The example_isolates data set now also reflects the change from v3.1 to v3.2. The mdro() function now accepts guideline == "EUCAST3.1" and guideline == "EUCAST3.2".Data set intrinsic_resistant. This data set contains all bug-drug combinations where the ‘bug’ is intrinsic resistant to the ‘drug’ according to the latest EUCAST insights. It contains just two columns: microorganism and antibiotic.
Curious about which enterococci are actually intrinsic resistant to vancomycin?
Support for skimming classes <rsi>, <mic>, <disk> and <mo> with the skimr package
as.rsi():
Support for using dplyr’s across() to interpret MIC values or disk zone diameters, which also automatically determines the column with microorganism names or codes.
as.rsi(df, col1:col9)
FALSE), that considers intrinsic resistance according to EUCASTAdded intelligent data cleaning to as.disk(), so numbers can also be extracted from text and decimal numbers will always be rounded up:
as.mo():
mo_matching_score(). Any user input value that could mean more than one taxonomic entry is now considered ‘uncertain’. Instead of a warning, a message will be thrown and the accompanying mo_uncertainties() has been changed completely; it now prints all possible candidates with their matching score.mo_* functions like mo_name() on microoganism IDs.ignore_pattern to as.mo() which can also be given to mo_* functions like mo_name(), to exclude known non-relevant input from analysing. This can also be set with the option AMR_ignore_pattern.get_locale() now uses at default Sys.getenv("LANG") or, if LANG is not set, Sys.getlocale(). This can be overwritten by setting the option AMR_locale.eucast_rules()
mo_shortname() now returns the genus for input where the species is unknownmo_genus("BORSA") will return “Staphylococcus”tibble printing support for classes <rsi>, <mic>, <disk>, <ab> and <mo>. When using tibbles containing antimicrobial columns (class <rsi>), “S” will print in green, “I” will print in yellow and “R” will print in red. Microbial IDs (class <mo>) will emphasise on the genus and species, not on the kingdom.antivirals now have a starting capital letter, like it is the case in the antibiotics data setWHONET data set to clarify that all patient names are fictitiousas.ab() algorithm improvementsc(as.mic(2), 2) previously failed but now returns a valid MIC classggplot_rsi() and geom_rsi() gained arguments minimum and language, to influence the internal use of rsi_df()
antibiotics data set:
PEN)PNV) was removed, since its actual entry ‘Phenoxymethylpenicillin’ (code PHN) already existedantibiotics$group) of ‘Linezolid’ (LNZ), ‘Cycloserine’ (CYC), ‘Tedizolid’ (TZD) and ‘Thiacetazone’ (THA) is now “Oxazolidinones” instead of “Other antibacterials”unique() on classes <rsi>, <mic>, <disk>, <ab> and <mo>
Added argument excess to the kurtosis() function (defaults to FALSE), to return the excess kurtosis, defined as the kurtosis minus three.
portion_R(), portion_S() and portion_I() that were deprecated since version 0.9.0 (November 2019) and were replaced with proportion_R(), proportion_S() and proportion_I()
base packageSuggests field of the DESCRIPTION fileab_from_text() to retrieve antimicrobial drug names, doses and forms of administration from clinical texts in e.g. health care records, which also corrects for misspelling since it uses as.ab() internallyTidyverse selection helpers for antibiotic classes, that help to select the columns of antibiotics that are of a specific antibiotic class, without the need to define the columns or antibiotic abbreviations. They can be used in any function that allows selection helpers, like dplyr::select() and tidyr::pivot_longer():
library(dplyr) # Columns 'IPM' and 'MEM' are in the example_isolates data set example_isolates %>% select(carbapenems()) #> Selecting carbapenems: `IPM` (imipenem), `MEM` (meropenem)
mo_domain() as an alias to mo_kingdom()
filter_penicillins() to filter isolates on a specific result in any column with a name in the antimicrobial ‘penicillins’ class (more specific: ATC subgroup Beta-lactam antibacterials, penicillins)filter_ab_class() functions, such as filter_aminoglycosides()
antibiotics data setAdded argument conserve_capped_values to as.rsi() for interpreting MIC values - it makes sure that values starting with “<” (but not “<=”) will always return “S” and values starting with “>” (but not “>=”) will always return “R”. The default behaviour of as.rsi() has not changed, so you need to specifically do as.rsi(..., conserve_capped_values = TRUE).
Big speed improvement for using any function on microorganism codes from earlier package versions (prior to AMR v1.2.0), such as as.mo(), mo_name(), first_isolate(), eucast_rules(), mdro(), etc.
AMR v0.5.0 and lower) are not supported anymore. Use as.mo() on your microorganism names or codes to transform them to current abbreviations used in this package.susceptibility() and resistance() and all count_*(), proportion_*() functions:
dplyr::all_of()) now works againas.ab():
as.ab(), making many more input errors translatable, such as digitalised health care records, using too few or too many vowels or consonants and many moreas.ab() would return an error on invalid input valuesas.ab() function will now throw a note if more than 1 antimicrobial drug could be retrieved from a single input value.eucast_rules() would not work on a tibble when the tibble or dplyr package was loadedas.rsi()), that also included results for animals. It now only contains interpretation guidelines for humans.*_join_microorganisms() functions and bug_drug_combinations() now return the original data class (e.g. tibbles and data.tables)rsi_df(), proportion_df() and count_df():
count_df()) when all antibiotics in the data set have only NAs<mo> and <Date>
bug_drug_combinations() for when only one antibiotic was in the input data<rsi>, to highlight the %SI vs. %Rmdro() and filter_ab_class()
arrows_textangled for ggplot_pca() to indicate whether the text at the end of the arrows should be angled (defaults to TRUE, as it was in previous versions)Fixed a bug where as.mic() could not handle dots without a leading zero (like "<=.25)
Removed code dependency on all other R packages, making this package fully independent of the development process of others. This is a major code change, but will probably not be noticeable by most users.
Making this package independent of especially the tidyverse (e.g. packages dplyr and tidyr) tremendously increases sustainability on the long term, since tidyverse functions change quite often. Good for users, but hard for package maintainers. Most of our functions are replaced with versions that only rely on base R, which keeps this package fully functional for many years to come, without requiring a lot of maintenance to keep up with other packages anymore. Another upside it that this package can now be used with all versions of R since R-3.0.0 (April 2013). Our package is being used in settings where the resources are very limited. Fewer dependencies on newer software is helpful for such settings.
freq() that was borrowed from the cleaner package was removed. Use cleaner::freq(), or run library("cleaner") before you use freq().mo or rsi in a tibble will no longer be in colour and printing rsi in a tibble will show the class <ord>, not <rsi> anymore. This is purely a visual effect.mo_* family (like mo_name() and mo_gramstain()) are noticeably slower when running on hundreds of thousands of rows.mo and ab now both also inherit class character, to support any data transformation. This change invalidates code that checks for class length == 1.first_isolate()), since some bacterial names might be renamed to other genera or other (sub)species. This is expected behaviour.eucast_rules() function no longer applies “other” rules at default that are made available by this package (like setting ampicillin = R when ampicillin + enzyme inhibitor = R). The default input value for rules is now c("breakpoints", "expert") instead of "all", but this can be changed by the user. To return to the old behaviour, set options(AMR.eucast_rules = "all").antibiotics data set these two rules:
eucast_rules()
ab_url() to return the direct URL of an antimicrobial agent from the official WHO websiteas.ab(), so that e.g. as.ab("ampi sul") and ab_name("ampi sul") workab_atc() and ab_group() now return NA if no antimicrobial agent could be foundset_mo_source() to make sure that column mo will always be the second columnp.symbol() - it was replaced with p_symbol()
read.4d(), that was only useful for reading data from an old test database.pca() functionggplot_pca() functionas.mo() (and consequently all mo_* functions, that use as.mo() internally):
SPE for species, like "ESCSPE" for Escherichia coli
antibiotics data setas.rsi() for years 2010-2019 (thanks to Anthony Underwood)Interpretation from MIC values (and disk zones) to R/SI can now be used with mutate_at() of the dplyr package:
uti (as abbreviation of urinary tract infections) as argument to as.rsi(), so interpretation of MIC values and disk zones can be made dependent on isolates specifically from UTIsInfo printing in functions eucast_rules(), first_isolate(), mdro() and resistance_predict() will now at default only print when R is in an interactive mode (i.e. not in RMarkdown)
This software is now out of beta and considered stable. Nonetheless, this package will be developed continually.
as.rsi() and inferred resistance and susceptibility using eucast_rules().Support for LOINC codes in the antibiotics data set. Use ab_loinc() to retrieve LOINC codes, or use a LOINC code for input in any ab_* function:
Support for SNOMED CT codes in the microorganisms data set. Use mo_snomed() to retrieve SNOMED codes, or use a SNOMED code for input in any mo_* function:
mo_snomed("S. aureus") #> [1] 115329001 3092008 113961008 mo_name(115329001) #> [1] "Staphylococcus aureus" mo_gramstain(115329001) #> [1] "Gram-positive"
as.mo() function previously wrote to the package folder to improve calculation speed for previously calculated results. This is no longer the case, to comply with CRAN policies. Consequently, the function clear_mo_history() was removed.as.rsi()
as.mo() (and consequently all mo_* functions, that use as.mo() internally):
as.mo("Methicillin-resistant S.aureus")
as.disk() limited to a maximum of 50 millimeterstidyverse
as.ab(): support for drugs starting with “co-” like co-amoxiclav, co-trimoxazole, co-trimazine and co-trimazole (thanks to Peter Dutey)antibiotics data set (thanks to Peter Dutey):
RIF) to rifampicin/isoniazid (RFI). Please note that the combination rifampicin/isoniazid has no DDDs defined, so e.g. ab_ddd("Rimactazid") will now return NA.SMX) to trimethoprim/sulfamethoxazole (SXT)microorganisms data set, which means that the new order Enterobacterales now consists of a part of the existing family Enterobacteriaceae, but that this family has been split into other families as well (like Morganellaceae and Yersiniaceae). Although published in 2016, this information is not yet in the Catalogue of Life version of 2019. All MDRO determinations with mdro() will now use the Enterobacterales order for all guidelines before 2016 that were dependent on the Enterobacteriaceae family.
Functions susceptibility() and resistance() as aliases of proportion_SI() and proportion_R(), respectively. These functions were added to make it more clear that “I” should be considered susceptible and not resistant.
library(dplyr) example_isolates %>% group_by(bug = mo_name(mo)) %>% summarise(amoxicillin = resistance(AMX), amox_clav = resistance(AMC)) %>% filter(!is.na(amoxicillin) | !is.na(amox_clav))
mdro() functionmdro(...., verbose = TRUE)) returns an informative data set where the reason for MDRO determination is given for every isolate, and an list of the resistant antimicrobial agentsData set antivirals, containing all entries from the ATC J05 group with their DDDs for oral and parenteral treatment
as.mo():
Added a score (a certainty percentage) to mo_uncertainties(), that is calculated using the Levenshtein distance:
as.mo(c("Stafylococcus aureus", "staphylokok aureuz")) #> Warning: #> Results of two values were guessed with uncertainty. Use mo_uncertainties() to review them. #> Class 'mo' #> [1] B_STPHY_AURS B_STPHY_AURS mo_uncertainties() #> "Stafylococcus aureus" -> Staphylococcus aureus (B_STPHY_AURS, score: 95.2%) #> "staphylokok aureuz" -> Staphylococcus aureus (B_STPHY_AURS, score: 85.7%)
as.atc() - this function was replaced by ab_atc()
portion_* functions to proportion_*. All portion_* functions are still available as deprecated functions, and will return a warning when used.as.rsi() over a data set, it will now print the guideline that will be used if it is not specified by the usereucast_rules():
eucast_rules() are now applied first and not as last anymore. This is to improve the dependency on certain antibiotics for the official EUCAST rules. Please see ?eucast_rules.as.rsi() where the input is NA
mdro() and eucast_rules()
antibiotics data setexample_isolates data set to better reflect realitymo_info()
clean to cleaner, as this package was renamed accordingly upon CRAN requestDetermination of first isolates now excludes all ‘unknown’ microorganisms at default, i.e. microbial code "UNKNOWN". They can be included with the new argument include_unknown:
first_isolate(..., include_unknown = TRUE)
"con" (contamination) will be excluded at default, since as.mo("con") = "UNKNOWN". The function always shows a note with the number of ‘unknown’ microorganisms that were included or excluded.For code consistency, classes ab and mo will now be preserved in any subsetting or assignment. For the sake of data integrity, this means that invalid assignments will now result in NA:
# how it works in base R: x <- factor("A") x[1] <- "B" #> Warning message: #> invalid factor level, NA generated # how it now works similarly for classes 'mo' and 'ab': x <- as.mo("E. coli") x[1] <- "testvalue" #> Warning message: #> invalid microorganism code, NA generated
"testvalue" could never be understood by e.g. mo_name(), although the class would suggest a valid microbial code.freq() has moved to a new package, clean (CRAN link), since creating frequency tables actually does not fit the scope of this package. The freq() function still works, since it is re-exported from the clean package (which will be installed automatically upon updating this AMR package).Renamed data set septic_patients to example_isolates
Function bug_drug_combinations() to quickly get a data.frame with the results of all bug-drug combinations in a data set. The column containing microorganism codes is guessed automatically and its input is transformed with mo_shortname() at default:
x <- bug_drug_combinations(example_isolates) #> NOTE: Using column `mo` as input for `col_mo`. x[1:4, ] #> mo ab S I R total #> 1 A. baumannii AMC 0 0 3 3 #> 2 A. baumannii AMK 0 0 0 0 #> 3 A. baumannii AMP 0 0 3 3 #> 4 A. baumannii AMX 0 0 3 3 #> NOTE: Use 'format()' on this result to get a publicable/printable format. # change the transformation with the FUN argument to anything you like: x <- bug_drug_combinations(example_isolates, FUN = mo_gramstain) #> NOTE: Using column `mo` as input for `col_mo`. x[1:4, ] #> mo ab S I R total #> 1 Gram-negative AMC 469 89 174 732 #> 2 Gram-negative AMK 251 0 2 253 #> 3 Gram-negative AMP 227 0 405 632 #> 4 Gram-negative AMX 227 0 405 632 #> NOTE: Use 'format()' on this result to get a publicable/printable format.
You can format this to a printable format, ready for reporting or exporting to e.g. Excel with the base R format() function:
format(x, combine_IR = FALSE)
Additional way to calculate co-resistance, i.e. when using multiple antimicrobials as input for portion_* functions or count_* functions. This can be used to determine the empiric susceptibility of a combination therapy. A new argument only_all_tested (which defaults to FALSE) replaces the old also_single_tested and can be used to select one of the two methods to count isolates and calculate portions. The difference can be seen in this example table (which is also on the portion and count help pages), where the %SI is being determined:
# -------------------------------------------------------------------- # only_all_tested = FALSE only_all_tested = TRUE # ----------------------- ----------------------- # Drug A Drug B include as include as include as include as # numerator denominator numerator denominator # -------- -------- ---------- ----------- ---------- ----------- # S or I S or I X X X X # R S or I X X X X # <NA> S or I X X - - # S or I R X X X X # R R - X - X # <NA> R - - - - # S or I <NA> X X - - # R <NA> - - - - # <NA> <NA> - - - - # --------------------------------------------------------------------
also_single_tested will throw an informative error that it has been replaced by only_all_tested.tibble printing support for classes rsi, mic, disk, ab mo. When using tibbles containing antimicrobial columns, values S will print in green, values I will print in yellow and values R will print in red. Microbial IDs (class mo) will emphasise on the genus and species, not on the kingdom.
as.mo() (of which some led to additions to the microorganisms data set). Many thanks to all contributors that helped improving the algorithms.
B_ENTRC_FAE could have been both E. faecalis and E. faecium. Its new code is B_ENTRC_FCLS and E. faecium has become B_ENTRC_FACM. Also, the Latin character æ (ae) is now preserved at the start of each genus and species abbreviation. For example, the old code for Aerococcus urinae was B_ARCCC_NAE. This is now B_AERCC_URIN. IMPORTANT: Old microorganism IDs are still supported, but support will be dropped in a future version. Use as.mo() on your old codes to transform them to the new format. Using functions from the mo_* family (like mo_name() and mo_gramstain()) on old codes, will throw a warning.as.ab(), including bidirectional language supportmdro() function, to determine multi-drug resistant organismseucast_rules():
eucast_rules(..., verbose = TRUE)) returns more informative and readable outputAMR:::get_column_abx())atc - using as.atc() is now deprecated in favour of ab_atc() and this will return a character, not the atc class anymoreabname(), ab_official(), atc_name(), atc_official(), atc_property(), atc_tradenames(), atc_trivial_nl()
mo_shortname()
mo_* functions where the coercion uncertainties and failures would not be available through mo_uncertainties() and mo_failures() anymorecountry argument of mdro() in favour of the already existing guideline argument to support multiple guidelines within one countryname of RIF is now Rifampicin instead of Rifampinantibiotics data set is now sorted by name and all cephalosporins now have their generation between bracketsguess_ab_col() which is now 30 times faster for antibiotic abbreviationsfilter_ab_class() to be more reliable and to support 5th generation cephalosporinsavailability() now uses portion_R() instead of portion_IR(), to comply with EUCAST insightsage() and age_groups() now have a na.rm argument to remove empty valuesp.symbol() to p_symbol() (the former is now deprecated and will be removed in a future version)x in age_groups() will now introduce NAs and not return an error anymorekey_antibiotics() on foreign systemsmdr_tb()
as.mic())Function rsi_df() to transform a data.frame to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combination of the existing functions count_df() and portion_df() to immediately show resistance percentages and number of available isolates:
Support for all scientifically published pathotypes of E. coli to date (that we could find). Supported are:
All these lead to the microbial ID of E. coli:
as.mo("UPEC") # B_ESCHR_COL mo_name("UPEC") # "Escherichia coli" mo_gramstain("EHEC") # "Gram-negative"
mo_info() as an analogy to ab_info(). The mo_info() prints a list with the full taxonomy, authors, and the URL to the online database of a microorganismFunction mo_synonyms() to get all previously accepted taxonomic names of a microorganism
count_df() and portion_df() are now lowercaseas.ab() and as.mo() to understand even more severely misspelled inputas.ab() now allows spaces for coercing antibiotics namesggplot2 methods for automatically determining the scale type of classes mo and ab
"bacteria" from getting coerced by as.ab() because Bacterial is a brand name of trimethoprim (TMP)eucast_rules() and mdro()
latest_annual_release from the catalogue_of_life_version() functionPVM1 from the antibiotics data set as this was a duplicate of PME
as.mo()
plot() and barplot() for MIC and RSI classesas.mo()
as.rsi() on an MIC value (created with as.mic()), a disk diffusion value (created with the new as.disk()) or on a complete date set containing columns with MIC or disk diffusion values.mo_name() as alias of mo_fullname()
mdr_tb()) and added a new vignette about MDR. Read this tutorial here on our website.first_isolate() where missing species would lead to incorrect FALSEs. This bug was not present in AMR v0.5.0, but was in v0.6.0 and v0.6.1.eucast_rules() where antibiotics from WHONET software would not be recognisedantibiotics data set:
ab contains a human readable EARS-Net code, used by ECDC and WHO/WHONET - this is the primary identifier used in this packageatc contains the ATC code, used by WHO/WHOCCcid contains the CID code (Compound ID), used by PubChemAMX for amoxicillinatc_certe, ab_umcg and atc_trivial_nl have been removedatc_* functions are superseded by ab_* functionsggplot_rsi():
colours to set the bar colourstitle, subtitle, caption, x.title and y.title to set titles and axis descriptionsguess_ab_col()
microorganisms.old data set, which leads to better results finding when using the as.mo() functionportion_df() and count_df() this means that their new argument combine_SI is TRUE at default. Our plotting function ggplot_rsi() also reflects this change since it uses count_df() internally.age() function gained a new argument exact to determine ages with decimalsguess_mo(), guess_atc(), EUCAST_rules(), interpretive_reading(), rsi()
freq()):
support for boxplots:
age_groups(), to let groups of fives and tens end with 100+ instead of 120+freq() for when all values are NA
first_isolate() for when dates are missingguess_ab_col()
as.mo() now gently interprets any number of whitespace characters (like tabs) as one spaceas.mo() now returns UNKNOWN for "con" (WHONET ID of ‘contamination’) and returns NA for "xxx"(WHONET ID of ‘no growth’)as.mo()
microorganisms.codes and cleaned it upmo_shortname() where species would not be determined correctlyeucast_rules() with verbose = TRUE
New website!
We’ve got a new website: https://msberends.gitlab.io/AMR (built with the great pkgdown)
as.mo() to identify an MO code.microorganisms data set now contains:
The responsible author(s) and year of scientific publication
This data is updated annually - check the included version with the new functioncatalogue_of_life_version().mo codes changed (e.g. Streptococcus changed from B_STRPTC to B_STRPT). A translation table is used internally to support older microorganism IDs, so users will not notice this difference.mo_rank() for the taxonomic rank (genus, species, infraspecies, etc.)mo_url() to get the direct URL of a species from the Catalogue of Lifefirst_isolate() and eucast_rules(), all arguments will be filled in automatically.antibiotics data set now contains a column ears_net.as.mo() now knows all WHONET species abbreviations too, because almost 2,000 microbial abbreviations were added to the microorganisms.codes data set.New filters for antimicrobial classes. Use these functions to filter isolates on results in one of more antibiotics from a specific class:
filter_aminoglycosides() filter_carbapenems() filter_cephalosporins() filter_1st_cephalosporins() filter_2nd_cephalosporins() filter_3rd_cephalosporins() filter_4th_cephalosporins() filter_fluoroquinolones() filter_glycopeptides() filter_macrolides() filter_tetracyclines()
The antibiotics data set will be searched, after which the input data will be checked for column names with a value in any abbreviations, codes or official names found in the antibiotics data set. For example:
septic_patients %>% filter_glycopeptides(result = "R") # Filtering on glycopeptide antibacterials: any of `vanc` or `teic` is R septic_patients %>% filter_glycopeptides(result = "R", scope = "all") # Filtering on glycopeptide antibacterials: all of `vanc` and `teic` is R
All ab_* functions are deprecated and replaced by atc_* functions:
ab_property -> atc_property() ab_name -> atc_name() ab_official -> atc_official() ab_trivial_nl -> atc_trivial_nl() ab_certe -> atc_certe() ab_umcg -> atc_umcg() ab_tradenames -> atc_tradenames()
as.atc() internally. The old atc_property has been renamed atc_online_property(). This is done for two reasons: firstly, not all ATC codes are of antibiotics (ab) but can also be of antivirals or antifungals. Secondly, the input must have class atc or must be coerable to this class. Properties of these classes should start with the same class name, analogous to as.mo() and e.g. mo_genus.set_mo_source() and get_mo_source() to use your own predefined MO codes as input for as.mo() and consequently all mo_* functionsdplyr version 0.8.0guess_ab_col() to find an antibiotic column in a tablemo_failures() to review values that could not be coerced to a valid MO code, using as.mo(). This latter function will now only show a maximum of 10 uncoerced values and will refer to mo_failures().mo_uncertainties() to review values that could be coerced to a valid MO code using as.mo(), but with uncertainty.mo_renamed() to get a list of all returned values from as.mo() that have had taxonomic renamingage() to calculate the (patients) age in yearsage_groups() to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic AMR data analysis per age group.New function ggplot_rsi_predict() as well as the base R plot() function can now be used for resistance prediction calculated with resistance_predict():
x <- resistance_predict(septic_patients, col_ab = "amox") plot(x) ggplot_rsi_predict(x)
Functions filter_first_isolate() and filter_first_weighted_isolate() to shorten and fasten filtering on data sets with antimicrobial results, e.g.:
septic_patients %>% filter_first_isolate(...) # or filter_first_isolate(septic_patients, ...)
is equal to:
septic_patients %>% mutate(only_firsts = first_isolate(septic_patients, ...)) %>% filter(only_firsts == TRUE) %>% select(-only_firsts)
availability() to check the number of available (non-empty) results in a data.frame
New vignettes about how to conduct AMR analysis, predict antimicrobial resistance, use the G-test and more. These are also available (and even easier readable) on our website: https://msberends.gitlab.io/AMR.
eucast_rules():
septic_patients now reflects these changeseucast_rules(..., verbose = TRUE) to get a data set with all changed per bug and drug combination.microorganisms.oldDT, microorganisms.prevDT, microorganisms.unprevDT and microorganismsDT since they were no longer needed and only contained info already available in the microorganisms data setantibiotics data set, from the Pharmaceuticals Community Register of the European Commissionatc_group1_nl and atc_group2_nl from the antibiotics data setatc_ddd() and atc_groups() have been renamed atc_online_ddd() and atc_online_groups(). The old functions are deprecated and will be removed in a future version.guess_mo() is now deprecated in favour of as.mo() and will be removed in future versionsguess_atc() is now deprecated in favour of as.atc() and will be removed in future versionsas.mo():
Now handles incorrect spelling, like i instead of y and f instead of ph:
# mo_fullname() uses as.mo() internally mo_fullname("Sthafilokockus aaureuz") #> [1] "Staphylococcus aureus" mo_fullname("S. klossi") #> [1] "Staphylococcus kloosii"
Uncertainty of the algorithm is now divided into four levels, 0 to 3, where the default allow_uncertain = TRUE is equal to uncertainty level 2. Run ?as.mo for more info about these levels.
# equal: as.mo(..., allow_uncertain = TRUE) as.mo(..., allow_uncertain = 2) # also equal: as.mo(..., allow_uncertain = FALSE) as.mo(..., allow_uncertain = 0)
as.mo(..., allow_uncertain = 3) could lead to very unreliable results.~/.Rhistory_mo. Use the new function clean_mo_history() to delete this file, which resets the algorithms.Incoercible results will now be considered ‘unknown’, MO code UNKNOWN. On foreign systems, properties of these will be translated to all languages already previously supported: German, Dutch, French, Italian, Spanish and Portuguese:
mo_genus("qwerty", language = "es") # Warning: # one unique value (^= 100.0%) could not be coerced and is considered 'unknown': "qwerty". Use mo_failures() to review it. #> [1] "(género desconocido)"
first_isolate():
septic_patients data set this yielded a difference of 0.15% more isolatescol_patientid), when this argument was left blankcol_keyantibiotics()), when this argument was left blankoutput_logical, the function will now always return a logical valuefilter_specimen to specimen_group, although using filter_specimen will still workportion functions, that low counts can influence the outcome and that the portion functions may camouflage this, since they only return the portion (albeit being dependent on the minimum argument)microorganisms.certe and microorganisms.umcg into microorganisms.codes
mo_taxonomy() now contains the kingdom toois.rsi.eligible() using the new threshold argumentscale_rsi_colours()
mo will now return the top 3 and the unique count, e.g. using summary(mo)
rsi and mic
as.rsi():
"HIGH S" will return S
freq() function):
Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
header functionheader is now set to TRUE at default, even for markdownmo to show unique count of families, genera and speciesdecimal.mark setting, which just like format defaults to getOption("OutDec")
big.mark argument will at default be "," when decimal.mark = "." and "." otherwiseNA
droplevels to exclude empty factor levels when input is a factorselect() on frequency tablesscale_y_percent() now contains the limits argumentmdro(), key_antibiotics() and eucast_rules()
resistance_predict() function)as.mic() to support more values ending in (several) zeroes%like%, it will now return the callcount_all to get all available isolates (that like all portion_* and count_* functions also supports summarise and group_by), the old n_rsi is now an alias of count_all
get_locale to determine language for language-dependent output for some mo_* functions. This is now the default value for their language argument, by which the system language will be used at default.microorganismsDT, microorganisms.prevDT, microorganisms.unprevDT and microorganisms.oldDT to improve the speed of as.mo. They are for reference only, since they are primarily for internal use of as.mo.read.4D to read from the 4D database of the MMB department of the UMCGmo_authors and mo_year to get specific values about the scientific reference of a taxonomic entryMDRO, BRMO, MRGN and EUCAST_exceptional_phenotypes were renamed to mdro, brmo, mrgn and eucast_exceptional_phenotypes
EUCAST_rules was renamed to eucast_rules, the old function still exists as a deprecated functioneucast_rules function:
rules to specify which rules should be applied (expert rules, breakpoints, others or all)verbose which can be set to TRUE to get very specific messages about which columns and rows were affectedseptic_patients now reflects these changespipe for piperacillin (J01CA12), also to the mdro functionkingdom to the microorganisms data set, and function mo_kingdom to look up valuesas.mo (and subsequently all mo_* functions), as empty values wil be ignored a priori
as.mo will return NAFunction as.mo (and all mo_* wrappers) now supports genus abbreviations with “species” attached
as.mo("E. species") # B_ESCHR mo_fullname("E. spp.") # "Escherichia species" as.mo("S. spp") # B_STPHY mo_fullname("S. species") # "Staphylococcus species"
combine_IR (TRUE/FALSE) to functions portion_df and count_df, to indicate that all values of I and R must be merged into one, so the output only consists of S vs. IR (susceptible vs. non-susceptible)portion_*(..., as_percent = TRUE) when minimal number of isolates would not be metalso_single_tested for portion_* and count_* functions to also include cases where not all antibiotics were tested but at least one of the tested antibiotics includes the target antimicribial interpretation, see ?portion
portion_* functions now throws a warning when total available isolate is below argument minimum
as.mo, as.rsi, as.mic, as.atc and freq will not set package name as attribute anymorefreq():
Support for grouping variables, test with:
Support for (un)selecting columns:
hms::is.hms
difftime
na, to choose which character to print for empty valuesheader to turn the header info off (default when markdown = TRUE)title to manually setbthe title of the frequency tablefirst_isolate now tries to find columns to use as input when arguments are left blankmdro)septic_patients is now a data.frame, not a tibble anymoremicroorganisms$ref and microorganisms.old$ref) to comply with CRAN policy to only allow ASCII charactersmo_property not working properlyeucast_rules where some Streptococci would become ceftazidime R in EUCAST rule 4.5mo, useful for top_freq()
ggplot_rsi and scale_y_percent have breaks argumentas.mo:
"CRS" -> Stenotrophomonas maltophilia
"CRSM" -> Stenotrophomonas maltophilia
"MSSA" -> Staphylococcus aureus
"MSSE" -> Staphylococcus epidermidis
join functionsis.rsi.eligible, now 15-20 times fasterg.test, when sum(x) is below 1000 or any of the expected values is below 5, Fisher’s Exact Test will be suggestedab_name will try to fall back on as.atc when no results are foundPercentages will now will rounded more logically (e.g. in freq function)
microorganisms now contains all microbial taxonomic data from ITIS (kingdoms Bacteria, Fungi and Protozoa), the Integrated Taxonomy Information System, available via https://itis.gov. The data set now contains more than 18,000 microorganisms with all known bacteria, fungi and protozoa according ITIS with genus, species, subspecies, family, order, class, phylum and subkingdom. The new data set microorganisms.old contains all previously known taxonomic names from those kingdoms.mo_property:
mo_phylum, mo_class, mo_order, mo_family, mo_genus, mo_species, mo_subspecies
mo_fullname, mo_shortname
mo_type, mo_gramstain
mo_ref
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese:
mo_gramstain("E. coli") # [1] "Gram negative" mo_gramstain("E. coli", language = "de") # German # [1] "Gramnegativ" mo_gramstain("E. coli", language = "es") # Spanish # [1] "Gram negativo" mo_fullname("S. group A", language = "pt") # Portuguese # [1] "Streptococcus grupo A"
Furthermore, former taxonomic names will give a note about the current taxonomic name:
mo_gramstain("Esc blattae") # Note: 'Escherichia blattae' (Burgess et al., 1973) was renamed 'Shimwellia blattae' (Priest and Barker, 2010) # [1] "Gram negative"
count_R, count_IR, count_I, count_SI and count_S to selectively count resistant or susceptible isolates
count_df (which works like portion_df) to get all counts of S, I and R of a data set with antibiotic columns, with support for grouped variablesis.rsi.eligible to check for columns that have valid antimicrobial results, but do not have the rsi class yet. Transform the columns of your raw data with: data %>% mutate_if(is.rsi.eligible, as.rsi)
Functions as.mo and is.mo as replacements for as.bactid and is.bactid (since the microoganisms data set not only contains bacteria). These last two functions are deprecated and will be removed in a future release. The as.mo function determines microbial IDs using intelligent rules:
as.mo("E. coli") # [1] B_ESCHR_COL as.mo("MRSA") # [1] B_STPHY_AUR as.mo("S group A") # [1] B_STRPTC_GRA
And with great speed too - on a quite regular Linux server from 2007 it takes us less than 0.02 seconds to transform 25,000 items:
thousands_of_E_colis <- rep("E. coli", 25000) microbenchmark::microbenchmark(as.mo(thousands_of_E_colis), unit = "s") # Unit: seconds # min median max neval # 0.01817717 0.01843957 0.03878077 100
reference_df for as.mo, so users can supply their own microbial IDs, name or codes as a reference tablebactid to mo, like:
EUCAST_rules, first_isolate and key_antibiotics
microorganisms and septic_patients
labels_rsi_count to print datalabels on a RSI ggplot2 modelFunctions as.atc and is.atc to transform/look up antibiotic ATC codes as defined by the WHO. The existing function guess_atc is now an alias of as.atc.
ab_property and its aliases: ab_name, ab_tradenames, ab_certe, ab_umcg and ab_trivial_nl
Renamed septic_patients$sex to septic_patients$gender
antibiotics data set: Terbinafine (D01BA02), Rifaximin (A07AA11) and Isoconazole (D01AC05)Added 163 trade names to the antibiotics data set, it now contains 298 different trade names in total, e.g.:
first_isolate, rows will be ignored when there’s no species availableratio is now deprecated and will be removed in a future release, as it is not really the scope of this packageas.mic for values ending in zeroes after a real numbermicroorganisms.umcg data setprevalence column to the microorganisms data setminimum and as_percent to portion_df
Support for quasiquotation in the functions series count_* and portions_*, and n_rsi. This allows to check for more than 2 vectors or columns.
ggplot_rsi and geom_rsi so they can cope with count_df. The new fun argument has value portion_df at default, but can be set to count_df.ggplot_rsi when the ggplot2 package was not loadedlabels_rsi_count to ggplot_rsi
geom_rsi (and ggplot_rsi) so you can set your own preferencesquote to the freq functiondiff for frequency tablesfreq) header of class character
Support for types (classes) list and matrix for freq
For lists, subsetting is possible:
rsi_df was removed in favour of new functions portion_R, portion_IR, portion_I, portion_SI and portion_S to selectively calculate resistance or susceptibility. These functions are 20 to 30 times faster than the old rsi function. The old function still works, but is deprecated.
portion_df to get all portions of S, I and R of a data set with antibiotic columns, with support for grouped variablesggplot2
geom_rsi, facet_rsi, scale_y_percent, scale_rsi_colours and theme_rsi
ggplot_rsi to apply all above functions on a data set:
septic_patients %>% select(tobr, gent) %>% ggplot_rsi will show portions of S, I and R immediately in a pretty plot?ggplot_rsi
as.bactid and is.bactid to transform/ look up microbial ID’s.guess_bactid is now an alias of as.bactid
kurtosis and skewness that are lacking in base R - they are generic functions and have support for vectors, data.frames and matricesg.test to perform the Χ2 distributed G-test, which use is the same as chisq.test
ratio to transform a vector of values to a preset ratioratio(c(10, 500, 10), ratio = "1:2:1") would return 130, 260, 130%in% or %like% (and give them keyboard shortcuts), or to view the datasets that come with this packagep.symbol to transform p values to their related symbols: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
clipboard_import and clipboard_export as helper functions to quickly copy and paste from/to software like Excel and SPSS. These functions use the clipr package, but are a little altered to also support headless Linux servers (so you can use it in RStudio Server)freq):
rsi (antimicrobial resistance) to use as inputtable to use as input: freq(table(x, y))
hist and plot to use a frequency table as input: hist(freq(df$age))
as.vector, as.data.frame, as_tibble and format
freq(mydata, mycolumn) is the same as mydata %>% freq(mycolumn)
top_freq function to return the top/below n items as vectoroptions(max.print.freq = n) where n is your preset valueresistance_predict and added more examplesseptic_patients data set to better reflect the realitymic and rsi classes now returns all values - use freq to check distributionskey_antibiotics function are now generic: 6 for broadspectrum ABs, 6 for Gram-positive specific and 6 for Gram-negative specific ABsabname function%like% now supports multiple patternsdata.frames with altered console printing to make it look like a frequency table. Because of this, the argument toConsole is not longer needed.freq where the class of an item would be lostseptic_patients dataset and the column bactid now has the new class "bactid"
microorganisms dataset (especially for Salmonella) and the column bactid now has the new class "bactid"
rsi and mic functions:
as.rsi("<=0.002; S") will return S
as.mic("<=0.002; S") will return <=0.002
as.mic("<= 0.002") now worksrsi and mic do not add the attribute package.version anymore"groups" option for atc_property(..., property). It will return a vector of the ATC hierarchy as defined by the WHO. The new function atc_groups is a convenient wrapper around this.atc_property as it requires the host set by url to be responsivefirst_isolate algorithm to exclude isolates where bacteria ID or genus is unavailable924b62) from the dplyr package v0.7.5 and aboveguess_bactid (now called as.bactid)
yourdata %>% select(genus, species) %>% as.bactid() now also worksn_rsi to count cases where antibiotic test results were available, to be used in conjunction with dplyr::summarise, see ?rsiguess_bactid to determine the ID of a microorganism based on genus/species or known abbreviations like MRSAguess_atc to determine the ATC of an antibiotic based on name, trade name, or known abbreviationsfreq to create frequency tables, with additional info in a headerMDRO to determine Multi Drug Resistant Organisms (MDRO) with support for country-specific guidelines.
BRMO and MRGN are wrappers for Dutch and German guidelines, respectively"points" or "keyantibiotics", see ?first_isolate
tibbles and data.tablesrsi class for vectors that contain only invalid antimicrobial interpretationsablist to antibiotics
bactlist to microorganisms
antibiotics datasetmicroorganisms datasetseptic_patients
join functions%like% to make it case insensitivefirst_isolate and EUCAST_rules column names are now case-insensitiveas.rsi and as.mic now add the package name and version as attributesREADME.md with more examplestestthat package