vignettes/datasets.Rmd
datasets.Rmd
A data set with 456 rows and 14 columns, containing the following column names:
ab, cid, name, group, atc, atc_group1, atc_group2, abbreviations, synonyms, oral_ddd, oral_units, iv_ddd, iv_units and loinc.
This data set is in R available as antibiotics
, after you load the AMR
package.
It was last updated on 16 August 2021 13:16:15 UTC. Find more info about the structure of this data set here.
+It was last updated on 19 August 2021 21:21:57 UTC. Find more info about the structure of this data set here.
Direct download links:
NEWS.md
- AMR
1.7.1.9025AMR
1.7.1.9026p_symbol()
and all filter_*()
functions (except for filter_first_isolate()
), which were all deprecated in a previous package versionp_symbol()
and all filter_*()
functions (except for filter_first_isolate()
), which were all deprecated in a previous package versionkey_antibiotics()
and key_antibiotics_equal()
functions, which were deprecated and superseded by key_antimicrobials()
and antimicrobials_equal()
ggplot2::ggplot()
generics for classes <mic>
, <disk>
, <rsi>
and <resistance_predict>
as they did not follow the ggplot2
logic. They were replaced with ggplot2::autoplot()
generics.antibiotics$atc
is now a list
instead of a character
, and this atc
column was moved to the 5th position of the antibiotics
data set
ab_atc()
does not always return a character vector with length 1, and returns a list
if the input is larger than length 1They now also work in R-3.0 and R-3.1, supporting every version of R since 2013
Added the selector ab_selector()
, which accepts a filter to be used internally on the antibiotics
data set, yielding great flexibility on drug properties, such as selecting antibiotic columns with an oral DDD of at least 1 gram:
+
+example_isolates[, ab_selector(oral_ddd > 1 & oral_units == "g")] # base R
+example_isolates %>% select(ab_selector(oral_ddd > 1 & oral_units == "g")) # dplyr
Fix for using selectors multiple times in one call (e.g., using them in dplyr::filter()
and immediately after in dplyr::select()
)
Added argument only_treatable
, which defaults to TRUE
and will exclude drugs that are only for laboratory tests and not for treating patients (such as imipenem/EDTA and gentamicin-high)
All antibiotic class selectors (such as carbapenems()
, aminoglycosides()
) can now be used for filtering as well, making all their accompanying filter_*()
functions redundant (such as filter_carbapenems()
, filter_aminoglycosides()
). These functions are now deprecated and will be removed in a next release. Examples of how the selectors can be used for filtering:
+# select columns with results for carbapenems example_isolates[, carbapenems()] # base R @@ -370,7 +378,7 @@
Now checks if
pattern
is a valid regular expressionAdded
-%unlike%
and%unlike_case%
(as negations of the existing%like%
and%like_case%
). This greatly improves readability:+if (!grepl("EUCAST", guideline)) ... # same: @@ -426,7 +434,7 @@
Functions
-oxazolidinones()
(an antibiotic selector function) andfilter_oxazolidinones()
(an antibiotic filter function) to select/filter on e.g. linezolid and tedizolid+library(dplyr) x <- example_isolates %>% select(date, hospital_id, oxazolidinones()) @@ -439,7 +447,7 @@
ggplot()
generics for classes<mic>
and<disk>
Function
-mo_is_yeast()
, which determines whether a microorganism is a member of the taxonomic class Saccharomycetes or the taxonomic order Saccharomycetales:+mo_kingdom(c("Aspergillus", "Candida")) #> [1] "Fungi" "Fungi" @@ -451,7 +459,7 @@ example_isolates[which(mo_is_yeast()), ] # base R example_isolates %>% filter(mo_is_yeast()) # dplyr
The
-mo_type()
function has also been updated to reflect this change:+mo_type(c("Aspergillus", "Candida")) # [1] "Fungi" "Yeasts" @@ -461,7 +469,7 @@
Added Pretomanid (PMD, J04AK08) to the
antibiotics
data setMIC values (see
-as.mic()
) can now be used in any mathematical processing, such as usage inside functionsmin()
,max()
,range()
, and with binary operators (+
,-
, etc.). This allows for easy distribution analysis and fast filtering on MIC values:+x <- random_mic(10) x @@ -534,7 +542,7 @@ New
Functions
-get_episode()
andis_new_episode()
to determine (patient) episodes which are not necessarily based on microorganisms. Theget_episode()
function returns the index number of the episode per group, while theis_new_episode()
function returns valuesTRUE
/FALSE
to indicate whether an item in a vector is the start of a new episode. They also supportdplyr
s grouping (i.e. usinggroup_by()
):+library(dplyr) example_isolates %>% @@ -580,7 +588,7 @@
mdr_cmi2012()
,- -
eucast_exceptional_phenotypes()
+# to select first isolates that are Gram-negative # and view results of cephalosporins and aminoglycosides: @@ -592,7 +600,7 @@
For antibiotic selection functions (such as
-cephalosporins()
,aminoglycosides()
) to select columns based on a certain antibiotic group, the dependency on thetidyselect
package was removed, meaning that they can now also be used without the need to have this package installed and now also work in base R function calls (they rely on R 3.2 or later):+# above example in base R: example_isolates[which(first_isolate() & mo_is_gram_negative()), @@ -600,7 +608,7 @@
For all function arguments in the code, it is now defined what the exact type of user input should be (inspired by the
typed
package). If the user input for a certain function does not meet the requirements for a specific argument (such as the class or length), an informative error will be thrown. This makes the package more robust and the use of it more reproducible and reliable. In total, more than 420 arguments were defined.- -
Fix for
set_mo_source()
, that previously would not remember the file location of the original file- +
Deprecated function
p_symbol()
that not really fits the scope of this package. It will be removed in a future version. See here for the source code to preserve it.Deprecated function
p_symbol()
that not really fits the scope of this package. It will be removed in a future version. See here for the source code to preserve it.Updated coagulase-negative staphylococci determination with Becker et al. 2020 (PMID 32056452), meaning that the species S. argensis, S. caeli, S. debuckii, S. edaphicus and S. pseudoxylosus are now all considered CoNS
Fix for using argument
reference_df
inas.mo()
andmo_*()
functions that contain old microbial codes (from previous package versions)- @@ -638,7 +646,7 @@
Fixed a bug where
mo_uncertainties()
would not return the results based on the MO matching scoreData set
intrinsic_resistant
. This data set contains all bug-drug combinations where the ‘bug’ is intrinsic resistant to the ‘drug’ according to the latest EUCAST insights. It contains just two columns:microorganism
andantibiotic
.Curious about which enterococci are actually intrinsic resistant to vancomycin?
-+library(AMR) library(dplyr) @@ -658,7 +666,7 @@
Improvements for
as.rsi()
:
Support for using
-dplyr
’sacross()
to interpret MIC values or disk zone diameters, which also automatically determines the column with microorganism names or codes.+# until dplyr 1.0.0 your_data %>% mutate_if(is.mic, as.rsi) @@ -675,7 +683,7 @@
Added intelligent data cleaning to
-as.disk()
, so numbers can also be extracted from text and decimal numbers will always be rounded up:+as.disk(c("disk zone: 23.4 mm", 23.4)) #> Class <disk> @@ -727,7 +735,7 @@
Function
ab_from_text()
to retrieve antimicrobial drug names, doses and forms of administration from clinical texts in e.g. health care records, which also corrects for misspelling since it usesas.ab()
internallyTidyverse selection helpers for antibiotic classes, that help to select the columns of antibiotics that are of a specific antibiotic class, without the need to define the columns or antibiotic abbreviations. They can be used in any function that allows selection helpers, like
-dplyr::select()
andtidyr::pivot_longer()
:+library(dplyr) @@ -833,7 +841,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
@@ -880,7 +888,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/- Other
- Removed previously deprecated function
p.symbol()
- it was replaced withp_symbol()
+
- Removed previously deprecated function
p.symbol()
- it was replaced withp_symbol()
- Removed function
read.4d()
, that was only useful for reading data from an old test database.
Fixed important floating point error for some MIC comparisons in EUCAST 2020 guideline
Interpretation from MIC values (and disk zones) to R/SI can now be used with
-mutate_at()
of thedplyr
package:+yourdata %>% mutate_at(vars(antibiotic1:antibiotic25), as.rsi, mo = "E. coli") @@ -905,7 +913,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Support for LOINC and SNOMED codes
Support for LOINC codes in the
-antibiotics
data set. Useab_loinc()
to retrieve LOINC codes, or use a LOINC code for input in anyab_*
function:+ab_loinc("ampicillin") #> [1] "21066-6" "3355-5" "33562-0" "33919-2" "43883-8" "43884-6" "87604-5" @@ -916,7 +924,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Support for SNOMED CT codes in the
-microorganisms
data set. Usemo_snomed()
to retrieve SNOMED codes, or use a SNOMED code for input in anymo_*
function:+mo_snomed("S. aureus") #> [1] 115329001 3092008 113961008 @@ -968,11 +976,11 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Adopted Adeolu et al. (2016), PMID 27620848 for the
microorganisms
data set, which means that the new order Enterobacterales now consists of a part of the existing family Enterobacteriaceae, but that this family has been split into other families as well (like Morganellaceae and Yersiniaceae). Although published in 2016, this information is not yet in the Catalogue of Life version of 2019. All MDRO determinations withmdro()
will now use the Enterobacterales order for all guidelines before 2016 that were dependent on the Enterobacteriaceae family.
If you were dependent on the old Enterobacteriaceae family e.g. by using in your code:
-+if (mo_family(somebugs) == "Enterobacteriaceae") ...
then please adjust this to:
-+@@ -983,7 +991,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ Newif (mo_order(somebugs) == "Enterobacterales") ...
Functions
-susceptibility()
andresistance()
as aliases ofproportion_SI()
andproportion_R()
, respectively. These functions were added to make it more clear that “I” should be considered susceptible and not resistant.+library(dplyr) example_isolates %>% @@ -1007,7 +1015,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
More intelligent way of coping with some consonants like “l” and “r”
Added a score (a certainty percentage) to
-mo_uncertainties()
, that is calculated using the Levenshtein distance:+as.mo(c("Stafylococcus aureus", "staphylokok aureuz")) @@ -1058,14 +1066,14 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ Breaking
Determination of first isolates now excludes all ‘unknown’ microorganisms at default, i.e. microbial code
-"UNKNOWN"
. They can be included with the new argumentinclude_unknown
:+first_isolate(..., include_unknown = TRUE)
For WHONET users, this means that all records/isolates with organism code
"con"
(contamination) will be excluded at default, sinceas.mo("con") = "UNKNOWN"
. The function always shows a note with the number of ‘unknown’ microorganisms that were included or excluded.For code consistency, classes
-ab
andmo
will now be preserved in any subsetting or assignment. For the sake of data integrity, this means that invalid assignments will now result inNA
:+# how it works in base R: x <- factor("A") @@ -1088,7 +1096,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ New
Function
-bug_drug_combinations()
to quickly get adata.frame
with the results of all bug-drug combinations in a data set. The column containing microorganism codes is guessed automatically and its input is transformed withmo_shortname()
at default:+x <- bug_drug_combinations(example_isolates) #> NOTE: Using column `mo` as input for `col_mo`. @@ -1111,13 +1119,13 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ #> 4 Gram-negative AMX 227 0 405 632 #> NOTE: Use 'format()' on this result to get a publicable/printable format.
You can format this to a printable format, ready for reporting or exporting to e.g. Excel with the base R
-format()
function:+format(x, combine_IR = FALSE)
Additional way to calculate co-resistance, i.e. when using multiple antimicrobials as input for
-portion_*
functions orcount_*
functions. This can be used to determine the empiric susceptibility of a combination therapy. A new argumentonly_all_tested
(which defaults toFALSE
) replaces the oldalso_single_tested
and can be used to select one of the two methods to count isolates and calculate portions. The difference can be seen in this example table (which is also on theportion
andcount
help pages), where the %SI is being determined:+# -------------------------------------------------------------------- # only_all_tested = FALSE only_all_tested = TRUE @@ -1139,7 +1147,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
-
tibble
printing support for classesrsi
,mic
,disk
,ab
mo
. When usingtibble
s containing antimicrobial columns, valuesS
will print in green, valuesI
will print in yellow and valuesR
will print in red. Microbial IDs (classmo
) will emphasise on the genus and species, not on the kingdom.+# (run this on your own console, as this page does not support colour printing) library(dplyr) @@ -1187,7 +1195,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Improved
filter_ab_class()
to be more reliable and to support 5th generation cephalosporins- Function
availability()
now usesportion_R()
instead ofportion_IR()
, to comply with EUCAST insights- Functions
-age()
andage_groups()
now have ana.rm
argument to remove empty values- Renamed function
+p.symbol()
top_symbol()
(the former is now deprecated and will be removed in a future version)- Renamed function
p.symbol()
top_symbol()
(the former is now deprecated and will be removed in a future version)- Using negative values for
x
inage_groups()
will now introduceNA
s and not return an error anymore- Fix for determining the system’s language
- Fix for
@@ -1211,7 +1219,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ Newkey_antibiotics()
on foreign systems
Function
-rsi_df()
to transform adata.frame
to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combination of the existing functionscount_df()
andportion_df()
to immediately show resistance percentages and number of available isolates:+septic_patients %>% select(AMX, CIP) %>% @@ -1236,7 +1244,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- STEC (Shiga-toxin producing E. coli)
- UPEC (Uropathogenic E. coli)
All these lead to the microbial ID of E. coli:
-+as.mo("UPEC") # B_ESCHR_COL @@ -1325,7 +1333,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
when all values are unique it now shows a message instead of a warning
support for boxplots:
-+septic_patients %>% freq(age) %>% @@ -1405,7 +1413,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
New filters for antimicrobial classes. Use these functions to filter isolates on results in one of more antibiotics from a specific class:
-+filter_aminoglycosides() filter_carbapenems() @@ -1419,7 +1427,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ filter_macrolides() filter_tetracyclines()
The
-antibiotics
data set will be searched, after which the input data will be checked for column names with a value in any abbreviations, codes or official names found in theantibiotics
data set. For example:+septic_patients %>% filter_glycopeptides(result = "R") # Filtering on glycopeptide antibacterials: any of `vanc` or `teic` is R @@ -1428,7 +1436,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
All
-ab_*
functions are deprecated and replaced byatc_*
functions:+ab_property -> atc_property() ab_name -> atc_name() @@ -1449,7 +1457,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
New function
age_groups()
to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic AMR data analysis per age group.New function
-ggplot_rsi_predict()
as well as the base Rplot()
function can now be used for resistance prediction calculated withresistance_predict()
:+x <- resistance_predict(septic_patients, col_ab = "amox") plot(x) @@ -1457,13 +1465,13 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Functions
-filter_first_isolate()
andfilter_first_weighted_isolate()
to shorten and fasten filtering on data sets with antimicrobial results, e.g.:+septic_patients %>% filter_first_isolate(...) # or filter_first_isolate(septic_patients, ...)
is equal to:
-+septic_patients %>% mutate(only_firsts = first_isolate(septic_patients, ...)) %>% @@ -1491,7 +1499,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Improvements for
as.mo()
:
Now handles incorrect spelling, like
-i
instead ofy
andf
instead ofph
:+# mo_fullname() uses as.mo() internally @@ -1503,7 +1511,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Uncertainty of the algorithm is now divided into four levels, 0 to 3, where the default
-allow_uncertain = TRUE
is equal to uncertainty level 2. Run?as.mo
for more info about these levels.+# equal: as.mo(..., allow_uncertain = TRUE) @@ -1518,7 +1526,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
All microbial IDs that found are now saved to a local file
~/.Rhistory_mo
. Use the new functionclean_mo_history()
to delete this file, which resets the algorithms.Incoercible results will now be considered ‘unknown’, MO code
-UNKNOWN
. On foreign systems, properties of these will be translated to all languages already previously supported: German, Dutch, French, Italian, Spanish and Portuguese:+mo_genus("qwerty", language = "es") # Warning: @@ -1562,7 +1570,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Frequency tables (
freq()
function):
Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
-+# Determine genus of microorganisms (mo) in `septic_patients` data set: # OLD WAY @@ -1636,7 +1644,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Fewer than 3 characters as input for
as.mo
will return NAFunction
-as.mo
(and allmo_*
wrappers) now supports genus abbreviations with “species” attached+as.mo("E. species") # B_ESCHR mo_fullname("E. spp.") # "Escherichia species" @@ -1652,7 +1660,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Frequency tables -
freq()
:
Support for grouping variables, test with:
-+septic_patients %>% group_by(hospital_id) %>% @@ -1660,7 +1668,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Support for (un)selecting columns:
-+septic_patients %>% freq(hospital_id) %>% @@ -1730,7 +1738,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
- Author and year:
mo_ref
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese:
-+mo_gramstain("E. coli") # [1] "Gram negative" @@ -1741,7 +1749,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ mo_fullname("S. group A", language = "pt") # Portuguese # [1] "Streptococcus grupo A"
Furthermore, former taxonomic names will give a note about the current taxonomic name:
-+mo_gramstain("Esc blattae") # Note: 'Escherichia blattae' (Burgess et al., 1973) was renamed 'Shimwellia blattae' (Priest and Barker, 2010) @@ -1754,7 +1762,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Function