Note: this is the development version, which will eventually be released as AMR 0.6.0.
as.mo() to identify an MO code.first_isolate() and eucast_rules(), all parameters will be filled in automatically.antibiotics data set now contains a column ears_net.ab_* functions are deprecated and replaced by atc_* functions: r ab_property -> atc_property() ab_name -> atc_name() ab_official -> atc_official() ab_trivial_nl -> atc_trivial_nl() ab_certe -> atc_certe() ab_umcg -> atc_umcg() ab_tradenames -> atc_tradenames() These functions use as.atc() internally. The old atc_property has been renamed atc_online_property(). This is done for two reasons: firstly, not all ATC codes are of antibiotics (ab) but can also be of antivirals or antifungals. Secondly, the input must have class atc or must be coerable to this class. Properties of these classes should start with the same class name, analogous to as.mo() and e.g. mo_genus.pkgdown)set_mo_source() and get_mo_source() to use your own predefined MO codes as input for as.mo() and consequently all mo_* functionsdplyr version 0.8.0guess_ab_col() to find an antibiotic column in a tablemo_failures() to review values that could not be coerced to a valid MO code, using as.mo(). This latter function will now only show a maximum of 10 uncoerced values and will refer to mo_failures().mo_renamed() to get a list of all returned values from as.mo() that have had taxonomic renamingage() to calculate the (patients) age in yearsage_groups() to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic antimicrobial resistance analysis per age group.ggplot_rsi_predict() as well as the base R plot() function can now be used for resistance prediction calculated with resistance_predict(): r x <- resistance_predict(septic_patients, col_ab = "amox") plot(x) ggplot_rsi_predict(x)
filter_first_isolate() and filter_first_weighted_isolate() to shorten and fasten filtering on data sets with antimicrobial results, e.g.: r septic_patients %>% filter_first_isolate(...) # or filter_first_isolate(septic_patients, ...) is equal to: r septic_patients %>% mutate(only_firsts = first_isolate(septic_patients, ...)) %>% filter(only_firsts == TRUE) %>% select(-only_firsts)
antibiotics data set, from the Pharmaceuticals Community Register of the European Commissionatc_group1_nl and atc_group2_nl from the antibiotics data setatc_ddd() and atc_groups() have been renamed atc_online_ddd() and atc_online_groups(). The old functions are deprecated and will be removed in a future version.guess_mo() is now deprecated in favour of as.mo() and will be removed in future versionsguess_atc() is now deprecated in favour of as.atc() and will be removed in future versionseucast_rules():as.mo():first_isolate():septic_patients data set this yielded a difference of 0.15% more isolatescol_patientid), when this parameter was left blankcol_keyantibiotics()), when this parameter was left blankoutput_logical, the function will now always return a logical valuefilter_specimen to specimen_group, although using filter_specimen will still workportion functions, that low counts can influence the outcome and that the portion functions may camouflage this, since they only return the portion (albeit being dependent on the minimum parameter)microorganisms.certe and microorganisms.umcg into microorganisms.codes
mo_taxonomy() now contains the kingdom toois.rsi.eligible()
scale_rsi_colours()
mo will now return the top 3 and the unique count, e.g. using summary(mo)
rsi and mic
freq() function):Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
header functionmo to show unique count of families, genera and speciesdecimal.mark setting, which just like format defaults to getOption("OutDec")
big.mark parameter will at default be "," when decimal.mark = "." and "." otherwiseNA
droplevels to exclude empty factor levels when input is a factorselect() on frequency tablesscale_y_percent() now contains the limits parametermdro(), key_antibiotics() and eucast_rules()
resistance_predict() function)Fix for as.mic() to support more values ending in (several) zeroes
count_all to get all available isolates (that like all portion_* and count_* functions also supports summarise and group_by), the old n_rsi is now an alias of count_all
get_locale to determine language for language-dependent output for some mo_* functions. This is now the default value for their language parameter, by which the system language will be used at default.microorganismsDT, microorganisms.prevDT, microorganisms.unprevDT and microorganisms.oldDT to improve the speed of as.mo. They are for reference only, since they are primarily for internal use of as.mo.read.4D to read from the 4D database of the MMB department of the UMCGmo_authors and mo_year to get specific values about the scientific reference of a taxonomic entryMDRO, BRMO, MRGN and EUCAST_exceptional_phenotypes were renamed to mdro, brmo, mrgn and eucast_exceptional_phenotypes
EUCAST_rules was renamed to eucast_rules, the old function still exists as a deprecated functioneucast_rules function:rules to specify which rules should be applied (expert rules, breakpoints, others or all)verbose which can be set to TRUE to get very specific messages about which columns and rows were affectedseptic_patients now reflects these changespipe for piperacillin (J01CA12), also to the mdro functionkingdom to the microorganisms data set, and function mo_kingdom to look up valuesas.mo (and subsequently all mo_* functions), as empty values wil be ignored a priori
as.mo will return NAas.mo (and all mo_* wrappers) now supports genus abbreviations with “species” attached r as.mo("E. species") # B_ESCHR mo_fullname("E. spp.") # "Escherichia species" as.mo("S. spp") # B_STPHY mo_fullname("S. species") # "Staphylococcus species"
combine_IR (TRUE/FALSE) to functions portion_df and count_df, to indicate that all values of I and R must be merged into one, so the output only consists of S vs. IR (susceptible vs. non-susceptible)portion_*(..., as_percent = TRUE) when minimal number of isolates would not be metalso_single_tested for portion_* and count_* functions to also include cases where not all antibiotics were tested but at least one of the tested antibiotics includes the target antimicribial interpretation, see ?portion
portion_* functions now throws a warning when total available isolate is below parameter minimum
as.mo, as.rsi, as.mic, as.atc and freq will not set package name as attribute anymorefreq():Support for grouping variables, test with:
septic_patients %>%
group_by(hospital_id) %>%
freq(gender)Support for (un)selecting columns:
septic_patients %>%
freq(hospital_id) %>%
select(-count, -cum_count) # only get item, percent, cum_percenthms::is.hms
difftime
na, to choose which character to print for empty valuesheader to turn the header info off (default when markdown = TRUE)title to manually setbthe title of the frequency tablefirst_isolate now tries to find columns to use as input when parameters are left blankmdro)septic_patients is now a data.frame, not a tibble anymoremicroorganisms$ref and microorganisms.old$ref) to comply with CRAN policy to only allow ASCII charactersmo_property not working properlyeucast_rules where some Streptococci would become ceftazidime R in EUCAST rule 4.5mo, useful for top_freq()
ggplot_rsi and scale_y_percent have breaks parameteras.mo:"CRS" -> Stenotrophomonas maltophilia
"CRSM" -> Stenotrophomonas maltophilia
"MSSA" -> Staphylococcus aureus
"MSSE" -> Staphylococcus epidermidis
join functionsis.rsi.eligible, now 15-20 times fasterg.test, when sum(x) is below 1000 or any of the expected values is below 5, Fisher’s Exact Test will be suggestedab_name will try to fall back on as.atc when no results are foundPercentages will now will rounded more logically (e.g. in freq function)
microorganisms now contains all microbial taxonomic data from ITIS (kingdoms Bacteria, Fungi and Protozoa), the Integrated Taxonomy Information System, available via https://itis.gov. The data set now contains more than 18,000 microorganisms with all known bacteria, fungi and protozoa according ITIS with genus, species, subspecies, family, order, class, phylum and subkingdom. The new data set microorganisms.old contains all previously known taxonomic names from those kingdoms.mo_property:mo_phylum, mo_class, mo_order, mo_family, mo_genus, mo_species, mo_subspecies
mo_fullname, mo_shortname
mo_type, mo_gramstain
mo_ref
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese: r mo_gramstain("E. coli") # [1] "Gram negative" mo_gramstain("E. coli", language = "de") # German # [1] "Gramnegativ" mo_gramstain("E. coli", language = "es") # Spanish # [1] "Gram negativo" mo_fullname("S. group A", language = "pt") # Portuguese # [1] "Streptococcus grupo A"
Furthermore, former taxonomic names will give a note about the current taxonomic name: r mo_gramstain("Esc blattae") # Note: 'Escherichia blattae' (Burgess et al., 1973) was renamed 'Shimwellia blattae' (Priest and Barker, 2010) # [1] "Gram negative"
count_R, count_IR, count_I, count_SI and count_S to selectively count resistant or susceptible isolatescount_df (which works like portion_df) to get all counts of S, I and R of a data set with antibiotic columns, with support for grouped variablesis.rsi.eligible to check for columns that have valid antimicrobial results, but do not have the rsi class yet. Transform the columns of your raw data with: data %>% mutate_if(is.rsi.eligible, as.rsi)
as.mo and is.mo as replacements for as.bactid and is.bactid (since the microoganisms data set not only contains bacteria). These last two functions are deprecated and will be removed in a future release. The as.mo function determines microbial IDs using Artificial Intelligence (AI): r as.mo("E. coli") # [1] B_ESCHR_COL as.mo("MRSA") # [1] B_STPHY_AUR as.mo("S group A") # [1] B_STRPTC_GRA And with great speed too - on a quite regular Linux server from 2007 it takes us less than 0.02 seconds to transform 25,000 items: r thousands_of_E_colis <- rep("E. coli", 25000) microbenchmark::microbenchmark(as.mo(thousands_of_E_colis), unit = "s") # Unit: seconds # min median max neval # 0.01817717 0.01843957 0.03878077 100
reference_df for as.mo, so users can supply their own microbial IDs, name or codes as a reference tablebactid to mo, like:EUCAST_rules, first_isolate and key_antibiotics
microorganisms and septic_patients
labels_rsi_count to print datalabels on a RSI ggplot2 modelFunctions as.atc and is.atc to transform/look up antibiotic ATC codes as defined by the WHO. The existing function guess_atc is now an alias of as.atc.
ab_property and its aliases: ab_name, ab_tradenames, ab_certe, ab_umcg and ab_trivial_nl
Renamed septic_patients$sex to septic_patients$gender
antibiotics data set: Terbinafine (D01BA02), Rifaximin (A07AA11) and Isoconazole (D01AC05)antibiotics data set, it now contains 298 different trade names in total, e.g.: r ab_official("Bactroban") # [1] "Mupirocin" ab_name(c("Bactroban", "Amoxil", "Zithromax", "Floxapen")) # [1] "Mupirocin" "Amoxicillin" "Azithromycin" "Flucloxacillin" ab_atc(c("Bactroban", "Amoxil", "Zithromax", "Floxapen")) # [1] "R01AX06" "J01CA04" "J01FA10" "J01CF05"
first_isolate, rows will be ignored when there’s no species availableratio is now deprecated and will be removed in a future release, as it is not really the scope of this packageas.mic for values ending in zeroes after a real numbermicroorganisms.umcg data setprevalence column to the microorganisms data setminimum and as_percent to portion_df
count_* and portions_*, and n_rsi. This allows to check for more than 2 vectors or columns. ```r septic_patients %>% select(amox, cipr) %>% count_IR() # which is the same as: septic_patients %>% count_IR(amox, cipr)septic_patients %>% portion_S(amcl) septic_patients %>% portion_S(amcl, gent) septic_patients %>% portion_S(amcl, gent, pita) * Edited `ggplot_rsi` and `geom_rsi` so they can cope with `count_df`. The new `fun` parameter has value `portion_df` at default, but can be set to `count_df`. * Fix for `ggplot_rsi` when the `ggplot2` package was not loaded * Added datalabels function `labels_rsi_count` to `ggplot_rsi` * Added possibility to set any parameter to `geom_rsi` (and `ggplot_rsi`) so you can set your own preferences * Fix for joins, where predefined suffices would not be honoured * Added parameter `quote` to the `freq` function * Added generic function `diff` for frequency tables * Added longest en shortest character length in the frequency table (`freq`) header of class `character` * Support for types (classes) list and matrix for `freq`r my_matrix = with(septic_patients, matrix(c(age, gender), ncol = 2)) freq(my_matrix) For lists, subsetting is possible:r my_list = list(age = septic_patients$age, gender = septic_patients$gender) my_list %>% freq(age) my_list %>% freq(gender) ```
rsi_df was removed in favour of new functions portion_R, portion_IR, portion_I, portion_SI and portion_S to selectively calculate resistance or susceptibility. These functions are 20 to 30 times faster than the old rsi function. The old function still works, but is deprecated.portion_df to get all portions of S, I and R of a data set with antibiotic columns, with support for grouped variablesggplot2
geom_rsi, facet_rsi, scale_y_percent, scale_rsi_colours and theme_rsi
ggplot_rsi to apply all above functions on a data set:
septic_patients %>% select(tobr, gent) %>% ggplot_rsi will show portions of S, I and R immediately in a pretty plot?ggplot_rsi
as.bactid and is.bactid to transform/ look up microbial ID’s.guess_bactid is now an alias of as.bactid
kurtosis and skewness that are lacking in base R - they are generic functions and have support for vectors, data.frames and matricesg.test to perform the Χ2 distributed G-test, which use is the same as chisq.test
ratio to transform a vector of values to a preset ratioratio(c(10, 500, 10), ratio = "1:2:1") would return 130, 260, 130%in% or %like% (and give them keyboard shortcuts), or to view the datasets that come with this packagep.symbol to transform p values to their related symbols: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
clipboard_import and clipboard_export as helper functions to quickly copy and paste from/to software like Excel and SPSS. These functions use the clipr package, but are a little altered to also support headless Linux servers (so you can use it in RStudio Server)freq):rsi (antimicrobial resistance) to use as inputtable to use as input: freq(table(x, y))
hist and plot to use a frequency table as input: hist(freq(df$age))
as.vector, as.data.frame, as_tibble and format
freq(mydata, mycolumn) is the same as mydata %>% freq(mycolumn)
top_freq function to return the top/below n items as vectoroptions(max.print.freq = n) where n is your preset valueresistance_predict and added more examplesseptic_patients data set to better reflect the realitymic and rsi classes now returns all values - use freq to check distributionskey_antibiotics function are now generic: 6 for broadspectrum ABs, 6 for Gram-positive specific and 6 for Gram-negative specific ABsabname function%like% now supports multiple patternsdata.frames with altered console printing to make it look like a frequency table. Because of this, the parameter toConsole is not longer needed.freq where the class of an item would be lostseptic_patients dataset and the column bactid now has the new class "bactid"
microorganisms dataset (especially for Salmonella) and the column bactid now has the new class "bactid"
rsi and mic functions:as.rsi("<=0.002; S") will return S
as.mic("<=0.002; S") will return <=0.002
as.mic("<= 0.002") now worksrsi and mic do not add the attribute package.version anymore"groups" option for atc_property(..., property). It will return a vector of the ATC hierarchy as defined by the WHO. The new function atc_groups is a convenient wrapper around this.atc_property as it requires the host set by url to be responsivefirst_isolate algorithm to exclude isolates where bacteria ID or genus is unavailable924b62) from the dplyr package v0.7.5 and aboveguess_bactid (now called as.bactid)yourdata %>% select(genus, species) %>% as.bactid() now also worksn_rsi to count cases where antibiotic test results were available, to be used in conjunction with dplyr::summarise, see ?rsiguess_bactid to determine the ID of a microorganism based on genus/species or known abbreviations like MRSAguess_atc to determine the ATC of an antibiotic based on name, trade name, or known abbreviationsfreq to create frequency tables, with additional info in a headerMDRO to determine Multi Drug Resistant Organisms (MDRO) with support for country-specific guidelines.BRMO and MRGN are wrappers for Dutch and German guidelines, respectively"points" or "keyantibiotics", see ?first_isolate
tibbles and data.tablesrsi class for vectors that contain only invalid antimicrobial interpretationsablist to antibiotics
bactlist to microorganisms
antibiotics datasetmicroorganisms datasetseptic_patients
join functions%like% to make it case insensitivefirst_isolate and EUCAST_rules column names are now case-insensitiveas.rsi and as.mic now add the package name and version as attributesREADME.md with more examplestestthat package