Function rsi_df()
to transform a data.frame
to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combinations of existing functions count_df()
and portion_df()
to immediately show resistance percentages and number of available isolates:
Support for all scientifically published pathotypes of E. coli to date. Supported are: AIEC (Adherent-Invasive E. coli), ATEC (Atypical Entero-pathogenic E. coli), DAEC (Diffusely Adhering E. coli), EAEC (Entero-Aggresive E. coli), EHEC (Entero-Haemorrhagic E. coli), EIEC (Entero-Invasive E. coli), EPEC (Entero-Pathogenic E. coli), ETEC (Entero-Toxigenic E. coli), NMEC (Neonatal Meningitis‐causing E. coli), STEC (Shiga-toxin producing E. coli) and UPEC (Uropathogenic E. coli). All these lead to the microbial ID of E. coli:
Function mo_info()
as an analogy to ab_info()
. The mo_info()
prints a list with the full taxonomy, authors, and the URL to the online database of a microorganism
count_df()
and portion_df()
are now lowercaseas.ab()
and as.mo()
to understand even more severely misspelled inputas.ab()
now allows spaces for coercing antibiotics namesggplot2
methods for automatically determining the scale type of classes mo
and ab
"bacteria"
from getting coerced by as.ab()
because Bacterial is a brand name of trimethoprim (TMP)eucast_rules()
and mdro()
latest_annual_release
from the catalogue_of_life_version()
functionPVM1
from the antibiotics
data set as this was a duplicate of PME
as.mo()
as.rsi()
on an MIC value (created with as.mic()
), a disk diffusion value (created with the new as.disk()
) or on a complete date set containing columns with MIC or disk diffusion values.mo_name()
as alias of mo_fullname()
mdr_tb()
) and added a new vignette about MDR. Read this tutorial here on our website.first_isolate()
where missing species would lead to incorrect FALSEs. This bug was not present in AMR v0.5.0, but was in v0.6.0 and v0.6.1.eucast_rules()
where antibiotics from WHONET software would not be recognisedantibiotics
data set:
ab
contains a human readable EARS-Net code, used by ECDC and WHO/WHONET - this is the primary identifier used in this packageatc
contains the ATC code, used by WHO/WHOCCcid
contains the CID code (Compound ID), used by PubChemAMX
for amoxicillinatc_certe
, ab_umcg
and atc_trivial_nl
have been removedatc_*
functions are superceded by ab_*
functionsAll output will be translated by using an included translation file which can be viewed here.
Please create an issue in one of our repositories if you want additions in this file.ggplot_rsi()
:
colours
to set the bar colourstitle
, subtitle
, caption
, x.title
and y.title
to set titles and axis descriptionsguess_ab_col()
microorganisms.old
data set, which leads to better results finding when using the as.mo()
functionportion_df()
and count_df()
this means that their new parameter combine_SI
is TRUE at default. Our plotting function ggplot_rsi()
also reflects this change since it uses count_df()
internally.age()
function gained a new parameter exact
to determine ages with decimalsguess_mo()
, guess_atc()
, EUCAST_rules()
, interpretive_reading()
, rsi()
freq()
):
support for boxplots:
Removed all hardcoded EUCAST rules and replaced them with a new reference file which can be viewed here.
Please create an issue in one of our repositories if you want changes in this file.age_groups()
, to let groups of fives and tens end with 100+ instead of 120+freq()
for when all values are NA
first_isolate()
for when dates are missingguess_ab_col()
as.mo()
now gently interprets any number of whitespace characters (like tabs) as one spaceas.mo()
now returns UNKNOWN
for "con"
(WHONET ID of ‘contamination’) and returns NA
for "xxx"
(WHONET ID of ‘no growth’)as.mo()
microorganisms.codes
and cleaned it upFix for mo_shortname()
where species would not be determined correctly
eucast_rules()
with verbose = TRUE
New website!
We’ve got a new website: https://msberends.gitlab.io/AMR (built with the great pkgdown
)
as.mo()
to identify an MO code.microorganisms
data set now contains:
The responsible author(s) and year of scientific publication
This data is updated annually - check the included version with the new functioncatalogue_of_life_version()
.mo
codes changed (e.g. Streptococcus changed from B_STRPTC
to B_STRPT
). A translation table is used internally to support older microorganism IDs, so users will not notice this difference.mo_rank()
for the taxonomic rank (genus, species, infraspecies, etc.)mo_url()
to get the direct URL of a species from the Catalogue of Lifefirst_isolate()
and eucast_rules()
, all parameters will be filled in automatically.antibiotics
data set now contains a column ears_net
.as.mo()
now knows all WHONET species abbreviations too, because almost 2,000 microbial abbreviations were added to the microorganisms.codes
data set.New filters for antimicrobial classes. Use these functions to filter isolates on results in one of more antibiotics from a specific class:
filter_aminoglycosides()
filter_carbapenems()
filter_cephalosporins()
filter_1st_cephalosporins()
filter_2nd_cephalosporins()
filter_3rd_cephalosporins()
filter_4th_cephalosporins()
filter_fluoroquinolones()
filter_glycopeptides()
filter_macrolides()
filter_tetracyclines()
The antibiotics
data set will be searched, after which the input data will be checked for column names with a value in any abbreviations, codes or official names found in the antibiotics
data set. For example:
All ab_*
functions are deprecated and replaced by atc_*
functions:
ab_property -> atc_property()
ab_name -> atc_name()
ab_official -> atc_official()
ab_trivial_nl -> atc_trivial_nl()
ab_certe -> atc_certe()
ab_umcg -> atc_umcg()
ab_tradenames -> atc_tradenames()
as.atc()
internally. The old atc_property
has been renamed atc_online_property()
. This is done for two reasons: firstly, not all ATC codes are of antibiotics (ab) but can also be of antivirals or antifungals. Secondly, the input must have class atc
or must be coerable to this class. Properties of these classes should start with the same class name, analogous to as.mo()
and e.g. mo_genus
.set_mo_source()
and get_mo_source()
to use your own predefined MO codes as input for as.mo()
and consequently all mo_*
functionsdplyr
version 0.8.0guess_ab_col()
to find an antibiotic column in a tablemo_failures()
to review values that could not be coerced to a valid MO code, using as.mo()
. This latter function will now only show a maximum of 10 uncoerced values and will refer to mo_failures()
.mo_uncertainties()
to review values that could be coerced to a valid MO code using as.mo()
, but with uncertainty.mo_renamed()
to get a list of all returned values from as.mo()
that have had taxonomic renamingage()
to calculate the (patients) age in yearsage_groups()
to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic antimicrobial resistance analysis per age group.New function ggplot_rsi_predict()
as well as the base R plot()
function can now be used for resistance prediction calculated with resistance_predict()
:
Functions filter_first_isolate()
and filter_first_weighted_isolate()
to shorten and fasten filtering on data sets with antimicrobial results, e.g.:
is equal to:
availability()
to check the number of available (non-empty) results in a data.frame
New vignettes about how to conduct AMR analysis, predict antimicrobial resistance, use the G-test and more. These are also available (and even easier readable) on our website: https://msberends.gitlab.io/AMR.
eucast_rules()
:
septic_patients
now reflects these changeseucast_rules(..., verbose = TRUE)
to get a data set with all changed per bug and drug combination.microorganisms.oldDT
, microorganisms.prevDT
, microorganisms.unprevDT
and microorganismsDT
since they were no longer needed and only contained info already available in the microorganisms
data setantibiotics
data set, from the Pharmaceuticals Community Register of the European Commissionatc_group1_nl
and atc_group2_nl
from the antibiotics
data setatc_ddd()
and atc_groups()
have been renamed atc_online_ddd()
and atc_online_groups()
. The old functions are deprecated and will be removed in a future version.guess_mo()
is now deprecated in favour of as.mo()
and will be removed in future versionsguess_atc()
is now deprecated in favour of as.atc()
and will be removed in future versionsas.mo()
:
Now handles incorrect spelling, like i
instead of y
and f
instead of ph
:
Uncertainty of the algorithm is now divided into four levels, 0 to 3, where the default allow_uncertain = TRUE
is equal to uncertainty level 2. Run ?as.mo
for more info about these levels.
# equal:
as.mo(..., allow_uncertain = TRUE)
as.mo(..., allow_uncertain = 2)
# also equal:
as.mo(..., allow_uncertain = FALSE)
as.mo(..., allow_uncertain = 0)
as.mo(..., allow_uncertain = 3)
could lead to very unreliable results.~/.Rhistory_mo
. Use the new function clean_mo_history()
to delete this file, which resets the algorithms.Incoercible results will now be considered ‘unknown’, MO code UNKNOWN
. On foreign systems, properties of these will be translated to all languages already previously supported: German, Dutch, French, Italian, Spanish and Portuguese:
first_isolate()
:
septic_patients
data set this yielded a difference of 0.15% more isolatescol_patientid
), when this parameter was left blankcol_keyantibiotics()
), when this parameter was left blankoutput_logical
, the function will now always return a logical valuefilter_specimen
to specimen_group
, although using filter_specimen
will still workportion
functions, that low counts can influence the outcome and that the portion
functions may camouflage this, since they only return the portion (albeit being dependent on the minimum
parameter)microorganisms.certe
and microorganisms.umcg
into microorganisms.codes
mo_taxonomy()
now contains the kingdom toois.rsi.eligible()
using the new threshold
parameterscale_rsi_colours()
mo
will now return the top 3 and the unique count, e.g. using summary(mo)
rsi
and mic
as.rsi()
:
"HIGH S"
will return S
freq()
function):
Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
header
functionheader
is now set to TRUE
at default, even for markdownmo
to show unique count of families, genera and speciesdecimal.mark
setting, which just like format
defaults to getOption("OutDec")
big.mark
parameter will at default be ","
when decimal.mark = "."
and "."
otherwiseNA
droplevels
to exclude empty factor levels when input is a factorselect()
on frequency tablesscale_y_percent()
now contains the limits
parametermdro()
, key_antibiotics()
and eucast_rules()
resistance_predict()
function)as.mic()
to support more values ending in (several) zeroes%like%
, it will now return the callcount_all
to get all available isolates (that like all portion_*
and count_*
functions also supports summarise
and group_by
), the old n_rsi
is now an alias of count_all
get_locale
to determine language for language-dependent output for some mo_*
functions. This is now the default value for their language
parameter, by which the system language will be used at default.microorganismsDT
, microorganisms.prevDT
, microorganisms.unprevDT
and microorganisms.oldDT
to improve the speed of as.mo
. They are for reference only, since they are primarily for internal use of as.mo
.read.4D
to read from the 4D database of the MMB department of the UMCGmo_authors
and mo_year
to get specific values about the scientific reference of a taxonomic entryMDRO
, BRMO
, MRGN
and EUCAST_exceptional_phenotypes
were renamed to mdro
, brmo
, mrgn
and eucast_exceptional_phenotypes
EUCAST_rules
was renamed to eucast_rules
, the old function still exists as a deprecated functioneucast_rules
function:
rules
to specify which rules should be applied (expert rules, breakpoints, others or all)verbose
which can be set to TRUE
to get very specific messages about which columns and rows were affectedseptic_patients
now reflects these changespipe
for piperacillin (J01CA12), also to the mdro
functionkingdom
to the microorganisms data set, and function mo_kingdom
to look up valuesas.mo
(and subsequently all mo_*
functions), as empty values wil be ignored a priori
as.mo
will return NAFunction as.mo
(and all mo_*
wrappers) now supports genus abbreviations with “species” attached
combine_IR
(TRUE/FALSE) to functions portion_df
and count_df
, to indicate that all values of I and R must be merged into one, so the output only consists of S vs. IR (susceptible vs. non-susceptible)portion_*(..., as_percent = TRUE)
when minimal number of isolates would not be metalso_single_tested
for portion_*
and count_*
functions to also include cases where not all antibiotics were tested but at least one of the tested antibiotics includes the target antimicribial interpretation, see ?portion
portion_*
functions now throws a warning when total available isolate is below parameter minimum
as.mo
, as.rsi
, as.mic
, as.atc
and freq
will not set package name as attribute anymorefreq()
:
Support for grouping variables, test with:
Support for (un)selecting columns:
hms::is.hms
difftime
na
, to choose which character to print for empty valuesheader
to turn the header info off (default when markdown = TRUE
)title
to manually setbthe title of the frequency tablefirst_isolate
now tries to find columns to use as input when parameters are left blankmdro
)septic_patients
is now a data.frame
, not a tibble anymoremicroorganisms$ref
and microorganisms.old$ref
) to comply with CRAN policy to only allow ASCII charactersmo_property
not working properlyeucast_rules
where some Streptococci would become ceftazidime R in EUCAST rule 4.5mo
, useful for top_freq()
ggplot_rsi
and scale_y_percent
have breaks
parameteras.mo
:
"CRS"
-> Stenotrophomonas maltophilia
"CRSM"
-> Stenotrophomonas maltophilia
"MSSA"
-> Staphylococcus aureus
"MSSE"
-> Staphylococcus epidermidis
join
functionsis.rsi.eligible
, now 15-20 times fasterg.test
, when sum(x)
is below 1000 or any of the expected values is below 5, Fisher’s Exact Test will be suggestedab_name
will try to fall back on as.atc
when no results are foundPercentages will now will rounded more logically (e.g. in freq
function)
microorganisms
now contains all microbial taxonomic data from ITIS (kingdoms Bacteria, Fungi and Protozoa), the Integrated Taxonomy Information System, available via https://itis.gov. The data set now contains more than 18,000 microorganisms with all known bacteria, fungi and protozoa according ITIS with genus, species, subspecies, family, order, class, phylum and subkingdom. The new data set microorganisms.old
contains all previously known taxonomic names from those kingdoms.mo_property
:
mo_phylum
, mo_class
, mo_order
, mo_family
, mo_genus
, mo_species
, mo_subspecies
mo_fullname
, mo_shortname
mo_type
, mo_gramstain
mo_ref
They also come with support for German, Dutch, French, Italian, Spanish and Portuguese:
mo_gramstain("E. coli")
# [1] "Gram negative"
mo_gramstain("E. coli", language = "de") # German
# [1] "Gramnegativ"
mo_gramstain("E. coli", language = "es") # Spanish
# [1] "Gram negativo"
mo_fullname("S. group A", language = "pt") # Portuguese
# [1] "Streptococcus grupo A"
Furthermore, former taxonomic names will give a note about the current taxonomic name:
count_R
, count_IR
, count_I
, count_SI
and count_S
to selectively count resistant or susceptible isolates
count_df
(which works like portion_df
) to get all counts of S, I and R of a data set with antibiotic columns, with support for grouped variablesis.rsi.eligible
to check for columns that have valid antimicrobial results, but do not have the rsi
class yet. Transform the columns of your raw data with: data %>% mutate_if(is.rsi.eligible, as.rsi)
Functions as.mo
and is.mo
as replacements for as.bactid
and is.bactid
(since the microoganisms
data set not only contains bacteria). These last two functions are deprecated and will be removed in a future release. The as.mo
function determines microbial IDs using intelligent rules:
as.mo("E. coli")
# [1] B_ESCHR_COL
as.mo("MRSA")
# [1] B_STPHY_AUR
as.mo("S group A")
# [1] B_STRPTC_GRA
And with great speed too - on a quite regular Linux server from 2007 it takes us less than 0.02 seconds to transform 25,000 items:
reference_df
for as.mo
, so users can supply their own microbial IDs, name or codes as a reference tablebactid
to mo
, like:
EUCAST_rules
, first_isolate
and key_antibiotics
microorganisms
and septic_patients
labels_rsi_count
to print datalabels on a RSI ggplot2
modelFunctions as.atc
and is.atc
to transform/look up antibiotic ATC codes as defined by the WHO. The existing function guess_atc
is now an alias of as.atc
.
ab_property
and its aliases: ab_name
, ab_tradenames
, ab_certe
, ab_umcg
and ab_trivial_nl
Renamed septic_patients$sex
to septic_patients$gender
antibiotics
data set: Terbinafine (D01BA02), Rifaximin (A07AA11) and Isoconazole (D01AC05)Added 163 trade names to the antibiotics
data set, it now contains 298 different trade names in total, e.g.:
first_isolate
, rows will be ignored when there’s no species availableratio
is now deprecated and will be removed in a future release, as it is not really the scope of this packageas.mic
for values ending in zeroes after a real numbermicroorganisms.umcg
data setprevalence
column to the microorganisms
data setminimum
and as_percent
to portion_df
Support for quasiquotation in the functions series count_*
and portions_*
, and n_rsi
. This allows to check for more than 2 vectors or columns.
ggplot_rsi
and geom_rsi
so they can cope with count_df
. The new fun
parameter has value portion_df
at default, but can be set to count_df
.ggplot_rsi
when the ggplot2
package was not loadedlabels_rsi_count
to ggplot_rsi
geom_rsi
(and ggplot_rsi
) so you can set your own preferencesquote
to the freq
functiondiff
for frequency tablesfreq
) header of class character
Support for types (classes) list and matrix for freq
For lists, subsetting is possible:
rsi_df
was removed in favour of new functions portion_R
, portion_IR
, portion_I
, portion_SI
and portion_S
to selectively calculate resistance or susceptibility. These functions are 20 to 30 times faster than the old rsi
function. The old function still works, but is deprecated.
portion_df
to get all portions of S, I and R of a data set with antibiotic columns, with support for grouped variablesggplot2
geom_rsi
, facet_rsi
, scale_y_percent
, scale_rsi_colours
and theme_rsi
ggplot_rsi
to apply all above functions on a data set:
septic_patients %>% select(tobr, gent) %>% ggplot_rsi
will show portions of S, I and R immediately in a pretty plot?ggplot_rsi
as.bactid
and is.bactid
to transform/ look up microbial ID’s.guess_bactid
is now an alias of as.bactid
kurtosis
and skewness
that are lacking in base R - they are generic functions and have support for vectors, data.frames and matricesg.test
to perform the Χ2 distributed G-test, which use is the same as chisq.test
ratio
to transform a vector of values to a preset ratioratio(c(10, 500, 10), ratio = "1:2:1")
would return 130, 260, 130
%in%
or %like%
(and give them keyboard shortcuts), or to view the datasets that come with this packagep.symbol
to transform p values to their related symbols: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
clipboard_import
and clipboard_export
as helper functions to quickly copy and paste from/to software like Excel and SPSS. These functions use the clipr
package, but are a little altered to also support headless Linux servers (so you can use it in RStudio Server)freq
):
rsi
(antimicrobial resistance) to use as inputtable
to use as input: freq(table(x, y))
hist
and plot
to use a frequency table as input: hist(freq(df$age))
as.vector
, as.data.frame
, as_tibble
and format
freq(mydata, mycolumn)
is the same as mydata %>% freq(mycolumn)
top_freq
function to return the top/below n items as vectoroptions(max.print.freq = n)
where n is your preset valueresistance_predict
and added more examplesseptic_patients
data set to better reflect the realitymic
and rsi
classes now returns all values - use freq
to check distributionskey_antibiotics
function are now generic: 6 for broadspectrum ABs, 6 for Gram-positive specific and 6 for Gram-negative specific ABsabname
function%like%
now supports multiple patternsdata.frame
s with altered console printing to make it look like a frequency table. Because of this, the parameter toConsole
is not longer needed.freq
where the class of an item would be lostseptic_patients
dataset and the column bactid
now has the new class "bactid"
microorganisms
dataset (especially for Salmonella) and the column bactid
now has the new class "bactid"
rsi
and mic
functions:
as.rsi("<=0.002; S")
will return S
as.mic("<=0.002; S")
will return <=0.002
as.mic("<= 0.002")
now worksrsi
and mic
do not add the attribute package.version
anymore"groups"
option for atc_property(..., property)
. It will return a vector of the ATC hierarchy as defined by the WHO. The new function atc_groups
is a convenient wrapper around this.atc_property
as it requires the host set by url
to be responsivefirst_isolate
algorithm to exclude isolates where bacteria ID or genus is unavailable924b62
) from the dplyr
package v0.7.5 and aboveguess_bactid
(now called as.bactid
)
yourdata %>% select(genus, species) %>% as.bactid()
now also worksn_rsi
to count cases where antibiotic test results were available, to be used in conjunction with dplyr::summarise
, see ?rsiguess_bactid
to determine the ID of a microorganism based on genus/species or known abbreviations like MRSAguess_atc
to determine the ATC of an antibiotic based on name, trade name, or known abbreviationsfreq
to create frequency tables, with additional info in a headerMDRO
to determine Multi Drug Resistant Organisms (MDRO) with support for country-specific guidelines.
BRMO
and MRGN
are wrappers for Dutch and German guidelines, respectively"points"
or "keyantibiotics"
, see ?first_isolate
tibble
s and data.table
srsi
class for vectors that contain only invalid antimicrobial interpretationsablist
to antibiotics
bactlist
to microorganisms
antibiotics
datasetmicroorganisms
datasetseptic_patients
join
functions%like%
to make it case insensitivefirst_isolate
and EUCAST_rules
column names are now case-insensitiveas.rsi
and as.mic
now add the package name and version as attributesREADME.md
with more examplestestthat
package