NEWS.md
- Function ab_from_text()
to retrieve antimicrobial drugs from clinical texts in e.g. health care records, which also corrects for misspelling since it uses as.ab()
internally:
ab_from_text("28/03/2020 regular amoxiciliin 500mg po tds") +#> [1] "Amoxicillin"
Tidyverse selections for antibiotic classes, that help to select the columns of antibiotics that are of a specific antibiotic class, without the need to define the columns or antibiotic abbreviations. They can be used in any function that allows Tidyverse selections, like dplyr::select()
and tidyr::pivot_longer()
:
library(dplyr) +library(dplyr) # Columns 'IPM' and 'MEM' are in the example_isolates data set example_isolates %>% @@ -269,6 +274,8 @@Changed
+
- Fixed a bug for using
susceptibility
orresistance()
outsidesummarise()
+- Fixed a bug where
eucast_rules()
would not work on a tibble when thetibble
ordplyr
package was loaded- All
*_join_microorganisms()
functions andbug_drug_combinations()
now return the original data class (e.g.tibble
s anddata.table
s)- Fixed a bug where
@@ -278,6 +285,9 @@as.ab()
would return an error on invalid input values- Fixed a bug in
bug_drug_combinations()
for when only one antibiotic was in the input data- Changed the summary for class
<mo>
, to highlight the %SI vs. %R- Improved error handling, giving more useful info when functions return an error
+- Algorithm improvements to
+as.ab()
+- Added Monuril as trade name for fosfomycin
Fixed important floating point error for some MIC comparisons in EUCAST 2020 guideline
Interpretation from MIC values (and disk zones) to R/SI can now be used with mutate_at()
of the dplyr
package:
yourdata %>% +yourdata %>% mutate_at(vars(antibiotic1:antibiotic25), as.rsi, mo = "E. coli") yourdata %>% @@ -426,7 +436,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Support for LOINC codes in the
-antibiotics
data set. Useab_loinc()
to retrieve LOINC codes, or use a LOINC code for input in anyab_*
function:ab_loinc("ampicillin") +ab_loinc("ampicillin") #> [1] "21066-6" "3355-5" "33562-0" "33919-2" "43883-8" "43884-6" "87604-5" ab_name("21066-6") #> [1] "Ampicillin" @@ -435,7 +445,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Support for SNOMED CT codes in the
-microorganisms
data set. Usemo_snomed()
to retrieve SNOMED codes, or use a SNOMED code for input in anymo_*
function:mo_snomed("S. aureus") +mo_snomed("S. aureus") #> [1] 115329001 3092008 113961008 mo_name(115329001) #> [1] "Staphylococcus aureus" @@ -498,9 +508,9 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/@@ -512,7 +522,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
If you were dependent on the old Enterobacteriaceae family e.g. by using in your code:
-+if (mo_family(somebugs) == "Enterobacteriaceae") ...if (mo_family(somebugs) == "Enterobacteriaceae") ...then please adjust this to:
-+if (mo_order(somebugs) == "Enterobacterales") ...if (mo_order(somebugs) == "Enterobacterales") ...
Functions
-susceptibility()
andresistance()
as aliases ofproportion_SI()
andproportion_R()
, respectively. These functions were added to make it more clear that “I” should be considered susceptible and not resistant.library(dplyr) +library(dplyr) example_isolates %>% group_by(bug = mo_name(mo)) %>% summarise(amoxicillin = resistance(AMX), @@ -539,7 +549,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/More intelligent way of coping with some consonants like “l” and “r”
Added a score (a certainty percentage) to
-mo_uncertainties()
, that is calculated using the Levenshtein distance:as.mo(c("Stafylococcus aureus", +as.mo(c("Stafylococcus aureus", "staphylokok aureuz")) #> Warning: #> Results of two values were guessed with uncertainty. Use mo_uncertainties() to review them. @@ -596,12 +606,12 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Determination of first isolates now excludes all ‘unknown’ microorganisms at default, i.e. microbial code
-"UNKNOWN"
. They can be included with the new parameterinclude_unknown
:+first_isolate(..., include_unknown = TRUE)first_isolate(..., include_unknown = TRUE)For WHONET users, this means that all records/isolates with organism code
"con"
(contamination) will be excluded at default, sinceas.mo("con") = "UNKNOWN"
. The function always shows a note with the number of ‘unknown’ microorganisms that were included or excluded.For code consistency, classes
-ab
andmo
will now be preserved in any subsetting or assignment. For the sake of data integrity, this means that invalid assignments will now result inNA
:# how it works in base R: +# how it works in base R: x <- factor("A") x[1] <- "B" #> Warning message: @@ -624,7 +634,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Function
-bug_drug_combinations()
to quickly get adata.frame
with the results of all bug-drug combinations in a data set. The column containing microorganism codes is guessed automatically and its input is transformed withmo_shortname()
at default:x <- bug_drug_combinations(example_isolates) +x <- bug_drug_combinations(example_isolates) #> NOTE: Using column `mo` as input for `col_mo`. x[1:4, ] #> mo ab S I R total @@ -645,11 +655,11 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ #> 4 Gram-negative AMX 227 0 405 632 #> NOTE: Use 'format()' on this result to get a publicable/printable format.You can format this to a printable format, ready for reporting or exporting to e.g. Excel with the base R
-format()
function:+format(x, combine_IR = FALSE)format(x, combine_IR = FALSE)Additional way to calculate co-resistance, i.e. when using multiple antimicrobials as input for
-portion_*
functions orcount_*
functions. This can be used to determine the empiric susceptibility of a combination therapy. A new parameteronly_all_tested
(which defaults toFALSE
) replaces the oldalso_single_tested
and can be used to select one of the two methods to count isolates and calculate portions. The difference can be seen in this example table (which is also on theportion
andcount
help pages), where the %SI is being determined:# -------------------------------------------------------------------- +# -------------------------------------------------------------------- # only_all_tested = FALSE only_all_tested = TRUE # ----------------------- ----------------------- # Drug A Drug B include as include as include as include as @@ -669,7 +679,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/-
tibble
printing support for classesrsi
,mic
,disk
,ab
mo
. When usingtibble
s containing antimicrobial columns, valuesS
will print in green, valuesI
will print in yellow and valuesR
will print in red. Microbial IDs (classmo
) will emphasise on the genus and species, not on the kingdom.# (run this on your own console, as this page does not support colour printing) +# (run this on your own console, as this page does not support colour printing) library(dplyr) example_isolates %>% select(mo:AMC) %>% @@ -750,7 +760,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/
Function
-rsi_df()
to transform adata.frame
to a data set containing only the microbial interpretation (S, I, R), the antibiotic, the percentage of S/I/R and the number of available isolates. This is a convenient combination of the existing functionscount_df()
andportion_df()
to immediately show resistance percentages and number of available isolates:septic_patients %>% +septic_patients %>% select(AMX, CIP) %>% rsi_df() # antibiotic interpretation value isolates @@ -775,7 +785,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/- UPEC (Uropathogenic E. coli)
All these lead to the microbial ID of E. coli:
-as.mo("UPEC") +as.mo("UPEC") # B_ESCHR_COL mo_name("UPEC") # "Escherichia coli" @@ -882,7 +892,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/when all values are unique it now shows a message instead of a warning
support for boxplots:
-septic_patients %>% +septic_patients %>% freq(age) %>% boxplot() # grouped boxplots: @@ -973,7 +983,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/New filters for antimicrobial classes. Use these functions to filter isolates on results in one of more antibiotics from a specific class:
-filter_aminoglycosides() +filter_aminoglycosides() filter_carbapenems() filter_cephalosporins() filter_1st_cephalosporins() @@ -985,14 +995,14 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ filter_macrolides() filter_tetracyclines()The
-antibiotics
data set will be searched, after which the input data will be checked for column names with a value in any abbreviations, codes or official names found in theantibiotics
data set. For example:septic_patients %>% filter_glycopeptides(result = "R") +septic_patients %>% filter_glycopeptides(result = "R") # Filtering on glycopeptide antibacterials: any of `vanc` or `teic` is R septic_patients %>% filter_glycopeptides(result = "R", scope = "all") # Filtering on glycopeptide antibacterials: all of `vanc` and `teic` is RAll
-ab_*
functions are deprecated and replaced byatc_*
functions:ab_property -> atc_property() +ab_property -> atc_property() ab_name -> atc_name() ab_official -> atc_official() ab_trivial_nl -> atc_trivial_nl() @@ -1011,17 +1021,17 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/New function
age_groups()
to split ages into custom or predefined groups (like children or elderly). This allows for easier demographic antimicrobial resistance analysis per age group.New function
-ggplot_rsi_predict()
as well as the base Rplot()
function can now be used for resistance prediction calculated withresistance_predict()
:x <- resistance_predict(septic_patients, col_ab = "amox") +x <- resistance_predict(septic_patients, col_ab = "amox") plot(x) ggplot_rsi_predict(x)Functions
-filter_first_isolate()
andfilter_first_weighted_isolate()
to shorten and fasten filtering on data sets with antimicrobial results, e.g.:septic_patients %>% filter_first_isolate(...) +septic_patients %>% filter_first_isolate(...) # or filter_first_isolate(septic_patients, ...)is equal to:
-septic_patients %>% +@@ -1052,7 +1062,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/septic_patients %>% mutate(only_firsts = first_isolate(septic_patients, ...)) %>% filter(only_firsts == TRUE) %>% select(-only_firsts)
Now handles incorrect spelling, like
-i
instead ofy
andf
instead ofph
:# mo_fullname() uses as.mo() internally +# mo_fullname() uses as.mo() internally mo_fullname("Sthafilokockus aaureuz") #> [1] "Staphylococcus aureus" @@ -1062,7 +1072,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Uncertainty of the algorithm is now divided into four levels, 0 to 3, where the default
-allow_uncertain = TRUE
is equal to uncertainty level 2. Run?as.mo
for more info about these levels.# equal: +# equal: as.mo(..., allow_uncertain = TRUE) as.mo(..., allow_uncertain = 2) @@ -1075,7 +1085,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/All microbial IDs that found are now saved to a local file
~/.Rhistory_mo
. Use the new functionclean_mo_history()
to delete this file, which resets the algorithms.Incoercible results will now be considered ‘unknown’, MO code
-UNKNOWN
. On foreign systems, properties of these will be translated to all languages already previously supported: German, Dutch, French, Italian, Spanish and Portuguese:mo_genus("qwerty", language = "es") +@@ -1123,7 +1133,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/mo_genus("qwerty", language = "es") # Warning: # one unique value (^= 100.0%) could not be coerced and is considered 'unknown': "qwerty". Use mo_failures() to review it. #> [1] "(género desconocido)"
Support for tidyverse quasiquotation! Now you can create frequency tables of function outcomes:
-# Determine genus of microorganisms (mo) in `septic_patients` data set: +# Determine genus of microorganisms (mo) in `septic_patients` data set: # OLD WAY septic_patients %>% mutate(genus = mo_genus(mo)) %>% @@ -1206,7 +1216,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Fewer than 3 characters as input for
as.mo
will return NAFunction
-as.mo
(and allmo_*
wrappers) now supports genus abbreviations with “species” attachedas.mo("E. species") # B_ESCHR +@@ -1221,13 +1231,13 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/as.mo("E. species") # B_ESCHR mo_fullname("E. spp.") # "Escherichia species" as.mo("S. spp") # B_STPHY mo_fullname("S. species") # "Staphylococcus species"
Support for grouping variables, test with:
-septic_patients %>% +septic_patients %>% group_by(hospital_id) %>% freq(gender)Support for (un)selecting columns:
-septic_patients %>% +@@ -1305,7 +1315,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/septic_patients %>% freq(hospital_id) %>% select(-count, -cum_count) # only get item, percent, cum_percentThey also come with support for German, Dutch, French, Italian, Spanish and Portuguese:
-mo_gramstain("E. coli") +mo_gramstain("E. coli") # [1] "Gram negative" mo_gramstain("E. coli", language = "de") # German # [1] "Gramnegativ" @@ -1314,7 +1324,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/ mo_fullname("S. group A", language = "pt") # Portuguese # [1] "Streptococcus grupo A"Furthermore, former taxonomic names will give a note about the current taxonomic name:
-mo_gramstain("Esc blattae") +@@ -1327,14 +1337,14 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/mo_gramstain("Esc blattae") # Note: 'Escherichia blattae' (Burgess et al., 1973) was renamed 'Shimwellia blattae' (Priest and Barker, 2010) # [1] "Gram negative"Function
is.rsi.eligible
to check for columns that have valid antimicrobial results, but do not have thersi
class yet. Transform the columns of your raw data with:data %>% mutate_if(is.rsi.eligible, as.rsi)
Functions
-as.mo
andis.mo
as replacements foras.bactid
andis.bactid
(since themicrooganisms
data set not only contains bacteria). These last two functions are deprecated and will be removed in a future release. Theas.mo
function determines microbial IDs using intelligent rules:as.mo("E. coli") +as.mo("E. coli") # [1] B_ESCHR_COL as.mo("MRSA") # [1] B_STPHY_AUR as.mo("S group A") # [1] B_STRPTC_GRAAnd with great speed too - on a quite regular Linux server from 2007 it takes us less than 0.02 seconds to transform 25,000 items:
-thousands_of_E_colis <- rep("E. coli", 25000) +thousands_of_E_colis <- rep("E. coli", 25000) microbenchmark::microbenchmark(as.mo(thousands_of_E_colis), unit = "s") # Unit: seconds # min median max neval @@ -1366,7 +1376,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Added three antimicrobial agents to the
antibiotics
data set: Terbinafine (D01BA02), Rifaximin (A07AA11) and Isoconazole (D01AC05)Added 163 trade names to the
-antibiotics
data set, it now contains 298 different trade names in total, e.g.:ab_official("Bactroban") +ab_official("Bactroban") # [1] "Mupirocin" ab_name(c("Bactroban", "Amoxil", "Zithromax", "Floxapen")) # [1] "Mupirocin" "Amoxicillin" "Azithromycin" "Flucloxacillin" @@ -1381,7 +1391,7 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Added parameters
minimum
andas_percent
toportion_df
Support for quasiquotation in the functions series
-count_*
andportions_*
, andn_rsi
. This allows to check for more than 2 vectors or columns.septic_patients %>% select(amox, cipr) %>% count_IR() +@@ -280,7 +280,7 @@ count_resistant() should be used to count resistant isolates, count_susceptible(septic_patients %>% select(amox, cipr) %>% count_IR() # which is the same as: septic_patients %>% count_IR(amox, cipr) @@ -1399,10 +1409,10 @@ This works for all drug combinations, such as ampicillin/sulbactam, ceftazidime/Added longest en shortest character length in the frequency table (
freq
) header of classcharacter
Support for types (classes) list and matrix for
-freq
diff --git a/docs/reference/count.html b/docs/reference/count.html index c2ddf447..b077baca 100644 --- a/docs/reference/count.html +++ b/docs/reference/count.html @@ -83,7 +83,7 @@ count_resistant() should be used to count resistant isolates, count_susceptible(my_matrix = with(septic_patients, matrix(c(age, gender), ncol = 2)) +For lists, subsetting is possible:
-@@ -288,7 +288,12 @@ This package contains all ~550 antibiotic, antimycotic and antiviral drumy_list = list(age = septic_patients$age, gender = septic_patients$gender) +diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml index 31d91fe1..bce17854 100644 --- a/docs/pkgdown.yml +++ b/docs/pkgdown.yml @@ -10,7 +10,7 @@ articles: WHONET: WHONET.html benchmarks: benchmarks.html resistance_predict: resistance_predict.html -last_built: 2020-06-22T11:18Z +last_built: 2020-06-25T15:34Z urls: reference: https://msberends.gitlab.io/AMR/reference article: https://msberends.gitlab.io/AMR/articles diff --git a/docs/reference/ab_from_text.html b/docs/reference/ab_from_text.html new file mode 100644 index 00000000..6cac6cf6 --- /dev/null +++ b/docs/reference/ab_from_text.html @@ -0,0 +1,305 @@ + + + + + + + + +my_list = list(age = septic_patients$age, gender = septic_patients$gender) my_list %>% freq(age) my_list %>% freq(gender)Retrieve antimicrobial drugs from text — ab_from_text • AMR (for R) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +++ + + + + + + + diff --git a/docs/reference/as.ab.html b/docs/reference/as.ab.html index 566ceae5..31309376 100644 --- a/docs/reference/as.ab.html +++ b/docs/reference/as.ab.html @@ -82,7 +82,7 @@+ + + + + + +++ + + ++ + ++ +++ +Use this function on e.g. clinical texts from health care records. It returns a vector of antimicrobial drugs found in the texts.
+ab_from_text(text, collapse = NULL, translate_ab = "name", ...)+ +Arguments
++
+ ++ + +text ++ text to analyse
+ +collapse ++ character to pass on to
paste(..., collapse = ...)
to only return one character per element oftext
, see Examples+ +translate_ab ++ a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
. Defaults to "name", which is equal to usingTRUE
. Use a valueFALSE
,NULL
orNA
to prevent translation of the<ab>
code.+ +... ++ parameters passed on to
as.ab()
Details
+ +To use this for creating a new variable in a data set (e.g. with
+mutate()
), it could be convenient to paste the outcome together with thecollapse
parameter so every value in your new variable will be a character of length 1:
+df %>% mutate(abx = ab_from_text(clinical_text, collapse = "|"))
This function is also internally used by
+ +as.ab()
, although it then only returns the first hit.Examples
+# mind the bad spelling of amoxicillin in this line, +# straight from a true health care record: +ab_from_text("28/03/2020 regular amoxicilliin 500mg po tds") + +ab_from_text("administered amoxi/clav and cipro") +ab_from_text("administered amoxi/clav and cipro", collapse = ", ") + +# if you want to know which antibiotic groups were administered, check it: +abx <- ab_from_text("administered amoxi/clav and cipro") +ab_group(abx)+On our website https://msberends.gitlab.io/AMR you can find a comprehensive tutorial about how to conduct AMR analysis, the complete documentation of all functions (which reads a lot easier than here in R) and an example analysis using WHONET data.
See also
-+antibiotics for the dataframe that is being used to determine ATCs.
++
+- +
antibiotics for the dataframe that is being used to determine ATCs
- +
ab_from_text()
for a function to retrieve antimicrobial drugs from clinical text (from health care records)Examples
# these examples all return "ERY", the ID of erythromycin: diff --git a/docs/reference/as.rsi.html b/docs/reference/as.rsi.html index 18b4879e..87c8ae26 100644 --- a/docs/reference/as.rsi.html +++ b/docs/reference/as.rsi.html @@ -82,7 +82,7 @@translate_ab -+ a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
. Use a valuelanguage @@ -400,7 +400,7 @@ A microorganism is categorised as Susceptible, Increased exposure when S = count_S(CIP), n1 = count_all(CIP), # the actual total; sum of all three n2 = n_rsi(CIP), # same - analogous to n_distinct - total = n()) # NOT the number of tested isolates! + total = n()) # NOT the number of tested isolates! # Count co-resistance between amoxicillin/clav acid and gentamicin, # so we can see that combination therapy does a lot more than mono therapy. diff --git a/docs/reference/ggplot_rsi.html b/docs/reference/ggplot_rsi.html index e060340d..20599a41 100644 --- a/docs/reference/ggplot_rsi.html +++ b/docs/reference/ggplot_rsi.html @@ -82,7 +82,7 @@ @@ -326,7 +326,7 @@translate_ab -+ a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
. Use a valuecombine_SI diff --git a/docs/reference/index.html b/docs/reference/index.html index fb04aa9c..6f6480ea 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -81,7 +81,7 @@ diff --git a/docs/reference/mo_property.html b/docs/reference/mo_property.html index 53d40444..7e9d3b80 100644 --- a/docs/reference/mo_property.html +++ b/docs/reference/mo_property.html @@ -82,7 +82,7 @@ diff --git a/docs/reference/proportion.html b/docs/reference/proportion.html index 9bd826a5..79f5d734 100644 --- a/docs/reference/proportion.html +++ b/docs/reference/proportion.html @@ -83,7 +83,7 @@ resistance() should be used to calculate resistance, susceptibility() should be @@ -296,7 +296,7 @@ resistance() should be used to calculate resistance, susceptibility() should betranslate_ab -+ a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
a column name of the antibiotics data set to translate the antibiotic abbreviations to, using
ab_property()
. Use a valuelanguage diff --git a/docs/sitemap.xml b/docs/sitemap.xml index 320f21f0..5dd3ef71 100644 --- a/docs/sitemap.xml +++ b/docs/sitemap.xml @@ -15,6 +15,9 @@+ https://msberends.gitlab.io/AMR/reference/WHONET.html + https://msberends.gitlab.io/AMR/reference/ab_from_text.html +diff --git a/man/ab_from_text.Rd b/man/ab_from_text.Rd new file mode 100644 index 00000000..be1bdcb8 --- /dev/null +++ b/man/ab_from_text.Rd @@ -0,0 +1,38 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/ab_from_text.R +\name{ab_from_text} +\alias{ab_from_text} +\title{Retrieve antimicrobial drugs from text} +\usage{ +ab_from_text(text, collapse = NULL, translate_ab = "name", ...) +} +\arguments{ +\item{text}{text to analyse} + +\item{collapse}{character to pass on to \code{paste(..., collapse = ...)} to only return one character per element of \code{text}, see Examples} + +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}. Defaults to "name", which is equal to using \code{TRUE}. Use a value \code{FALSE}, \code{NULL} or \code{NA} to prevent translation of the \verb{ https://msberends.gitlab.io/AMR/reference/ab_property.html } code.} + +\item{...}{parameters passed on to \code{\link[=as.ab]{as.ab()}}} +} +\description{ +Use this function on e.g. clinical texts from health care records. It returns a vector of antimicrobial drugs found in the texts. +} +\details{ +To use this for creating a new variable in a data set (e.g. with \code{mutate()}), it could be convenient to paste the outcome together with the \code{collapse} parameter so every value in your new variable will be a character of length 1:\cr +\code{df \%>\% mutate(abx = ab_from_text(clinical_text, collapse = "|"))} + +This function is also internally used by \code{\link[=as.ab]{as.ab()}}, although it then only returns the first hit. +} +\examples{ +# mind the bad spelling of amoxicillin in this line, +# straight from a true health care record: +ab_from_text("28/03/2020 regular amoxicilliin 500mg po tds") + +ab_from_text("administered amoxi/clav and cipro") +ab_from_text("administered amoxi/clav and cipro", collapse = ", ") + +# if you want to know which antibiotic groups were administered, check it: +abx <- ab_from_text("administered amoxi/clav and cipro") +ab_group(abx) +} diff --git a/man/as.ab.Rd b/man/as.ab.Rd index fc3d097a..e2a9d9d3 100644 --- a/man/as.ab.Rd +++ b/man/as.ab.Rd @@ -83,5 +83,8 @@ ab_name("J01FA01") # "Erythromycin" ab_name("eryt") # "Erythromycin" } \seealso{ -\link{antibiotics} for the dataframe that is being used to determine ATCs. +\itemize{ +\item \link{antibiotics} for the dataframe that is being used to determine ATCs +\item \code{\link[=ab_from_text]{ab_from_text()}} for a function to retrieve antimicrobial drugs from clinical text (from health care records) +} } diff --git a/man/count.Rd b/man/count.Rd index 74e7a636..e772a1c7 100644 --- a/man/count.Rd +++ b/man/count.Rd @@ -47,7 +47,7 @@ count_df( \item{data}{a \code{\link{data.frame}} containing columns with class \code{\link{rsi}} (see \code{\link[=as.rsi]{as.rsi()}})} -\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}. Use a value} \item{language}{language of the returned text, defaults to system language (see \code{\link[=get_locale]{get_locale()}}) and can also be set with \code{getOption("AMR_locale")}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} diff --git a/man/ggplot_rsi.Rd b/man/ggplot_rsi.Rd index 392861c2..2116a72f 100644 --- a/man/ggplot_rsi.Rd +++ b/man/ggplot_rsi.Rd @@ -83,7 +83,7 @@ labels_rsi_count( \item{limits}{numeric vector of length two providing limits of the scale, use \code{NA} to refer to the existing minimum or maximum} -\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}. Use a value} \item{combine_SI}{a logical to indicate whether all values of S and I must be merged into one, so the output only consists of S+I vs. R (susceptible vs. resistant). This used to be the parameter \code{combine_IR}, but this now follows the redefinition by EUCAST about the interpretion of I (increased exposure) in 2019, see section 'Interpretation of S, I and R' below. Default is \code{TRUE}.} diff --git a/man/proportion.Rd b/man/proportion.Rd index bd7be656..22ea1a55 100644 --- a/man/proportion.Rd +++ b/man/proportion.Rd @@ -62,7 +62,7 @@ rsi_df( \item{data}{a \code{\link{data.frame}} containing columns with class \code{\link{rsi}} (see \code{\link[=as.rsi]{as.rsi()}})} -\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}} +\item{translate_ab}{a column name of the \link{antibiotics} data set to translate the antibiotic abbreviations to, using \code{\link[=ab_property]{ab_property()}}. Use a value} \item{language}{language of the returned text, defaults to system language (see \code{\link[=get_locale]{get_locale()}}) and can also be set with \code{getOption("AMR_locale")}. Use \code{language = NULL} or \code{language = ""} to prevent translation.} diff --git a/tests/testthat/test-ab_from_text.R b/tests/testthat/test-ab_from_text.R new file mode 100644 index 00000000..ce9be46e --- /dev/null +++ b/tests/testthat/test-ab_from_text.R @@ -0,0 +1,32 @@ +# ==================================================================== # +# TITLE # +# Antimicrobial Resistance (AMR) Analysis # +# # +# SOURCE # +# https://gitlab.com/msberends/AMR # +# # +# LICENCE # +# (c) 2018-2020 Berends MS, Luz CF et al. # +# # +# This R package is free software; you can freely use and distribute # +# it for both personal and commercial purposes under the terms of the # +# GNU General Public License version 2.0 (GNU GPL-2), as published by # +# the Free Software Foundation. # +# # +# We created this package for both routine data analysis and academic # +# research and it was publicly released in the hope that it will be # +# useful, but it comes WITHOUT ANY WARRANTY OR LIABILITY. # +# Visit our website for more info: https://msberends.gitlab.io/AMR. # +# ==================================================================== # + +context("ab_from_text.R") + +test_that("ab_from_text works", { + + expect_identical(ab_from_text("28/03/2020 regular amoxicilliin 500mg po tds"), + "Amoxicillin") + expect_identical(ab_from_text("28/03/2020 regular amoxicilliin 500mg po tds", translate_ab = FALSE), + as.ab("AMX")) + expect_identical(ab_from_text("administered amoxi/clav and cipro", collapse = ", "), + "Amoxicillin, Ciprofloxacin") +})