septic_patients.Rd
An anonymised data set containing 2,000 microbial blood culture isolates with their full antibiograms found in septic patients in 4 different hospitals in the Netherlands, between 2001 and 2017. It is true, genuine data. This data.frame
can be used to practice AMR analysis. For examples, press F1.
septic_patients
A data.frame
with 2,000 observations and 49 variables:
date
date of receipt at the laboratory
hospital_id
ID of the hospital, from A to D
ward_icu
logical to determine if ward is an intensive care unit
ward_clinical
logical to determine if ward is a regular clinical ward
ward_outpatient
logical to determine if ward is an outpatient clinic
age
age of the patient
gender
gender of the patient
patient_id
ID of the patient, first 10 characters of an SHA hash containing irretrievable information
mo
ID of microorganism created with as.mo
, see also microorganisms
peni:rifa
40 different antibiotics with class rsi
(see as.rsi
); these column names occur in antibiotics
data set and can be translated with abname
# ----------- # # PREPARATION # # ----------- # # Save this example data set to an object, so we can edit it: my_data <- septic_patients # load the dplyr package to make data science A LOT easier library(dplyr) # Add first isolates to our data set: my_data <- my_data %>% mutate(first_isolates = first_isolate(my_data, "date", "patient_id", "mo"))#># -------- # # ANALYSIS # # -------- # # 1. Get the amoxicillin resistance percentages (p) # and numbers (n) of E. coli, divided by hospital: my_data %>% filter(mo == guess_mo("E. coli"), first_isolates == TRUE) %>% group_by(hospital_id) %>% summarise(n = n_rsi(amox), p = portion_IR(amox))#> Warning: Introducing NA: only 19 results available (minimum set to 30).#> # A tibble: 4 x 3 #> hospital_id n p #> <fct> <int> <dbl> #> 1 A 19 NA #> 2 B 65 0.477 #> 3 C 35 0.543 #> 4 D 94 0.5# 2. Get the amoxicillin/clavulanic acid resistance # percentages of E. coli, trend over the years: my_data %>% filter(mo == guess_mo("E. coli"), first_isolates == TRUE) %>% group_by(year = format(date, "%Y")) %>% summarise(n = n_rsi(amcl), p = portion_IR(amcl, minimum = 20))#> Warning: Introducing NA: only 13 results available (minimum set to 20).#> Warning: Introducing NA: only 14 results available (minimum set to 20).#> Warning: Introducing NA: only 13 results available (minimum set to 20).#> Warning: Introducing NA: only 15 results available (minimum set to 20).#> Warning: Introducing NA: only 16 results available (minimum set to 20).#> Warning: Introducing NA: only 17 results available (minimum set to 20).#> Warning: Introducing NA: only 17 results available (minimum set to 20).#> Warning: Introducing NA: only 18 results available (minimum set to 20).#> Warning: Introducing NA: only 13 results available (minimum set to 20).#> Warning: Introducing NA: only 10 results available (minimum set to 20).#> Warning: Introducing NA: only 13 results available (minimum set to 20).#> Warning: Introducing NA: only 14 results available (minimum set to 20).#> # A tibble: 16 x 3 #> year n p #> <chr> <int> <dbl> #> 1 2002 13 NA #> 2 2003 14 NA #> 3 2004 13 NA #> 4 2005 15 NA #> 5 2006 16 NA #> 6 2007 17 NA #> 7 2008 17 NA #> 8 2009 18 NA #> 9 2010 13 NA #> 10 2011 21 0.0952 #> 11 2012 10 NA #> 12 2013 13 NA #> 13 2014 20 0.2 #> 14 2015 14 NA #> 15 2016 21 0.190 #> 16 2017 20 0.4