resistance_predict.Rd
Create a prediction model to predict antimicrobial resistance for the next years on statistical solid ground. Standard errors (SE) will be returned as columns se_min
and se_max
. See Examples for a real live example.
resistance_predict(tbl, col_ab, col_date, year_min = NULL, year_max = NULL, year_every = 1, minimum = 30, model = "binomial", I_as_R = TRUE, preserve_measurements = TRUE, info = TRUE) rsi_predict(tbl, col_ab, col_date, year_min = NULL, year_max = NULL, year_every = 1, minimum = 30, model = "binomial", I_as_R = TRUE, preserve_measurements = TRUE, info = TRUE)
tbl | a |
---|---|
col_ab | column name of |
col_date | column name of the date, will be used to calculate years if this column doesn't consist of years already |
year_min | lowest year to use in the prediction model, dafaults the lowest year in |
year_max | highest year to use in the prediction model, defaults to 15 years after today |
year_every | unit of sequence between lowest year found in the data and |
minimum | minimal amount of available isolates per year to include. Years containing less observations will be estimated by the model. |
model | the statistical model of choice. Valid values are |
I_as_R | treat |
preserve_measurements | logical to indicate whether predictions of years that are actually available in the data should be overwritten with the original data. The standard errors of those years will be |
info | print textual analysis with the name and |
data.frame
with columns:
year
value
, the same as estimated
when preserve_measurements = FALSE
, and a combination of observed
and estimated
otherwise
se_min
, the lower bound of the standard error with a minimum of 0
se_max
the upper bound of the standard error with a maximum of 1
observations
, the total number of observations, i.e. S + I + R
observed
, the original observed values
estimated
, the estimated values, calculated by the model
# NOT RUN { # use it with base R: resistance_predict(tbl = tbl[which(first_isolate == TRUE & genus == "Haemophilus"),], col_ab = "amcl", col_date = "date") # or use dplyr so you can actually read it: library(dplyr) tbl %>% filter(first_isolate == TRUE, genus == "Haemophilus") %>% resistance_predict(amcl, date) # }# real live example: library(dplyr) septic_patients %>% # get bacteria properties like genus and species left_join_microorganisms("mo") %>% # calculate first isolates mutate(first_isolate = first_isolate(.)) %>% # filter on first E. coli isolates filter(genus == "Escherichia", species == "coli", first_isolate == TRUE) %>% # predict resistance of cefotaxime for next years resistance_predict(col_ab = "cfot", col_date = "date", year_max = 2025, preserve_measurements = TRUE, minimum = 0)#>#> NOTE: Using column `date` as input for `col_date`.#> NOTE: Using column `patient_id` as input for `col_patient_id`.#> 1,317 first isolates (65.9% of total)#> #> Logistic regression model (logit) with binomial distribution #> ------------------------------------------------------------ #> #> Call: #> glm(formula = cbind(R, S) ~ year, family = binomial) #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -1.0751 -0.4675 -0.2840 -0.1530 1.5028 #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) -686.7518 342.1219 -2.007 0.0447 * #> year 0.3393 0.1698 1.998 0.0457 * #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> (Dispersion parameter for binomial family taken to be 1) #> #> Null deviance: 13.5547 on 15 degrees of freedom #> Residual deviance: 6.8145 on 14 degrees of freedom #> AIC: 19.128 #> #> Number of Fisher Scoring iterations: 6 #>#> year value se_min se_max observations observed estimated #> 1 2002 0.00000000 NA NA 12 0.00000000 0.0005265096 #> 2 2003 0.00000000 NA NA 13 0.00000000 0.0007390158 #> 3 2004 0.00000000 NA NA 12 0.00000000 0.0010372031 #> 4 2005 0.00000000 NA NA 15 0.00000000 0.0014555316 #> 5 2006 0.00000000 NA NA 16 0.00000000 0.0020422369 #> 6 2007 0.00000000 NA NA 17 0.00000000 0.0028647568 #> 7 2008 0.00000000 NA NA 17 0.00000000 0.0040172166 #> 8 2009 0.00000000 NA NA 18 0.00000000 0.0056306802 #> 9 2010 0.00000000 NA NA 13 0.00000000 0.0078870391 #> 10 2011 0.04761905 NA NA 21 0.04761905 0.0110375429 #> 11 2012 0.00000000 NA NA 10 0.00000000 0.0154269569 #> 12 2013 0.00000000 NA NA 13 0.00000000 0.0215239636 #> 13 2014 0.00000000 NA NA 19 0.00000000 0.0299572977 #> 14 2015 0.14285714 NA NA 14 0.14285714 0.0415545799 #> 15 2016 0.04761905 NA NA 21 0.04761905 0.0573759332 #> 16 2017 0.05000000 NA NA 20 0.05000000 0.0787262663 #> 17 2018 0.10711851 0.03829468 0.1759423 NA NA 0.1071185079 #> 18 2019 0.14414813 0.03838570 0.2499106 NA NA 0.1441481336 #> 19 2020 0.19123682 0.03582031 0.3466533 NA NA 0.1912368226 #> 20 2021 0.24922848 0.03244999 0.4660070 NA NA 0.2492284792 #> 21 2022 0.31789357 0.03249431 0.6032928 NA NA 0.3178935725 #> 22 2023 0.39551054 0.04249603 0.7485251 NA NA 0.3955105423 #> 23 2024 0.47877663 0.06981495 0.8877383 NA NA 0.4787766284 #> 24 2025 0.56323896 0.11983506 1.0000000 NA NA 0.5632389556# create nice plots with ggplot if (!require(ggplot2)) { data <- septic_patients %>% filter(mo == as.mo("E. coli")) %>% resistance_predict(col_ab = "amox", col_date = "date", info = FALSE, minimum = 15) ggplot(data, aes(x = year)) + geom_col(aes(y = value), fill = "grey75") + geom_errorbar(aes(ymin = se_min, ymax = se_max), colour = "grey50") + scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, 0.1), labels = paste0(seq(0, 100, 10), "%")) + labs(title = expression(paste("Forecast of amoxicillin resistance in ", italic("E. coli"))), y = "%IR", x = "Year") + theme_minimal(base_size = 13) }