resistance_predict.Rd
Create a prediction model to predict antimicrobial resistance for the next years on statistical solid ground. Standard errors (SE) will be returned as columns se_min
and se_max
. See Examples for a real live example.
resistance_predict(tbl, col_ab, col_date, year_min = NULL, year_max = NULL, year_every = 1, minimum = 30, model = "binomial", I_as_R = TRUE, preserve_measurements = TRUE, info = TRUE) rsi_predict(tbl, col_ab, col_date, year_min = NULL, year_max = NULL, year_every = 1, minimum = 30, model = "binomial", I_as_R = TRUE, preserve_measurements = TRUE, info = TRUE)
tbl | a |
---|---|
col_ab | column name of |
col_date | column name of the date, will be used to calculate years if this column doesn't consist of years already |
year_min | lowest year to use in the prediction model, dafaults the lowest year in |
year_max | highest year to use in the prediction model, defaults to 15 years after today |
year_every | unit of sequence between lowest year found in the data and |
minimum | minimal amount of available isolates per year to include. Years containing less observations will be estimated by the model. |
model | the statistical model of choice. Valid values are |
I_as_R | treat |
preserve_measurements | logical to indicate whether predictions of years that are actually available in the data should be overwritten with the original data. The standard errors of those years will be |
info | print textual analysis with the name and |
data.frame
with columns:
year
value
, the same as estimated
when preserve_measurements = FALSE
, and a combination of observed
and estimated
otherwise
se_min
, the lower bound of the standard error with a minimum of 0
se_max
the upper bound of the standard error with a maximum of 1
observations
, the total number of observations, i.e. S + I + R
observed
, the original observed values
estimated
, the estimated values, calculated by the model
# NOT RUN { # use it with base R: resistance_predict(tbl = tbl[which(first_isolate == TRUE & genus == "Haemophilus"),], col_ab = "amcl", col_date = "date") # or use dplyr so you can actually read it: library(dplyr) tbl %>% filter(first_isolate == TRUE, genus == "Haemophilus") %>% resistance_predict(amcl, date) # }# real live example: library(dplyr) septic_patients %>% # get bacteria properties like genus and species left_join_microorganisms("mo") %>% # calculate first isolates mutate(first_isolate = first_isolate(.)) %>% # filter on first E. coli isolates filter(genus == "Escherichia", species == "coli", first_isolate == TRUE) %>% # predict resistance of cefotaxime for next years resistance_predict(col_ab = "cfot", col_date = "date", year_max = 2025, preserve_measurements = TRUE, minimum = 0)#>#> NOTE: Using column `date` as input for `col_date`.#> NOTE: Using column `patient_id` as input for `col_patient_id`.#> 1,315 first isolates (65.8% of total)#> #> Logistic regression model (logit) with binomial distribution #> ------------------------------------------------------------ #> #> Call: #> glm(formula = cbind(R, S) ~ year, family = binomial) #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -1.0796 -0.4714 -0.2849 -0.1534 1.5711 #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) -687.3844 340.9570 -2.016 0.0438 * #> year 0.3396 0.1692 2.007 0.0448 * #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> (Dispersion parameter for binomial family taken to be 1) #> #> Null deviance: 13.8352 on 15 degrees of freedom #> Residual deviance: 7.0501 on 14 degrees of freedom #> AIC: 19.351 #> #> Number of Fisher Scoring iterations: 6 #>#> year value se_min se_max observations observed estimated #> 1 2002 0.00000000 NA NA 12 0.00000000 0.0005290038 #> 2 2003 0.00000000 NA NA 13 0.00000000 0.0007427521 #> 3 2004 0.00000000 NA NA 12 0.00000000 0.0010427770 #> 4 2005 0.00000000 NA NA 15 0.00000000 0.0014638156 #> 5 2006 0.00000000 NA NA 16 0.00000000 0.0020545058 #> 6 2007 0.00000000 NA NA 17 0.00000000 0.0028828678 #> 7 2008 0.00000000 NA NA 17 0.00000000 0.0040438659 #> 8 2009 0.00000000 NA NA 18 0.00000000 0.0056697665 #> 9 2010 0.00000000 NA NA 13 0.00000000 0.0079441719 #> 10 2011 0.04761905 NA NA 21 0.04761905 0.0111207424 #> 11 2012 0.00000000 NA NA 10 0.00000000 0.0155475955 #> 12 2013 0.00000000 NA NA 13 0.00000000 0.0216979872 #> 13 2014 0.00000000 NA NA 19 0.00000000 0.0302067265 #> 14 2015 0.15384615 NA NA 13 0.15384615 0.0419091804 #> 15 2016 0.04761905 NA NA 21 0.04761905 0.0578747498 #> 16 2017 0.05000000 NA NA 20 0.05000000 0.0794183382 #> 17 2018 0.10806159 0.03882360 0.1772996 NA NA 0.1080615861 #> 18 2019 0.14540370 0.03910717 0.2517002 NA NA 0.1454037037 #> 19 2020 0.19285970 0.03682518 0.3488942 NA NA 0.1928597004 #> 20 2021 0.25125053 0.03388540 0.4686157 NA NA 0.2512505261 #> 21 2022 0.32030440 0.03456588 0.6060429 NA NA 0.3203043985 #> 22 2023 0.39824269 0.04543193 0.7510534 NA NA 0.3982426891 #> 23 2024 0.48170517 0.07378334 0.8896270 NA NA 0.4817051695 #> 24 2025 0.56620119 0.12484502 1.0000000 NA NA 0.5662011910# create nice plots with ggplot if (!require(ggplot2)) { data <- septic_patients %>% filter(mo == as.mo("E. coli")) %>% resistance_predict(col_ab = "amox", col_date = "date", info = FALSE, minimum = 15) ggplot(data, aes(x = year)) + geom_col(aes(y = value), fill = "grey75") + geom_errorbar(aes(ymin = se_min, ymax = se_max), colour = "grey50") + scale_y_continuous(limits = c(0, 1), breaks = seq(0, 1, 0.1), labels = paste0(seq(0, 100, 10), "%")) + labs(title = expression(paste("Forecast of amoxicillin resistance in ", italic("E. coli"))), y = "%IR", x = "Year") + theme_minimal(base_size = 13) }