From 15fc72fc6628c87968176cba409becc51d08ce41 Mon Sep 17 00:00:00 2001 From: Matthijs Berends Date: Thu, 19 Dec 2024 20:17:15 +0100 Subject: [PATCH] (v2.1.1.9121) support tidymodels --- .Rbuildignore | 1 + DESCRIPTION | 5 +- NEWS.md | 8 +- PythonPackage/AMR/AMR.egg-info/PKG-INFO | 2 +- PythonPackage/AMR/AMR/functions.py | 4 +- ...ny.whl => AMR-2.1.1.9121-py3-none-any.whl} | Bin 9959 -> 9961 bytes PythonPackage/AMR/dist/amr-2.1.1.9120.tar.gz | Bin 9730 -> 0 bytes PythonPackage/AMR/dist/amr-2.1.1.9121.tar.gz | Bin 0 -> 9721 bytes PythonPackage/AMR/setup.py | 2 +- R/aa_helper_functions.R | 5 +- R/mdro.R | 243 +++++++++++++----- _pkgdown.yml | 3 + ....txt => gpt_training_text_v2.1.1.9121.txt} | 230 ++++++++++++++++- inst/tinytest/test-mdro.R | 2 +- man/mdro.Rd | 31 ++- vignettes/AMR_with_tidymodels.Rmd | 191 ++++++++++++++ 16 files changed, 638 insertions(+), 89 deletions(-) rename PythonPackage/AMR/dist/{AMR-2.1.1.9120-py3-none-any.whl => AMR-2.1.1.9121-py3-none-any.whl} (75%) delete mode 100644 PythonPackage/AMR/dist/amr-2.1.1.9120.tar.gz create mode 100644 PythonPackage/AMR/dist/amr-2.1.1.9121.tar.gz rename data-raw/{gpt_training_text_v2.1.1.9120.txt => gpt_training_text_v2.1.1.9121.txt} (98%) create mode 100644 vignettes/AMR_with_tidymodels.Rmd diff --git a/.Rbuildignore b/.Rbuildignore index 19b952f7..6fd7fa1c 100755 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -25,6 +25,7 @@ ^tests/testthat/_snaps$ ^vignettes/AMR\.Rmd$ ^vignettes/AMR_intro\.png$ +^vignettes/AMR_with_tidymodels\.Rmd$ ^vignettes/benchmarks\.Rmd$ ^vignettes/benchmarks\.Rmd\.not$ ^vignettes/datasets\.Rmd$ diff --git a/DESCRIPTION b/DESCRIPTION index 0a43bb80..696a98db 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,6 +1,6 @@ Package: AMR -Version: 2.1.1.9120 -Date: 2024-12-15 +Version: 2.1.1.9121 +Date: 2024-12-19 Title: Antimicrobial Resistance Data Analysis Description: Functions to simplify and standardise antimicrobial resistance (AMR) data analysis and to work with microbial and antimicrobial properties by @@ -47,6 +47,7 @@ Suggests: rvest, skimr, tibble, + tidymodels, tidyselect, tinytest, vctrs, diff --git a/NEWS.md b/NEWS.md index ae6ba381..0d8ce2f6 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# AMR 2.1.1.9120 +# AMR 2.1.1.9121 *(this beta version will eventually become v3.0. We're happy to reach a new major milestone soon, which will be all about the new One Health support! Install this beta using [the instructions here](https://msberends.github.io/AMR/#latest-development-version).)* @@ -30,6 +30,8 @@ This package now supports not only tools for AMR data analysis in clinical setti * New function `rescale_mic()`, which allows users to rescale MIC values to a manually set range. This is the powerhouse behind the `scale_*_mic()` functions, but it can be used independently to, for instance, compare equality in MIC distributions by rescaling them to the same range first. * **Support for Python** * While using R for the heavy lifting, [our 'AMR' Python Package](https://pypi.org/project/AMR/) was developed to run the AMR R package natively in Python. The Python package will always have the same version number as the R package, as it is built automatically with every code change. +* **Support for `tidymodels`** + * All antimicrobial selectors (such as `aminoglycosides()` and `betalactams()`) are now supported in `tidymodels` packages such as `recipe` and `parsnip`. See for more info [our tutorial](https://msberends.github.io/AMR/articles/AMR_with_tidymodels.html) on using AMR function for predictive modelling. * **Other** * New function `mo_group_members()` to retrieve the member microorganisms of a microorganism group. For example, `mo_group_members("Strep group C")` returns a vector of all microorganisms that belong to that group. @@ -73,7 +75,9 @@ This package now supports not only tools for AMR data analysis in clinical setti * Updated the prevalence calculation to include genera from the World Health Organization's (WHO) Priority Pathogen List * Improved algorithm of `first_isolate()` when using the phenotype-based method, to prioritise records with the highest availability of SIR values * `scale_y_percent()` can now cope with ranges outside the 0-100% range -* Support for new Dutch national MDRO guideline (SRI-richtlijn BRMO, Nov 2024) +* MDRO determination (using `mdro()`) + * Implemented the new Dutch national MDRO guideline (SRI-richtlijn BRMO, Nov 2024) + * Added arguments `esbl`, `carbapenemase`, `mecA`, `mecC`, `vanA`, `vanB` to denote column names or logical values indicating presence of these genes (or production of their proteins) ## Other * Greatly improved `vctrs` integration, a Tidyverse package working in the background for many Tidyverse functions. For users, this means that functions such as `dplyr`'s `bind_rows()`, `rowwise()` and `c_across()` are now supported for e.g. columns of class `mic`. Despite this, this `AMR` package is still zero-dependent on any other package, including `dplyr` and `vctrs`. diff --git a/PythonPackage/AMR/AMR.egg-info/PKG-INFO b/PythonPackage/AMR/AMR.egg-info/PKG-INFO index d41c9396..347e5180 100644 --- a/PythonPackage/AMR/AMR.egg-info/PKG-INFO +++ b/PythonPackage/AMR/AMR.egg-info/PKG-INFO @@ -1,6 +1,6 @@ Metadata-Version: 2.1 Name: AMR -Version: 2.1.1.9120 +Version: 2.1.1.9121 Summary: A Python wrapper for the AMR R package Home-page: https://github.com/msberends/AMR Author: Matthijs Berends diff --git a/PythonPackage/AMR/AMR/functions.py b/PythonPackage/AMR/AMR/functions.py index a2f07d44..25a4e528 100644 --- a/PythonPackage/AMR/AMR/functions.py +++ b/PythonPackage/AMR/AMR/functions.py @@ -441,9 +441,9 @@ def mdr_tb(x = None, *args, **kwargs): def mdr_cmi2012(x = None, *args, **kwargs): """See our website of the R package for the manual: https://msberends.github.io/AMR/index.html""" return convert_to_python(amr_r.mdr_cmi2012(x = None, *args, **kwargs)) -def eucast_exceptional_phenotypes(x = None, *args, **kwargs): +def eucast_exceptional_phenotypes(*args, **kwargs): """See our website of the R package for the manual: https://msberends.github.io/AMR/index.html""" - return convert_to_python(amr_r.eucast_exceptional_phenotypes(x = None, *args, **kwargs)) + return convert_to_python(amr_r.eucast_exceptional_phenotypes(*args, **kwargs)) def mean_amr_distance(x, *args, **kwargs): """See our website of the R package for the manual: https://msberends.github.io/AMR/index.html""" return convert_to_python(amr_r.mean_amr_distance(x, *args, **kwargs)) diff --git a/PythonPackage/AMR/dist/AMR-2.1.1.9120-py3-none-any.whl b/PythonPackage/AMR/dist/AMR-2.1.1.9121-py3-none-any.whl similarity index 75% rename from PythonPackage/AMR/dist/AMR-2.1.1.9120-py3-none-any.whl rename to PythonPackage/AMR/dist/AMR-2.1.1.9121-py3-none-any.whl index 65c72e0da6dcd51a615c3e064314ae5dccd7ee60..29d89122cefe5142eaf5b681049849d0be5debcd 100644 GIT binary patch delta 1339 zcmaFv`_h*;z?+#xgn@y9gTZ>%SEh{#O85IV9imBSEVx&J@ger9Rus{i>B1i(P`M;KjY9^gYz#$U)7$LzS6J8 z?OfHVZgEsxXt!&}rR)QgyhX?8)|Q=mLA!g zP(QWQCSN-!vw44Gi{)Of%Xh<8E>Jam)3A5%nn&|GuL@;67UX{9cQne&;>r}fDEB{~e2o^M;L z8l4(0Jf6&E7ItHOh0~NcW-GflhfhYj=pV@L{d+4-tZzb?! zfxWTt@(*5Ds}Jv-}Z=PYzs`ab6B(QPVN(tWHg+7QB`>IZjp4xEt{1^ z-FTUSvXgU^kgKzQkW1-B-^s}i3=I!|{}D5~Y$w%j zE1}{vWx|gR@97tpM6c<;YWU;p?7OQ%&Z>UBH~;kN>W9`&TbA-H^xa>ee3Di2U(m^^ z$DDH;Zbv04Rn^f3x3=0Vn#P>Gm)Ljn zk;T0JsJ$_M9|a=ommT(~d%w zGrRzs{6@tJRDey^RJB4aza|%`8gjtX#!6LnaN3XuCy39gIzXO+nhcm{uLk1fsHt$m W<9zZ=H8r5v0X0Ro`zpYQX8-`#HC5yQ delta 1410 zcmaFq``ni|z?+#xgn@y9gTZi4|3+RLR%Rf**@yKOGcS-4IjJ)|(}Y(G*;HH}K${Vz0vaTl5~u&NDX1yL!Gd{8OcIt?1U5xGk zWlO1w@|Ja09y{suv#an*w(;Q!aZan67X9PXIZlRpgy}_LG@H~AbhL@^uelYAjr1Pj^mhq7V_KqwH$J2ZhesA*r(orq= zU+ZpMJ?Oq$?=$&+fvAR0)sIBm7)eFaz#g5GF+BzlU%ISh>2FsNu zroP%EQ?+MHz0I}@^AF6k2)%n&=1$e>9hWaDHwFBj8r~O^C0J^6ldIFdFp{BY(mU?2 zi&_~L9#`khS+gPjV~fUKwvw6~?UOfm#5G)(`n~lm&{^d>s*_e$IOz!hN%E(t8wdMDulO;)EIKS# zaQ@p~k(*zWFP!zfn!3xie(sc{iJyH~%ey9|UMQX!5pylHVEOq-Hrwf)M^j^2%Z(v!ufklo$KQ8UdTepzkD!RJg-;dcICA_IDBeQyaZ%eTL>WW(nv-oLeLw9hQv zeCpLZwHxV&Z-d$k-_9VHHpdA$^!KurAKDn zNsMP?U|?rrV35a9s+W>kT%wzqmzJ*|L;9J^h|sC-sc^xx%|?B4CSMEtWo zU49xmysvlL$m9Mp@$-e1^k&Y44xAotPh@hyJ_>fzua^0nw2{jiwR6yx+yfi zY;PvZ-v5(Ic^}&Roz-{YPfx}~V5D3mVE~IVe!7w>sp80?q2`Kj^#N4Mmvn9pr^ zKjBO~vwtkh3^mJ>uO2^@P>H!#{ItvEaDiS!(QUrm%lx_w@BT3dcr!AIFvE+k$?sIG zKtr!~+165?@&W diff --git a/PythonPackage/AMR/dist/amr-2.1.1.9120.tar.gz b/PythonPackage/AMR/dist/amr-2.1.1.9120.tar.gz deleted file mode 100644 index fc8f95c68b4653fd466d1b871897ce174ccf8f29..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 9730 zcmcI}WmgwuXV2I=ZoPv2p$zf}C6&Y#yd=P!|C{?rWXR&L4}F+c&@T zQ@^#>ZG7$hA(R8k857kt@Je;R+^-;v4yTFl`XoQTNK@TyycgK}s9?m8#HNQdUI3Dz z+h?f=rn^2vm^nrmkUje8{*>hIcE?OhcYx71wvf>93%GGPL$*~GBT2Ycz^$H3_3)7SHbuQX}=;5Y`F>I zA(jamncZ><>J&QE9u5lPIToLheH8p~43ZM`R1%LuAFcRv)%Y%4J_Gr!{JzzA59vLI z>^Z&wg+n=YRH4KRK7!tEcdzr}A!{cPN$2_6dk97;-9#^a(-9{Ou%S%)IU=-1`e=kU!m*eM zl1Yo-D7JBa`sGr08HTpKA^Gibafz~|XUU5=Ss7o)gnc{l}~Ey4T;S!z6+#p z+iY%J3}!b;QWIu%JY@W<8pY|!R>SPGPuU5`qlJu-MW~aZ4@uQU+1OT+;@=2hA}N)= z_F)jA2>oy(I|Ofs5cR5GUvM{={Nm8A#LweA&hOg8w2|OR_amhv0nZ)H2pbkS_XWos z3JFlfvF4|@_7lq^YYwlWXog^QEDtmoRb8A$%rzL{t}VnFo}bEu81|~IF2upudl%y# zo6$=$)$c^i5gXw?rohKDahc*zROWb&G+}m z3SDyoT6HfHef-9H#r2e(FgbY*u`q4})M}U}QR)iF4wmJZ zA~579N}YKjiulJ+uOY;!O2YC8?$c;Wh4338^RReoO5J5NLCtaUPCM;p6Pqj{)}k(~ z-5b8MikLEm8@zkW zhR0Jexk$Twi6C8HlUTtRK%8~7Q?2@Dz+?=Sf8lQog(wrkUc`|TD%e&%P6a~|s+Tg& z4AvWRlMcyL^@k}}#n#+cGNl8pNIcENz1b4!MgfnFjn5_S{OL=`Y}f!)t08n-UnLaw zkE9@crP?U)xHb2*3MDolt-8fo>S;tFC{EA{c7+7^UNVhjx#IdVXOcH!ihf*JZs8oGdzDu)w1~hdK+I@tAedyDF-+B3Zh4S+ml!I_DdsC-`~=)w1$ea zb$GDHLdn7LFA2z1h){HE5FYKr;223%1f#v!f1OH%F~<@MX2-@h%u zfG5dgi5A!G{xZ~}H);!>7*3l*&C(y%DUy0(Oy#IZ1QWTEp(|NARZ zDHY69BGbY1D#G-ZK_UVh1s2&`KZrHUQ-9681aIuW+v7t$E3bf95r&Qx?=fT*5gionl1Rj9Iyo1s57RoWRN!qf^yY^LcAE zK-v0Wf1BjrNQ4Ie^#UAX1C7&BCraL14Td`+0(LhX%dOANb(su4=4mIsFs+&|EK^vU z%^4@d%7oB|-(P`ND3#?Fqky7uqid?(I5e>lPXbJqp8#?hmTZW~Lsn^j{%u z5aS?-hH?2FByDK1u??!|FU0SC79SGCQwM}}WidajX#eu~xrM9ZUl7)p;PF|EU!H*- z_CuxY+qO{nN}0bEPdyn*7p!Pm)5g{c zqFSID>oBL*7W$JYnTGxNM{cc}f6x}wO-`DczFvQxMDn*MI<~N%d<1%B5__+89|H}d zReLX%t%b%{#IbSd?10lnN^_5=23G{-+O`r!-Z-{?6=&vcA={++?JcFkdA*s8enXBT z8Otj+vy!%6KnJ8|9;Xj#0e3=^Phwsb1I=S~2w^JY(y$-qiEK=5J)G>1X&Zq$HhLeg zs+v6(s*Zk1uJ=gcoc6D(mh|KQEIFVF-c&kT6JOcDzIbV*d zG%c)?Ea=;Nt0J@aJd0FVidkcWx4@6TV`5b4JjiY|)TB>*`@#oaaRMw_oLz)aSqEDJ zXx2yyuqK4EgAwc;7&WU3&a*P=qr#{A*LN+ih6r{zbf+b@2~RaKcW4t6wa84 zxcLUg6{4@#oAJ#Y_<3%fS&Hfz@ENt(HmR7<7&$)eBoKcHO?TGT~rC>A?{# zhUK8_X#K5w3M`i?)n=4_pvrf7T4A4A9zFi@T^VR@voBZ&rQpj6M-%7Pd&RAAnhnpvkrecIt~7)x#`~H`7^`mLU<%*1-Ml$=RNNxv zB}9wwC0c(ETu%s)W49HDwagjY3`X?Eryc8Nyj)5Lxz%1;*UvX|V&XBm)Ge4;B6?g? zm_8FB)s|p);=nTA)qd`)VGa|%!-}E8DXPom>Lh_B0XNE6*ZyL;oJwkq|HI+2kUq3Q zm<(&6M!TpZ1K^q1RI)uRQikJDbe;-=-gRv^H2Vo86!tfh`ZNYhG^z)V>u^~gNqM~+B3GRe(nBibWs z0;>v9F1dt^KF#ksIGI|iGR%Ui@FqAisB9|FM@OW|F@t!|w0}CWDp)}Ktv#uz{)nw^ zhCGiReVNCB<$4g7K_)rS;Y~<6-xsgoo`LMQNLUTDC6?!IFsq3(N33y^R>D`_Lr0eJ zMt$Z{o6l}zmnBri-|&b2ApypaD9YE|h&PX81O|Dizp z9}3=37Z#K-7nT~#aC7zlKt3gDT>d|j3G3em_a7qu5uyJu4!5IojvY?z9t~j5)}cMB zLaKY~b6%_7qjj#&=bUh6*P2#j_neP~snU()s_f1xq{Uli(a1WmKKM4-e~037Z7?PU zFTh~3T>?XDz@S?Ng=mKmZJALR;YfKP#?o|cbZDrp4{fF2|05F!-ENZ3@+Bm1U*S=t zb^M*pouCuTTyw_;pry1l(*ND!n_XTLVV%V(!rtrIzx)47_|8&*QP)N72GoSAGZ7-u zsC(r&IGxxAjE3Mg#|-*|?tRcP;4AT7v~Q#%HoEWFCf(?H3f*W0Z=^*KoWY7imU~H5 z1~yaKH|8W)=4B$F)i+z*jf+XSb~%|bTi@% zf~fkREB+WrgwCmA3R3ZZpH{Ycf%k#T9M zmm`(B$;eXw;G2lbRcTMu|LC_Aq~zY|t0>2~<(-UDm7+Yf?C`!-qjLz}W$qE~!ZRBT zK6tk5aQs&q?Wm!c@+r-+oY9D!+*};32drK^vznCgEfj+lm@%*ERKHXojz7q3iG~`t zjBFTI-HUi`RV)k;VG8(}c=nUX0w)cU8rFBcpV;K@JvUz14@>WP9Szs(6Y-&D>CiWd=8c&mzp7#YSUiZN zy8(Skk!etrH4a%%?aFMg^%zXXm15TB}b`&G~~vD#YR+?=j}P=pI(B{txu^gMCYepBp(C z&_AjPqlU7zb-vWLnUUg48S$$R2$#@thuUS=blGSus6z*S=UDnJ_C2JKY-aqSt7HlN z{wrjx7RHJz(56Uwr~6>{S&NoJ)UNI(qyu!rIsYj&^H&H;Q#%QUqN^vma1=y9MT#=9UwHL+OgMcj@Bs1*0|Z6;lGqH&6N=BM@90RJQ`%a#FfEYqBhUa__e{ z<-JbwFdQ}<_VRMcuBQFpmS0Kp+04p0)D{Y9;N<`C{v5%L@sJ&F0q>%OOZPYWn*V3} zeJ%d~th|bktHoU4BBTg14SAjzxYM3C3QD*UCr5Yx8x%A{(s%}$oy9x{UIrvWD8560 zWzd`o66l-iiBSm!b~7ZzjnT@gyvJw>4$bE@2@^Bz$i^@sUTia zSX|cu7$saPZ9s8ikKu&Md42($!bb`b*L)~>j7=8NZPX-JVYo5loXfT}v5)lUf$HI2 z%OKH7WJl+&f@7=^n!r+57nd0%9TUj`zh$FkFRawwND>4iq}J`z+s?s18h?ZC;qwG} zkZkrH5F4O7z2*0eg{}#}>cq&!4}j6PBuF5AP$`nr)Q@0x@p`RS%ZPI@rvc9X;nE^LLXR!^mzNr<4?CCC;J+*KtR3LGg{k>}Dv@y{DJ%->CBU0{l}B^q zDfns|-SAM1mjrM=X-b^1jxs7E2I)8h)DkHQr%sA7z*Cam^+$!Um_+62K%-f zA%nJ+IaZ6?REGq5AKqO;Aa7)PZzUtYswcXHB*93o$6|S22ZFC+ZflpENOS2ij^;4< z2*5h&Yxj^2fdI0Nc*%t-KNV9J5vWFs6FidNnJo?}QQwgy1R}OV@D}`!9ZHHk1wDC= zpA4Xao@QKee1^*%Z7ZWN9(i}|S)h<%hQce#ZE4U9NIcp;>oLK+pmm?39)|Pl2Sdtw z^TTH-+N5nXkB<`N@1_4NL=(_3JVA>{9^gz@ow&K>e#@QWDgKrGMOA|JSD(nQr~7jm z#ZI&fUMXDc59Q!hP;hlQ-irYF)}Dj#&R-!%nzx0UBgZa@?g7lh%AXxeH|&SOvG9y1A^R@Syi2I+8IORg(3^x8>N%C8LrFMshRk5*aji8!)#k6H+BRp- zRxz4TrTyxiRg$jSj@c)?)ep$~G(@V=Q$jc<3JL?6CvCN81vVQ{oGutoTBSx>URMwL zitD*otePuup&qxN>6N>%Q}k?dZ>$N4Zpa)lxv-(TJS3{1i*3BLp8(?6*UU^Ng?`(n;eT+$dOEkE$S1P z81&Jy=J(vWM!=Zu`TNIV)Nod$V1%M*%N=e_&@d-10*ozncSoQ_p0c!@(&rB+s!*ye ze&fExL>K44;N4l(=&-^sIe~Z5`ZJ<eWzKIbKU$5@gSo!1Mr z0lW5Jj4dv&Md^Rm{v{1PV0_-@$Jpp+@~59Cgh{rqvbf5lXaX@bbnm>YFv7kh!R%^6 ziVNvM2y42TUfkas<*xdZJ6%WStC|7ZYb`>G1;s^sx{r#eKkOk~HC%iKg3aty9cy&ZyimW%zlo^Q^x@u*s`HgQsGR$M8+k_jCZ)%bKmzpxbk*i$jM7Gs$`?Ow?5eh zAH@7=L+2*wx;3!TP?BTxPv@5MXNi?=nvp9 z)2$h*i?|}uvO-?!WUydB^=Kj?9ZxH2N+FNka$iQ0ft>;w%k#y^;$Z_y1faoRZKR5QxF}(@)*aA zSRC1BP8na|2eaY{pnEI{4)pAWdDF4KMGo52PMT&zH_Cr+hM{^s_L8RC!UPEtMxbC4 z<@gI@l1!$AW{>)=ZZ!xB<*V=k4y-pZm809Y8t__lj};YZVzxfc4kol;p3kyWDIYt^ zFG6!-+o`?D5NJJ6k<7NeT-DZ?Q;$89B7d7+5i%mY2RbjGU2T#=?Y=Du5J-BvTSfFK z>dF)rGUo1Cc=MbljC8;i7mE7;notI*iwE%8oN_yPqxAt`B+yWcb)%M~3yex5Y^m9& zQGTc!G>NwdV_@0j&5;{IUxMWO+u>+@qIEI1`I%K zihn)nJsk6zhA-t_L%M<%LgQ#{n-*4^m=ChT z-cHrpg-imhEyAbF%t<9X%&JJixs}f*VLF{u0Hoohkf@fJgD!^?`}zX?vjaglfSZT^ z`~{$CwNq=$T%`Ul^U}*Mc50%k?C1A z1#;0MU|PNQ!Ixf+zI1@WZV3=S-LI8M%o|aketofezhPf@U&TL6Fp_$kkJ^9(b`9#r zpHw6~g=M!z$RNWOj8)qi^&<1?Xk#vOPPqmK;kONv3MkpV^XGN0yVmo54wNwp^83KG z7ZpN~;Y`Q=_#}jiuXOILb=Fe3VqHQ6D@)tyc&I48IsYttzvI{@Ywrx=>X*ZIZUWsd z4oeYMIO*fWEf#?$_IUwySge0X*TM9G|E%22y)4M)>rAG-1Xi&D;vl5=%1f4>`qwbU z{%dk!t8o<$(8iBc*_3KtyYyyN;UT~*dk37l6uXt93s+b!gC$38Yr8FJ7jjJ!p>Y|@z4VUm817IpPE zGVt+m&tp$6E>x%jVv*?sV`DomJWgL3%#W+gknEkh@ z=f%g%9CUc?!r^2ujO&BUAg%KI`_|{Ik*HqD#(Z4HGQWA@^ZdW7!gvazYOSPT3=p!z z?p+jpZ~6^qiHKOJ-SEYR+LU4TZBC&x2L<)`?9+8s5A&WqD><)hMw{i@( z%d%C~NqqHD<{aO{*2NWPOPGEZ^Kp@9fJPPPyadUq8rh5Z0Y2Bnl2-=J?cDDDz$V5 z2^mL&Jmm-6oWFfH#umR{n#n7Tu;mO|Yc^{y`Usk==1>?| zwFwop09|Pg4R&C5CZC}=F!1!VqJ0E9Vf=A$lNpNhboiQnk{N(978j<{+tRuTF}#8_ zJO(Xk!#@NW_XYah?4LK|tix@&KmZ-W$M=iF5bycM$DpkVPuGX1lZlo@M?Mh#a$*N) z2|B79y+FO_?8f=W@J)HZn{X4p<=~+lY{IXG)?6&-D85XgCp7I3I{2)$EPi*Gr9Z6Q zMT^Fl#rJQ<(~$p8WqH7%x2=nYNyz&pRyN%ITM&k}==|9;umJL3DJbA$+Sh1n4ssRR z0u=g4dI?h(g=~$`ijgGt;bc-EESZ=5xAsh~5CU9$v1@_=$st!Z9AY&`kk?KZL~Bu$ z(`R5J zx@7<0CWriNf&e&b9wD0(_XSf{LIn>u#aPKXV#qm`q3)D{*)|>`bKMDgIP-Tw@9(Pr zBH*3y4%qQ_x@XJ@c`VwipN0fbHXcFzIze1XHlA-1pe+5ryX>YFU{24Iz1M_DdB>Pf zW1aIiRmgn-+^a>uqNifK=aq!hz%dj-t>1KN$qvRhwtW2;k4qmDyl3lTzREXa-sqEB zCo4y9^8M>J8oAN#ROGx%7|AwPEuOCNpwh!ka~hiboPt3phf9yC2>&@ZGLC$%q#kll zy~+KFZmXW(rp*cPKUEq?W^*uhnakxSGq@=%K7B?!mWk2<%_X&Oo?XX7jy0IN!6Af` z8ueX!DTK{4?AX2@+ry<50xr^rL}Q-`nH{@+KNGEFvWkhBw=I+&+_Vz#4(Sq0{AoC9 WeyM5vzmaeTflcG<+=A+ahWbCgv{&r_ diff --git a/PythonPackage/AMR/dist/amr-2.1.1.9121.tar.gz b/PythonPackage/AMR/dist/amr-2.1.1.9121.tar.gz new file mode 100644 index 0000000000000000000000000000000000000000..e6ad65a71454ff69557133d61bc9773452d81f0f GIT binary patch literal 9721 zcmcI~Wl$V2*Dg@pi@Up9ad(QB;_mKR++B;i6c%@f;?Ck)+})kRvU~f!_xo{Y?(c7s z=OlS@CYebxNls44;}8)oT`jC(Af}FPtXyoI|AGJ~7blyCsTCj5GM<0 zWatiA0Ks&(7w~hZ@B^|Z4eqZ=C~oh}v~))py_3rcb-(+zu1&JG&3)@PZl1_(+A(?u zaNnT!)IP$XA$HM0C=GCqJ7s$z95E2lcp0`+4&vU$My|l;s@q~J~1LxGMit{0$|4=NdZqKu_*L$z~W8)hfpiH{s|HZzVS?o zI?#haeBL1QaloN^F7OQig3zPhw*nE12j3*|Jwb|AAUo2Km(?do#bWIjNKy^NGhcvZ z?kTYyedIWvcwNs|CP33cCEr#Lpb7 z3W~X>S-%h-dDZ6lhXJt;VcD?;L1!H^3tFa#{7RWRF|jtITXPzRsr_mvvb2w0l;_OR zEXfsBbc;O8oxX_35+jW6fI2Fn!wjm`gF3k%4O6txQ1GZ4HlCM?C5y@gjc(9{c(fp= zB#ZMu>y^u%m(PgV>M~kO@qaPoe%Xx1)Sj>6-DzGOYQzzR0NoWFjyt&RZ97D{aRZ1a zgptECbUD1zizry)M}pW3vI&xMMH7UqHugU0E+ zzLp2#bxxnab9j8ZES{PZc-kXW!|5D(6U;ue!4g6Q4VFcB9-3P{~G zP~i_lw_$0fnB3J>J#H#(M09QkuVn>;vds{?=2&bXS;(=Brg)Fko|GntgDr>e$;CxO zXDg;NsAO65`fxj<2gsu+?w)pygkjs+Lg*pz(o&scrcDDZ0q8x8M}Ln|vTW+YsGI&Q zC@7%Rgruoi{;pNSC2)-+y*!UtuQOt}I+tlQ?D;cNXN2=-8KumyhoK=7|7!@F7gp&4 zkr6}fo{THGZhr+8wpF8ljspLpGCG9Yz;BRbUipC)pU)%IjFvd%Z?W_4Weox4=RHk{ zXKv3x_2)9dK8jqOPS)KD59!Aif$8a_O=u#bVT6=Otw6rsPYFno8od|d&UZ=I1lWME zuw9ffkyU^>338fIfM^QboAOSK7d$fc&JtYy`5bY+uUc%j@#s{u$>0m>H}PVdY(lus z>U*gyGmppQ%}s#yu5fVcfrbc0d)>i05N1-$yU2qC^XR6?&C3IeRS@l3a(2xjup9wU zY`GP&UC!H$A|mMFUEBu%8U}bk^->=pr>j5^6`ZP#qlBTHwfE8uPM!@m4xK{S#IXYK zLYqS$T++P$$!AM)62JLM}ym+}>oW~|~7N>OwE4a#m2!z~;Rod!oaucof_u9O<{{f`LUY8@wr zx%(K^qDO_2*gVFmaDHM$a^Dm&R_cpe{ErJDsZGI^bjzpdWff_mxkEoBh8U2+>5kRo z*WzHw)?OL~oasVd=0`3&E^4&_Rtklyhrz+RvjnLJ>uMX3)A=|>mqpif<{-3Geo;X> zd3|T?iWkVzd(ioZ*wd5DGh}#!yLTR3cv27-?bgcfrkTcoJl z`KVV`2bsW!hfIZqA`FGh6YZ1w)Ka!mp8fU;H@cJUroySc>qBol*z7!&ytF?GCWV-N zvuk)ciE{DwT))nUa$bjCz8r^H<6;6~5Uk56(7;bWTTYWk=ty(nVg0y|$6gXfRoB_S z6O0oH=IiaozQY|z$g?B}x||5|=0H$hDTOt5x1NN!iNqzV|30rBUGOCpT%54)@NVQh zr+TF@MiHFKIkBSK1xl7r6|a}U-}3JP5n--1O;lh%k=eMAHgV$~7$U4L=^>Q_INo4Y zf_ivy{H|Xq%S(b=OU_ucegu9(Zk+g%Wwq&CR(2z|*5w!QoD+qk*!8nSml=M}lLDIX zi5OcA14^k)u{x$N*D+;t-n>fbi{l~aE{|xHC^vRNV%nxDyI8ZeRJXAJ2#+b?lCZ+6}7 zFGVQ=QCsjNe}jI+SGz$J*wMT-UrR3XIn~;^Lf~;6P~$nbe%ZUhU~TT?j*7X8x(lT+X={#a zL^e7BW6!= z2s3&6NjN-TnqU*Ld;$n|(tmO!Zc}>AJDR1M_UrZ^i>?AF$8^y+_waTH%Pv=J*gS1KF{uY!dj7F6$x}d z{QA0r525~U#sOvbxu=8jTRlRlc6G&|l?u2nPTH=7pTvb5SWLm|tKj4ERV1Y_9h?GwEZ@dlg0qAU>1ycS=?>f&x~6GPa@uEq;=2xx ztB|z9`-%u(L`h$p43%miQOymE$NAH^{MzEU9TXSrBb3X5Q-(^P{BCb(7b8<8^V4}u zw$1}U8b(KZU7pmY;T*n5ahMn>tD@Il!512NK|Ck>#r`4KIPy?y)X{Q zK>wK@#qqwBFEM(0Yj}W~qh&9XZiiDb`Aw2()IO3kWbTp@d#)~_>FxMapiJFFEa#j|7oCW(Hc=PqP zq`zz8Tc0G$Q#wn`P4wifGTT)6=f#?SlGIOJKj3>65AFh#gRA|tKn#>NxAH;%!GZ@PJ#$=L{)3JW^4JdJZrG`J~`FHmBSb|ZXc#&`_kxpf%h71Fh* zz=H*LES!P#_}bBLcAbf>X@wg)#aFws5#>dXww-G|eAtAP7A51~b&y{Us5B+HFtuc8 zUD)Bj9cxSZYih&498+V-(247Qv2+oG=gl&Xo6-8AyIDwXOIU7tTE!Um&PkDNjHIBg z4!!{n@qRXW?7xvf^*%JOW_U~dYZ$%na!6qn#OmUtY3UNgOC?bn5>m|= z;NH7`m0B&E+fO@o|&y zBv}t+{#yJVpCmB>_HV1wKU#MyAdN zlw`oxsAKwXU`H+JdOxGSy+grTSInboDwJ*b-;$dhV-n#0*ADN#)bFn|a%@vdein&D z%mLLE4RkEE1+iLg1<3t$v3Le~_p`GC6xb2W_lf{lY9-z130rSk{8x0Fqb{54Gk@0k zkFTv@-Ehss=;?z`ijL8jP~Ksjf6)*C1a-Rip4meE+a@@DX0=3_c2gTVX~ykZ$@7wj zKeuFLB+FQ0rOH^CJ~LUWtlAS{#jEXVgFI{0-P? z(->3S))O+%*fFXpNH;pKz=cz5MzvJ&WEWFpYS0DLP0miH=Y&|}?(Sy)j6il#>+iRO zv51NcS`d+ zyh-$pm=9z zNNH53E0 z`nUGzjX`TP9LKMp^PM|>KRR}yY=?ho9O&EJbco0bWJE#thRYg{$mZCeH6;5Uoa&#) z$hmO8UYWy^ljv6LBWRyN;;;!MGK^Y)X@5A0l#quw@jwo zp&dO5fDyTt4hk{kX@$jC9L@3HS(r|)i!>JRy^!8uzcw-8#|hB3bH(ji=_m1r?_aR~ z0y$=2e93npZ3jtAx`|s!`Y9`t&)Mf-6o!jW0iEs+i?1V>nh}}R|C3J>a)~Cgsj2`@%b-gNykn0IZyitB{Jm7wi(u{PmNYU|Be~b$M33XF zZ-~~iAK_;S|Ls%~>MgBo1-s0l={^sr`UL~3FsY03selm^DoVd{S9T^j&jdec!A)6! z=ES20J%J<5HD5)2mvjz>9Agg5b%<$Le|*9p5aX>5$yGDZysm~!WbLc`{ek@Iu^u5dN>EY=JEE?8pQY* z6!g0ig3t0fe&-o}g_*O|++gPMp(%TwDv?;%0A-16GA=>ig`H&sO z)WfL)$aD_0R{;oQn`JYZ9-XMGDjyt|{t0qsnR}Jbp|AsaAJ_I`a@kx5!Rz56*w)^l zME^_jV#g`L#ATlu)bm%h33mM1*1&_%?mi7_ejoS0pzrSjdA9WeyT$%}n8TD0AIWbC zW3NYexm?ITMLe68PTbZ25&9Vx>DBv39z3W`G_6zP5PtdKC2d8qBHnMmyA}#U+|MLL zmXL@tACrA^SJO;B1k6vNY1UYu`sl2^n zA0w0#y<~hm@zA&Q9Ql^Wk)*yRj1CM)P@9Cn!QGhpnQq9)KNC3-_)yM;z5@@CRaNKd zSMVK<=O~I&K9P06@9tA31fBI8bI94!gs<;VK8S(eYe5NA9U|T^y`T=gs}tNN_c0gc zjTvk4Drhr#S(3Jw?JGp=)&r#q z#+e+WL%Cpd-4IAskVMrb8PK)jCg?K$x6oeZF~R^vW2WrA)im#?13Q%+Egz95^x zJLtd@7z)OASw%%Vte^Kmle+qZcsDd@HdZ$3pI}qqC-s-7rAX62(V}M9YOatXQS`@ zH&mSC^FEYilwIrZTEg2sWn+SDg`~bIc9`ZpTM`1H!-vg{G!dxgJ)>KgStwYp^j*s1 zOpg)_RM-fn>je9_0*K3cUOWWLWNfOG*;0%+$FiuB_5yRL88Xu!1fOum1EwhwlRwdl zBFm3oz<i_Qy^1x}F#pi!PbjOD5d7J( z$jIG*w6i5w`o6w^1_$0T^l>kH5OJ_`G5{(6%<{SANSjK;Q;pNA*=+1> zCunf$vlW(P&=PvFW^thq)iQ>%HnmZ5(JsO*Y`Us&82~t8`_)!(G5r1;mI!?7?fLob z{p!OB;~rPbDx$jlD<8BS_9sAw*6D$de^m2T_+oBL%B}WTVf#_Z8N5_h;o!E>SYOsB zrj?ylyeS06Sk9N%`*}q)-1;LdjUgn_+Qr@Bv)fN=tPOE3jWJku-|>+2OX>@DVtpP& z>vnqw=H8MKxl+m_?dy+Jc4QBNBDpb{@s^2ohC)w)1d91yBT?XDh+0u7 z)A)ieVIUyu_LqJ2cJ^-^7a0Q6FA1|~?#SuBRwqc`A!+75sIUvN9?qa~-rUHE5lTiOa$o|u8-sQwjw{O#6`Vgn$$F`<`kT{V3B=Y3r1;n`gM3E zL=~uTx6$`fi3vUbfQ8$$S5f#9{B`;`gyjh_9q}DdJW3)@w`9#5l_WiMc`DXOe4+yb^(Tw-^mQp_x&%}U*8(VW1Ls^w3 z9@)-}u)w_Zq$;pf4Sc=DIo2RPR&ru}Def?M)<&b65M6gM*WWw<2m+L6qXZW@NB6jCP7aG6yI! z?V6WYLL~L9VV$jg?Z^fiPj&Y|#-y z9DHdTUI~CXmqIV~_`)SYtXAOc8+&Gw(eHR&n-E3VeXkZ{@jQe(zKcLV87xk;O|TMb zL#k;}09LoOS?purhj!Y+jYlduQgqxE#r_SjPc&MNojl)|?6)w)5>EP_@;Xwfmns^C z0~wcsi|em11#b>D9B0Z?OP`UuhhV-%3dn68(h2baD^4xE+KC zQjkO~IJh;2R|wIM89~x)82s{!gAZ*7`mUF}AQO~qWIitC8Jj|{>=n_ifD9n$wM8&O z#P<#{7;gbAKSzhu+c)Bf5Ef90SXgtjmnhWG+)JCK`r#X0WUeve$vdYJ{YD(w8;m;S(&!zby6Ov z>a}wd*mNttj);fw)R*d<2@6ANNxs-2>oe|kw_a>$>s+WiZxYjE1zv<+ zz`Ac1*2y$uMZWbb=15pZNheZ$=WjBM+H$-?KGUZG-4x2yav-3m>OrRI*t`wwm7|T5 zS9c0T2K15=U4vrNZk^|J8%)fCTJ;J=Cq%KQ2{@Pr9J#gM?f^48l^D*|U-bN_GoVF( z`oz&@wDaN(a(Hg!=(NDo8qw8vnkU#4%N<*ZzH^t~Jd^djzHGm@8GMRGJQBnN-x#)Y z3li>q31r`i0+oJ9=D*5p{IWsD-s>Yu&{k=}$C1LY&iJhbnwg?{WnqowCsI)Sgz*g} ze1{=V&-gdk;;sZz71j6lM81;V(uW76(1TzqW_5fetWO8No#|Z>jYelUoyXb}AyugE z%d|ay#yRd5(XYWRQd*=9TkSKv>@mi6Eu@Ka#OJ~?&|>atpEAivpF*mRlge@%xV5yF z_8{o5NE#c8jMBSQ#Z$iGry%QQuh;5+o)5CAHF{~_yKFy5gQg?JK!2xBfln3Zqas3G zx2HCDOkig+p&3^ljS9oo?Z4wJMO9HjkNV}Rzure>>B4`RuyG*+ZvOQH9c^I6dTUO34c}U5T zNg#!a33j2F@L5@W(&wYQ&)>P7ZX(^T2h(Oj{$YO!)DuUe>1mnN%PF;jA9Z@&ZTR-o?qYnEJNw5=0%aD=AoODOY zeU!?pv)je(+0>599%N(JVgftWe(%m`Jg1f&|8t+45e-_p@;$c3yVDt^i+H&fhO%qz zLn55RAB>ESdejRfRS{Xv-&ekI4>Na6ff$Z+Yj3(nxWrmby4;6_+72ywd)N33EUtM2 zk=R6W3nL8J><-o)n#zNvZ}Oh+mY?q_Tri-F9^3rCqE>BJCtrW-*YqIj<4Lp3-EHlf zbIR5Zna>rMPP|ktf}L&7GQcwS?MP(Vu9Z?9eaXFxi}b)1TRUpF_i_LZ_C|gi|D(ic zXVY4kIRSLJ6Mc}flMMJX+mfUj!>SKY%Z1Gn&1Z29Td3+~N<-dNL&)_SNWGnb+YI*v zu~&&uKwgo>o}#131&xvUByaItoz_;;1`;IJ?HoM*g}9KZ+W7%Bnbl_=4+CAa8H;ClL15UcF05{TM`u zh{W^aFGSyLajk1x0Kd1Sym3-7*F5;6WTY$d)c3JBL*V zeam+{=daG5pmss9J}IO&5u!B~RQ&d+#-dO0Dib?-5teqj= zeSAjb5UHuUm!P1q>TNiK8<;w8=+*Av3&Z{=wq0L*mmZXk{|y?vBM5%*#)kZ@5EHGG zJ4tpY-;#O}o78yu0A+M$-^R56- z;#RtUv-0*{_@DU3c^u^YcVw*3419wYd2u3!{ywBG0tt#!WO<0z8pDwG@04K&V_8j- zXk?1^?$|^xW&BBwS0uGozP-wwhd+|>P#M95US@CGfwAie#MWp z;BHnFPy!^0o)=!u5h9DBMEPJTmP2sL`!W(j%rz^IKA>X@B%PTZZ*#c2Bdo3^V@)dW gsbHL0fXeX_$z0YB_`lK7079-DN*)N+0S)zk0F9qhaR2}S literal 0 HcmV?d00001 diff --git a/PythonPackage/AMR/setup.py b/PythonPackage/AMR/setup.py index 1854d100..8ea2c7a8 100644 --- a/PythonPackage/AMR/setup.py +++ b/PythonPackage/AMR/setup.py @@ -2,7 +2,7 @@ from setuptools import setup, find_packages setup( name='AMR', - version='2.1.1.9120', + version='2.1.1.9121', packages=find_packages(), install_requires=[ 'rpy2', diff --git a/R/aa_helper_functions.R b/R/aa_helper_functions.R index 7cea3406..36008387 100644 --- a/R/aa_helper_functions.R +++ b/R/aa_helper_functions.R @@ -988,7 +988,7 @@ get_current_data <- function(arg_name, call) { for (env in frms[which(with_mask)]) { if (is.function(env$mask$current_rows) && (valid_df(env$data) || valid_df(env$`.data`))) { # an element `.data` or `data` (containing all data) and `mask` (containing functions) will be in the environment when using dplyr verbs - # we use their mask$current_rows() to get the group rows, since dplyr::cur_data_all() is deprecated and will be removed in the future + # we use their mask$current_rows() below to get the group rows, since dplyr::cur_data_all() is deprecated and will be removed in the future # e.g. for `example_isolates %>% group_by(ward) %>% mutate(first = first_isolate(.))` if (valid_df(env$data)) { # support for dplyr 1.1.x @@ -1008,6 +1008,9 @@ get_current_data <- function(arg_name, call) { if (valid_df(env$`.data`)) { # an element `.data` will be in the environment when using dplyr::select() return(env$`.data`) + } else if (valid_df(env$data)) { + # an element `data` will be in the environment when using older dplyr versions, or tidymodels + return(env$data) } else if (valid_df(env$xx)) { # an element `xx` will be in the environment for rows + cols in base R, e.g. `example_isolates[c(1:3), carbapenems()]` return(env$xx) diff --git a/R/mdro.R b/R/mdro.R index b7afff1e..eeddb29c 100755 --- a/R/mdro.R +++ b/R/mdro.R @@ -32,6 +32,12 @@ #' Determine which isolates are multidrug-resistant organisms (MDRO) according to international, national, or custom guidelines. #' @param x a [data.frame] with antibiotics columns, like `AMX` or `amox`. Can be left blank for automatic determination. #' @param guideline a specific guideline to follow, see sections *Supported international / national guidelines* and *Using Custom Guidelines* below. When left empty, the publication by Magiorakos *et al.* (see below) will be followed. +#' @param esbl [logical] values, or a column name containing logical values, indicating the presence of an ESBL gene (or production of its proteins) +#' @param carbapenemase [logical] values, or a column name containing logical values, indicating the presence of a carbapenemase gene (or production of its proteins) +#' @param mecA [logical] values, or a column name containing logical values, indicating the presence of a *mecA* gene (or production of its proteins) +#' @param mecC [logical] values, or a column name containing logical values, indicating the presence of a *mecC* gene (or production of its proteins) +#' @param vanA [logical] values, or a column name containing logical values, indicating the presence of a *vanA* gene (or production of its proteins) +#' @param vanB [logical] values, or a column name containing logical values, indicating the presence of a *vanB* gene (or production of its proteins) #' @param ... in case of [custom_mdro_guideline()]: a set of rules, see section *Using Custom Guidelines* below. Otherwise: column name of an antibiotic, see section *Antibiotics* below. #' @param as_factor a [logical] to indicate whether the returned value should be an ordered [factor] (`TRUE`, default), or otherwise a [character] vector #' @inheritParams eucast_rules @@ -177,6 +183,12 @@ mdro <- function(x = NULL, guideline = "CMI2012", col_mo = NULL, + esbl = NA, + carbapenemase = NA, + mecA = NA, + mecC = NA, + vanA = NA, + vanB = NA, info = interactive(), pct_required_classes = 0.5, combine_SI = TRUE, @@ -190,9 +202,13 @@ mdro <- function(x = NULL, } meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0 meet_criteria(guideline, allow_class = c("list", "character"), allow_NULL = TRUE) - if (!is.list(guideline)) { - meet_criteria(guideline, allow_class = "character", has_length = 1, allow_NULL = TRUE) - } + if (!is.list(guideline)) meet_criteria(guideline, allow_class = "character", has_length = 1, allow_NULL = TRUE) + meet_criteria(esbl, allow_class = c("logical", "character"), allow_NA = TRUE) + meet_criteria(carbapenemase, allow_class = c("logical", "character"), allow_NA = TRUE) + meet_criteria(mecA, allow_class = c("logical", "character"), allow_NA = TRUE) + meet_criteria(mecC, allow_class = c("logical", "character"), allow_NA = TRUE) + meet_criteria(vanA, allow_class = c("logical", "character"), allow_NA = TRUE) + meet_criteria(vanB, allow_class = c("logical", "character"), allow_NA = TRUE) meet_criteria(col_mo, allow_class = "character", has_length = 1, is_in = colnames(x), allow_NULL = TRUE) meet_criteria(info, allow_class = "logical", has_length = 1) meet_criteria(pct_required_classes, allow_class = "numeric", has_length = 1) @@ -203,7 +219,51 @@ mdro <- function(x = NULL, if (!any(is_sir_eligible(x))) { stop_("There were no possible SIR columns found in the data set. Transform columns with `as.sir()` for valid antimicrobial interpretations.") } - + + # get gene values as TRUE/FALSE + if (is.character(esbl)) { + meet_criteria(esbl, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + esbl <- x[[esbl]] + meet_criteria(esbl, allow_class = "logical", allow_NA = TRUE) + } else if (length(esbl) == 1) { + esbl <- rep(esbl, NROW(x)) + } + if (is.character(carbapenemase)) { + meet_criteria(carbapenemase, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + carbapenemase <- x[[carbapenemase]] + meet_criteria(carbapenemase, allow_class = "logical", allow_NA = TRUE) + } else if (length(carbapenemase) == 1) { + carbapenemase <- rep(carbapenemase, NROW(x)) + } + if (is.character(mecA)) { + meet_criteria(mecA, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + mecA <- x[[mecA]] + meet_criteria(mecA, allow_class = "logical", allow_NA = TRUE) + } else if (length(mecA) == 1) { + mecA <- rep(mecA, NROW(x)) + } + if (is.character(mecC)) { + meet_criteria(mecC, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + mecC <- x[[mecC]] + meet_criteria(mecC, allow_class = "logical", allow_NA = TRUE) + } else if (length(mecC) == 1) { + mecC <- rep(mecC, NROW(x)) + } + if (is.character(vanA)) { + meet_criteria(vanA, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + vanA <- x[[vanA]] + meet_criteria(vanA, allow_class = "logical", allow_NA = TRUE) + } else if (length(vanA) == 1) { + vanA <- rep(vanA, NROW(x)) + } + if (is.character(vanB)) { + meet_criteria(vanB, is_in = colnames(x), allow_NA = FALSE, has_length = 1) + vanB <- x[[vanB]] + meet_criteria(vanB, allow_class = "logical", allow_NA = TRUE) + } else if (length(vanB) == 1) { + vanB <- rep(vanB, NROW(x)) + } + info.bak <- info # don't throw info's more than once per call if (isTRUE(info)) { @@ -476,7 +536,7 @@ mdro <- function(x = NULL, if (!"AMP" %in% names(cols_ab) && "AMX" %in% names(cols_ab)) { # ampicillin column is missing, but amoxicillin is available if (isTRUE(info)) { - message_("Using column '", cols_ab[names(cols_ab) == "AMX"], "' as input for ampicillin since many EUCAST rules depend on it.") + message_("Using column '", cols_ab[names(cols_ab) == "AMX"], "' as input for ampicillin since many MDRO rules depend on it.") } cols_ab <- c(cols_ab, c(AMP = unname(cols_ab[names(cols_ab) == "AMX"]))) } @@ -663,6 +723,17 @@ mdro <- function(x = NULL, out[is.na(out)] <- FALSE out } + col_values <- function(df, col, return_if_lacking = "") { + if (col %in% colnames(df)) { + df[[col]] + } else { + rep(return_if_lacking, NROW(df)) + } + } + NA_as_FALSE <- function(x) { + x[is.na(x)] <- FALSE + x + } # antibiotic classes # nolint start @@ -677,6 +748,10 @@ mdro <- function(x = NULL, # helper function for editing the table trans_tbl <- function(to, rows, cols, any_all, reason = NULL) { + cols.bak <- cols + if (identical(cols, "any")) { + cols <- unique(cols_ab) + } cols <- cols[!ab_missing(cols)] cols <- cols[!is.na(cols)] if (length(rows) > 0 && length(cols) > 0) { @@ -690,7 +765,7 @@ mdro <- function(x = NULL, x[rows, "columns_nonsusceptible"] <<- vapply( FUN.VALUE = character(1), rows, - function(row, group_vct = cols) { + function(row, group_vct = cols_ab) { cols_nonsus <- vapply( FUN.VALUE = logical(1), x[row, group_vct, drop = FALSE], @@ -717,7 +792,7 @@ mdro <- function(x = NULL, rows_affected <- vapply( FUN.VALUE = logical(1), x_transposed, - function(y) search_function(y %in% search_result, na.rm = TRUE) + function(y) search_function(y %in% search_result, na.rm = TRUE) | identical(cols.bak, "any") ) rows_affected <- x[which(rows_affected), "row_number", drop = TRUE] rows_to_change <- rows[rows %in% rows_affected] @@ -1449,62 +1524,83 @@ mdro <- function(x = NULL, if (length(ESBLs) > 0) { trans_tbl( 2, # positive, unconfirmed - which(x$order == "Enterobacterales" & x[[ESBLs[1]]] == "R" & x[[ESBLs[2]]] == "R"), - c(AMX %or% AMP, cephalosporins_3rd), - "all", - reason = "Enterobacterales: ESBL" + rows = which(x$order == "Enterobacterales" & x[[ESBLs[1]]] == "R" & x[[ESBLs[2]]] == "R" & is.na(esbl)), + cols = c(AMX %or% AMP, cephalosporins_3rd), + any_all = "all", + reason = "Enterobacterales: potential ESBL" ) } trans_tbl( 3, # positive - which(x$order == "Enterobacterales" & (x$genus %in% c("Proteus", "Providencia") | paste(x$genus, x$species) %in% c("Serratia marcescens", "Morganella morganii"))), - carbapenems_without_imipenem, - "any", - reason = "Enterobacterales: carbapenem or carbapenemase" + rows = which(x$order == "Enterobacterales" & esbl == TRUE), + cols = "any", + any_all = "any", + reason = "Enterobacterales: ESBL" ) trans_tbl( 3, - which(x$order == "Enterobacterales" & !(x$genus %in% c("Proteus", "Providencia") | paste(x$genus, x$species) %in% c("Serratia marcescens", "Morganella morganii"))), - carbapenems, - "any", - reason = "Enterobacterales: carbapenem or carbapenemase" + rows = which(x$order == "Enterobacterales" & (x$genus %in% c("Proteus", "Providencia") | paste(x$genus, x$species) %in% c("Serratia marcescens", "Morganella morganii"))), + cols = carbapenems_without_imipenem, + any_all = "any", + reason = "Enterobacterales: carbapenem resistance" ) trans_tbl( 3, - which(x[[SXT]] == "R" & + rows = which(x$order == "Enterobacterales" & !(x$genus %in% c("Proteus", "Providencia") | paste(x$genus, x$species) %in% c("Serratia marcescens", "Morganella morganii"))), + cols = carbapenems, + any_all = "any", + reason = "Enterobacterales: carbapenem resistance" + ) + trans_tbl( + 3, + rows = which(x$order == "Enterobacterales" & carbapenemase == TRUE), + cols = "any", + any_all = "any", + reason = "Enterobacterales: carbapenemase" + ) + trans_tbl( + 3, + rows = which(x[[SXT]] == "R" & (x[[GEN]] == "R" | x[[TOB]] == "R" | x[[AMK]] == "R") & (x[[CIP]] == "R" | x[[NOR]] == "R" | x[[LVX]] == "R") & (x$genus %in% c("Enterobacter", "Providencia") | paste(x$genus, x$species) %in% c("Citrobacter freundii", "Klebsiella aerogenes", "Hafnia alvei", "Morganella morganii"))), - c(SXT, aminoglycosides, fluoroquinolones), - "any", + cols = c(SXT, aminoglycosides, fluoroquinolones), + any_all = "any", reason = "Enterobacterales group II: aminoglycoside + fluoroquinolone + cotrimoxazol" ) trans_tbl( 3, - which(x[[SXT]] == "R" & + rows = which(x[[SXT]] == "R" & x[[GEN]] == "R" & (x[[CIP]] == "R" | x[[NOR]] == "R" | x[[LVX]] == "R") & paste(x$genus, x$species) == "Serratia marcescens"), - c(SXT, aminoglycosides_serratia_marcescens, fluoroquinolones), - "any", + cols = c(SXT, aminoglycosides_serratia_marcescens, fluoroquinolones), + any_all = "any", reason = "Enterobacterales group II: aminoglycoside + fluoroquinolone + cotrimoxazol" ) # Acinetobacter baumannii-calcoaceticus complex trans_tbl( 3, - which((x[[GEN]] == "R" | x[[TOB]] == "R" | x[[AMK]] == "R") & + rows = which((x[[GEN]] == "R" | x[[TOB]] == "R" | x[[AMK]] == "R") & (x[[CIP]] == "R" | x[[LVX]] == "R") & x[[col_mo]] %in% AMR::microorganisms.groups$mo[AMR::microorganisms.groups$mo_group_name == "Acinetobacter baumannii complex"]), - c(aminoglycosides, CIP, LVX), - "any", + cols = c(aminoglycosides, CIP, LVX), + any_all = "any", reason = "A. baumannii-calcoaceticus complex: aminoglycoside + ciprofloxacin or levofloxacin" ) trans_tbl( 2, # unconfirmed - which(x[[col_mo]] %in% AMR::microorganisms.groups$mo[AMR::microorganisms.groups$mo_group_name == "Acinetobacter baumannii complex"]), - carbapenems, - "any", + rows = which(x[[col_mo]] %in% AMR::microorganisms.groups$mo[AMR::microorganisms.groups$mo_group_name == "Acinetobacter baumannii complex"] & is.na(carbapenemase)), + cols = carbapenems, + any_all = "any", + reason = "A. baumannii-calcoaceticus complex: potential carbapenemase" + ) + trans_tbl( + 3, + rows = which(x[[col_mo]] %in% AMR::microorganisms.groups$mo[AMR::microorganisms.groups$mo_group_name == "Acinetobacter baumannii complex"] & carbapenemase == TRUE), + cols = carbapenems, + any_all = "any", reason = "A. baumannii-calcoaceticus complex: carbapenemase" ) @@ -1513,59 +1609,65 @@ mdro <- function(x = NULL, # take pip/tazo if just pip is not available - many labs only test for pip/tazo because of availability on a Vitek card PIP <- TZP } - if (!ab_missing(MEM) && !ab_missing(IPM) && - !ab_missing(GEN) && !ab_missing(TOB) && - !ab_missing(CIP) && - !ab_missing(CAZ) && - !ab_missing(PIP)) { - x$psae <- 0 - x[which(x[, MEM, drop = TRUE] == "R" | x[, IPM, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, MEM, drop = TRUE] == "R" | x[, IPM, drop = TRUE] == "R"), "psae"] - x[which(x[, GEN, drop = TRUE] == "R" & x[, TOB, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, GEN, drop = TRUE] == "R" & x[, TOB, drop = TRUE] == "R"), "psae"] - x[which(x[, CIP, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, CIP, drop = TRUE] == "R"), "psae"] - x[which(x[, CAZ, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, CAZ, drop = TRUE] == "R"), "psae"] - x[which(x[, PIP, drop = TRUE] == "R"), "psae"] <- 1 + x[which(x[, PIP, drop = TRUE] == "R"), "psae"] - } else { - x$psae <- 0 - } + x$psae <- 0 + x$psae <- x$psae + ifelse(NA_as_FALSE(col_values(x, TOB) == "R" | col_values(x, AMK) == "R"), 1, 0) + x$psae <- x$psae + ifelse(NA_as_FALSE(col_values(x, IPM) == "R" | col_values(x, MEM) == "R"), 1, 0) + x$psae <- x$psae + ifelse(NA_as_FALSE(col_values(x, PIP) == "R"), 1, 0) + x$psae <- x$psae + ifelse(NA_as_FALSE(col_values(x, CAZ) == "R"), 1, 0) + x$psae <- x$psae + ifelse(NA_as_FALSE(col_values(x, CIP) == "R" | col_values(x, NOR) == "R" | col_values(x, LVX) == "R"), 1, 0) trans_tbl( 3, - which(x$genus == "Pseudomonas" & x$species == "aeruginosa"), - c(CAZ, CIP, GEN, IPM, MEM, TOB, PIP), - "all", # this will set all negatives to "guideline criteria not met" instead of "not covered by guideline" + rows = which(x$genus == "Pseudomonas" & x$species == "aeruginosa"), + cols = c(CAZ, CIP, GEN, IPM, MEM, TOB, PIP), + any_all = "all", # this will set all negatives to "guideline criteria not met" instead of "not covered by guideline" reason = "P. aeruginosa: at least 3 classes contain R" ) trans_tbl( 3, - which(x$genus == "Pseudomonas" & x$species == "aeruginosa" & x$psae >= 3), - c(CAZ, CIP, GEN, IPM, MEM, TOB, PIP), - "any", # this is the actual one, changing the ones with x$psae >= 3 + rows = which(x$genus == "Pseudomonas" & x$species == "aeruginosa" & x$psae >= 3), + cols = c(CAZ, CIP, GEN, IPM, MEM, TOB, PIP), + any_all = "any", # this is the actual one, changing the ones with x$psae >= 3 reason = "P. aeruginosa: at least 3 classes contain R" ) # Enterococcus faecium trans_tbl( 3, - which(x$genus == "Enterococcus" & x$species == "faecium"), - c(PEN %or% AMX %or% AMP, VAN), - "all", - reason = "E. faecium: vancomycin or vanA/vanB gene + penicillin group" + rows = which(x$genus == "Enterococcus" & x$species == "faecium"), + cols = c(PEN %or% AMX %or% AMP, VAN), + any_all = "all", + reason = "E. faecium: vancomycin + penicillin group" + ) + trans_tbl( + 3, + rows = which(x$genus == "Enterococcus" & x$species == "faecium" & (vanA == TRUE | vanB == TRUE)), + cols = c(PEN, AMX, AMP, VAN), + any_all = "any", + reason = "E. faecium: vanA/vanB gene + penicillin group" ) # Staphylococcus aureus trans_tbl( 2, - which(x$genus == "Staphylococcus" & x$species == "aureus"), - c(PEN, AMX, AMP, FLC, OXA, FOX, FOX1), - "any", - reason = "S. aureus: MRSA" + rows = which(x$genus == "Staphylococcus" & x$species == "aureus" & (is.na(mecA) | is.na(mecC))), + cols = c(AMC, TZP, FLC, OXA, FOX, FOX1), + any_all = "any", + reason = "S. aureus: potential MRSA" + ) + trans_tbl( + 3, + rows = which(x$genus == "Staphylococcus" & x$species == "aureus" & (mecA == TRUE | mecC == TRUE)), + cols = "any", + any_all = "any", + reason = "S. aureus: mecA/mecC gene" ) # Candida auris trans_tbl( 3, - which(x$genus == "Candida" & x$species == "auris"), - character(0), - "any", + rows = which(x$genus == "Candida" & x$species == "auris"), + cols = "any", + any_all = "any", reason = "C. auris: regardless of resistance" ) } @@ -2040,50 +2142,51 @@ brmo <- function(x = NULL, only_sir_columns = FALSE, ...) { mdro(x = x, only_sir_columns = only_sir_columns, guideline = "BRMO", ...) } + #' @rdname mdro #' @export -mrgn <- function(x = NULL, only_sir_columns = FALSE, ...) { +mrgn <- function(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) { meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) stop_if( "guideline" %in% names(list(...)), "argument `guideline` must not be set since this is a guideline-specific function" ) - mdro(x = x, only_sir_columns = only_sir_columns, guideline = "MRGN", ...) + mdro(x = x, only_sir_columns = only_sir_columns, verbose = verbose, guideline = "MRGN", ...) } #' @rdname mdro #' @export -mdr_tb <- function(x = NULL, only_sir_columns = FALSE, ...) { +mdr_tb <- function(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) { meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) stop_if( "guideline" %in% names(list(...)), "argument `guideline` must not be set since this is a guideline-specific function" ) - mdro(x = x, only_sir_columns = only_sir_columns, guideline = "TB", ...) + mdro(x = x, only_sir_columns = only_sir_columns, verbose = verbose, guideline = "TB", ...) } #' @rdname mdro #' @export -mdr_cmi2012 <- function(x = NULL, only_sir_columns = FALSE, ...) { +mdr_cmi2012 <- function(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) { meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) stop_if( "guideline" %in% names(list(...)), "argument `guideline` must not be set since this is a guideline-specific function" ) - mdro(x = x, only_sir_columns = only_sir_columns, guideline = "CMI2012", ...) + mdro(x = x, only_sir_columns = only_sir_columns, verbose = verbose, guideline = "CMI2012", ...) } #' @rdname mdro #' @export -eucast_exceptional_phenotypes <- function(x = NULL, only_sir_columns = FALSE, ...) { +eucast_exceptional_phenotypes <- function(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) { meet_criteria(x, allow_class = "data.frame", allow_NULL = TRUE) meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) stop_if( "guideline" %in% names(list(...)), "argument `guideline` must not be set since this is a guideline-specific function" ) - mdro(x = x, only_sir_columns = only_sir_columns, guideline = "EUCAST", ...) + mdro(x = x, only_sir_columns = only_sir_columns, verbose = verbose, guideline = "EUCAST", ...) } diff --git a/_pkgdown.yml b/_pkgdown.yml index 47ef7619..7bb3f42f 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -80,6 +80,9 @@ navbar: - text: "Download Data Sets for Own Use" icon: "fa-database" href: "articles/datasets.html" + - text: "Use AMR for Predictive Modelling (tidymodels)" + icon: "fa-square-root-variable" + href: "articles/AMR_with_tidymodels.html" - text: "Set User- Or Team-specific Package Settings" icon: "fa-gear" href: "reference/AMR-options.html" diff --git a/data-raw/gpt_training_text_v2.1.1.9120.txt b/data-raw/gpt_training_text_v2.1.1.9121.txt similarity index 98% rename from data-raw/gpt_training_text_v2.1.1.9120.txt rename to data-raw/gpt_training_text_v2.1.1.9121.txt index 6ae8f2a0..17f9b3e3 100644 --- a/data-raw/gpt_training_text_v2.1.1.9120.txt +++ b/data-raw/gpt_training_text_v2.1.1.9121.txt @@ -1,5 +1,5 @@ This files contains all context you must know about the AMR package for R. -First and foremost, you are trained on version 2.1.1.9120. Remember this whenever someone asks which AMR package version you’re at. +First and foremost, you are trained on version 2.1.1.9121. Remember this whenever someone asks which AMR package version you’re at. -------------------------------- THE PART HEREAFTER CONTAINS CONTENTS FROM FILE 'NAMESPACE': @@ -6086,6 +6086,12 @@ mdro( x = NULL, guideline = "CMI2012", col_mo = NULL, + esbl = NA, + carbapenemase = NA, + mecA = NA, + mecC = NA, + vanA = NA, + vanB = NA, info = interactive(), pct_required_classes = 0.5, combine_SI = TRUE, @@ -6098,13 +6104,18 @@ custom_mdro_guideline(..., as_factor = TRUE) brmo(x = NULL, only_sir_columns = FALSE, ...) -mrgn(x = NULL, only_sir_columns = FALSE, ...) +mrgn(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -mdr_tb(x = NULL, only_sir_columns = FALSE, ...) +mdr_tb(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -mdr_cmi2012(x = NULL, only_sir_columns = FALSE, ...) +mdr_cmi2012(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -eucast_exceptional_phenotypes(x = NULL, only_sir_columns = FALSE, ...) +eucast_exceptional_phenotypes( + x = NULL, + only_sir_columns = FALSE, + verbose = FALSE, + ... +) } \arguments{ \item{x}{a \link{data.frame} with antibiotics columns, like \code{AMX} or \code{amox}. Can be left blank for automatic determination.} @@ -6113,6 +6124,18 @@ eucast_exceptional_phenotypes(x = NULL, only_sir_columns = FALSE, ...) \item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} +\item{esbl}{\link{logical} values, or a column name containing logical values, indicating the presence of an ESBL gene (or production of its proteins)} + +\item{carbapenemase}{\link{logical} values, or a column name containing logical values, indicating the presence of a carbapenemase gene (or production of its proteins)} + +\item{mecA}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{mecA} gene (or production of its proteins)} + +\item{mecC}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{mecC} gene (or production of its proteins)} + +\item{vanA}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{vanA} gene (or production of its proteins)} + +\item{vanB}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{vanB} gene (or production of its proteins)} + \item{info}{a \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions} \item{pct_required_classes}{minimal required percentage of antimicrobial classes that must be available per isolate, rounded down. For example, with the default guideline, 17 antimicrobial classes must be available for \emph{S. aureus}. Setting this \code{pct_required_classes} argument to \code{0.5} (default) means that for every \emph{S. aureus} isolate at least 8 different classes must be available. Any lower number of available classes will return \code{NA} for that isolate.} @@ -8885,6 +8908,203 @@ Whether you're cleaning data or analysing resistance patterns, the `AMR` Python +THE PART HEREAFTER CONTAINS CONTENTS FROM FILE 'vignettes/AMR_with_tidymodels.Rmd': + + +--- +title: "`AMR` with `tidymodels`" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{`AMR` with `tidymodels`} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 5 +) +``` + +Antimicrobial resistance (AMR) is a global health crisis, and understanding resistance patterns is crucial for managing effective treatments. The `AMR` R package provides robust tools for analysing AMR data, including convenient antibiotic selector functions like `aminoglycosides()` and `betalactams()`. In this post, we will explore how to use the `tidymodels` framework to predict resistance patterns in the `example_isolates` dataset. + +By leveraging the power of `tidymodels` and the `AMR` package, we’ll build a reproducible machine learning workflow to predict resistance to two important antibiotic classes: aminoglycosides and beta-lactams. + +--- + +### **Objective** + +Our goal is to build a predictive model using the `tidymodels` framework to determine resistance patterns based on microbial data. We will: + +1. Preprocess data using the selector functions `aminoglycosides()` and `betalactams()`. +2. Define a logistic regression model for prediction. +3. Use a structured `tidymodels` workflow to preprocess, train, and evaluate the model. + +--- + +### **Data Preparation** + +We begin by loading the required libraries and preparing the `example_isolates` dataset from the `AMR` package. + +```{r} +# Load required libraries +library(tidymodels) # For machine learning workflows, and data manipulation (dplyr, tidyr, ...) +library(AMR) # For AMR data analysis + +# Load the example_isolates dataset +data("example_isolates") # Preloaded dataset with AMR results + +# Select relevant columns for prediction +data <- example_isolates %>% + # select AB results dynamically + select(mo, aminoglycosides(), betalactams()) %>% + # replace NAs with NI (not-interpretable) + mutate(across(where(is.sir), + ~replace_na(.x, "NI")), + # make factors of SIR columns + across(where(is.sir), + as.integer), + # get Gramstain of microorganisms + mo = as.factor(mo_gramstain(mo))) %>% + # drop NAs - the ones without a Gramstain (fungi, etc.) + drop_na() # %>% + # Cefepime is not reliable + #select(-FEP) +``` + +**Explanation:** +- `aminoglycosides()` and `betalactams()` dynamically select columns for antibiotics in these classes. +- `drop_na()` ensures the model receives complete cases for training. + +--- + +### **Defining the Workflow** + +We now define the `tidymodels` workflow, which consists of three steps: preprocessing, model specification, and fitting. + +#### 1. Preprocessing with a Recipe + +We create a recipe to preprocess the data for modelling. This includes: +- Encoding resistance results (`S`, `I`, `R`) as binary (resistant or not resistant). +- Converting microbial organism names (`mo`) into numerical features using one-hot encoding. + +```{r} +# Define the recipe for data preprocessing +resistance_recipe <- recipe(mo ~ ., data = data) %>% + step_corr(c(aminoglycosides(), betalactams()), threshold = 0.9) +resistance_recipe +``` + +**Explanation:** +- `step_mutate()` transforms resistance results (`R`) into binary variables (TRUE/FALSE). +- `step_dummy()` converts categorical organism (`mo`) names into one-hot encoded numerical features, making them compatible with the model. + +#### 2. Specifying the Model + +We define a logistic regression model since resistance prediction is a binary classification task. + +```{r} +# Specify a logistic regression model +logistic_model <- logistic_reg() %>% + set_engine("glm") # Use the Generalized Linear Model engine +logistic_model +``` + +**Explanation:** +- `logistic_reg()` sets up a logistic regression model. +- `set_engine("glm")` specifies the use of R's built-in GLM engine. + +#### 3. Building the Workflow + +We bundle the recipe and model together into a `workflow`, which organizes the entire modeling process. + +```{r} +# Combine the recipe and model into a workflow +resistance_workflow <- workflow() %>% + add_recipe(resistance_recipe) %>% # Add the preprocessing recipe + add_model(logistic_model) # Add the logistic regression model +resistance_workflow +``` + +--- + +### **Training and Evaluating the Model** + +To train the model, we split the data into training and testing sets. Then, we fit the workflow on the training set and evaluate its performance. + +```{r} +# Split data into training and testing sets +set.seed(123) # For reproducibility +data_split <- initial_split(data, prop = 0.8) # 80% training, 20% testing +training_data <- training(data_split) # Training set +testing_data <- testing(data_split) # Testing set + +# Fit the workflow to the training data +fitted_workflow <- resistance_workflow %>% + fit(training_data) # Train the model + +fitted_workflow +``` + +**Explanation:** +- `initial_split()` splits the data into training and testing sets. +- `fit()` trains the workflow on the training set. + +Next, we evaluate the model on the testing data. + +```{r} +# Make predictions on the testing set +predictions <- fitted_workflow %>% + predict(testing_data) # Generate predictions +probabilities <- fitted_workflow %>% + predict(testing_data, type = "prob") # Generate probabilities + +predictions <- predictions %>% + bind_cols(probabilities) %>% + bind_cols(testing_data) # Combine with true labels + +predictions + +# Evaluate model performance +metrics <- predictions %>% + metrics(truth = mo, estimate = .pred_class) # Calculate performance metrics + +metrics +``` + +**Explanation:** +- `predict()` generates predictions on the testing set. +- `metrics()` computes evaluation metrics like accuracy and AUC. + +It appears we can predict the Gram based on AMR results with a `r round(metrics$.estimate[1], 3)` accuracy. The ROC curve looks like: + +```{r} +predictions %>% + roc_curve(mo, `.pred_Gram-negative`) %>% + autoplot() +``` + +--- + +### **Conclusion** + +In this post, we demonstrated how to build a machine learning pipeline with the `tidymodels` framework and the `AMR` package. By combining selector functions like `aminoglycosides()` and `betalactams()` with `tidymodels`, we efficiently prepared data, trained a model, and evaluated its performance. + +This workflow is extensible to other antibiotic classes and resistance patterns, empowering users to analyse AMR data systematically and reproducibly. + +--- + + + THE PART HEREAFTER CONTAINS CONTENTS FROM FILE 'vignettes/EUCAST.Rmd': diff --git a/inst/tinytest/test-mdro.R b/inst/tinytest/test-mdro.R index 8dda697a..8abaaae3 100755 --- a/inst/tinytest/test-mdro.R +++ b/inst/tinytest/test-mdro.R @@ -45,7 +45,7 @@ expect_identical(class(outcome), c("ordered", "factor")) # example_isolates should have these finding using Dutch guidelines expect_equal( as.double(table(outcome)), - c(1994, 0, 6) + c(1977, 23, 0) ) expect_equal( diff --git a/man/mdro.Rd b/man/mdro.Rd index 7a9b8357..5e112647 100644 --- a/man/mdro.Rd +++ b/man/mdro.Rd @@ -23,6 +23,12 @@ mdro( x = NULL, guideline = "CMI2012", col_mo = NULL, + esbl = NA, + carbapenemase = NA, + mecA = NA, + mecC = NA, + vanA = NA, + vanB = NA, info = interactive(), pct_required_classes = 0.5, combine_SI = TRUE, @@ -35,13 +41,18 @@ custom_mdro_guideline(..., as_factor = TRUE) brmo(x = NULL, only_sir_columns = FALSE, ...) -mrgn(x = NULL, only_sir_columns = FALSE, ...) +mrgn(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -mdr_tb(x = NULL, only_sir_columns = FALSE, ...) +mdr_tb(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -mdr_cmi2012(x = NULL, only_sir_columns = FALSE, ...) +mdr_cmi2012(x = NULL, only_sir_columns = FALSE, verbose = FALSE, ...) -eucast_exceptional_phenotypes(x = NULL, only_sir_columns = FALSE, ...) +eucast_exceptional_phenotypes( + x = NULL, + only_sir_columns = FALSE, + verbose = FALSE, + ... +) } \arguments{ \item{x}{a \link{data.frame} with antibiotics columns, like \code{AMX} or \code{amox}. Can be left blank for automatic determination.} @@ -50,6 +61,18 @@ eucast_exceptional_phenotypes(x = NULL, only_sir_columns = FALSE, ...) \item{col_mo}{column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} +\item{esbl}{\link{logical} values, or a column name containing logical values, indicating the presence of an ESBL gene (or production of its proteins)} + +\item{carbapenemase}{\link{logical} values, or a column name containing logical values, indicating the presence of a carbapenemase gene (or production of its proteins)} + +\item{mecA}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{mecA} gene (or production of its proteins)} + +\item{mecC}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{mecC} gene (or production of its proteins)} + +\item{vanA}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{vanA} gene (or production of its proteins)} + +\item{vanB}{\link{logical} values, or a column name containing logical values, indicating the presence of a \emph{vanB} gene (or production of its proteins)} + \item{info}{a \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions} \item{pct_required_classes}{minimal required percentage of antimicrobial classes that must be available per isolate, rounded down. For example, with the default guideline, 17 antimicrobial classes must be available for \emph{S. aureus}. Setting this \code{pct_required_classes} argument to \code{0.5} (default) means that for every \emph{S. aureus} isolate at least 8 different classes must be available. Any lower number of available classes will return \code{NA} for that isolate.} diff --git a/vignettes/AMR_with_tidymodels.Rmd b/vignettes/AMR_with_tidymodels.Rmd new file mode 100644 index 00000000..e3edd17e --- /dev/null +++ b/vignettes/AMR_with_tidymodels.Rmd @@ -0,0 +1,191 @@ +--- +title: "`AMR` with `tidymodels`" +output: + rmarkdown::html_vignette: + toc: true + toc_depth: 3 +vignette: > + %\VignetteIndexEntry{`AMR` with `tidymodels`} + %\VignetteEncoding{UTF-8} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + chunk_output_type: console +--- + +```{r setup, include = FALSE, results = 'markup'} +knitr::opts_chunk$set( + warning = FALSE, + collapse = TRUE, + comment = "#>", + fig.width = 7.5, + fig.height = 5 +) +``` + +Antimicrobial resistance (AMR) is a global health crisis, and understanding resistance patterns is crucial for managing effective treatments. The `AMR` R package provides robust tools for analysing AMR data, including convenient antibiotic selector functions like `aminoglycosides()` and `betalactams()`. In this post, we will explore how to use the `tidymodels` framework to predict resistance patterns in the `example_isolates` dataset. + +By leveraging the power of `tidymodels` and the `AMR` package, we’ll build a reproducible machine learning workflow to predict resistance to two important antibiotic classes: aminoglycosides and beta-lactams. + +--- + +### **Objective** + +Our goal is to build a predictive model using the `tidymodels` framework to determine resistance patterns based on microbial data. We will: + +1. Preprocess data using the selector functions `aminoglycosides()` and `betalactams()`. +2. Define a logistic regression model for prediction. +3. Use a structured `tidymodels` workflow to preprocess, train, and evaluate the model. + +--- + +### **Data Preparation** + +We begin by loading the required libraries and preparing the `example_isolates` dataset from the `AMR` package. + +```{r} +# Load required libraries +library(tidymodels) # For machine learning workflows, and data manipulation (dplyr, tidyr, ...) +library(AMR) # For AMR data analysis + +# Load the example_isolates dataset +data("example_isolates") # Preloaded dataset with AMR results + +# Select relevant columns for prediction +data <- example_isolates %>% + # select AB results dynamically + select(mo, aminoglycosides(), betalactams()) %>% + # replace NAs with NI (not-interpretable) + mutate(across(where(is.sir), + ~replace_na(.x, "NI")), + # make factors of SIR columns + across(where(is.sir), + as.integer), + # get Gramstain of microorganisms + mo = as.factor(mo_gramstain(mo))) %>% + # drop NAs - the ones without a Gramstain (fungi, etc.) + drop_na() # %>% + # Cefepime is not reliable + #select(-FEP) +``` + +**Explanation:** +- `aminoglycosides()` and `betalactams()` dynamically select columns for antibiotics in these classes. +- `drop_na()` ensures the model receives complete cases for training. + +--- + +### **Defining the Workflow** + +We now define the `tidymodels` workflow, which consists of three steps: preprocessing, model specification, and fitting. + +#### 1. Preprocessing with a Recipe + +We create a recipe to preprocess the data for modelling. This includes: +- Encoding resistance results (`S`, `I`, `R`) as binary (resistant or not resistant). +- Converting microbial organism names (`mo`) into numerical features using one-hot encoding. + +```{r} +# Define the recipe for data preprocessing +resistance_recipe <- recipe(mo ~ ., data = data) %>% + step_corr(c(aminoglycosides(), betalactams()), threshold = 0.9) +resistance_recipe +``` + +**Explanation:** +- `step_mutate()` transforms resistance results (`R`) into binary variables (TRUE/FALSE). +- `step_dummy()` converts categorical organism (`mo`) names into one-hot encoded numerical features, making them compatible with the model. + +#### 2. Specifying the Model + +We define a logistic regression model since resistance prediction is a binary classification task. + +```{r} +# Specify a logistic regression model +logistic_model <- logistic_reg() %>% + set_engine("glm") # Use the Generalized Linear Model engine +logistic_model +``` + +**Explanation:** +- `logistic_reg()` sets up a logistic regression model. +- `set_engine("glm")` specifies the use of R's built-in GLM engine. + +#### 3. Building the Workflow + +We bundle the recipe and model together into a `workflow`, which organizes the entire modeling process. + +```{r} +# Combine the recipe and model into a workflow +resistance_workflow <- workflow() %>% + add_recipe(resistance_recipe) %>% # Add the preprocessing recipe + add_model(logistic_model) # Add the logistic regression model +resistance_workflow +``` + +--- + +### **Training and Evaluating the Model** + +To train the model, we split the data into training and testing sets. Then, we fit the workflow on the training set and evaluate its performance. + +```{r} +# Split data into training and testing sets +set.seed(123) # For reproducibility +data_split <- initial_split(data, prop = 0.8) # 80% training, 20% testing +training_data <- training(data_split) # Training set +testing_data <- testing(data_split) # Testing set + +# Fit the workflow to the training data +fitted_workflow <- resistance_workflow %>% + fit(training_data) # Train the model + +fitted_workflow +``` + +**Explanation:** +- `initial_split()` splits the data into training and testing sets. +- `fit()` trains the workflow on the training set. + +Next, we evaluate the model on the testing data. + +```{r} +# Make predictions on the testing set +predictions <- fitted_workflow %>% + predict(testing_data) # Generate predictions +probabilities <- fitted_workflow %>% + predict(testing_data, type = "prob") # Generate probabilities + +predictions <- predictions %>% + bind_cols(probabilities) %>% + bind_cols(testing_data) # Combine with true labels + +predictions + +# Evaluate model performance +metrics <- predictions %>% + metrics(truth = mo, estimate = .pred_class) # Calculate performance metrics + +metrics +``` + +**Explanation:** +- `predict()` generates predictions on the testing set. +- `metrics()` computes evaluation metrics like accuracy and AUC. + +It appears we can predict the Gram based on AMR results with a `r round(metrics$.estimate[1], 3)` accuracy. The ROC curve looks like: + +```{r} +predictions %>% + roc_curve(mo, `.pred_Gram-negative`) %>% + autoplot() +``` + +--- + +### **Conclusion** + +In this post, we demonstrated how to build a machine learning pipeline with the `tidymodels` framework and the `AMR` package. By combining selector functions like `aminoglycosides()` and `betalactams()` with `tidymodels`, we efficiently prepared data, trained a model, and evaluated its performance. + +This workflow is extensible to other antibiotic classes and resistance patterns, empowering users to analyse AMR data systematically and reproducibly. + +---