From 1a6314769ba41281a8f02f006e76dbbfeebec1d7 Mon Sep 17 00:00:00 2001
From: "Matthijs S. Berends" Note: values on this page will change with every website update since they are based on randomly created values and the page was written in RMarkdown. However, the methodology remains unchanged. This page was generated on 22 February 2019. Note: values on this page will change with every website update since they are based on randomly created values and the page was written in RMarkdown. However, the methodology remains unchanged. This page was generated on 23 February 2019. So, we can draw at least two conclusions immediately. From a data scientist perspective, the data looks clean: only values The data is already quite clean, but we still need to transform some variables. The So only 28.4% is suitable for resistance analysis! We can now filter on it with the So only 28.3% is suitable for resistance analysis! We can now filter on it with the For future use, the above two syntaxes can be shortened with the Only 2 isolates are marked as ‘first’ according to CLSI guideline. But when reviewing the antibiogram, it is obvious that some isolates are absolutely different strains and should be included too. This is why we weigh isolates, based on their antibiogram. The Instead of 2, now 8 isolates are flagged. In total, 79.3% of all isolates are marked ‘first weighted’ - 50.9% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline. Instead of 2, now 10 isolates are flagged. In total, 79.3% of all isolates are marked ‘first weighted’ - 50.9% more than when using the CLSI guideline. In real life, this novel algorithm will yield 5-10% more isolates than the classic CLSI guideline. As with So we end up with 15,854 isolates for analysis. So we end up with 15,851 isolates for analysis. We can remove unneeded columns: Or can be used like the Frequency table of Frequency table of Columns: 2 Shortest: 16 The functions Or can be used in conjuction with How to conduct AMR analysis
Matthijs S. Berends
- 22 February 2019
+ 23 February 2019
AMR.Rmd
Introduction
@@ -217,21 +217,21 @@
-
2019-02-22
+2019-02-23
abcd
Escherichia coli
S
S
-
2019-02-22
+2019-02-23
abcd
Escherichia coli
S
R
-
2019-02-22
+2019-02-23
efgh
Escherichia coli
R
@@ -327,70 +327,70 @@
-
-2014-11-25
-O2
-Hospital D
-Escherichia coli
-R
-S
-R
-S
-F
-
-
-2016-11-18
-I10
+2011-09-14
+N3
Hospital B
Escherichia coli
-R
S
-R
+S
+S
S
M
- 2014-08-15
-G9
-Hospital D
-Staphylococcus aureus
+
+
-2011-01-09
+I3
+Hospital A
+Escherichia coli
R
S
S
S
M
- 2017-07-26
-S2
-Hospital B
-Staphylococcus aureus
+
+
+2015-06-02
+E8
+Hospital A
+Streptococcus pneumoniae
+R
S
R
S
+M
+
+
2011-02-06
+S1
+Hospital D
+Escherichia coli
+S
+S
+S
S
F
-
2017-01-25
-H5
+2010-01-27
+N7
Hospital C
Escherichia coli
R
+I
+R
S
-S
-S
-M
+F
-
@@ -411,8 +411,8 @@
#>
#> Item Count Percent Cum. Count Cum. Percent
#> --- ----- ------- -------- ----------- -------------
-#> 1 M 10,377 51.9% 10,377 51.9%
-#> 2 F 9,623 48.1% 20,000 100.0%
+#> 1 M 10,364 51.8% 10,364 51.8%
+#> 2 F 9,636 48.2% 20,000 100.0%
2017-03-12
-B9
-Hospital C
+2017-08-11
+U3
+Hospital B
Escherichia coli
S
S
S
S
-M
+F
M
and F
. From a researcher perspective: there are slightly more men. Nothing we didn’t already know.bacteria
column now consists of text, and we want to add more variables based on microbial IDs later on. So, we will transform this column to valid IDs. The mutate()
function of the dplyr
package makes this really easy:data <- data %>%
@@ -443,10 +443,10 @@
#> Kingella kingae (no changes)
#>
#> EUCAST Expert Rules, Intrinsic Resistance and Exceptional Phenotypes (v3.1, 2016)
-#> Table 1: Intrinsic resistance in Enterobacteriaceae (1284 changes)
+#> Table 1: Intrinsic resistance in Enterobacteriaceae (1334 changes)
#> Table 2: Intrinsic resistance in non-fermentative Gram-negative bacteria (no changes)
#> Table 3: Intrinsic resistance in other Gram-negative bacteria (no changes)
-#> Table 4: Intrinsic resistance in Gram-positive bacteria (2790 changes)
+#> Table 4: Intrinsic resistance in Gram-positive bacteria (2731 changes)
#> Table 8: Interpretive rules for B-lactam agents and Gram-positive cocci (no changes)
#> Table 9: Interpretive rules for B-lactam agents and Gram-negative rods (no changes)
#> Table 10: Interpretive rules for B-lactam agents and other Gram-negative bacteria (no changes)
@@ -462,9 +462,9 @@
#> Non-EUCAST: piperacillin/tazobactam = S where piperacillin = S (no changes)
#> Non-EUCAST: trimethoprim/sulfa = S where trimethoprim = S (no changes)
#>
-#> => EUCAST rules affected 7,321 out of 20,000 rows
+#> => EUCAST rules affected 7,419 out of 20,000 rows
#> -> added 0 test results
-#> -> changed 4,074 test results (0 to S; 0 to I; 4,074 to R)
@@ -489,8 +489,8 @@
#> NOTE: Using column `bacteria` as input for `col_mo`.
#> NOTE: Using column `date` as input for `col_date`.
#> NOTE: Using column `patient_id` as input for `col_patient_id`.
-#> => Found 5,680 first isolates (28.4% of total)
filter()
function, also from the dplyr
package:filter()
function, also from the dplyr
package:filter_first_isolate()
function:
1
-2010-01-10
-X9
+2010-02-08
+H1
B_ESCHR_COL
-R
+S
S
S
S
@@ -527,43 +527,43 @@
2
-2010-04-18
-X9
+2010-04-06
+H1
B_ESCHR_COL
R
-I
+S
S
S
FALSE
3
-2010-07-02
-X9
+2010-04-25
+H1
B_ESCHR_COL
+S
R
S
-S
-S
+R
FALSE
4
-2010-09-21
-X9
+2010-10-05
+H1
B_ESCHR_COL
-R
+I
+S
S
R
-S
FALSE
5
-2010-09-22
-X9
+2010-11-09
+H1
B_ESCHR_COL
-R
+S
S
S
S
@@ -571,59 +571,59 @@
6
-2010-10-06
-X9
+2010-11-23
+H1
B_ESCHR_COL
+R
S
-S
-S
+R
S
FALSE
7
-2010-10-14
-X9
+2010-12-26
+H1
B_ESCHR_COL
R
-S
+I
S
S
FALSE
8
-2011-01-09
-X9
+2011-01-01
+H1
B_ESCHR_COL
S
-I
S
-R
+S
+S
FALSE
+9
-2011-03-31
-X9
+2011-01-21
+H1
B_ESCHR_COL
R
+I
+S
+S
+FALSE
+
+
-10
+2011-02-28
+H1
+B_ESCHR_COL
+S
S
S
S
TRUE
-
10
-2011-03-31
-X9
-B_ESCHR_COL
-S
-S
-R
-S
-FALSE
-key_antibiotics()
function adds a vector with 18 key antibiotics: 6 broad spectrum ones, 6 small spectrum for Gram negatives and 6 small spectrum for Gram positives. These can be defined by the user.
-
isolate
@@ -654,10 +654,10 @@
1
-2010-01-10
-X9
+2010-02-08
+H1
B_ESCHR_COL
-R
+S
S
S
S
@@ -666,47 +666,47 @@
-2
-2010-04-18
-X9
-B_ESCHR_COL
-R
-I
-S
-S
-FALSE
-FALSE
-
-
-3
-2010-07-02
-X9
+2010-04-06
+H1
B_ESCHR_COL
R
S
S
S
FALSE
-FALSE
-
-
4
-2010-09-21
-X9
-B_ESCHR_COL
-R
-S
-R
-S
-FALSE
TRUE
-
+5
-2010-09-22
-X9
+3
+2010-04-25
+H1
B_ESCHR_COL
+S
R
S
+R
+FALSE
+TRUE
+
+
+4
+2010-10-05
+H1
+B_ESCHR_COL
+I
+S
+S
+R
+FALSE
+TRUE
+
+
5
+2010-11-09
+H1
+B_ESCHR_COL
+S
+S
S
S
FALSE
@@ -714,23 +714,23 @@
6
-2010-10-06
-X9
+2010-11-23
+H1
B_ESCHR_COL
+R
S
-S
-S
+R
S
FALSE
TRUE
7
-2010-10-14
-X9
+2010-12-26
+H1
B_ESCHR_COL
R
-S
+I
S
S
FALSE
@@ -738,47 +738,47 @@
8
-2011-01-09
-X9
+2011-01-01
+H1
B_ESCHR_COL
S
-I
S
-R
+S
+S
FALSE
TRUE
-9
-2011-03-31
-X9
+2011-01-21
+H1
B_ESCHR_COL
R
+I
S
S
-S
-TRUE
-TRUE
-
-
+10
-2011-03-31
-X9
-B_ESCHR_COL
-S
-S
-R
-S
FALSE
TRUE
+
10
+2011-02-28
+H1
+B_ESCHR_COL
+S
+S
+S
+S
+TRUE
+TRUE
+filter_first_isolate()
, there’s a shortcut for this new algorithm too:
@@ -1014,23 +1007,23 @@ Longest: 24
-
date
patient_id
hospital
@@ -803,14 +802,13 @@
-
2
-2016-11-18
-I10
+2011-09-14
+N3
Hospital B
B_ESCHR_COL
-R
S
-R
+S
+S
S
M
Gram negative
@@ -819,10 +817,9 @@
TRUE
-
5
-2017-01-25
-H5
-Hospital C
+2011-01-09
+I3
+Hospital A
B_ESCHR_COL
R
S
@@ -835,67 +832,63 @@
TRUE
-
6
-2017-03-12
-B9
-Hospital C
-B_ESCHR_COL
-S
-S
-S
+2015-06-02
+E8
+Hospital A
+B_STRPT_PNE
+R
S
+R
+R
M
-Gram negative
-Escherichia
-coli
+Gram positive
+Streptococcus
+pneumoniae
TRUE
-
7
-2015-08-12
-Y4
-Hospital B
-B_STPHY_AUR
-R
+2011-02-06
+S1
+Hospital D
+B_ESCHR_COL
+S
S
S
S
F
-Gram positive
-Staphylococcus
-aureus
+Gram negative
+Escherichia
+coli
TRUE
-
9
-2016-01-24
-L10
-Hospital A
+2010-01-27
+N7
+Hospital C
B_ESCHR_COL
+R
+I
+R
S
-S
-S
-S
-M
+F
Gram negative
Escherichia
coli
TRUE
-
@@ -915,9 +908,9 @@
12
-2013-09-11
-H6
+2017-08-11
+U3
Hospital B
-B_STPHY_AUR
+B_ESCHR_COL
S
S
-R
S
-M
-Gram positive
-Staphylococcus
-aureus
+S
+F
+Gram negative
+Escherichia
+coli
TRUE
dplyr
way, which is easier readable:genus
and species
from a data.frame
(15,854 x 13)genus
and species
from a data.frame
(15,851 x 13)
-Length: 15,854 (of which NA: 0 = 0.00%)
+Length: 15,851 (of which NA: 0 = 0.00%)
Unique: 4
Longest: 24
1
Escherichia coli
-7,918
-49.9%
-7,918
-49.9%
+7,800
+49.2%
+7,800
+49.2%
2
Staphylococcus aureus
-3,930
-24.8%
-11,848
-74.7%
+4,008
+25.3%
+11,808
+74.5%
3
Streptococcus pneumoniae
-2,498
-15.8%
-14,346
-90.5%
+2,445
+15.4%
+14,253
+89.9%
@@ -971,7 +964,7 @@ Longest: 24
Resistance percentages
4
Klebsiella pneumoniae
-1,508
-9.5%
-15,854
+1,598
+10.1%
+15,851
100.0%
portion_R
, portion_RI
, portion_I
, portion_IS
and portion_S
can be used to determine the portion of a specific antimicrobial outcome. They can be used on their own:group_by()
and summarise()
, both from the dplyr
package:data_1st %>%
group_by(hospital) %>%
@@ -984,19 +977,19 @@ Longest: 24
Hospital A
-0.4737395
+0.4877378
Hospital B
-0.4763709
+0.4750000
Hospital C
-0.4739257
+0.4869240
Hospital D
-0.4636854
+0.4860406
?fb3l8PQb>9Ve@hBNWsN}7|>TE2B2MtFWz|0d>
zI!ff5=u~k5_`o53?9_=91E#ua*lVgFHa3 tr77{?r{V6bb&93us5?!9~Mf8xN@%BAr$dRmD+U-8RyN$ZwM(tp~Z;eT$v
z{d-0m|MxrozYTB|uFR71al^RAqU@_wEyx=`bCFj@U%%m*60f^`BMTbF&X0?Sbe`1B
zF9d~$hfkBJb9gy>+0@;@mAqfIvhw$t*P_3-e%@)Vg;!8T=)HR SYwHn|&{+?e`Ny%bJ8AbIYGH;0LlqB~zwpRx2fm5`I!Uz5s5CWx
zxYmRhMpcu+lnY+L>D63(0U^$KD_n T;%?
z7PLC~nXU=SNRR8_-Q~C2oQ~Hf5;~OzJw?-K0#)H%#|W|?$;|=KCUH^@_ELZzy~z+y
zvY7Fh6ku_k9;jxm#SE
z=!Y+Fs3y4JMkBCS;5g^E-l@q+M{{#>wB>UGG8(jgCQJPxCT#4^P77xDDi+}K@^W~1
zI7**a$CVWo711>T={J3h2iO1iQE~<3BO^IL4g$dEzkwIz-~ZP*Wl^B0xwO(QWc)n&
PI-{S>^o#V4{m;Jvi`H~j
literal 33724
zcmeFa2UJzrnl6e(DZ8`;M1@ib?m`S8q69&rf{K6y1w?XzVgQLsPFik7L_~=K3aI2L
z3L;q%L_k21Ad-_v&N;nrVx7L-{kqTSzJ1<({oZKCIEN~Qz4uyc&j0_?Kg&~1MSkt-
z&8z9?=+-J6JA9IkZe;`=9sO&{PxzDL-xQYc?H8+K+L!6*I6TPzR`_&{n9$K}rBgV3
zNW
Fw=0MEUyuZ#gis(uV