Apparently, there was some uncertainty about the translation to
taxonomic codes. Let’s check this:
mo_uncertainties()
-#> Matching scores are based on the resemblance between the input and the full
-#> taxonomic name, and the pathogenicity in humans. See ?mo_matching_score.
+#> Matching scores are based on the resemblance between the input and the full
+#> taxonomic name, and the pathogenicity in humans. See ?mo_matching_score.
+#> Colour keys: 0.000-0.549 0.550-0.649 0.650-0.749 0.750-1.000
#>
-#> --------------------------------------------------------------------------------
-#> "E. coli" -> Escherichia coli (B_ESCHR_COLI, 0.688)
-#> Also matched: Enterococcus crotali (0.650), Escherichia coli coli
-#> (0.643), Escherichia coli expressing (0.611), Enterobacter cowanii
-#> (0.600), Enterococcus columbae (0.595), Enterococcus camelliae (0.591),
-#> Enterococcus casseliflavus (0.577), Enterobacter cloacae cloacae
-#> (0.571), Enterobacter cloacae complex (0.571), and Enterobacter cloacae
-#> dissolvens (0.565)
-#> --------------------------------------------------------------------------------
-#> "K. pneumoniae" -> Klebsiella pneumoniae (B_KLBSL_PNMN, 0.786)
-#> Also matched: Klebsiella pneumoniae ozaenae (0.707), Klebsiella
-#> pneumoniae pneumoniae (0.688), Klebsiella pneumoniae rhinoscleromatis
-#> (0.658), Klebsiella pasteurii (0.500), Klebsiella planticola (0.500),
-#> Kingella potus (0.400), Kluyveromyces pseudotropicale (0.386),
-#> Kluyveromyces pseudotropicalis (0.363), Kosakonia pseudosacchari
-#> (0.361), and Kluyveromyces pseudotropicalis pseudotropicalis (0.361)
-#> --------------------------------------------------------------------------------
-#> "S. aureus" -> Staphylococcus aureus (B_STPHY_AURS, 0.690)
-#> Also matched: Staphylococcus aureus aureus (0.643), Staphylococcus
-#> argenteus (0.625), Staphylococcus aureus anaerobius (0.625),
-#> Staphylococcus auricularis (0.615), Salmonella Aurelianis (0.595),
-#> Salmonella Aarhus (0.588), Salmonella Amounderness (0.587),
-#> Staphylococcus argensis (0.587), Streptococcus australis (0.587), and
-#> Salmonella choleraesuis arizonae (0.562)
-#> --------------------------------------------------------------------------------
-#> "S. pneumoniae" -> Streptococcus pneumoniae (B_STRPT_PNMN, 0.750)
-#> Also matched: Streptococcus pseudopneumoniae (0.700), Streptococcus
-#> phocae salmonis (0.552), Serratia proteamaculans quinovora (0.545),
-#> Streptococcus pseudoporcinus (0.536), Staphylococcus piscifermentans
-#> (0.533), Staphylococcus pseudintermedius (0.532), Serratia
-#> proteamaculans proteamaculans (0.526), Streptococcus gallolyticus
-#> pasteurianus (0.526), Salmonella Portanigra (0.524), and Streptococcus
-#> periodonticum (0.519)
+#> --------------------------------------------------------------------------------
+#> "E. coli" -> Escherichia coli (B_ESCHR_COLI, 0.688)
+#> Also matched: Enterococcus crotali (0.650), Escherichia coli coli
+#> (0.643), Escherichia coli expressing (0.611), Enterobacter cowanii
+#> (0.600), Enterococcus columbae (0.595), Enterococcus camelliae (0.591),
+#> Enterococcus casseliflavus (0.577), Enterobacter cloacae cloacae
+#> (0.571), Enterobacter cloacae complex (0.571), and Enterobacter cloacae
+#> dissolvens (0.565)
+#> --------------------------------------------------------------------------------
+#> "K. pneumoniae" -> Klebsiella pneumoniae (B_KLBSL_PNMN, 0.786)
+#> Also matched: Klebsiella pneumoniae ozaenae (0.707), Klebsiella
+#> pneumoniae pneumoniae (0.688), Klebsiella pneumoniae rhinoscleromatis
+#> (0.658), Klebsiella pasteurii (0.500), Klebsiella planticola (0.500),
+#> Kingella potus (0.400), Kluyveromyces pseudotropicale (0.386),
+#> Kluyveromyces pseudotropicalis (0.363), Kosakonia pseudosacchari
+#> (0.361), and Kluyveromyces pseudotropicalis pseudotropicalis (0.361)
+#> --------------------------------------------------------------------------------
+#> "S. aureus" -> Staphylococcus aureus (B_STPHY_AURS, 0.690)
+#> Also matched: Staphylococcus aureus aureus (0.643), Staphylococcus
+#> argenteus (0.625), Staphylococcus aureus anaerobius (0.625),
+#> Staphylococcus auricularis (0.615), Salmonella Aurelianis (0.595),
+#> Salmonella Aarhus (0.588), Salmonella Amounderness (0.587),
+#> Staphylococcus argensis (0.587), Streptococcus australis (0.587), and
+#> Salmonella choleraesuis arizonae (0.562)
+#> --------------------------------------------------------------------------------
+#> "S. pneumoniae" -> Streptococcus pneumoniae (B_STRPT_PNMN, 0.750)
+#> Also matched: Streptococcus pseudopneumoniae (0.700), Streptococcus
+#> phocae salmonis (0.552), Serratia proteamaculans quinovora (0.545),
+#> Streptococcus pseudoporcinus (0.536), Staphylococcus piscifermentans
+#> (0.533), Staphylococcus pseudintermedius (0.532), Serratia
+#> proteamaculans proteamaculans (0.526), Streptococcus gallolyticus
+#> pasteurianus (0.526), Salmonella Portanigra (0.524), and Streptococcus
+#> periodonticum (0.519)
#>
-#> Only the first 10 other matches of each record are shown. Run
-#> print(mo_uncertainties(), n = ...) to view more entries, or save
-#> mo_uncertainties() to an object.
+That’s all good.
This is basically it for the cleaning, time to start the data
inclusion.
@@ -395,14 +396,14 @@ the methods on the So only 91% is suitable for resistance analysis! We can now filter on
it with the filter()
function, also from the
dplyr
package:
@@ -420,16 +421,16 @@ like:
Time for the analysis.
@@ -519,39 +520,39 @@ in:
our_data_1st %>%
select(date, aminoglycosides())
-#> ℹ For aminoglycosides() using column 'GEN' (gentamicin)
+#> ℹ For aminoglycosides() using column 'GEN' (gentamicin)
#> # A tibble: 2,724 × 2
#> date GEN
#> <date> <sir>
-#> 1 2012-11-21 S
-#> 2 2018-04-03 S
-#> 3 2014-09-19 S
-#> 4 2015-12-10 S
-#> 5 2015-03-02 S
-#> 6 2018-03-31 S
-#> 7 2015-10-25 S
-#> 8 2019-06-19 S
-#> 9 2015-04-27 S
-#> 10 2011-06-21 S
+#> 1 2012-11-21 S
+#> 2 2018-04-03 S
+#> 3 2014-09-19 S
+#> 4 2015-12-10 S
+#> 5 2015-03-02 S
+#> 6 2018-03-31 S
+#> 7 2015-10-25 S
+#> 8 2019-06-19 S
+#> 9 2015-04-27 S
+#> 10 2011-06-21 S
#> # ℹ 2,714 more rows
our_data_1st %>%
select(bacteria, betalactams())
-#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
-#> (amoxicillin/clavulanic acid)
+#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
+#> (amoxicillin/clavulanic acid)
#> # A tibble: 2,724 × 3
#> bacteria AMX AMC
#> <mo> <sir> <sir>
-#> 1 B_ESCHR_COLI R I
-#> 2 B_KLBSL_PNMN R I
-#> 3 B_ESCHR_COLI R S
-#> 4 B_ESCHR_COLI S I
-#> 5 B_ESCHR_COLI S S
-#> 6 B_STPHY_AURS R S
-#> 7 B_ESCHR_COLI R S
-#> 8 B_ESCHR_COLI S S
-#> 9 B_STPHY_AURS S S
-#> 10 B_ESCHR_COLI S S
+#> 1 B_ESCHR_COLI R I
+#> 2 B_KLBSL_PNMN R I
+#> 3 B_ESCHR_COLI R S
+#> 4 B_ESCHR_COLI S I
+#> 5 B_ESCHR_COLI S S
+#> 6 B_STPHY_AURS R S
+#> 7 B_ESCHR_COLI R S
+#> 8 B_ESCHR_COLI S S
+#> 9 B_STPHY_AURS S S
+#> 10 B_ESCHR_COLI S S
#> # ℹ 2,714 more rows
our_data_1st %>%
@@ -559,73 +560,73 @@ in:
#> # A tibble: 2,724 × 5
#> bacteria AMX AMC CIP GEN
#> <mo> <sir> <sir> <sir> <sir>
-#> 1 B_ESCHR_COLI R I S S
-#> 2 B_KLBSL_PNMN R I S S
-#> 3 B_ESCHR_COLI R S S S
-#> 4 B_ESCHR_COLI S I S S
-#> 5 B_ESCHR_COLI S S S S
-#> 6 B_STPHY_AURS R S R S
-#> 7 B_ESCHR_COLI R S S S
-#> 8 B_ESCHR_COLI S S S S
-#> 9 B_STPHY_AURS S S S S
-#> 10 B_ESCHR_COLI S S S S
+#> 1 B_ESCHR_COLI R I S S
+#> 2 B_KLBSL_PNMN R I S S
+#> 3 B_ESCHR_COLI R S S S
+#> 4 B_ESCHR_COLI S I S S
+#> 5 B_ESCHR_COLI S S S S
+#> 6 B_STPHY_AURS R S R S
+#> 7 B_ESCHR_COLI R S S S
+#> 8 B_ESCHR_COLI S S S S
+#> 9 B_STPHY_AURS S S S S
+#> 10 B_ESCHR_COLI S S S S
#> # ℹ 2,714 more rows
# filtering using AB selectors is also possible:
our_data_1st %>%
filter(any(aminoglycosides() == "R"))
-#> ℹ For aminoglycosides() using column 'GEN' (gentamicin)
+#> ℹ For aminoglycosides() using column 'GEN' (gentamicin)
#> # A tibble: 981 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
-#> 1 J5 A 2017-12-25 B_STRPT_PNMN R S S R TRUE
-#> 2 X1 A 2017-07-04 B_STPHY_AURS R S S R TRUE
-#> 3 B3 A 2016-07-24 B_ESCHR_COLI S S S R TRUE
-#> 4 V7 A 2012-04-03 B_ESCHR_COLI S S S R TRUE
-#> 5 C9 A 2017-03-23 B_ESCHR_COLI S S S R TRUE
-#> 6 R1 A 2018-06-10 B_STPHY_AURS S S S R TRUE
-#> 7 S2 A 2013-07-19 B_STRPT_PNMN S S S R TRUE
-#> 8 P5 A 2019-03-09 B_STPHY_AURS S S S R TRUE
-#> 9 Q8 A 2019-08-10 B_STPHY_AURS S S S R TRUE
-#> 10 K5 A 2013-03-15 B_STRPT_PNMN S S S R TRUE
+#> 1 J5 A 2017-12-25 B_STRPT_PNMN R S S R TRUE
+#> 2 X1 A 2017-07-04 B_STPHY_AURS R S S R TRUE
+#> 3 B3 A 2016-07-24 B_ESCHR_COLI S S S R TRUE
+#> 4 V7 A 2012-04-03 B_ESCHR_COLI S S S R TRUE
+#> 5 C9 A 2017-03-23 B_ESCHR_COLI S S S R TRUE
+#> 6 R1 A 2018-06-10 B_STPHY_AURS S S S R TRUE
+#> 7 S2 A 2013-07-19 B_STRPT_PNMN S S S R TRUE
+#> 8 P5 A 2019-03-09 B_STPHY_AURS S S S R TRUE
+#> 9 Q8 A 2019-08-10 B_STPHY_AURS S S S R TRUE
+#> 10 K5 A 2013-03-15 B_STRPT_PNMN S S S R TRUE
#> # ℹ 971 more rows
our_data_1st %>%
filter(all(betalactams() == "R"))
-#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
-#> (amoxicillin/clavulanic acid)
+#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
+#> (amoxicillin/clavulanic acid)
#> # A tibble: 462 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
-#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
-#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
-#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
-#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
-#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
-#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
-#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
-#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
-#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
-#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
+#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
+#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
+#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
+#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
+#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
+#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
+#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
+#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
+#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
+#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
#> # ℹ 452 more rows
# even works in base R (since R 3.0):
our_data_1st[all(betalactams() == "R"), ]
-#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
-#> (amoxicillin/clavulanic acid)
+#> ℹ For betalactams() using columns 'AMX' (amoxicillin) and 'AMC'
+#> (amoxicillin/clavulanic acid)
#> # A tibble: 462 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
-#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
-#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
-#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
-#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
-#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
-#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
-#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
-#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
-#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
-#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
+#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
+#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
+#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
+#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
+#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
+#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
+#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
+#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
+#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
+#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
#> # ℹ 452 more rows