Join the data set microorganisms easily to an existing data set or to a character vector.
Usage
inner_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...)
left_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...)
right_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...)
full_join_microorganisms(x, by = NULL, suffix = c("2", ""), ...)
semi_join_microorganisms(x, by = NULL, ...)
anti_join_microorganisms(x, by = NULL, ...)
Arguments
- x
existing data set to join, or character vector. In case of a character vector, the resulting data.frame will contain a column 'x' with these values.
- by
a variable to join by - if left empty will search for a column with class
mo
(created withas.mo()
) or will be"mo"
if that column name exists inx
, could otherwise be a column name ofx
with values that exist inmicroorganisms$mo
(such asby = "bacteria_id"
), or another column in microorganisms (but then it should be named, likeby = c("bacteria_id" = "fullname")
)- suffix
if there are non-joined duplicate variables in
x
andy
, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.- ...
ignored, only in place to allow future extensions
Details
Note: As opposed to the join()
functions of dplyr
, character vectors are supported and at default existing columns will get a suffix "2"
and the newly joined columns will not get a suffix.
If the dplyr
package is installed, their join functions will be used. Otherwise, the much slower merge()
and interaction()
functions from base R will be used.
Examples
left_join_microorganisms(as.mo("K. pneumoniae"))
#> ℹ Function `as.mo()` is uncertain about "K. pneumoniae" (assuming
#> Klebsiella pneumoniae). Run `mo_uncertainties()` to review this.
#> mo fullname kingdom phylum
#> 1 B_KLBSL_PNMN Klebsiella pneumoniae Bacteria Proteobacteria
#> class order family genus species
#> 1 Gammaproteobacteria Enterobacterales Enterobacteriaceae Klebsiella pneumoniae
#> subspecies rank ref species_id source prevalence
#> 1 species Trevisan, 1887 777151 LPSN 1
#> snomed
#> 1 1098101000112102, 446870005, 1098201000112108, 409801009, 56415008, 714315002
left_join_microorganisms("B_KLBSL_PNMN")
#> mo fullname kingdom phylum
#> 1 B_KLBSL_PNMN Klebsiella pneumoniae Bacteria Proteobacteria
#> class order family genus species
#> 1 Gammaproteobacteria Enterobacterales Enterobacteriaceae Klebsiella pneumoniae
#> subspecies rank ref species_id source prevalence
#> 1 species Trevisan, 1887 777151 LPSN 1
#> snomed
#> 1 1098101000112102, 446870005, 1098201000112108, 409801009, 56415008, 714315002
# \donttest{
if (require("dplyr")) {
example_isolates %>%
left_join_microorganisms() %>%
colnames()
df <- data.frame(date = seq(from = as.Date("2018-01-01"),
to = as.Date("2018-01-07"),
by = 1),
bacteria = as.mo(c("S. aureus", "MRSA", "MSSA", "STAAUR",
"E. coli", "E. coli", "E. coli")),
stringsAsFactors = FALSE)
colnames(df)
df_joined <- left_join_microorganisms(df, "bacteria")
colnames(df_joined)
}
#> Joining, by = "mo"
#> ℹ Function `as.mo()` is uncertain about "E. coli" (assuming Escherichia
#> coli) and "S. aureus" (assuming Staphylococcus aureus). Run
#> `mo_uncertainties()` to review these uncertainties.
#> [1] "date" "bacteria" "fullname" "kingdom" "phylum"
#> [6] "class" "order" "family" "genus" "species"
#> [11] "subspecies" "rank" "ref" "species_id" "source"
#> [16] "prevalence" "snomed"
# }