#' Gets data from the WHO to determine properties of an ATC (e.g. an antibiotic) like name, defined daily dose (DDD) or standard unit. \strong{This function requires an internet connection.}
#' @param atc_code a character or character vector with ATC code(s) of antibiotic(s)
#' @param property property of an ATC code. Valid values are \code{"ATC code"}, \code{"Name"}, \code{"DDD"}, \code{"U"} (\code{"unit"}), \code{"Adm.R"} en \code{"Note"}.
#' @param administration type of administration, see \emph{Details}
#' @param url url of website of the WHO. The sign \code{\%s} can be used as a placeholder for ATC codes.
#' @details
#' Abbreviations for the property \code{"Adm.R"} (parameter \code{administration}):
#' \itemize{
#' \item{\code{"Implant"}}{ = Implant}
#' \item{\code{"Inhal"}}{ = Inhalation}
#' \item{\code{"Instill"}}{ = Instillation}
#' \item{\code{"N"}}{ = nasal}
#' \item{\code{"O"}}{ = oral}
#' \item{\code{"P"}}{ = parenteral}
#' \item{\code{"R"}}{ = rectal}
#' \item{\code{"SL"}}{ = sublingual/buccal}
#' \item{\code{"TD"}}{ = transdermal}
#' \item{\code{"V"}}{ = vaginal}
#' }
#'
#' Abbreviations for the property \code{"U"} (unit):
#' Convert antibiotic codes (from a laboratory information system like MOLIS or GLIMS) to a (trivial) antibiotic name or ATC code, or vice versa. This uses the data from \code{\link{antibiotics}}.
#' @param from,to type to transform from and to. See \code{\link{antibiotics}} for its column names. WIth \code{from = "guess"} the from will be guessed from \code{"atc"}, \code{"molis"} and \code{"umcg"}.
#' Use this function to determine a valid ID based on a genus (and species). This input could be a full name (like \code{"Staphylococcus aureus"}), an abbreviated name (like \code{"S. aureus"}), or just a genus. You could also use a \code{\link{paste}} of a genus and species column to use the full name as input: \code{x = paste(df$genus, df$species)}, where \code{df} is your dataframe.
#' @param x character vector to determine \code{bactid}
#' @export
#' @importFrom dplyr %>% filter slice pull
#' @return Character (vector).
#' @seealso \code{\link{microorganisms}} for the dataframe that is being used to determine ID's.
#' @examples
#' # These examples all return "STAAUR", the ID of S. aureus:
#' guess_bactid("stau")
#' guess_bactid("STAU")
#' guess_bactid("staaur")
#' guess_bactid("S. aureus")
#' guess_bactid("S aureus")
#' guess_bactid("Staphylococcus aureus")
#' guess_bactid("MRSA") # Methicillin-resistant S. aureus
#' guess_bactid("VISA") # Vancomycin Intermediate S. aureus
guess_bactid<-function(x){
# remove dots and other non-text in case of "E. coli" except spaces
x<-gsub("[^a-zA-Z ]+","",x)
# but spaces before and after should be omitted
x<-trimws(x,which="both")
x.bak<-x
# replace space by regex sign
x<-gsub(" ",".*",x,fixed=TRUE)
# add start and stop
x_species<-paste(x,'species')
x<-paste0('^',x,'$')
for(iin1:length(x)){
if(tolower(x[i])=='^e.*coli$'){
# avoid detection of Entamoeba coli in case of E. coli
x[i]<-'Escherichia coli'
}
if(tolower(x[i])=='^h.*influenzae$'){
# avoid detection of Haematobacter influenzae in case of H. influenzae
x[i]<-'Haemophilus influenzae'
}
if(tolower(x[i])=='^st.*au$'
|tolower(x[i])=='^stau$'
|tolower(x[i])=='^staaur$'){
# avoid detection of Staphylococcus auricularis in case of S. aureus
x[i]<-'Staphylococcus aureus'
}
if(tolower(x[i])=='^p.*aer$'){
# avoid detection of Pasteurella aerogenes in case of Pseudomonas aeruginosa