AMR/vignettes/AMR.Rmd

---
title: "Introduction to the AMR package"
author: "Matthijs S. Berends"
output: 
  rmarkdown::html_vignette:
    toc: false
vignette: >
  %\VignetteIndexEntry{Introduction to the AMR package}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE, results = 'markup'}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#"
)
```

This R package was intended **to make microbial epidemiology easier**. Most functions contain extensive help pages to get started.

The `AMR` package basically does four important things:

1. It **cleanses existing data**, by transforming it to reproducible and profound *classes*, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect:

   * Use `as.mo` to get an ID of a microorganism. The IDs are human readable for the trained eye - the ID of *Klebsiella pneumoniae* is "B_KLBSL_PNE" (B stands for Bacteria) and the ID of *S. aureus* is "B_STPHY_AUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, it uses Artificial Intelligence to look up values in the included ITIS data, consisting of more than 18,000 microorganisms.
   * Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S".
   * Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called *ordinal* in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002".
   * Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine.
   
2. It **enhances existing data** and **adds new data** from data sets included in this package.

   * Use `EUCAST_rules` to apply [EUCAST expert rules to isolates](http://www.eucast.org/expert_rules_and_intrinsic_resistance/).
   * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute).
     * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them.
   * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported.
   * The data set `microorganisms` contains the complete taxonomic tree of more than 18,000 microorganisms (bacteria, fungi/yeasts and protozoa). Furthermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
   * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_name` and `ab_tradenames` to look up values. The `ab_*` functions use `as.atc` internally so they support AI to guess your expected result. For example, `ab_name("Fluclox")`, `ab_name("Floxapen")` and `ab_name("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data.

3. It **analyses the data** with convenient functions that use well-known methods.

   * Calculate the resistance (and even co-resistance) of microbial isolates with the `portion_R`, `portion_IR`, `portion_I`, `portion_SI` and `portion_S` functions. Similarly, the *number* of isolates can be determined with the `count_R`, `count_IR`, `count_I`, `count_SI` and `count_S` functions. All these functions can be used [with the `dplyr` package](https://dplyr.tidyverse.org/#usage) (e.g. in conjunction with [`summarise`](https://dplyr.tidyverse.org/reference/summarise.html))
   * Plot AMR results with `geom_rsi`, a function made for the `ggplot2` package
   * Predict antimicrobial resistance for the nextcoming years using logistic regression models with the `resistance_predict` function
   * Conduct descriptive statistics to enhance base R: calculate kurtosis, skewness and create frequency tables

4. It **teaches the user** how to use all the above actions.

   * The package contains extensive help pages with many examples.
   * It also contains an example data set called `septic_patients`. This data set contains:
     * 2,000 blood culture isolates from anonymised septic patients between 2001 and 2017 in the Northern Netherlands
     * Results of 40 antibiotics (each antibiotic in its own column) with a total of 38,414 antimicrobial results
     * Real and genuine data
     
### ITIS

This package contains the **complete microbial taxonomic data** (with all  seven taxonomic ranks - from subkingdom to subspecies) from the publicly available Integrated Taxonomic Information System (ITIS, https://www.itis.gov). 

All (sub)species from the taxonomic kingdoms Bacteria, Fungi and Protozoa are included in this package, as well as all previously accepted names known to ITIS. Furthermore, the responsible authors and year of publication are available. This allows users to use authoritative taxonomic information for their data analysis on any microorganism, not only human pathogens.

ITIS is a partnership of U.S., Canadian, and Mexican agencies and taxonomic specialists.

**Get a note when a species was renamed**
```r
mo_shortname("Chlamydia psittaci")
# Note: 'Chlamydia psittaci' (Page, 1968) was renamed 'Chlamydophila psittaci' (Everett et al., 1999)
# [1] "C. psittaci"
```

**Get any property from the entire taxonomic tree for all included species**
```r
mo_class("E. coli")
# [1] "Gammaproteobacteria"

mo_family("E. coli")
# [1] "Enterobacteriaceae"

mo_ref("E. coli")
# [1] "Castellani and Chalmers, 1919"
```

**Do not get mistaken - the package only includes microorganisms**
```r
mo_phylum("C. elegans")
# [1] "Cyanobacteria"                   # Bacteria?!
mo_fullname("C. elegans")
# [1] "Chroococcus limneticus elegans"  # Because a microorganism was found 
```

----
```{r, echo = FALSE}
# this will print "2018" in 2018, and "2018-yyyy" after 2018.
yrs <- paste(unique(c(2018, format(Sys.Date(), "%Y"))), collapse = "-")
```
AMR, (c) `r yrs`, `r packageDescription("AMR")$URL`

Licensed under the [GNU General Public License v2.0](https://github.com/msberends/AMR/blob/master/LICENSE).
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`---`
			`title: "Introduction to the AMR package"`
			`author: "Matthijs S. Berends"`
			`output:`
			`rmarkdown::html_vignette:`
CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00			`toc: false`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`vignette: >`
new MOs, cleanup 2018-09-01 21:19:46 +02:00			`%\VignetteIndexEntry{Introduction to the AMR package}`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`%\VignetteEngine{knitr::rmarkdown}`
			`%\VignetteEncoding{UTF-8}`
			`---`

			```{r setup, include = FALSE, results = 'markup'}
			`knitr::opts_chunk$set(`
			`collapse = TRUE,`
			`comment = "#"`
			`)`
			```

CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00			`This R package was intended to make microbial epidemiology easier. Most functions contain extensive help pages to get started.`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00			The `AMR` package basically does four important things:
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`1. It cleanses existing data, by transforming it to reproducible and profound classes, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect:`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00			* Use `as.mo` to get an ID of a microorganism. The IDs are human readable for the trained eye - the ID of Klebsiella pneumoniae is "B_KLBSL_PNE" (B stands for Bacteria) and the ID of S. aureus is "B_STPHY_AUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of S. aureus. Moreover, it can group all coagulase negative and positive Staphylococci, and can transform Streptococci into Lancefield groups. To find bacteria based on your input, it uses Artificial Intelligence to look up values in the included ITIS data, consisting of more than 18,000 microorganisms.
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			* Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S".
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			* Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called ordinal in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002".
new MOs, cleanup 2018-09-01 21:19:46 +02:00			* Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine.
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
			`2. It enhances existing data and adds new data from data sets included in this package.`

			* Use `EUCAST_rules` to apply [EUCAST expert rules to isolates](http://www.eucast.org/expert_rules_and_intrinsic_resistance/).
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			* Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute).
			`* You can also identify first weighted isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them.`
			* Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported.
CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00			* The data set `microorganisms` contains the complete taxonomic tree of more than 18,000 microorganisms (bacteria, fungi/yeasts and protozoa). Furthermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
added taxonomic data from ITIS 2018-09-17 20:53:32 +02:00			* The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_name` and `ab_tradenames` to look up values. The `ab_*` functions use `as.atc` internally so they support AI to guess your expected result. For example, `ab_name("Fluclox")`, `ab_name("Floxapen")` and `ab_name("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data.

atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`3. It analyses the data with convenient functions that use well-known methods.`

speed improvement as.mo, freq title 2018-10-31 12:10:49 +01:00			* Calculate the resistance (and even co-resistance) of microbial isolates with the `portion_R`, `portion_IR`, `portion_I`, `portion_SI` and `portion_S` functions. Similarly, the number of isolates can be determined with the `count_R`, `count_IR`, `count_I`, `count_SI` and `count_S` functions. All these functions can be used [with the `dplyr` package](https://dplyr.tidyverse.org/#usage) (e.g. in conjunction with [`summarise`](https://dplyr.tidyverse.org/reference/summarise.html))
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			* Plot AMR results with `geom_rsi`, a function made for the `ggplot2` package
			* Predict antimicrobial resistance for the nextcoming years using logistic regression models with the `resistance_predict` function
			`* Conduct descriptive statistics to enhance base R: calculate kurtosis, skewness and create frequency tables`

163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`4. It teaches the user how to use all the above actions.`

			`* The package contains extensive help pages with many examples.`
			* It also contains an example data set called `septic_patients`. This data set contains:
			`* 2,000 blood culture isolates from anonymised septic patients between 2001 and 2017 in the Northern Netherlands`
			`* Results of 40 antibiotics (each antibiotic in its own column) with a total of 38,414 antimicrobial results`
			`* Real and genuine data`
CRAN fixes for release 0.4.0 https://cran.r-project.org/web/checks/check_results_AMR.html 2018-10-09 13:53:33 +02:00
			`### ITIS`

			`This package contains the complete microbial taxonomic data (with all seven taxonomic ranks - from subkingdom to subspecies) from the publicly available Integrated Taxonomic Information System (ITIS, https://www.itis.gov).`

			`All (sub)species from the taxonomic kingdoms Bacteria, Fungi and Protozoa are included in this package, as well as all previously accepted names known to ITIS. Furthermore, the responsible authors and year of publication are available. This allows users to use authoritative taxonomic information for their data analysis on any microorganism, not only human pathogens.`

			`ITIS is a partnership of U.S., Canadian, and Mexican agencies and taxonomic specialists.`

			`Get a note when a species was renamed`
			```r
			`mo_shortname("Chlamydia psittaci")`
			`# Note: 'Chlamydia psittaci' (Page, 1968) was renamed 'Chlamydophila psittaci' (Everett et al., 1999)`
			`# [1] "C. psittaci"`
			```

			`Get any property from the entire taxonomic tree for all included species`
			```r
			`mo_class("E. coli")`
			`# [1] "Gammaproteobacteria"`

			`mo_family("E. coli")`
			`# [1] "Enterobacteriaceae"`

			`mo_ref("E. coli")`
			`# [1] "Castellani and Chalmers, 1919"`
			```

			`Do not get mistaken - the package only includes microorganisms`
			```r
			`mo_phylum("C. elegans")`
			`# [1] "Cyanobacteria" # Bacteria?!`
			`mo_fullname("C. elegans")`
			`# [1] "Chroococcus limneticus elegans" # Because a microorganism was found`
			```
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
			`----`
			```{r, echo = FALSE}
			`# this will print "2018" in 2018, and "2018-yyyy" after 2018.`
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`yrs <- paste(unique(c(2018, format(Sys.Date(), "%Y"))), collapse = "-")`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			```
			AMR, (c) `r yrs`, `r packageDescription("AMR")$URL`

			`Licensed under the [GNU General Public License v2.0](https://github.com/msberends/AMR/blob/master/LICENSE).`