AMR/vignettes/AMR.Rmd

---
title: "Introduction to the AMR package"
author: "Matthijs S. Berends"
output: 
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{Introduction to the AMR package}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE, results = 'markup'}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#"
)
```

This R package was intended to make microbial epidemiology easier. Most functions contain extensive help pages to get started.

This `AMR` package basically does four important things:

1. It **cleanses existing data**, by transforming it to reproducible and profound *classes*, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect:

   * Use `as.mo` to get an ID of a microorganism. The IDs are quite obvious - the ID of *E. coli* is "ESCCOL" and the ID of *S. aureus* is "STAAUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of *S. aureus*. Moreover, it can group all coagulase negative and positive *Staphylococci*, and can transform *Streptococci* into Lancefield groups. To find bacteria based on your input, this package contains a freely available database of almost 3,000 different (potential) human pathogenic microorganisms.
   * Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S".
   * Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called *ordinal* in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002".
   * Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine.
   
2. It **enhances existing data** and **adds new data** from data sets included in this package.

   * Use `EUCAST_rules` to apply [EUCAST expert rules to isolates](http://www.eucast.org/expert_rules_and_intrinsic_resistance/).
   * Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute).
     * You can also identify first *weighted* isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them.
   * Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported.
   * The data set `microorganisms` contains the taxonomic properties of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). Taxonomic names were downloaded from ITIS (Integrated Taxonomic Information System, http://www.itis.gov). Furhermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
   * The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_name` and `ab_tradenames` to look up values. The `ab_*` functions use `as.atc` internally so they support AI to guess your expected result. For example, `ab_name("Fluclox")`, `ab_name("Floxapen")` and `ab_name("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data.

3. It **analyses the data** with convenient functions that use well-known methods.

   * Calculate the resistance (and even co-resistance) of microbial isolates with the `portion_R`, `portion_IR`, `portion_I`, `portion_SI` and `portion_S` functions. Similarly, the *amount* of isolates can be determined with the `count_R`, `count_IR`, `count_I`, `count_SI` and `count_S` functions. All these functions can be used [with the `dplyr` package](https://dplyr.tidyverse.org/#usage) (e.g. in conjunction with [`summarise`](https://dplyr.tidyverse.org/reference/summarise.html))
   * Plot AMR results with `geom_rsi`, a function made for the `ggplot2` package
   * Predict antimicrobial resistance for the nextcoming years using logistic regression models with the `resistance_predict` function
   * Conduct descriptive statistics to enhance base R: calculate kurtosis, skewness and create frequency tables

4. It **teaches the user** how to use all the above actions.

   * The package contains extensive help pages with many examples.
   * It also contains an example data set called `septic_patients`. This data set contains:
     * 2,000 blood culture isolates from anonymised septic patients between 2001 and 2017 in the Northern Netherlands
     * Results of 40 antibiotics (each antibiotic in its own column) with a total of 38,414 antimicrobial results
     * Real and genuine data

----
```{r, echo = FALSE}
# this will print "2018" in 2018, and "2018-yyyy" after 2018.
yrs <- paste(unique(c(2018, format(Sys.Date(), "%Y"))), collapse = "-")
```
AMR, (c) `r yrs`, `r packageDescription("AMR")$URL`

Licensed under the [GNU General Public License v2.0](https://github.com/msberends/AMR/blob/master/LICENSE).
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`---`
			`title: "Introduction to the AMR package"`
			`author: "Matthijs S. Berends"`
			`output:`
			`rmarkdown::html_vignette:`
			`toc: true`
			`vignette: >`
new MOs, cleanup 2018-09-01 21:19:46 +02:00			`%\VignetteIndexEntry{Introduction to the AMR package}`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`%\VignetteEngine{knitr::rmarkdown}`
			`%\VignetteEncoding{UTF-8}`
			`---`

			```{r setup, include = FALSE, results = 'markup'}
			`knitr::opts_chunk$set(`
			`collapse = TRUE,`
			`comment = "#"`
			`)`
			```

			`This R package was intended to make microbial epidemiology easier. Most functions contain extensive help pages to get started.`

			This `AMR` package basically does four important things:

163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`1. It cleanses existing data, by transforming it to reproducible and profound classes, making the most efficient use of R. These functions all use artificial intelligence to guess results that you would expect:`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
added vanA to vanE positive Enterococci 2018-09-05 09:49:19 +02:00			* Use `as.mo` to get an ID of a microorganism. The IDs are quite obvious - the ID of E. coli is "ESCCOL" and the ID of S. aureus is "STAAUR". The function takes almost any text as input that looks like the name or code of a microorganism like "E. coli", "esco" and "esccol". Even `as.mo("MRSA")` will return the ID of S. aureus. Moreover, it can group all coagulase negative and positive Staphylococci, and can transform Streptococci into Lancefield groups. To find bacteria based on your input, this package contains a freely available database of almost 3,000 different (potential) human pathogenic microorganisms.
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			* Use `as.rsi` to transform values to valid antimicrobial results. It produces just S, I or R based on your input and warns about invalid values. Even values like "<=0.002; S" (combined MIC/RSI) will result in "S".
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			* Use `as.mic` to cleanse your MIC values. It produces a so-called factor (called ordinal in SPSS) with valid MIC values as levels. A value like "<=0.002; S" (combined MIC/RSI) will result in "<=0.002".
new MOs, cleanup 2018-09-01 21:19:46 +02:00			* Use `as.atc` to get the ATC code of an antibiotic as defined by the WHO. This package contains a database with most LIS codes, official names, DDDs and even trade names of antibiotics. For example, the values "Furabid", "Furadantin", "nitro" all return the ATC code of Nitrofurantoine.
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
			`2. It enhances existing data and adds new data from data sets included in this package.`

			* Use `EUCAST_rules` to apply [EUCAST expert rules to isolates](http://www.eucast.org/expert_rules_and_intrinsic_resistance/).
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			* Use `first_isolate` to identify the first isolates of every patient [using guidelines from the CLSI](https://clsi.org/standards/products/microbiology/documents/m39/) (Clinical and Laboratory Standards Institute).
			`* You can also identify first weighted isolates of every patient, an adjusted version of the CLSI guideline. This takes into account key antibiotics of every strain and compares them.`
			* Use `MDRO` (abbreviation of Multi Drug Resistant Organisms) to check your isolates for exceptional resistance with country-specific guidelines or EUCAST rules. Currently, national guidelines for Germany and the Netherlands are supported.
added taxonomic data from ITIS 2018-09-17 20:53:32 +02:00			* The data set `microorganisms` contains the taxonomic properties of almost 3,000 potential human pathogenic microorganisms (bacteria, fungi/yeasts and parasites). Taxonomic names were downloaded from ITIS (Integrated Taxonomic Information System, http://www.itis.gov). Furhermore, the colloquial name and Gram stain are available, which enables resistance analysis of e.g. different antibiotics per Gram stain. The package also contains functions to look up values in this data set like `mo_genus`, `mo_family`, `mo_gramstain` or even `mo_phylum`. As they use `as.mo` internally, they also use artificial intelligence. For example, `mo_genus("MRSA")` and `mo_genus("S. aureus")` will both return `"Staphylococcus"`. They also come with support for German, Dutch, French, Italian, Spanish and Portuguese. These functions can be used to add new variables to your data.
			* The data set `antibiotics` contains the ATC code, LIS codes, official name, trivial name and DDD of both oral and parenteral administration. It also contains a total of 298 trade names. Use functions like `ab_name` and `ab_tradenames` to look up values. The `ab_*` functions use `as.atc` internally so they support AI to guess your expected result. For example, `ab_name("Fluclox")`, `ab_name("Floxapen")` and `ab_name("J01CF05")` will all return `"Flucloxacillin"`. These functions can again be used to add new variables to your data.

atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			`3. It analyses the data with convenient functions that use well-known methods.`

163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			* Calculate the resistance (and even co-resistance) of microbial isolates with the `portion_R`, `portion_IR`, `portion_I`, `portion_SI` and `portion_S` functions. Similarly, the amount of isolates can be determined with the `count_R`, `count_IR`, `count_I`, `count_SI` and `count_S` functions. All these functions can be used [with the `dplyr` package](https://dplyr.tidyverse.org/#usage) (e.g. in conjunction with [`summarise`](https://dplyr.tidyverse.org/reference/summarise.html))
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			* Plot AMR results with `geom_rsi`, a function made for the `ggplot2` package
			* Predict antimicrobial resistance for the nextcoming years using logistic regression models with the `resistance_predict` function
			`* Conduct descriptive statistics to enhance base R: calculate kurtosis, skewness and create frequency tables`

163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`4. It teaches the user how to use all the above actions.`

			`* The package contains extensive help pages with many examples.`
			* It also contains an example data set called `septic_patients`. This data set contains:
			`* 2,000 blood culture isolates from anonymised septic patients between 2001 and 2017 in the Northern Netherlands`
			`* Results of 40 antibiotics (each antibiotic in its own column) with a total of 38,414 antimicrobial results`
			`* Real and genuine data`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00
			`----`
			```{r, echo = FALSE}
			`# this will print "2018" in 2018, and "2018-yyyy" after 2018.`
163 new trade names, added ab_tradenames 2018-08-29 12:27:37 +02:00			`yrs <- paste(unique(c(2018, format(Sys.Date(), "%Y"))), collapse = "-")`
atc and bactid functions, readme update 2018-08-25 22:01:14 +02:00			```
			AMR, (c) `r yrs`, `r packageDescription("AMR")$URL`

			`Licensed under the [GNU General Public License v2.0](https://github.com/msberends/AMR/blob/master/LICENSE).`