Import of data

This tutorial assumes you already imported the WHONET data with e.g. the readxl package. In RStudio, this can be done using the menu button ‘Import Dataset’ in the tab ‘Environment’. Choose the option ‘From Excel’ and select your exported file. Make sure date fields are imported correctly.

An example syntax could look like this:

library(readxl)
data <- read_excel(path = "path/to/your/file.xlsx")

This package comes with an example data set WHONET. We will use it for this analysis.

Preparation

First, load the relevant packages if you did not yet did this. I use the tidyverse for all of my analyses. All of them. If you don’t know it yet, I suggest you read about it on their website: https://www.tidyverse.org/.

library(dplyr)   # part of tidyverse
library(ggplot2) # part of tidyverse
library(AMR)     # this package

We will have to transform some variables to simplify and automate the analysis:

Microorganisms should be transformed to our own microorganism IDs (called an mo) using our Catalogue of Life reference data set, which contains all ~70,000 microorganisms from the taxonomic kingdoms Bacteria, Fungi and Protozoa. We do the tranformation with as.mo(). This function also recognises almost all WHONET abbreviations of microorganisms.
Antimicrobial results or interpretations have to be clean and valid. In other words, they should only contain values "S", "I" or "R". That is exactly where the as.rsi() function is for.

# transform variables
data <- WHONET %>%
  # get microbial ID based on given organism
  mutate(mo = as.mo(Organism)) %>% 
  # transform everything from "AMP_ND10" to "CIP_EE" to the new `rsi` class
  mutate_at(vars(AMP_ND10:CIP_EE), as.rsi)

No errors or warnings, so all values are transformed succesfully.

We created a package dedicated to data cleaning and checking, called the clean package. It gets automatically installed with the AMR package, so we only have to load it:

library(clean)

It contains the freq() function, to create frequency tables.

So let’s check our data, with a couple of frequency tables:

# our newly created `mo` variable
data %>% freq(mo, nmax = 10)

Frequency table

Class: mo (character)
Length: 500 (of which NA: 0 = 0%)
Unique: 39

Gram-negative: 281 (56.20%)
Gram-positive: 219 (43.80%)
Nr of genera: 17
Nr of species: 39

	Item	Count	Percent	Cum. Count	Cum. Percent
1	B_ESCHR_COLI	245	49.0%	245	49.0%
2	B_STPHY_CONS	74	14.8%	319	63.8%
3	B_STPHY_EPDR	38	7.6%	357	71.4%
4	B_STRPT_PNMN	31	6.2%	388	77.6%
5	B_STPHY_HMNS	21	4.2%	409	81.8%
6	B_PROTS_MRBL	9	1.8%	418	83.6%
7	B_ENTRC_FACM	8	1.6%	426	85.2%
8	B_STPHY_CPTS	8	1.6%	434	86.8%
9	B_ENTRB_CLOC	5	1.0%	439	87.8%
10	B_ENTRC_CLMB	4	0.8%	443	88.6%

(omitted 29 entries, n = 57 [11.40%])


# our transformed antibiotic columns
# amoxicillin/clavulanic acid (J01CR02) as an example
data %>% freq(AMC_ND2)

Frequency table

Class: factor > ordered > rsi (numeric)
Length: 500 (of which NA: 19 = 3.8%)
Levels: 3: S < I < R
Unique: 3

%SI: 78.6%

	Item	Count	Percent	Cum. Count	Cum. Percent
1	S	356	74.01%	356	74.01%
2	R	103	21.41%	459	95.43%
3	I	22	4.57%	481	100.00%

Analysis

(more will be available soon)

How to work with WHONET data

Matthijs S. Berends

16 October 2019

Import of data

Preparation

Analysis

Contents