1
0
mirror of https://github.com/msberends/AMR.git synced 2025-12-17 07:00:19 +01:00

Built site for AMR@3.0.1.9003: ba30b08

This commit is contained in:
github-actions
2025-11-24 10:42:21 +00:00
parent 7d16891987
commit 141fc468f8
161 changed files with 21798 additions and 313 deletions

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
@@ -91,7 +91,7 @@
website update since they are based on randomly created values and the
page was written in <a href="https://rmarkdown.rstudio.com/" class="external-link">R
Markdown</a>. However, the methodology remains unchanged. This page was
generated on 13 October 2025.</p>
generated on 24 November 2025.</p>
<div class="section level2">
<h2 id="introduction">Introduction<a class="anchor" aria-label="anchor" href="#introduction"></a>
</h2>
@@ -147,21 +147,21 @@ make the structure of your data generally look like this:</p>
</tr></thead>
<tbody>
<tr class="odd">
<td align="center">2025-10-13</td>
<td align="center">2025-11-24</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">S</td>
</tr>
<tr class="even">
<td align="center">2025-10-13</td>
<td align="center">2025-11-24</td>
<td align="center">abcd</td>
<td align="center">Escherichia coli</td>
<td align="center">S</td>
<td align="center">R</td>
</tr>
<tr class="odd">
<td align="center">2025-10-13</td>
<td align="center">2025-11-24</td>
<td align="center">efgh</td>
<td align="center">Escherichia coli</td>
<td align="center">R</td>
@@ -1254,7 +1254,7 @@ function on a grouped <code>tibble</code>, i.e., using
provides an extension to that function:</p>
<div class="sourceCode" id="cb22"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/autoplot.html" class="external-link">autoplot</a></span><span class="op">(</span><span class="va">combined_ab</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_files/figure-html/unnamed-chunk-10-1.png" width="720"></p>
<p><img src="AMR_files/figure-html/unnamed-chunk-10-1.png" class="r-plt" width="720"></p>
<p>To calculate antimicrobial resistance in a more sensible way, also by
correcting for too few results, we use the <code><a href="../reference/proportion.html">resistance()</a></code> and
<code><a href="../reference/proportion.html">susceptibility()</a></code> functions.</p>
@@ -1348,7 +1348,7 @@ categories.</p>
<span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html" class="external-link">labs</a></span><span class="op">(</span>title <span class="op">=</span> <span class="st">"MIC Distribution and SIR Interpretation"</span>,</span>
<span> x <span class="op">=</span> <span class="st">"Sample Groups"</span>,</span>
<span> y <span class="op">=</span> <span class="st">"MIC (mg/L)"</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_files/figure-html/mic_plot-1.png" width="720"></p>
<p><img src="AMR_files/figure-html/mic_plot-1.png" class="r-plt" width="720"></p>
<p>This plot provides an intuitive way to assess susceptibility patterns
across different groups while incorporating clinical breakpoints.</p>
<p>For a more straightforward and less manual approach,
@@ -1357,12 +1357,12 @@ extended by this package to directly plot MIC and disk diffusion
values:</p>
<div class="sourceCode" id="cb27"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/autoplot.html" class="external-link">autoplot</a></span><span class="op">(</span><span class="va">mic_values</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_files/figure-html/autoplot-1.png" width="720"></p>
<p><img src="AMR_files/figure-html/autoplot-1.png" class="r-plt" width="720"></p>
<div class="sourceCode" id="cb28"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span></span>
<span><span class="co"># by providing `mo` and `ab`, colours will indicate the SIR interpretation:</span></span>
<span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/autoplot.html" class="external-link">autoplot</a></span><span class="op">(</span><span class="va">mic_values</span>, mo <span class="op">=</span> <span class="st">"K. pneumoniae"</span>, ab <span class="op">=</span> <span class="st">"cipro"</span>, guideline <span class="op">=</span> <span class="st">"EUCAST 2024"</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_files/figure-html/autoplot-2.png" width="720"></p>
<p><img src="AMR_files/figure-html/autoplot-2.png" class="r-plt" width="720"></p>
<hr>
<p><em>Author: Dr. Matthijs Berends, 23rd Feb 2025</em></p>
</div>

951
articles/AMR.md Normal file
View File

@@ -0,0 +1,951 @@
# Conduct AMR data analysis
**Note:** values on this page will change with every website update
since they are based on randomly created values and the page was written
in [R Markdown](https://rmarkdown.rstudio.com/). However, the
methodology remains unchanged. This page was generated on 24 November
2025.
## Introduction
Conducting AMR data analysis unfortunately requires in-depth knowledge
from different scientific fields, which makes it hard to do right. At
least, it requires:
- Good questions (always start with those!) and reliable data
- A thorough understanding of (clinical) epidemiology, to understand the
clinical and epidemiological relevance and possible bias of results
- A thorough understanding of (clinical) microbiology/infectious
diseases, to understand which microorganisms are causal to which
infections and the implications of pharmaceutical treatment, as well
as understanding intrinsic and acquired microbial resistance
- Experience with data analysis with microbiological tests and their
results, to understand the determination and limitations of MIC values
and their interpretations to SIR values
- Availability of the biological taxonomy of microorganisms and probably
normalisation factors for pharmaceuticals, such as defined daily doses
(DDD)
- Available (inter-)national guidelines, and profound methods to apply
them
Of course, we cannot instantly provide you with knowledge and
experience. But with this `AMR` package, we aimed at providing (1) tools
to simplify antimicrobial resistance data cleaning, transformation and
analysis, (2) methods to easily incorporate international guidelines and
(3) scientifically reliable reference data, including the requirements
mentioned above.
The `AMR` package enables standardised and reproducible AMR data
analysis, with the application of evidence-based rules, determination of
first isolates, translation of various codes for microorganisms and
antimicrobial agents, determination of (multi-drug) resistant
microorganisms, and calculation of antimicrobial resistance, prevalence
and future trends.
## Preparation
For this tutorial, we will create fake demonstration data to work with.
You can skip to [Cleaning the data](#cleaning-the-data) if you already
have your own data ready. If you start your analysis, try to make the
structure of your data generally look like this:
| date | patient_id | mo | AMX | CIP |
|:----------:|:----------:|:----------------:|:---:|:---:|
| 2025-11-24 | abcd | Escherichia coli | S | S |
| 2025-11-24 | abcd | Escherichia coli | S | R |
| 2025-11-24 | efgh | Escherichia coli | R | S |
### Needed R packages
As with many uses in R, we need some additional packages for AMR data
analysis. Our package works closely together with the [tidyverse
packages](https://www.tidyverse.org)
[`dplyr`](https://dplyr.tidyverse.org/) and
[`ggplot2`](https://ggplot2.tidyverse.org) by RStudio. The tidyverse
tremendously improves the way we conduct data science - it allows for a
very natural way of writing syntaxes and creating beautiful plots in R.
We will also use the `cleaner` package, that can be used for cleaning
data and creating frequency tables.
``` r
library(dplyr)
library(ggplot2)
library(AMR)
# (if not yet installed, install with:)
# install.packages(c("dplyr", "ggplot2", "AMR"))
```
The `AMR` package contains a data set `example_isolates_unclean`, which
might look data that users have extracted from their laboratory systems:
``` r
example_isolates_unclean
#> # A tibble: 3,000 × 8
#> patient_id hospital date bacteria AMX AMC CIP GEN
#> <chr> <chr> <date> <chr> <chr> <chr> <chr> <chr>
#> 1 J3 A 2012-11-21 E. coli R I S S
#> 2 R7 A 2018-04-03 K. pneumoniae R I S S
#> 3 P3 A 2014-09-19 E. coli R S S S
#> 4 P10 A 2015-12-10 E. coli S I S S
#> 5 B7 A 2015-03-02 E. coli S S S S
#> 6 W3 A 2018-03-31 S. aureus R S R S
#> 7 J8 A 2016-06-14 E. coli R S S S
#> 8 M3 A 2015-10-25 E. coli R S S S
#> 9 J3 A 2019-06-19 E. coli S S S S
#> 10 G6 A 2015-04-27 S. aureus S S S S
#> # 2,990 more rows
# we will use 'our_data' as the data set name for this tutorial
our_data <- example_isolates_unclean
```
For AMR data analysis, we would like the microorganism column to contain
valid, up-to-date taxonomy, and the antibiotic columns to be cleaned as
SIR values as well.
### Taxonomy of microorganisms
With [`as.mo()`](https://amr-for-r.org/reference/as.mo.md), users can
transform arbitrary microorganism names or codes to current taxonomy.
The `AMR` package contains up-to-date taxonomic data. To be specific,
currently included data were retrieved on 24 Jun 2024.
The codes of the AMR packages that come from
[`as.mo()`](https://amr-for-r.org/reference/as.mo.md) are short, but
still human readable. More importantly,
[`as.mo()`](https://amr-for-r.org/reference/as.mo.md) supports all kinds
of input:
``` r
as.mo("Klebsiella pneumoniae")
#> Class 'mo'
#> [1] B_KLBSL_PNMN
as.mo("K. pneumoniae")
#> Class 'mo'
#> [1] B_KLBSL_PNMN
as.mo("KLEPNE")
#> Class 'mo'
#> [1] B_KLBSL_PNMN
as.mo("KLPN")
#> Class 'mo'
#> [1] B_KLBSL_PNMN
```
The first character in above codes denote their taxonomic kingdom, such
as Bacteria (B), Fungi (F), and Protozoa (P).
The `AMR` package also contain functions to directly retrieve taxonomic
properties, such as the name, genus, species, family, order, and even
Gram-stain. They all start with `mo_` and they use
[`as.mo()`](https://amr-for-r.org/reference/as.mo.md) internally, so
that still any arbitrary user input can be used:
``` r
mo_family("K. pneumoniae")
#> [1] "Enterobacteriaceae"
mo_genus("K. pneumoniae")
#> [1] "Klebsiella"
mo_species("K. pneumoniae")
#> [1] "pneumoniae"
mo_gramstain("Klebsiella pneumoniae")
#> [1] "Gram-negative"
mo_ref("K. pneumoniae")
#> [1] "Trevisan, 1887"
mo_snomed("K. pneumoniae")
#> [[1]]
#> [1] "1098101000112102" "446870005" "1098201000112108" "409801009"
#> [5] "56415008" "714315002" "713926009"
```
Now we can thus clean our data:
``` r
our_data$bacteria <- as.mo(our_data$bacteria, info = TRUE)
#> Retrieved values from the `microorganisms.codes` data set for "ESCCOL",
#> "KLEPNE", "STAAUR", and "STRPNE".
#> Microorganism translation was uncertain for four microorganisms. Run
#> `mo_uncertainties()` to review these uncertainties, or use
#> `add_custom_microorganisms()` to add custom entries.
```
Apparently, there was some uncertainty about the translation to
taxonomic codes. Lets check this:
``` r
mo_uncertainties()
#> Matching scores are based on the resemblance between the input and the full
#> taxonomic name, and the pathogenicity in humans. See `?mo_matching_score`.
#> Colour keys: 0.000-0.549 0.550-0.649 0.650-0.749 0.750-1.000
#>
#> --------------------------------------------------------------------------------
#> "E. coli" -> Escherichia coli (B_ESCHR_COLI, 0.688)
#> Also matched: Enterococcus crotali (0.650), Escherichia coli coli
#> (0.643), Escherichia coli expressing (0.611), Enterobacter cowanii
#> (0.600), Enterococcus columbae (0.595), Enterococcus camelliae (0.591),
#> Enterococcus casseliflavus (0.577), Enterobacter cloacae cloacae
#> (0.571), Enterobacter cloacae complex (0.571), and Enterobacter cloacae
#> dissolvens (0.565)
#> --------------------------------------------------------------------------------
#> "K. pneumoniae" -> Klebsiella pneumoniae (B_KLBSL_PNMN, 0.786)
#> Also matched: Klebsiella pneumoniae complex (0.707), Klebsiella
#> pneumoniae ozaenae (0.707), Klebsiella pneumoniae pneumoniae (0.688),
#> Klebsiella pneumoniae rhinoscleromatis (0.658), Klebsiella pasteurii
#> (0.500), Klebsiella planticola (0.500), Kingella potus (0.400),
#> Kluyveromyces pseudotropicale (0.386), Kluyveromyces pseudotropicalis
#> (0.363), and Kosakonia pseudosacchari (0.361)
#> --------------------------------------------------------------------------------
#> "S. aureus" -> Staphylococcus aureus (B_STPHY_AURS, 0.690)
#> Also matched: Staphylococcus aureus aureus (0.643), Staphylococcus
#> argenteus (0.625), Staphylococcus aureus anaerobius (0.625),
#> Staphylococcus auricularis (0.615), Salmonella Aurelianis (0.595),
#> Salmonella Aarhus (0.588), Salmonella Amounderness (0.587),
#> Staphylococcus argensis (0.587), Streptococcus australis (0.587), and
#> Salmonella choleraesuis arizonae (0.562)
#> --------------------------------------------------------------------------------
#> "S. pneumoniae" -> Streptococcus pneumoniae (B_STRPT_PNMN, 0.750)
#> Also matched: Streptococcus pseudopneumoniae (0.700), Streptococcus
#> phocae salmonis (0.552), Serratia proteamaculans quinovora (0.545),
#> Streptococcus pseudoporcinus (0.536), Staphylococcus piscifermentans
#> (0.533), Staphylococcus pseudintermedius (0.532), Serratia
#> proteamaculans proteamaculans (0.526), Streptococcus gallolyticus
#> pasteurianus (0.526), Salmonella Portanigra (0.524), and Streptococcus
#> periodonticum (0.519)
#>
#> Only the first 10 other matches of each record are shown. Run
#> `print(mo_uncertainties(), n = ...)` to view more entries, or save
#> `mo_uncertainties()` to an object.
```
Thats all good.
### Antibiotic results
The column with antibiotic test results must also be cleaned. The `AMR`
package comes with three new data types to work with such test results:
`mic` for minimal inhibitory concentrations (MIC), `disk` for disk
diffusion diameters, and `sir` for SIR data that have been interpreted
already. This package can also determine SIR values based on MIC or disk
diffusion values, read more about that on the
[`as.sir()`](https://amr-for-r.org/reference/as.sir.md) page.
For now, we will just clean the SIR columns in our data using dplyr:
``` r
# method 1, be explicit about the columns:
our_data <- our_data %>%
mutate_at(vars(AMX:GEN), as.sir)
# method 2, let the AMR package determine the eligible columns
our_data <- our_data %>%
mutate_if(is_sir_eligible, as.sir)
# result:
our_data
#> # A tibble: 3,000 × 8
#> patient_id hospital date bacteria AMX AMC CIP GEN
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir>
#> 1 J3 A 2012-11-21 B_ESCHR_COLI R I S S
#> 2 R7 A 2018-04-03 B_KLBSL_PNMN R I S S
#> 3 P3 A 2014-09-19 B_ESCHR_COLI R S S S
#> 4 P10 A 2015-12-10 B_ESCHR_COLI S I S S
#> 5 B7 A 2015-03-02 B_ESCHR_COLI S S S S
#> 6 W3 A 2018-03-31 B_STPHY_AURS R S R S
#> 7 J8 A 2016-06-14 B_ESCHR_COLI R S S S
#> 8 M3 A 2015-10-25 B_ESCHR_COLI R S S S
#> 9 J3 A 2019-06-19 B_ESCHR_COLI S S S S
#> 10 G6 A 2015-04-27 B_STPHY_AURS S S S S
#> # 2,990 more rows
```
This is basically it for the cleaning, time to start the data inclusion.
### First isolates
We need to know which isolates we can *actually* use for analysis
without repetition bias.
To conduct an analysis of antimicrobial resistance, you must [only
include the first isolate of every patient per
episode](https:/pubmed.ncbi.nlm.nih.gov/17304462/) (Hindler *et al.*,
Clin Infect Dis. 2007). If you would not do this, you could easily get
an overestimate or underestimate of the resistance of an antibiotic.
Imagine that a patient was admitted with an MRSA and that it was found
in 5 different blood cultures the following weeks (yes, some countries
like the Netherlands have these blood drawing policies). The resistance
percentage of oxacillin of all isolates would be overestimated, because
you included this MRSA more than once. It would clearly be [selection
bias](https://en.wikipedia.org/wiki/Selection_bias).
The Clinical and Laboratory Standards Institute (CLSI) appoints this as
follows:
> *(…) When preparing a cumulative antibiogram to guide clinical
> decisions about empirical antimicrobial therapy of initial infections,
> **only the first isolate of a given species per patient, per analysis
> period (eg, one year) should be included, irrespective of body site,
> antimicrobial susceptibility profile, or other phenotypical
> characteristics (eg, biotype)**. The first isolate is easily
> identified, and cumulative antimicrobial susceptibility test data
> prepared using the first isolate are generally comparable to
> cumulative antimicrobial susceptibility test data calculated by other
> methods, providing duplicate isolates are excluded.*
> [M39-A4 Analysis and Presentation of Cumulative Antimicrobial
> Susceptibility Test Data, 4th Edition. CLSI, 2014. Chapter
> 6.4](https://clsi.org/standards/products/microbiology/documents/m39/)
This `AMR` package includes this methodology with the
[`first_isolate()`](https://amr-for-r.org/reference/first_isolate.md)
function and is able to apply the four different methods as defined by
[Hindler *et al.* in
2007](https://academic.oup.com/cid/article/44/6/867/364325):
phenotype-based, episode-based, patient-based, isolate-based. The right
method depends on your goals and analysis, but the default
phenotype-based method is in any case the method to properly correct for
most duplicate isolates. Read more about the methods on the
[`first_isolate()`](https://amr-for-r.org/reference/first_isolate.md)
page.
The outcome of the function can easily be added to our data:
``` r
our_data <- our_data %>%
mutate(first = first_isolate(info = TRUE))
#> Determining first isolates using an episode length of 365 days
#> Using column 'bacteria' as input for `col_mo`.
#> Using column 'date' as input for `col_date`.
#> Using column 'patient_id' as input for `col_patient_id`.
#> Basing inclusion on all antimicrobial results, using a points threshold
#> of 2
#> => Found 2,724 'phenotype-based' first isolates (90.8% of total where a
#> microbial ID was available)
```
So only 91% is suitable for resistance analysis! We can now filter on it
with the [`filter()`](https://dplyr.tidyverse.org/reference/filter.html)
function, also from the `dplyr` package:
``` r
our_data_1st <- our_data %>%
filter(first == TRUE)
```
For future use, the above two syntaxes can be shortened:
``` r
our_data_1st <- our_data %>%
filter_first_isolate()
```
So we end up with 2 724 isolates for analysis. Now our data looks like:
``` r
our_data_1st
#> # A tibble: 2,724 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
#> 1 J3 A 2012-11-21 B_ESCHR_COLI R I S S TRUE
#> 2 R7 A 2018-04-03 B_KLBSL_PNMN R I S S TRUE
#> 3 P3 A 2014-09-19 B_ESCHR_COLI R S S S TRUE
#> 4 P10 A 2015-12-10 B_ESCHR_COLI S I S S TRUE
#> 5 B7 A 2015-03-02 B_ESCHR_COLI S S S S TRUE
#> 6 W3 A 2018-03-31 B_STPHY_AURS R S R S TRUE
#> 7 M3 A 2015-10-25 B_ESCHR_COLI R S S S TRUE
#> 8 J3 A 2019-06-19 B_ESCHR_COLI S S S S TRUE
#> 9 G6 A 2015-04-27 B_STPHY_AURS S S S S TRUE
#> 10 P4 A 2011-06-21 B_ESCHR_COLI S S S S TRUE
#> # 2,714 more rows
```
Time for the analysis.
## Analysing the data
The base R [`summary()`](https://rdrr.io/r/base/summary.html) function
gives a good first impression, as it comes with support for the new `mo`
and `sir` classes that we now have in our data set:
``` r
summary(our_data_1st)
#> patient_id hospital date
#> Length:2724 Length:2724 Min. :2011-01-01
#> Class :character Class :character 1st Qu.:2013-04-07
#> Mode :character Mode :character Median :2015-06-03
#> Mean :2015-06-09
#> 3rd Qu.:2017-08-11
#> Max. :2019-12-27
#> bacteria AMX AMC
#> Class :mo Class:sir Class:sir
#> <NA> :0 %S :41.6% (n=1133) %S :52.6% (n=1432)
#> Unique:4 %SDD : 0.0% (n=0) %SDD : 0.0% (n=0)
#> #1 :B_ESCHR_COLI %I :16.4% (n=446) %I :12.2% (n=333)
#> #2 :B_STPHY_AURS %R :42.0% (n=1145) %R :35.2% (n=959)
#> #3 :B_STRPT_PNMN %NI : 0.0% (n=0) %NI : 0.0% (n=0)
#> CIP GEN first
#> Class:sir Class:sir Mode:logical
#> %S :52.5% (n=1431) %S :61.0% (n=1661) TRUE:2724
#> %SDD : 0.0% (n=0) %SDD : 0.0% (n=0)
#> %I : 6.5% (n=176) %I : 3.0% (n=82)
#> %R :41.0% (n=1117) %R :36.0% (n=981)
#> %NI : 0.0% (n=0) %NI : 0.0% (n=0)
glimpse(our_data_1st)
#> Rows: 2,724
#> Columns: 9
#> $ patient_id <chr> "J3", "R7", "P3", "P10", "B7", "W3", "M3", "J3", "G6", "P4"…
#> $ hospital <chr> "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A",…
#> $ date <date> 2012-11-21, 2018-04-03, 2014-09-19, 2015-12-10, 2015-03-02…
#> $ bacteria <mo> "B_ESCHR_COLI", "B_KLBSL_PNMN", "B_ESCHR_COLI", "B_ESCHR_COL…
#> $ AMX <sir> R, R, R, S, S, R, R, S, S, S, S, R, S, S, R, R, R, R, S, R,…
#> $ AMC <sir> I, I, S, I, S, S, S, S, S, S, S, S, S, S, S, S, S, R, S, S,…
#> $ CIP <sir> S, S, S, S, S, R, S, S, S, S, S, S, S, S, S, S, S, S, S, S,…
#> $ GEN <sir> S, S, S, S, S, S, S, S, S, S, S, R, S, S, S, S, S, S, S, S,…
#> $ first <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,…
# number of unique values per column:
sapply(our_data_1st, n_distinct)
#> patient_id hospital date bacteria AMX AMC CIP
#> 260 3 1854 4 3 3 3
#> GEN first
#> 3 1
```
### Availability of species
To just get an idea how the species are distributed, create a frequency
table with [`count()`](https://amr-for-r.org/reference/count.md) based
on the name of the microorganisms:
``` r
our_data %>%
count(mo_name(bacteria), sort = TRUE)
#> # A tibble: 4 × 2
#> `mo_name(bacteria)` n
#> <chr> <int>
#> 1 Escherichia coli 1518
#> 2 Staphylococcus aureus 730
#> 3 Streptococcus pneumoniae 426
#> 4 Klebsiella pneumoniae 326
our_data_1st %>%
count(mo_name(bacteria), sort = TRUE)
#> # A tibble: 4 × 2
#> `mo_name(bacteria)` n
#> <chr> <int>
#> 1 Escherichia coli 1321
#> 2 Staphylococcus aureus 682
#> 3 Streptococcus pneumoniae 402
#> 4 Klebsiella pneumoniae 319
```
### Select and filter with antibiotic selectors
Using so-called antibiotic class selectors, you can select or filter
columns based on the antibiotic class that your antibiotic results are
in:
``` r
our_data_1st %>%
select(date, aminoglycosides())
#> For `aminoglycosides()` using column 'GEN' (gentamicin)
#> # A tibble: 2,724 × 2
#> date GEN
#> <date> <sir>
#> 1 2012-11-21 S
#> 2 2018-04-03 S
#> 3 2014-09-19 S
#> 4 2015-12-10 S
#> 5 2015-03-02 S
#> 6 2018-03-31 S
#> 7 2015-10-25 S
#> 8 2019-06-19 S
#> 9 2015-04-27 S
#> 10 2011-06-21 S
#> # 2,714 more rows
our_data_1st %>%
select(bacteria, betalactams())
#> For `betalactams()` using columns 'AMX' (amoxicillin) and 'AMC'
#> (amoxicillin/clavulanic acid)
#> # A tibble: 2,724 × 3
#> bacteria AMX AMC
#> <mo> <sir> <sir>
#> 1 B_ESCHR_COLI R I
#> 2 B_KLBSL_PNMN R I
#> 3 B_ESCHR_COLI R S
#> 4 B_ESCHR_COLI S I
#> 5 B_ESCHR_COLI S S
#> 6 B_STPHY_AURS R S
#> 7 B_ESCHR_COLI R S
#> 8 B_ESCHR_COLI S S
#> 9 B_STPHY_AURS S S
#> 10 B_ESCHR_COLI S S
#> # 2,714 more rows
our_data_1st %>%
select(bacteria, where(is.sir))
#> # A tibble: 2,724 × 5
#> bacteria AMX AMC CIP GEN
#> <mo> <sir> <sir> <sir> <sir>
#> 1 B_ESCHR_COLI R I S S
#> 2 B_KLBSL_PNMN R I S S
#> 3 B_ESCHR_COLI R S S S
#> 4 B_ESCHR_COLI S I S S
#> 5 B_ESCHR_COLI S S S S
#> 6 B_STPHY_AURS R S R S
#> 7 B_ESCHR_COLI R S S S
#> 8 B_ESCHR_COLI S S S S
#> 9 B_STPHY_AURS S S S S
#> 10 B_ESCHR_COLI S S S S
#> # 2,714 more rows
# filtering using AB selectors is also possible:
our_data_1st %>%
filter(any(aminoglycosides() == "R"))
#> For `aminoglycosides()` using column 'GEN' (gentamicin)
#> # A tibble: 981 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
#> 1 J5 A 2017-12-25 B_STRPT_PNMN R S S R TRUE
#> 2 X1 A 2017-07-04 B_STPHY_AURS R S S R TRUE
#> 3 B3 A 2016-07-24 B_ESCHR_COLI S S S R TRUE
#> 4 V7 A 2012-04-03 B_ESCHR_COLI S S S R TRUE
#> 5 C9 A 2017-03-23 B_ESCHR_COLI S S S R TRUE
#> 6 R1 A 2018-06-10 B_STPHY_AURS S S S R TRUE
#> 7 S2 A 2013-07-19 B_STRPT_PNMN S S S R TRUE
#> 8 P5 A 2019-03-09 B_STPHY_AURS S S S R TRUE
#> 9 Q8 A 2019-08-10 B_STPHY_AURS S S S R TRUE
#> 10 K5 A 2013-03-15 B_STRPT_PNMN S S S R TRUE
#> # 971 more rows
our_data_1st %>%
filter(all(betalactams() == "R"))
#> For `betalactams()` using columns 'AMX' (amoxicillin) and 'AMC'
#> (amoxicillin/clavulanic acid)
#> # A tibble: 462 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
#> # 452 more rows
# even works in base R (since R 3.0):
our_data_1st[all(betalactams() == "R"), ]
#> For `betalactams()` using columns 'AMX' (amoxicillin) and 'AMC'
#> (amoxicillin/clavulanic acid)
#> # A tibble: 462 × 9
#> patient_id hospital date bacteria AMX AMC CIP GEN first
#> <chr> <chr> <date> <mo> <sir> <sir> <sir> <sir> <lgl>
#> 1 M7 A 2013-07-22 B_STRPT_PNMN R R S S TRUE
#> 2 R10 A 2013-12-20 B_STPHY_AURS R R S S TRUE
#> 3 R7 A 2015-10-25 B_STPHY_AURS R R S S TRUE
#> 4 R8 A 2019-10-25 B_STPHY_AURS R R S S TRUE
#> 5 B6 A 2016-11-20 B_ESCHR_COLI R R R R TRUE
#> 6 I7 A 2015-08-19 B_ESCHR_COLI R R S S TRUE
#> 7 N3 A 2014-12-29 B_STRPT_PNMN R R R S TRUE
#> 8 Q2 A 2019-09-22 B_ESCHR_COLI R R S S TRUE
#> 9 X7 A 2011-03-20 B_ESCHR_COLI R R S R TRUE
#> 10 V1 A 2018-08-07 B_STPHY_AURS R R S S TRUE
#> # 452 more rows
```
### Generate antibiograms
Since AMR v2.0 (March 2023), it is very easy to create different types
of antibiograms, with support for 20 different languages.
There are four antibiogram types, as proposed by Klinker *et al.* (2021,
[DOI
10.1177/20499361211011373](https://doi.org/10.1177/20499361211011373)),
and they are all supported by the new
[`antibiogram()`](https://amr-for-r.org/reference/antibiogram.md)
function:
1. **Traditional Antibiogram (TA)** e.g, for the susceptibility of
*Pseudomonas aeruginosa* to piperacillin/tazobactam (TZP)
2. **Combination Antibiogram (CA)** e.g, for the sdditional
susceptibility of *Pseudomonas aeruginosa* to TZP + tobramycin
versus TZP alone
3. **Syndromic Antibiogram (SA)** e.g, for the susceptibility of
*Pseudomonas aeruginosa* to TZP among respiratory specimens
(obtained among ICU patients only)
4. **Weighted-Incidence Syndromic Combination Antibiogram (WISCA)**
e.g, for the susceptibility of *Pseudomonas aeruginosa* to TZP among
respiratory specimens (obtained among ICU patients only) for male
patients age \>=65 years with heart failure
In this section, we show how to use the
[`antibiogram()`](https://amr-for-r.org/reference/antibiogram.md)
function to create any of the above antibiogram types. For starters,
this is what the included `example_isolates` data set looks like:
``` r
example_isolates
#> # A tibble: 2,000 × 46
#> date patient age gender ward mo PEN OXA FLC AMX
#> <date> <chr> <dbl> <chr> <chr> <mo> <sir> <sir> <sir> <sir>
#> 1 2002-01-02 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 2 2002-01-03 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 3 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 4 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 5 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 6 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 7 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 8 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 9 2002-01-16 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 10 2002-01-17 858515 79 F ICU B_STPHY_EPDR R NA S NA
#> # 1,990 more rows
#> # 36 more variables: AMC <sir>, AMP <sir>, TZP <sir>, CZO <sir>, FEP <sir>,
#> # CXM <sir>, FOX <sir>, CTX <sir>, CAZ <sir>, CRO <sir>, GEN <sir>,
#> # TOB <sir>, AMK <sir>, KAN <sir>, TMP <sir>, SXT <sir>, NIT <sir>,
#> # FOS <sir>, LNZ <sir>, CIP <sir>, MFX <sir>, VAN <sir>, TEC <sir>,
#> # TCY <sir>, TGC <sir>, DOX <sir>, ERY <sir>, CLI <sir>, AZM <sir>,
#> # IPM <sir>, MEM <sir>, MTR <sir>, CHL <sir>, COL <sir>, MUP <sir>, …
```
#### Traditional Antibiogram
To create a traditional antibiogram, simply state which antibiotics
should be used. The `antibiotics` argument in the
[`antibiogram()`](https://amr-for-r.org/reference/antibiogram.md)
function supports any (combination) of the previously mentioned
antibiotic class selectors:
``` r
antibiogram(example_isolates,
antibiotics = c(aminoglycosides(), carbapenems()))
#> For `aminoglycosides()` using columns 'GEN' (gentamicin), 'TOB'
#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)
#> For `carbapenems()` using columns 'IPM' (imipenem) and 'MEM' (meropenem)
```
| Pathogen | Amikacin | Gentamicin | Imipenem | Kanamycin | Meropenem | Tobramycin |
|:-----------------|:---------------------|:--------------------|:---------------------|:----------------|:---------------------|:--------------------|
| CoNS | 0% (0-8%,N=43) | 86% (82-90%,N=309) | 52% (37-67%,N=48) | 0% (0-8%,N=43) | 52% (37-67%,N=48) | 22% (12-35%,N=55) |
| *E. coli* | 100% (98-100%,N=171) | 98% (96-99%,N=460) | 100% (99-100%,N=422) | NA | 100% (99-100%,N=418) | 97% (96-99%,N=462) |
| *E. faecalis* | 0% (0-9%,N=39) | 0% (0-9%,N=39) | 100% (91-100%,N=38) | 0% (0-9%,N=39) | NA | 0% (0-9%,N=39) |
| *K. pneumoniae* | NA | 90% (79-96%,N=58) | 100% (93-100%,N=51) | NA | 100% (93-100%,N=53) | 90% (79-96%,N=58) |
| *P. aeruginosa* | NA | 100% (88-100%,N=30) | NA | 0% (0-12%,N=30) | NA | 100% (88-100%,N=30) |
| *P. mirabilis* | NA | 94% (80-99%,N=34) | 94% (79-99%,N=32) | NA | NA | 94% (80-99%,N=34) |
| *S. aureus* | NA | 99% (97-100%,N=233) | NA | NA | NA | 98% (92-100%,N=86) |
| *S. epidermidis* | 0% (0-8%,N=44) | 79% (71-85%,N=163) | NA | 0% (0-8%,N=44) | NA | 51% (40-61%,N=89) |
| *S. hominis* | NA | 92% (84-97%,N=80) | NA | NA | NA | 85% (74-93%,N=62) |
| *S. pneumoniae* | 0% (0-3%,N=117) | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) |
Notice that the
[`antibiogram()`](https://amr-for-r.org/reference/antibiogram.md)
function automatically prints in the right format when using Quarto or R
Markdown (such as this page), and even applies italics for taxonomic
names (by using
[`italicise_taxonomy()`](https://amr-for-r.org/reference/italicise_taxonomy.md)
internally).
It also uses the language of your OS if this is either English, Arabic,
Bengali, Chinese, Czech, Danish, Dutch, Finnish, French, German, Greek,
Hindi, Indonesian, Italian, Japanese, Korean, Norwegian, Polish,
Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish,
Ukrainian, Urdu, or Vietnamese. In this next example, we force the
language to be Spanish using the `language` argument:
``` r
antibiogram(example_isolates,
mo_transform = "gramstain",
antibiotics = aminoglycosides(),
ab_transform = "name",
language = "es")
#> For `aminoglycosides()` using columns 'GEN' (gentamicin), 'TOB'
#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)
```
| Patógeno | Amikacina | Gentamicina | Kanamicina | Tobramicina |
|:--------------|:-------------------|:--------------------|:----------------|:-------------------|
| Gram negativo | 98% (96-99%,N=256) | 96% (95-98%,N=684) | 0% (0-10%,N=35) | 96% (94-97%,N=686) |
| Gram positivo | 0% (0-1%,N=436) | 63% (60-66%,N=1170) | 0% (0-1%,N=436) | 34% (31-38%,N=665) |
#### Combined Antibiogram
To create a combined antibiogram, use antibiotic codes or names with a
plus `+` character like this:
``` r
combined_ab <- antibiogram(example_isolates,
antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"),
ab_transform = NULL)
combined_ab
```
| Pathogen | TZP | TZP + GEN | TZP + TOB |
|:-----------------|:---------------------|:---------------------|:---------------------|
| CoNS | 30% (16-49%,N=33) | 97% (95-99%,N=274) | NA |
| *E. coli* | 94% (92-96%,N=416) | 100% (98-100%,N=459) | 99% (97-100%,N=461) |
| *K. pneumoniae* | 89% (77-96%,N=53) | 93% (83-98%,N=58) | 93% (83-98%,N=58) |
| *P. aeruginosa* | NA | 100% (88-100%,N=30) | 100% (88-100%,N=30) |
| *P. mirabilis* | NA | 100% (90-100%,N=34) | 100% (90-100%,N=34) |
| *S. aureus* | NA | 100% (98-100%,N=231) | 100% (96-100%,N=91) |
| *S. epidermidis* | NA | 100% (97-100%,N=128) | 100% (92-100%,N=46) |
| *S. hominis* | NA | 100% (95-100%,N=74) | 100% (93-100%,N=53) |
| *S. pneumoniae* | 100% (97-100%,N=112) | 100% (97-100%,N=112) | 100% (97-100%,N=112) |
#### Syndromic Antibiogram
To create a syndromic antibiogram, the `syndromic_group` argument must
be used. This can be any column in the data, or e.g. an
[`ifelse()`](https://rdrr.io/r/base/ifelse.html) with calculations based
on certain columns:
``` r
antibiogram(example_isolates,
antibiotics = c(aminoglycosides(), carbapenems()),
syndromic_group = "ward")
#> For `aminoglycosides()` using columns 'GEN' (gentamicin), 'TOB'
#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)
#> For `carbapenems()` using columns 'IPM' (imipenem) and 'MEM' (meropenem)
```
| Syndromic Group | Pathogen | Amikacin | Gentamicin | Imipenem | Kanamycin | Meropenem | Tobramycin |
|:----------------|:-----------------|:---------------------|:--------------------|:---------------------|:----------------|:---------------------|:--------------------|
| Clinical | CoNS | NA | 89% (84-93%,N=205) | 57% (39-74%,N=35) | NA | 57% (39-74%,N=35) | 26% (12-45%,N=31) |
| ICU | CoNS | NA | 79% (68-88%,N=73) | NA | NA | NA | NA |
| Outpatient | CoNS | NA | 84% (66-95%,N=31) | NA | NA | NA | NA |
| Clinical | *E. coli* | 100% (97-100%,N=104) | 98% (96-99%,N=297) | 100% (99-100%,N=266) | NA | 100% (99-100%,N=276) | 98% (96-99%,N=299) |
| ICU | *E. coli* | 100% (93-100%,N=52) | 99% (95-100%,N=137) | 100% (97-100%,N=133) | NA | 100% (97-100%,N=118) | 96% (92-99%,N=137) |
| Clinical | *K. pneumoniae* | NA | 92% (81-98%,N=51) | 100% (92-100%,N=44) | NA | 100% (92-100%,N=46) | 92% (81-98%,N=51) |
| Clinical | *P. mirabilis* | NA | 100% (88-100%,N=30) | NA | NA | NA | 100% (88-100%,N=30) |
| Clinical | *S. aureus* | NA | 99% (95-100%,N=150) | NA | NA | NA | 97% (89-100%,N=63) |
| ICU | *S. aureus* | NA | 100% (95-100%,N=66) | NA | NA | NA | NA |
| Clinical | *S. epidermidis* | NA | 82% (72-90%,N=79) | NA | NA | NA | 55% (39-70%,N=44) |
| ICU | *S. epidermidis* | NA | 72% (60-82%,N=75) | NA | NA | NA | 41% (26-58%,N=41) |
| Clinical | *S. hominis* | NA | 96% (85-99%,N=45) | NA | NA | NA | 94% (79-99%,N=31) |
| Clinical | *S. pneumoniae* | 0% (0-5%,N=78) | 0% (0-5%,N=78) | NA | 0% (0-5%,N=78) | NA | 0% (0-5%,N=78) |
| ICU | *S. pneumoniae* | 0% (0-12%,N=30) | 0% (0-12%,N=30) | NA | 0% (0-12%,N=30) | NA | 0% (0-12%,N=30) |
#### Weighted-Incidence Syndromic Combination Antibiogram (WISCA)
To create a **Weighted-Incidence Syndromic Combination Antibiogram
(WISCA)**, simply set `wisca = TRUE` in the
[`antibiogram()`](https://amr-for-r.org/reference/antibiogram.md)
function, or use the dedicated
[`wisca()`](https://amr-for-r.org/reference/antibiogram.md) function.
Unlike traditional antibiograms, WISCA provides syndrome-based
susceptibility estimates, weighted by pathogen incidence and
antimicrobial susceptibility patterns.
``` r
example_isolates %>%
wisca(antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"),
minimum = 10) # Recommended threshold: ≥30
```
| Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
|:------------------------|:-------------------------------------|:-------------------------------------|
| 69.4% (64.3-74.3%) | 92.6% (91.1-93.9%) | 88.7% (85.8-91.2%) |
WISCA uses a **Bayesian decision model** to integrate data from multiple
pathogens, improving empirical therapy guidance, especially for
low-incidence infections. It is **pathogen-agnostic**, meaning results
are syndrome-based rather than stratified by microorganism.
For reliable results, ensure your data includes **only first isolates**
(use
[`first_isolate()`](https://amr-for-r.org/reference/first_isolate.md))
and consider filtering for **the top *n* species** (use
[`top_n_microorganisms()`](https://amr-for-r.org/reference/top_n_microorganisms.md)),
as WISCA outcomes are most meaningful when based on robust incidence
estimates.
For **patient- or syndrome-specific WISCA**, run the function on a
grouped `tibble`, i.e., using
[`group_by()`](https://dplyr.tidyverse.org/reference/group_by.html)
first:
``` r
example_isolates %>%
top_n_microorganisms(n = 10) %>%
group_by(age_group = age_groups(age, c(25, 50, 75)),
gender) %>%
wisca(antibiotics = c("TZP", "TZP+TOB", "TZP+GEN"))
```
| age_group | gender | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
|:----------|:-------|:------------------------|:-------------------------------------|:-------------------------------------|
| 0-24 | F | 56.6% (25.2-83.9%) | 73.6% (48-91.6%) | 68.6% (42.9-89.5%) |
| 0-24 | M | 60.3% (28.4-87.1%) | 79.7% (57.6-94.2%) | 60.1% (29.5-87.7%) |
| 25-49 | F | 66.6% (45.6-85.5%) | 91.7% (84.6-96.7%) | 83% (67.9-94%) |
| 25-49 | M | 56.4% (29.1-81.7%) | 89.2% (80.3-95.7%) | 72.4% (49.7-90%) |
| 50-74 | F | 67.8% (55.8-80.1%) | 95.6% (93.2-97.5%) | 88.1% (80.4-94.6%) |
| 50-74 | M | 66.2% (54.8-75.8%) | 95.2% (92.4-97.4%) | 84.4% (74.4-92.5%) |
| 75+ | F | 71.7% (61-81.7%) | 96.6% (94.4-98.2%) | 90.6% (84.6-95.3%) |
| 75+ | M | 72.9% (63.8-82%) | 96.6% (94.6-98.1%) | 92.8% (87.8-96.5%) |
#### Plotting antibiograms
Antibiograms can be plotted using
[`autoplot()`](https://ggplot2.tidyverse.org/reference/autoplot.html)
from the `ggplot2` packages, since this `AMR` package provides an
extension to that function:
``` r
autoplot(combined_ab)
```
![](AMR_files/figure-html/unnamed-chunk-10-1.png)
To calculate antimicrobial resistance in a more sensible way, also by
correcting for too few results, we use the
[`resistance()`](https://amr-for-r.org/reference/proportion.md) and
[`susceptibility()`](https://amr-for-r.org/reference/proportion.md)
functions.
### Resistance percentages
The functions
[`resistance()`](https://amr-for-r.org/reference/proportion.md) and
[`susceptibility()`](https://amr-for-r.org/reference/proportion.md) can
be used to calculate antimicrobial resistance or susceptibility. For
more specific analyses, the functions
[`proportion_S()`](https://amr-for-r.org/reference/proportion.md),
[`proportion_SI()`](https://amr-for-r.org/reference/proportion.md),
[`proportion_I()`](https://amr-for-r.org/reference/proportion.md),
[`proportion_IR()`](https://amr-for-r.org/reference/proportion.md) and
[`proportion_R()`](https://amr-for-r.org/reference/proportion.md) can be
used to determine the proportion of a specific antimicrobial outcome.
All these functions contain a `minimum` argument, denoting the minimum
required number of test results for returning a value. These functions
will otherwise return `NA`. The default is `minimum = 30`, following the
[CLSI M39-A4
guideline](https://clsi.org/standards/products/microbiology/documents/m39/)
for applying microbial epidemiology.
As per the EUCAST guideline of 2019, we calculate resistance as the
proportion of R
([`proportion_R()`](https://amr-for-r.org/reference/proportion.md),
equal to
[`resistance()`](https://amr-for-r.org/reference/proportion.md)) and
susceptibility as the proportion of S and I
([`proportion_SI()`](https://amr-for-r.org/reference/proportion.md),
equal to
[`susceptibility()`](https://amr-for-r.org/reference/proportion.md)).
These functions can be used on their own:
``` r
our_data_1st %>% resistance(AMX)
#> [1] 0.4203377
```
Or can be used in conjunction with
[`group_by()`](https://dplyr.tidyverse.org/reference/group_by.html) and
[`summarise()`](https://dplyr.tidyverse.org/reference/summarise.html),
both from the `dplyr` package:
``` r
our_data_1st %>%
group_by(hospital) %>%
summarise(amoxicillin = resistance(AMX))
#> # A tibble: 3 × 2
#> hospital amoxicillin
#> <chr> <dbl>
#> 1 A 0.340
#> 2 B 0.551
#> 3 C 0.370
```
### Interpreting MIC and Disk Diffusion Values
Minimal inhibitory concentration (MIC) values and disk diffusion
diameters can be interpreted into clinical breakpoints (SIR) using
[`as.sir()`](https://amr-for-r.org/reference/as.sir.md). Heres an
example with randomly generated MIC values for *Klebsiella pneumoniae*
and ciprofloxacin:
``` r
set.seed(123)
mic_values <- random_mic(100)
sir_values <- as.sir(mic_values, mo = "K. pneumoniae", ab = "cipro", guideline = "EUCAST 2024")
my_data <- tibble(MIC = mic_values, SIR = sir_values)
my_data
#> # A tibble: 100 × 2
#> MIC SIR
#> <mic> <sir>
#> 1 <=0.0001 S
#> 2 0.0160 S
#> 3 >=8.0000 R
#> 4 0.0320 S
#> 5 0.0080 S
#> 6 64.0000 R
#> 7 0.0080 S
#> 8 0.1250 S
#> 9 0.0320 S
#> 10 0.0002 S
#> # 90 more rows
```
This allows direct interpretation according to EUCAST or CLSI
breakpoints, facilitating automated AMR data processing.
### Plotting MIC and SIR Interpretations
We can visualise MIC distributions and their SIR interpretations using
`ggplot2`, using the new
[`scale_y_mic()`](https://amr-for-r.org/reference/plot.md) for the
y-axis and
[`scale_colour_sir()`](https://amr-for-r.org/reference/plot.md) to
colour-code SIR categories.
``` r
# add a group
my_data$group <- rep(c("A", "B", "C", "D"), each = 25)
ggplot(my_data,
aes(x = group, y = MIC, colour = SIR)) +
geom_jitter(width = 0.2, size = 2) +
geom_boxplot(fill = NA, colour = "grey40") +
scale_y_mic() +
scale_colour_sir() +
labs(title = "MIC Distribution and SIR Interpretation",
x = "Sample Groups",
y = "MIC (mg/L)")
```
![](AMR_files/figure-html/mic_plot-1.png)
This plot provides an intuitive way to assess susceptibility patterns
across different groups while incorporating clinical breakpoints.
For a more straightforward and less manual approach, `ggplot2`s
function
[`autoplot()`](https://ggplot2.tidyverse.org/reference/autoplot.html)
has been extended by this package to directly plot MIC and disk
diffusion values:
``` r
autoplot(mic_values)
```
![](AMR_files/figure-html/autoplot-1.png)
``` r
# by providing `mo` and `ab`, colours will indicate the SIR interpretation:
autoplot(mic_values, mo = "K. pneumoniae", ab = "cipro", guideline = "EUCAST 2024")
```
![](AMR_files/figure-html/autoplot-2.png)
------------------------------------------------------------------------
*Author: Dr. Matthijs Berends, 23rd Feb 2025*

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">

217
articles/AMR_for_Python.md Normal file
View File

@@ -0,0 +1,217 @@
# AMR for Python
## Introduction
The `AMR` package for R is a powerful tool for antimicrobial resistance
(AMR) analysis. It provides extensive features for handling microbial
and antimicrobial data. However, for those who work primarily in Python,
we now have a more intuitive option available: the [`AMR` Python
package](https://pypi.org/project/AMR/).
This Python package is a wrapper around the `AMR` R package. It uses the
`rpy2` package internally. Despite the need to have R installed, Python
users can now easily work with AMR data directly through Python code.
## Prerequisites
This package was only tested with a [virtual environment
(venv)](https://docs.python.org/3/library/venv.html). You can set up
such an environment by running:
``` python
# linux and macOS:
python -m venv /path/to/new/virtual/environment
# Windows:
python -m venv C:\path\to\new\virtual\environment
```
Then you can [activate the
environment](https://docs.python.org/3/library/venv.html#how-venvs-work),
after which the venv is ready to work with.
## Install AMR
1. Since the Python package is available on the official [Python
Package Index](https://pypi.org/project/AMR/), you can just run:
``` bash
pip install AMR
```
2. Make sure you have R installed. There is **no need to install the
`AMR` R package**, as it will be installed automatically.
For Linux:
``` bash
# Ubuntu / Debian
sudo apt install r-base
# Fedora:
sudo dnf install R
# CentOS/RHEL
sudo yum install R
```
For macOS (using [Homebrew](https://brew.sh)):
``` bash
brew install r
```
For Windows, visit the [CRAN download
page](https://cran.r-project.org) to download and install R.
## Examples of Usage
### Cleaning Taxonomy
Heres an example that demonstrates how to clean microorganism and drug
names using the `AMR` Python package:
``` python
import pandas as pd
import AMR
# Sample data
data = {
"MOs": ['E. coli', 'ESCCOL', 'esco', 'Esche coli'],
"Drug": ['Cipro', 'CIP', 'J01MA02', 'Ciproxin']
}
df = pd.DataFrame(data)
# Use AMR functions to clean microorganism and drug names
df['MO_clean'] = AMR.mo_name(df['MOs'])
df['Drug_clean'] = AMR.ab_name(df['Drug'])
# Display the results
print(df)
```
| MOs | Drug | MO_clean | Drug_clean |
|------------|----------|------------------|---------------|
| E. coli | Cipro | Escherichia coli | Ciprofloxacin |
| ESCCOL | CIP | Escherichia coli | Ciprofloxacin |
| esco | J01MA02 | Escherichia coli | Ciprofloxacin |
| Esche coli | Ciproxin | Escherichia coli | Ciprofloxacin |
#### Explanation
- **mo_name:** This function standardises microorganism names. Here,
different variations of *Escherichia coli* (such as “E. coli”,
“ESCCOL”, “esco”, and “Esche coli”) are all converted into the
correct, standardised form, “Escherichia coli”.
- **ab_name**: Similarly, this function standardises antimicrobial
names. The different representations of ciprofloxacin (e.g., “Cipro”,
“CIP”, “J01MA02”, and “Ciproxin”) are all converted to the standard
name, “Ciprofloxacin”.
### Calculating AMR
``` python
import AMR
import pandas as pd
df = AMR.example_isolates
result = AMR.resistance(df["AMX"])
print(result)
```
[0.59555556]
### Generating Antibiograms
One of the core functions of the `AMR` package is generating an
antibiogram, a table that summarises the antimicrobial susceptibility of
bacterial isolates. Heres how you can generate an antibiogram from
Python:
``` python
result2a = AMR.antibiogram(df[["mo", "AMX", "CIP", "TZP"]])
print(result2a)
```
| Pathogen | Amoxicillin | Ciprofloxacin | Piperacillin/tazobactam |
|----------------|----------------|---------------|-------------------------|
| CoNS | 7% (10/142) | 73% (183/252) | 30% (10/33) |
| E. coli | 50% (196/392) | 88% (399/456) | 94% (393/416) |
| K. pneumoniae | 0% (0/58) | 96% (53/55) | 89% (47/53) |
| P. aeruginosa | 0% (0/30) | 100% (30/30) | None |
| P. mirabilis | None | 94% (34/36) | None |
| S. aureus | 6% (8/131) | 90% (171/191) | None |
| S. epidermidis | 1% (1/91) | 64% (87/136) | None |
| S. hominis | None | 80% (56/70) | None |
| S. pneumoniae | 100% (112/112) | None | 100% (112/112) |
``` python
result2b = AMR.antibiogram(df[["mo", "AMX", "CIP", "TZP"]], mo_transform = "gramstain")
print(result2b)
```
| Pathogen | Amoxicillin | Ciprofloxacin | Piperacillin/tazobactam |
|---------------|---------------|---------------|-------------------------|
| Gram-negative | 36% (226/631) | 91% (621/684) | 88% (565/641) |
| Gram-positive | 43% (305/703) | 77% (560/724) | 86% (296/345) |
In this example, we generate an antibiogram by selecting various
antibiotics.
### Taxonomic Data Sets Now in Python!
As a Python user, you might like that the most important data sets of
the `AMR` R package, `microorganisms`, `antimicrobials`,
`clinical_breakpoints`, and `example_isolates`, are now available as
regular Python data frames:
``` python
AMR.microorganisms
```
| mo | fullname | status | kingdom | gbif | gbif_parent | gbif_renamed_to | prevalence |
|--------------|------------------------------------|----------|----------|----------|-------------|-----------------|------------|
| B_GRAMN | (unknown Gram-negatives) | unknown | Bacteria | None | None | None | 2.0 |
| B_GRAMP | (unknown Gram-positives) | unknown | Bacteria | None | None | None | 2.0 |
| B_ANAER-NEG | (unknown anaerobic Gram-negatives) | unknown | Bacteria | None | None | None | 2.0 |
| B_ANAER-POS | (unknown anaerobic Gram-positives) | unknown | Bacteria | None | None | None | 2.0 |
| B_ANAER | (unknown anaerobic bacteria) | unknown | Bacteria | None | None | None | 2.0 |
| … | … | … | … | … | … | … | … |
| B_ZYMMN_POMC | Zymomonas pomaceae | accepted | Bacteria | 10744418 | 3221412 | None | 2.0 |
| B_ZYMPH | Zymophilus | synonym | Bacteria | None | 9475166 | None | 2.0 |
| B_ZYMPH_PCVR | Zymophilus paucivorans | synonym | Bacteria | None | None | None | 2.0 |
| B_ZYMPH_RFFN | Zymophilus raffinosivorans | synonym | Bacteria | None | None | None | 2.0 |
| F_ZYZYG | Zyzygomyces | unknown | Fungi | None | 7581 | None | 2.0 |
``` python
AMR.antimicrobials
```
| ab | cid | name | group | oral_ddd | oral_units | iv_ddd | iv_units |
|-----|------------|-----------------------|--------------------------|----------|------------|--------|----------|
| AMA | 4649.0 | 4-aminosalicylic acid | Antimycobacterials | 12.00 | g | NaN | None |
| ACM | 6450012.0 | Acetylmidecamycin | Macrolides/lincosamides | NaN | None | NaN | None |
| ASP | 49787020.0 | Acetylspiramycin | Macrolides/lincosamides | NaN | None | NaN | None |
| ALS | 8954.0 | Aldesulfone sodium | Other antibacterials | 0.33 | g | NaN | None |
| AMK | 37768.0 | Amikacin | Aminoglycosides | NaN | None | 1.0 | g |
| … | … | … | … | … | … | … | … |
| VIR | 11979535.0 | Virginiamycine | Other antibacterials | NaN | None | NaN | None |
| VOR | 71616.0 | Voriconazole | Antifungals/antimycotics | 0.40 | g | 0.4 | g |
| XBR | 72144.0 | Xibornol | Other antibacterials | NaN | None | NaN | None |
| ZID | 77846445.0 | Zidebactam | Other antibacterials | NaN | None | NaN | None |
| ZFD | NaN | Zoliflodacin | None | NaN | None | NaN | None |
## Conclusion
With the `AMR` Python package, Python users can now effortlessly call R
functions from the `AMR` R package. This eliminates the need for complex
`rpy2` configurations and provides a clean, easy-to-use interface for
antimicrobial resistance analysis. The examples provided above
demonstrate how this can be applied to typical workflows, such as
standardising microorganism and antimicrobial names or calculating
resistance.
By just running `import AMR`, users can seamlessly integrate the robust
features of the R `AMR` package into Python workflows.
Whether youre cleaning data or analysing resistance patterns, the `AMR`
Python package makes it easy to work with AMR data in Python.

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
@@ -413,7 +413,7 @@ ROC curve looks like this:</p>
<code class="sourceCode R"><span><span class="va">predictions</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%&gt;%</a></span></span>
<span> <span class="fu">roc_curve</span><span class="op">(</span><span class="va">mo</span>, <span class="va">`.pred_Gram-negative`</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%&gt;%</a></span></span>
<span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/autoplot.html" class="external-link">autoplot</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-8-1.png" width="720"></p>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-8-1.png" class="r-plt" width="720"></p>
</div>
<div class="section level3">
<h3 id="conclusion">
@@ -677,7 +677,7 @@ sets.</li>
<span> x <span class="op">=</span> <span class="st">"Year"</span>,</span>
<span> y <span class="op">=</span> <span class="st">"Resistance Proportion"</span><span class="op">)</span> <span class="op">+</span></span>
<span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html" class="external-link">theme_minimal</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png" width="720"></p>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png" class="r-plt" width="720"></p>
<p>Additionally, we can visualise resistance trends in
<code>ggplot2</code> and directly add linear models there:</p>
<div class="sourceCode" id="cb16"><pre class="downlit sourceCode r">
@@ -691,7 +691,7 @@ sets.</li>
<span> formula <span class="op">=</span> <span class="va">y</span> <span class="op">~</span> <span class="va">x</span>,</span>
<span> alpha <span class="op">=</span> <span class="fl">0.25</span><span class="op">)</span> <span class="op">+</span></span>
<span> <span class="fu"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html" class="external-link">theme_minimal</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png" width="720"></p>
<p><img src="AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png" class="r-plt" width="720"></p>
</div>
<div class="section level3">
<h3 id="conclusion-1">

View File

@@ -0,0 +1,606 @@
# AMR with tidymodels
> This page was entirely written by our [AMR for R
> Assistant](https://chat.amr-for-r.org), a ChatGPT manually-trained
> model able to answer any question about the `AMR` package.
Antimicrobial resistance (AMR) is a global health crisis, and
understanding resistance patterns is crucial for managing effective
treatments. The `AMR` R package provides robust tools for analysing AMR
data, including convenient antimicrobial selector functions like
[`aminoglycosides()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
and
[`betalactams()`](https://amr-for-r.org/reference/antimicrobial_selectors.md).
In this post, we will explore how to use the `tidymodels` framework to
predict resistance patterns in the `example_isolates` dataset in two
examples.
This post contains the following examples:
1. Using Antimicrobial Selectors
2. Predicting ESBL Presence Using Raw MICs
3. Predicting AMR Over Time
## Example 1: Using Antimicrobial Selectors
By leveraging the power of `tidymodels` and the `AMR` package, well
build a reproducible machine learning workflow to predict the Gramstain
of the microorganism to two important antibiotic classes:
aminoglycosides and beta-lactams.
### **Objective**
Our goal is to build a predictive model using the `tidymodels` framework
to determine the Gramstain of the microorganism based on microbial data.
We will:
1. Preprocess data using the selector functions
[`aminoglycosides()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
and
[`betalactams()`](https://amr-for-r.org/reference/antimicrobial_selectors.md).
2. Define a logistic regression model for prediction.
3. Use a structured `tidymodels` workflow to preprocess, train, and
evaluate the model.
### **Data Preparation**
We begin by loading the required libraries and preparing the
`example_isolates` dataset from the `AMR` package.
``` r
# Load required libraries
library(AMR) # For AMR data analysis
library(tidymodels) # For machine learning workflows, and data manipulation (dplyr, tidyr, ...)
```
Prepare the data:
``` r
# Your data could look like this:
example_isolates
#> # A tibble: 2,000 × 46
#> date patient age gender ward mo PEN OXA FLC AMX
#> <date> <chr> <dbl> <chr> <chr> <mo> <sir> <sir> <sir> <sir>
#> 1 2002-01-02 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 2 2002-01-03 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 3 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 4 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 5 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 6 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 7 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 8 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 9 2002-01-16 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 10 2002-01-17 858515 79 F ICU B_STPHY_EPDR R NA S NA
#> # 1,990 more rows
#> # 36 more variables: AMC <sir>, AMP <sir>, TZP <sir>, CZO <sir>, FEP <sir>,
#> # CXM <sir>, FOX <sir>, CTX <sir>, CAZ <sir>, CRO <sir>, GEN <sir>,
#> # TOB <sir>, AMK <sir>, KAN <sir>, TMP <sir>, SXT <sir>, NIT <sir>,
#> # FOS <sir>, LNZ <sir>, CIP <sir>, MFX <sir>, VAN <sir>, TEC <sir>,
#> # TCY <sir>, TGC <sir>, DOX <sir>, ERY <sir>, CLI <sir>, AZM <sir>,
#> # IPM <sir>, MEM <sir>, MTR <sir>, CHL <sir>, COL <sir>, MUP <sir>, …
# Select relevant columns for prediction
data <- example_isolates %>%
# select AB results dynamically
select(mo, aminoglycosides(), betalactams()) %>%
# replace NAs with NI (not-interpretable)
mutate(across(where(is.sir),
~replace_na(.x, "NI")),
# make factors of SIR columns
across(where(is.sir),
as.integer),
# get Gramstain of microorganisms
mo = as.factor(mo_gramstain(mo))) %>%
# drop NAs - the ones without a Gramstain (fungi, etc.)
drop_na()
#> For `aminoglycosides()` using columns 'GEN' (gentamicin), 'TOB'
#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)
#> For `betalactams()` using columns 'PEN' (benzylpenicillin), 'OXA'
#> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'
#> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'
#> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'
#> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),
#> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)
```
**Explanation:**
- [`aminoglycosides()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
and
[`betalactams()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
dynamically select columns for antimicrobials in these classes.
- `drop_na()` ensures the model receives complete cases for training.
### **Defining the Workflow**
We now define the `tidymodels` workflow, which consists of three steps:
preprocessing, model specification, and fitting.
#### 1. Preprocessing with a Recipe
We create a recipe to preprocess the data for modelling.
``` r
# Define the recipe for data preprocessing
resistance_recipe <- recipe(mo ~ ., data = data) %>%
step_corr(c(aminoglycosides(), betalactams()), threshold = 0.9)
resistance_recipe
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 20
#>
#> ── Operations
#> • Correlation filter on: c(aminoglycosides(), betalactams())
```
For a recipe that includes at least one preprocessing operation, like we
have with `step_corr()`, the necessary parameters can be estimated from
a training set using `prep()`:
``` r
prep(resistance_recipe)
#> For `aminoglycosides()` using columns 'GEN' (gentamicin), 'TOB'
#> (tobramycin), 'AMK' (amikacin), and 'KAN' (kanamycin)
#> For `betalactams()` using columns 'PEN' (benzylpenicillin), 'OXA'
#> (oxacillin), 'FLC' (flucloxacillin), 'AMX' (amoxicillin), 'AMC'
#> (amoxicillin/clavulanic acid), 'AMP' (ampicillin), 'TZP'
#> (piperacillin/tazobactam), 'CZO' (cefazolin), 'FEP' (cefepime), 'CXM'
#> (cefuroxime), 'FOX' (cefoxitin), 'CTX' (cefotaxime), 'CAZ' (ceftazidime),
#> 'CRO' (ceftriaxone), 'IPM' (imipenem), and 'MEM' (meropenem)
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 20
#>
#> ── Training information
#> Training data contained 1968 data points and no incomplete rows.
#>
#> ── Operations
#> • Correlation filter on: AMX CTX | Trained
```
**Explanation:**
- `recipe(mo ~ ., data = data)` will take the `mo` column as outcome and
all other columns as predictors.
- `step_corr()` removes predictors (i.e., antibiotic columns) that have
a higher correlation than 90%.
Notice how the recipe contains just the antimicrobial selector
functions - no need to define the columns specifically. In the
preparation (retrieved with `prep()`) we can see that the columns or
variables AMX and CTX were removed as they correlate too much with
existing, other variables.
#### 2. Specifying the Model
We define a logistic regression model since resistance prediction is a
binary classification task.
``` r
# Specify a logistic regression model
logistic_model <- logistic_reg() %>%
set_engine("glm") # Use the Generalised Linear Model engine
logistic_model
#> Logistic Regression Model Specification (classification)
#>
#> Computational engine: glm
```
**Explanation:**
- `logistic_reg()` sets up a logistic regression model.
- `set_engine("glm")` specifies the use of Rs built-in GLM engine.
#### 3. Building the Workflow
We bundle the recipe and model together into a `workflow`, which
organises the entire modelling process.
``` r
# Combine the recipe and model into a workflow
resistance_workflow <- workflow() %>%
add_recipe(resistance_recipe) %>% # Add the preprocessing recipe
add_model(logistic_model) # Add the logistic regression model
resistance_workflow
#> ══ Workflow ════════════════════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: logistic_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> 1 Recipe Step
#>
#> • step_corr()
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Logistic Regression Model Specification (classification)
#>
#> Computational engine: glm
```
### **Training and Evaluating the Model**
To train the model, we split the data into training and testing sets.
Then, we fit the workflow on the training set and evaluate its
performance.
``` r
# Split data into training and testing sets
set.seed(123) # For reproducibility
data_split <- initial_split(data, prop = 0.8) # 80% training, 20% testing
training_data <- training(data_split) # Training set
testing_data <- testing(data_split) # Testing set
# Fit the workflow to the training data
fitted_workflow <- resistance_workflow %>%
fit(training_data) # Train the model
```
**Explanation:**
- `initial_split()` splits the data into training and testing sets.
- `fit()` trains the workflow on the training set.
Notice how in `fit()`, the antimicrobial selector functions are
internally called again. For training, these functions are called since
they are stored in the recipe.
Next, we evaluate the model on the testing data.
``` r
# Make predictions on the testing set
predictions <- fitted_workflow %>%
predict(testing_data) # Generate predictions
probabilities <- fitted_workflow %>%
predict(testing_data, type = "prob") # Generate probabilities
predictions <- predictions %>%
bind_cols(probabilities) %>%
bind_cols(testing_data) # Combine with true labels
predictions
#> # A tibble: 394 × 24
#> .pred_class `.pred_Gram-negative` `.pred_Gram-positive` mo GEN TOB
#> <fct> <dbl> <dbl> <fct> <int> <int>
#> 1 Gram-positive 1.07e- 1 8.93 e- 1 Gram-p… 5 5
#> 2 Gram-positive 3.17e- 8 1.000e+ 0 Gram-p… 5 1
#> 3 Gram-negative 9.99e- 1 1.42 e- 3 Gram-n… 5 5
#> 4 Gram-positive 2.22e-16 1 e+ 0 Gram-p… 5 5
#> 5 Gram-negative 9.46e- 1 5.42 e- 2 Gram-n… 5 5
#> 6 Gram-positive 1.07e- 1 8.93 e- 1 Gram-p… 5 5
#> 7 Gram-positive 2.22e-16 1 e+ 0 Gram-p… 1 5
#> 8 Gram-positive 2.22e-16 1 e+ 0 Gram-p… 4 4
#> 9 Gram-negative 1 e+ 0 2.22 e-16 Gram-n… 1 1
#> 10 Gram-positive 6.05e-11 1.000e+ 0 Gram-p… 4 4
#> # 384 more rows
#> # 18 more variables: AMK <int>, KAN <int>, PEN <int>, OXA <int>, FLC <int>,
#> # AMX <int>, AMC <int>, AMP <int>, TZP <int>, CZO <int>, FEP <int>,
#> # CXM <int>, FOX <int>, CTX <int>, CAZ <int>, CRO <int>, IPM <int>, MEM <int>
# Evaluate model performance
metrics <- predictions %>%
metrics(truth = mo, estimate = .pred_class) # Calculate performance metrics
metrics
#> # A tibble: 2 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 accuracy binary 0.995
#> 2 kap binary 0.989
# To assess some other model properties, you can make our own `metrics()` function
our_metrics <- metric_set(accuracy, kap, ppv, npv) # add Positive Predictive Value and Negative Predictive Value
metrics2 <- predictions %>%
our_metrics(truth = mo, estimate = .pred_class) # run again on our `our_metrics()` function
metrics2
#> # A tibble: 4 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 accuracy binary 0.995
#> 2 kap binary 0.989
#> 3 ppv binary 0.987
#> 4 npv binary 1
```
**Explanation:**
- [`predict()`](https://rdrr.io/r/stats/predict.html) generates
predictions on the testing set.
- `metrics()` computes evaluation metrics like accuracy and kappa.
It appears we can predict the Gram stain with a 99.5% accuracy based on
AMR results of only aminoglycosides and beta-lactam antibiotics. The ROC
curve looks like this:
``` r
predictions %>%
roc_curve(mo, `.pred_Gram-negative`) %>%
autoplot()
```
![](AMR_with_tidymodels_files/figure-html/unnamed-chunk-8-1.png)
### **Conclusion**
In this post, we demonstrated how to build a machine learning pipeline
with the `tidymodels` framework and the `AMR` package. By combining
selector functions like
[`aminoglycosides()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
and
[`betalactams()`](https://amr-for-r.org/reference/antimicrobial_selectors.md)
with `tidymodels`, we efficiently prepared data, trained a model, and
evaluated its performance.
This workflow is extensible to other antimicrobial classes and
resistance patterns, empowering users to analyse AMR data systematically
and reproducibly.
------------------------------------------------------------------------
## Example 2: Predicting ESBL Presence Using Raw MICs
In this second example, we demonstrate how to use `<mic>` columns
directly in `tidymodels` workflows using AMR-specific recipe steps. This
includes a transformation to `log2` scale using `step_mic_log2()`, which
prepares MIC values for use in classification models.
This approach and idea formed the basis for the publication [DOI:
10.3389/fmicb.2025.1582703](https://doi.org/10.3389/fmicb.2025.1582703)
to model the presence of extended-spectrum beta-lactamases (ESBL).
> NOTE: THIS EXAMPLE WILL BE AVAILABLE IN A NEXT VERSION (#TODO)
>
> The new AMR package version will contain new tidymodels selectors such
> as `step_mic_log2()`.
------------------------------------------------------------------------
## Example 2: Predicting AMR Over Time
In this third example, we aim to predict antimicrobial resistance (AMR)
trends over time using `tidymodels`. We will model resistance to three
antibiotics (amoxicillin `AMX`, amoxicillin-clavulanic acid `AMC`, and
ciprofloxacin `CIP`), based on historical data grouped by year and
hospital ward.
### **Objective**
Our goal is to:
1. Prepare the dataset by aggregating resistance data over time.
2. Define a regression model to predict AMR trends.
3. Use `tidymodels` to preprocess, train, and evaluate the model.
### **Data Preparation**
We start by transforming the `example_isolates` dataset into a
structured time-series format.
``` r
# Load required libraries
library(AMR)
library(tidymodels)
# Transform dataset
data_time <- example_isolates %>%
top_n_microorganisms(n = 10) %>% # Filter on the top #10 species
mutate(year = as.integer(format(date, "%Y")), # Extract year from date
gramstain = mo_gramstain(mo)) %>% # Get taxonomic names
group_by(year, gramstain) %>%
summarise(across(c(AMX, AMC, CIP),
function(x) resistance(x, minimum = 0),
.names = "res_{.col}"),
.groups = "drop") %>%
filter(!is.na(res_AMX) & !is.na(res_AMC) & !is.na(res_CIP)) # Drop missing values
#> Using column 'mo' as input for `col_mo`.
data_time
#> # A tibble: 32 × 5
#> year gramstain res_AMX res_AMC res_CIP
#> <int> <chr> <dbl> <dbl> <dbl>
#> 1 2002 Gram-negative 1 0.105 0.0606
#> 2 2002 Gram-positive 0.838 0.182 0.162
#> 3 2003 Gram-negative 1 0.0714 0
#> 4 2003 Gram-positive 0.714 0.244 0.154
#> 5 2004 Gram-negative 0.464 0.0938 0
#> 6 2004 Gram-positive 0.849 0.299 0.244
#> 7 2005 Gram-negative 0.412 0.132 0.0588
#> 8 2005 Gram-positive 0.882 0.382 0.154
#> 9 2006 Gram-negative 0.379 0 0.1
#> 10 2006 Gram-positive 0.778 0.333 0.353
#> # 22 more rows
```
**Explanation:**
- `mo_name(mo)`: Converts microbial codes into proper species names.
- [`resistance()`](https://amr-for-r.org/reference/proportion.md):
Converts AMR results into numeric values (proportion of resistant
isolates).
- `group_by(year, ward, species)`: Aggregates resistance rates by year
and ward.
### **Defining the Workflow**
We now define the modelling workflow, which consists of a preprocessing
step, a model specification, and the fitting process.
#### 1. Preprocessing with a Recipe
``` r
# Define the recipe
resistance_recipe_time <- recipe(res_AMX ~ year + gramstain, data = data_time) %>%
step_dummy(gramstain, one_hot = TRUE) %>% # Convert categorical to numerical
step_normalize(year) %>% # Normalise year for better model performance
step_nzv(all_predictors()) # Remove near-zero variance predictors
resistance_recipe_time
#>
#> ── Recipe ──────────────────────────────────────────────────────────────────────
#>
#> ── Inputs
#> Number of variables by role
#> outcome: 1
#> predictor: 2
#>
#> ── Operations
#> • Dummy variables from: gramstain
#> • Centering and scaling for: year
#> • Sparse, unbalanced variable filter on: all_predictors()
```
**Explanation:**
- `step_dummy()`: Encodes categorical variables (`ward`, `species`) as
numerical indicators.
- `step_normalize()`: Normalises the `year` variable.
- `step_nzv()`: Removes near-zero variance predictors.
#### 2. Specifying the Model
We use a linear regression model to predict resistance trends.
``` r
# Define the linear regression model
lm_model <- linear_reg() %>%
set_engine("lm") # Use linear regression
lm_model
#> Linear Regression Model Specification (regression)
#>
#> Computational engine: lm
```
**Explanation:**
- `linear_reg()`: Defines a linear regression model.
- `set_engine("lm")`: Uses Rs built-in linear regression engine.
#### 3. Building the Workflow
We combine the preprocessing recipe and model into a workflow.
``` r
# Create workflow
resistance_workflow_time <- workflow() %>%
add_recipe(resistance_recipe_time) %>%
add_model(lm_model)
resistance_workflow_time
#> ══ Workflow ════════════════════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> 3 Recipe Steps
#>
#> • step_dummy()
#> • step_normalize()
#> • step_nzv()
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear Regression Model Specification (regression)
#>
#> Computational engine: lm
```
### **Training and Evaluating the Model**
We split the data into training and testing sets, fit the model, and
evaluate performance.
``` r
# Split the data
set.seed(123)
data_split_time <- initial_split(data_time, prop = 0.8)
train_time <- training(data_split_time)
test_time <- testing(data_split_time)
# Train the model
fitted_workflow_time <- resistance_workflow_time %>%
fit(train_time)
# Make predictions
predictions_time <- fitted_workflow_time %>%
predict(test_time) %>%
bind_cols(test_time)
# Evaluate model
metrics_time <- predictions_time %>%
metrics(truth = res_AMX, estimate = .pred)
metrics_time
#> # A tibble: 3 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 rmse standard 0.0774
#> 2 rsq standard 0.711
#> 3 mae standard 0.0704
```
**Explanation:**
- `initial_split()`: Splits data into training and testing sets.
- `fit()`: Trains the workflow.
- [`predict()`](https://rdrr.io/r/stats/predict.html): Generates
resistance predictions.
- `metrics()`: Evaluates model performance.
### **Visualising Predictions**
We plot resistance trends over time for amoxicillin.
``` r
library(ggplot2)
# Plot actual vs predicted resistance over time
ggplot(predictions_time, aes(x = year)) +
geom_point(aes(y = res_AMX, color = "Actual")) +
geom_line(aes(y = .pred, color = "Predicted")) +
labs(title = "Predicted vs Actual AMX Resistance Over Time",
x = "Year",
y = "Resistance Proportion") +
theme_minimal()
```
![](AMR_with_tidymodels_files/figure-html/unnamed-chunk-14-1.png)
Additionally, we can visualise resistance trends in `ggplot2` and
directly add linear models there:
``` r
ggplot(data_time, aes(x = year, y = res_AMX, color = gramstain)) +
geom_line() +
labs(title = "AMX Resistance Trends",
x = "Year",
y = "Resistance Proportion") +
# add a linear model directly in ggplot2:
geom_smooth(method = "lm",
formula = y ~ x,
alpha = 0.25) +
theme_minimal()
```
![](AMR_with_tidymodels_files/figure-html/unnamed-chunk-15-1.png)
### **Conclusion**
In this example, we demonstrated how to analyze AMR trends over time
using `tidymodels`. By aggregating resistance rates by year and hospital
ward, we built a predictive model to track changes in resistance to
amoxicillin (`AMX`), amoxicillin-clavulanic acid (`AMC`), and
ciprofloxacin (`CIP`).
This method can be extended to other antibiotics and resistance
patterns, providing valuable insights into AMR dynamics in healthcare
settings.

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">

126
articles/EUCAST.md Normal file
View File

@@ -0,0 +1,126 @@
# Apply EUCAST rules
## Introduction
What are EUCAST rules? The European Committee on Antimicrobial
Susceptibility Testing (EUCAST) states [on their
website](https://www.eucast.org/expert_rules_and_expected_phenotypes):
> *EUCAST expert rules (see below) are a tabulated collection of expert
> knowledge on interpretive rules, expected resistant phenotypes and
> expected susceptible phenotypes which should be applied to
> antimicrobial susceptibility testing in order to reduce testing,
> reduce errors and make appropriate recommendations for reporting
> particular resistances.*
In Europe, a lot of medical microbiological laboratories already apply
these rules ([Brown *et al.*,
2015](https://www.eurosurveillance.org/content/10.2807/1560-7917.ES2015.20.2.21008)).
Our package features their latest insights on expected resistant
phenotypes (v1.2, 2023).
## Examples
These rules can be used to discard improbable bug-drug combinations in
your data. For example, *Klebsiella* produces beta-lactamase that
prevents ampicillin (or amoxicillin) from working against it. In other
words, practically every strain of *Klebsiella* is resistant to
ampicillin.
Sometimes, laboratory data can still contain such strains with
*Klebsiella* being susceptible to ampicillin. This could be because an
antibiogram is available before an identification is available, and the
antibiogram is then not re-interpreted based on the identification. The
[`eucast_rules()`](https://amr-for-r.org/reference/eucast_rules.md)
function resolves this, by applying the latest EUCAST Expected
Resistant Phenotypes guideline:
``` r
oops <- tibble::tibble(
mo = c(
"Klebsiella pneumoniae",
"Escherichia coli"
),
ampicillin = as.sir("S")
)
oops
#> # A tibble: 2 × 2
#> mo ampicillin
#> <chr> <sir>
#> 1 Klebsiella pneumoniae S
#> 2 Escherichia coli S
eucast_rules(oops, info = FALSE, overwrite = TRUE)
#> # A tibble: 2 × 2
#> mo ampicillin
#> <chr> <sir>
#> 1 Klebsiella pneumoniae R
#> 2 Escherichia coli S
```
A more convenient function is
[`mo_is_intrinsic_resistant()`](https://amr-for-r.org/reference/mo_property.md)
that uses the same guideline, but allows to check for one or more
specific microorganisms or antimicrobials:
``` r
mo_is_intrinsic_resistant(
c("Klebsiella pneumoniae", "Escherichia coli"),
"ampicillin"
)
#> [1] TRUE FALSE
mo_is_intrinsic_resistant(
"Klebsiella pneumoniae",
c("ampicillin", "kanamycin")
)
#> [1] TRUE FALSE
```
EUCAST rules can not only be used for correction, they can also be used
for filling in known resistance and susceptibility based on results of
other antimicrobials drugs. This process is called *interpretive
reading*, and is basically a form of imputation:
``` r
data <- tibble::tibble(
mo = c(
"Staphylococcus aureus",
"Enterococcus faecalis",
"Escherichia coli",
"Klebsiella pneumoniae",
"Pseudomonas aeruginosa"
),
VAN = "-", # Vancomycin
AMX = "-", # Amoxicillin
COL = "-", # Colistin
CAZ = "-", # Ceftazidime
CXM = "-", # Cefuroxime
PEN = "S", # Benzylenicillin
FOX = "S" # Cefoxitin
)
```
``` r
data
```
| mo | VAN | AMX | COL | CAZ | CXM | PEN | FOX |
|:-----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| Staphylococcus aureus | \- | \- | \- | \- | \- | S | S |
| Enterococcus faecalis | \- | \- | \- | \- | \- | S | S |
| Escherichia coli | \- | \- | \- | \- | \- | S | S |
| Klebsiella pneumoniae | \- | \- | \- | \- | \- | S | S |
| Pseudomonas aeruginosa | \- | \- | \- | \- | \- | S | S |
``` r
eucast_rules(data, overwrite = TRUE)
```
| mo | VAN | AMX | COL | CAZ | CXM | PEN | FOX |
|:-----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| Staphylococcus aureus | \- | S | R | R | S | S | S |
| Enterococcus faecalis | \- | \- | R | R | R | S | R |
| Escherichia coli | R | \- | \- | \- | \- | R | S |
| Klebsiella pneumoniae | R | R | \- | \- | \- | R | S |
| Pseudomonas aeruginosa | R | R | \- | \- | R | R | R |

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
@@ -210,18 +210,18 @@ per drug explain the difference per microorganism.</p>
</h2>
<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/stats/biplot.html" class="external-link">biplot</a></span><span class="op">(</span><span class="va">pca_result</span><span class="op">)</span></span></code></pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-5-1.png" width="750"></p>
<p><img src="PCA_files/figure-html/unnamed-chunk-5-1.png" class="r-plt" width="750"></p>
<p>But we cant see the explanation of the points. Perhaps this works
better with our new <code><a href="../reference/ggplot_pca.html">ggplot_pca()</a></code> function, that
automatically adds the right labels and even groups:</p>
<div class="sourceCode" id="cb7"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span><span class="op">(</span><span class="va">pca_result</span><span class="op">)</span></span></code></pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-6-1.png" width="750"></p>
<p><img src="PCA_files/figure-html/unnamed-chunk-6-1.png" class="r-plt" width="750"></p>
<p>You can also print an ellipse per group, and edit the appearance:</p>
<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
<code class="sourceCode R"><span><span class="fu"><a href="../reference/ggplot_pca.html">ggplot_pca</a></span><span class="op">(</span><span class="va">pca_result</span>, ellipse <span class="op">=</span> <span class="cn">TRUE</span><span class="op">)</span> <span class="op">+</span></span>
<span> <span class="fu">ggplot2</span><span class="fu">::</span><span class="fu"><a href="https://ggplot2.tidyverse.org/reference/labs.html" class="external-link">labs</a></span><span class="op">(</span>title <span class="op">=</span> <span class="st">"An AMR/PCA biplot!"</span><span class="op">)</span></span></code></pre></div>
<p><img src="PCA_files/figure-html/unnamed-chunk-7-1.png" width="750"></p>
<p><img src="PCA_files/figure-html/unnamed-chunk-7-1.png" class="r-plt" width="750"></p>
</div>
</main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
</nav></aside>

157
articles/PCA.md Normal file
View File

@@ -0,0 +1,157 @@
# Conduct principal component analysis (PCA) for AMR
**NOTE: This page will be updated soon, as the pca() function is
currently being developed.**
## Introduction
## Transforming
For PCA, we need to transform our AMR data first. This is what the
`example_isolates` data set in this package looks like:
``` r
library(AMR)
library(dplyr)
glimpse(example_isolates)
#> Rows: 2,000
#> Columns: 46
#> $ date <date> 2002-01-02, 2002-01-03, 2002-01-07, 2002-01-07, 2002-01-13, 2…
#> $ patient <chr> "A77334", "A77334", "067927", "067927", "067927", "067927", "4…
#> $ age <dbl> 65, 65, 45, 45, 45, 45, 78, 78, 45, 79, 67, 67, 71, 71, 75, 50…
#> $ gender <chr> "F", "F", "F", "F", "F", "F", "M", "M", "F", "F", "M", "M", "M…
#> $ ward <chr> "Clinical", "Clinical", "ICU", "ICU", "ICU", "ICU", "Clinical"…
#> $ mo <mo> "B_ESCHR_COLI", "B_ESCHR_COLI", "B_STPHY_EPDR", "B_STPHY_EPDR",…
#> $ PEN <sir> R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, R, S,…
#> $ OXA <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ FLC <sir> NA, NA, R, R, R, R, S, S, R, S, S, S, NA, NA, NA, NA, NA, R, R…
#> $ AMX <sir> NA, NA, NA, NA, NA, NA, R, R, NA, NA, NA, NA, NA, NA, R, NA, N…
#> $ AMC <sir> I, I, NA, NA, NA, NA, S, S, NA, NA, S, S, I, I, R, I, I, NA, N…
#> $ AMP <sir> NA, NA, NA, NA, NA, NA, R, R, NA, NA, NA, NA, NA, NA, R, NA, N…
#> $ TZP <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ CZO <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, NA,…
#> $ FEP <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ CXM <sir> I, I, R, R, R, R, S, S, R, S, S, S, S, S, NA, S, S, R, R, S, S…
#> $ FOX <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, NA,…
#> $ CTX <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, S, NA, S, S…
#> $ CAZ <sir> NA, NA, R, R, R, R, R, R, R, R, R, R, NA, NA, NA, S, S, R, R, …
#> $ CRO <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, S, NA, S, S…
#> $ GEN <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ TOB <sir> NA, NA, NA, NA, NA, NA, S, S, NA, NA, NA, NA, S, S, NA, NA, NA…
#> $ AMK <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ KAN <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ TMP <sir> R, R, S, S, R, R, R, R, S, S, NA, NA, S, S, S, S, S, R, R, R, …
#> $ SXT <sir> R, R, S, S, NA, NA, NA, NA, S, S, NA, NA, S, S, S, S, S, NA, N…
#> $ NIT <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R,…
#> $ FOS <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ LNZ <sir> R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R, R, R, N…
#> $ CIP <sir> NA, NA, NA, NA, NA, NA, NA, NA, S, S, NA, NA, NA, NA, NA, S, S…
#> $ MFX <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ VAN <sir> R, R, S, S, S, S, S, S, S, S, NA, NA, R, R, R, R, R, S, S, S, …
#> $ TEC <sir> R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R, R, R, N…
#> $ TCY <sir> R, R, S, S, S, S, S, S, S, I, S, S, NA, NA, I, R, R, S, I, R, …
#> $ TGC <sir> NA, NA, S, S, S, S, S, S, S, NA, S, S, NA, NA, NA, R, R, S, NA…
#> $ DOX <sir> NA, NA, S, S, S, S, S, S, S, NA, S, S, NA, NA, NA, R, R, S, NA…
#> $ ERY <sir> R, R, R, R, R, R, S, S, R, S, S, S, R, R, R, R, R, R, R, R, S,…
#> $ CLI <sir> R, R, NA, NA, NA, R, NA, NA, NA, NA, NA, NA, R, R, R, R, R, NA…
#> $ AZM <sir> R, R, R, R, R, R, S, S, R, S, S, S, R, R, R, R, R, R, R, R, S,…
#> $ IPM <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, S, NA, S, S…
#> $ MEM <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ MTR <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ CHL <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ COL <sir> NA, NA, R, R, R, R, R, R, R, R, R, R, NA, NA, NA, R, R, R, R, …
#> $ MUP <sir> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ RIF <sir> R, R, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, R, R, R, R, R, N…
```
Now to transform this to a data set with only resistance percentages per
taxonomic order and genus:
``` r
resistance_data <- example_isolates %>%
group_by(
order = mo_order(mo), # group on anything, like order
genus = mo_genus(mo)
) %>% # and genus as we do here
summarise_if(is.sir, resistance) %>% # then get resistance of all drugs
select(
order, genus, AMC, CXM, CTX,
CAZ, GEN, TOB, TMP, SXT
) # and select only relevant columns
head(resistance_data)
#> # A tibble: 6 × 10
#> # Groups: order [5]
#> order genus AMC CXM CTX CAZ GEN TOB TMP SXT
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (unknown order) (unknown ge… NA NA NA NA NA NA NA NA
#> 2 Actinomycetales Schaalia NA NA NA NA NA NA NA NA
#> 3 Bacteroidales Bacteroides NA NA NA NA NA NA NA NA
#> 4 Campylobacterales Campylobact… NA NA NA NA NA NA NA NA
#> 5 Caryophanales Gemella NA NA NA NA NA NA NA NA
#> 6 Caryophanales Listeria NA NA NA NA NA NA NA NA
```
## Perform principal component analysis
The new [`pca()`](https://amr-for-r.org/reference/pca.md) function will
automatically filter on rows that contain numeric values in all selected
variables, so we now only need to do:
``` r
pca_result <- pca(resistance_data)
#> Columns selected for PCA: "AMC", "CAZ", "CTX", "CXM", "GEN", "SXT",
#> "TMP", and "TOB". Total observations available: 7.
```
The result can be reviewed with the good old
[`summary()`](https://rdrr.io/r/base/summary.html) function:
``` r
summary(pca_result)
#> Groups (n=4, named as 'order'):
#> [1] "Caryophanales" "Enterobacterales" "Lactobacillales" "Pseudomonadales"
#> Importance of components:
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7
#> Standard deviation 2.1539 1.6807 0.6138 0.33879 0.20808 0.03140 1.232e-16
#> Proportion of Variance 0.5799 0.3531 0.0471 0.01435 0.00541 0.00012 0.000e+00
#> Cumulative Proportion 0.5799 0.9330 0.9801 0.99446 0.99988 1.00000 1.000e+00
```
#> Groups (n=4, named as 'order'):
#> [1] "Caryophanales" "Enterobacterales" "Lactobacillales" "Pseudomonadales"
Good news. The first two components explain a total of 93.3% of the
variance (see the PC1 and PC2 values of the *Proportion of Variance*. We
can create a so-called biplot with the base R
[`biplot()`](https://rdrr.io/r/stats/biplot.html) function, to see which
antimicrobial resistance per drug explain the difference per
microorganism.
## Plotting the results
``` r
biplot(pca_result)
```
![](PCA_files/figure-html/unnamed-chunk-5-1.png)
But we cant see the explanation of the points. Perhaps this works
better with our new
[`ggplot_pca()`](https://amr-for-r.org/reference/ggplot_pca.md)
function, that automatically adds the right labels and even groups:
``` r
ggplot_pca(pca_result)
```
![](PCA_files/figure-html/unnamed-chunk-6-1.png)
You can also print an ellipse per group, and edit the appearance:
``` r
ggplot_pca(pca_result, ellipse = TRUE) +
ggplot2::labs(title = "An AMR/PCA biplot!")
```
![](PCA_files/figure-html/unnamed-chunk-7-1.png)

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
@@ -311,7 +311,7 @@ using the included <code><a href="../reference/ggplot_sir.html">ggplot_sir()</a>
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/group_by.html" class="external-link">group_by</a></span><span class="op">(</span><span class="va">Country</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%&gt;%</a></span></span>
<span> <span class="fu"><a href="https://dplyr.tidyverse.org/reference/select.html" class="external-link">select</a></span><span class="op">(</span><span class="va">Country</span>, <span class="va">AMP_ND2</span>, <span class="va">AMC_ED20</span>, <span class="va">CAZ_ED10</span>, <span class="va">CIP_ED5</span><span class="op">)</span> <span class="op"><a href="https://magrittr.tidyverse.org/reference/pipe.html" class="external-link">%&gt;%</a></span></span>
<span> <span class="fu"><a href="../reference/ggplot_sir.html">ggplot_sir</a></span><span class="op">(</span>translate_ab <span class="op">=</span> <span class="st">"ab"</span>, facet <span class="op">=</span> <span class="st">"Country"</span>, datalabels <span class="op">=</span> <span class="cn">FALSE</span><span class="op">)</span></span></code></pre></div>
<p><img src="WHONET_files/figure-html/unnamed-chunk-7-1.png" width="720"></p>
<p><img src="WHONET_files/figure-html/unnamed-chunk-7-1.png" class="r-plt" width="720"></p>
</div>
</main><aside class="col-md-3"><nav id="toc" aria-label="Table of contents"><h2>On this page</h2>
</nav></aside>

137
articles/WHONET.md Normal file
View File

@@ -0,0 +1,137 @@
# Work with WHONET data
### Import of data
This tutorial assumes you already imported the WHONET data with e.g. the
[`readxl` package](https://readxl.tidyverse.org/). In RStudio, this can
be done using the menu button Import Dataset in the tab Environment.
Choose the option From Excel and select your exported file. Make sure
date fields are imported correctly.
An example syntax could look like this:
``` r
library(readxl)
data <- read_excel(path = "path/to/your/file.xlsx")
```
This package comes with an [example data set
`WHONET`](https://amr-for-r.org/reference/WHONET.html). We will use it
for this analysis.
### Preparation
First, load the relevant packages if you did not yet did this. I use the
tidyverse for all of my analyses. All of them. If you dont know it yet,
I suggest you read about it on their website:
<https://www.tidyverse.org/>.
``` r
library(dplyr) # part of tidyverse
library(ggplot2) # part of tidyverse
library(AMR) # this package
library(cleaner) # to create frequency tables
```
We will have to transform some variables to simplify and automate the
analysis:
- Microorganisms should be transformed to our own microorganism codes
(called an `mo`) using [our Catalogue of Life reference data
set](https://amr-for-r.org/reference/catalogue_of_life), which
contains all ~70,000 microorganisms from the taxonomic kingdoms
Bacteria, Fungi and Protozoa. We do the tranformation with
[`as.mo()`](https://amr-for-r.org/reference/as.mo.md). This function
also recognises almost all WHONET abbreviations of microorganisms.
- Antimicrobial results or interpretations have to be clean and valid.
In other words, they should only contain values `"S"`, `"I"` or `"R"`.
That is exactly where the
[`as.sir()`](https://amr-for-r.org/reference/as.sir.md) function is
for.
``` r
# transform variables
data <- WHONET %>%
# get microbial ID based on given organism
mutate(mo = as.mo(Organism)) %>%
# transform everything from "AMP_ND10" to "CIP_EE" to the new `sir` class
mutate_at(vars(AMP_ND10:CIP_EE), as.sir)
```
No errors or warnings, so all values are transformed succesfully.
We also created a package dedicated to data cleaning and checking,
called the `cleaner` package. Its
[`freq()`](https://msberends.github.io/cleaner/reference/freq.html)
function can be used to create frequency tables.
So lets check our data, with a couple of frequency tables:
``` r
# our newly created `mo` variable, put in the mo_name() function
data %>% freq(mo_name(mo), nmax = 10)
```
**Frequency table**
Class: character
Length: 500
Available: 500 (100%, NA: 0 = 0%)
Unique: 38
Shortest: 11
Longest: 40
| | Item | Count | Percent | Cum. Count | Cum. Percent |
|:----|:-----------------------------------------|------:|--------:|-----------:|-------------:|
| 1 | Escherichia coli | 245 | 49.0% | 245 | 49.0% |
| 2 | Coagulase-negative Staphylococcus (CoNS) | 74 | 14.8% | 319 | 63.8% |
| 3 | Staphylococcus epidermidis | 38 | 7.6% | 357 | 71.4% |
| 4 | Streptococcus pneumoniae | 31 | 6.2% | 388 | 77.6% |
| 5 | Staphylococcus hominis | 21 | 4.2% | 409 | 81.8% |
| 6 | Proteus mirabilis | 9 | 1.8% | 418 | 83.6% |
| 7 | Enterococcus faecium | 8 | 1.6% | 426 | 85.2% |
| 8 | Staphylococcus capitis urealyticus | 8 | 1.6% | 434 | 86.8% |
| 9 | Enterobacter cloacae | 5 | 1.0% | 439 | 87.8% |
| 10 | Enterococcus columbae | 4 | 0.8% | 443 | 88.6% |
(omitted 28 entries, n = 57 \[11.4%\])
``` r
# our transformed antibiotic columns
# amoxicillin/clavulanic acid (J01CR02) as an example
data %>% freq(AMC_ND2)
```
**Frequency table**
Class: factor \> ordered \> sir (numeric)
Length: 500
Levels: 5: S \< SDD \< I \< R \< NI
Available: 481 (96.2%, NA: 19 = 3.8%)
Unique: 3
Drug: Amoxicillin/clavulanic acid (AMC, J01CR02/QJ01CR02)
Drug group: Beta-lactams/penicillins
%SI: 78.59%
| | Item | Count | Percent | Cum. Count | Cum. Percent |
|:----|:-----|------:|--------:|-----------:|-------------:|
| 1 | S | 356 | 74.01% | 356 | 74.01% |
| 2 | R | 103 | 21.41% | 459 | 95.43% |
| 3 | I | 22 | 4.57% | 481 | 100.00% |
### A first glimpse at results
An easy `ggplot` will already give a lot of information, using the
included [`ggplot_sir()`](https://amr-for-r.org/reference/ggplot_sir.md)
function:
``` r
data %>%
group_by(Country) %>%
select(Country, AMP_ND2, AMC_ED20, CAZ_ED10, CIP_ED5) %>%
ggplot_sir(translate_ab = "ab", facet = "Country", datalabels = FALSE)
```
![](WHONET_files/figure-html/unnamed-chunk-7-1.png)

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">

252
articles/WISCA.md Normal file
View File

@@ -0,0 +1,252 @@
# Estimating Empirical Coverage with WISCA
> This explainer was largely written by our [AMR for R
> Assistant](https://chat.amr-for-r.org), a ChatGPT manually-trained
> model able to answer any question about the `AMR` package.
## Introduction
Clinical guidelines for empirical antimicrobial therapy require
*probabilistic reasoning*: what is the chance that a regimen will cover
the likely infecting organisms, before culture results are available?
This is the purpose of **WISCA**, or **Weighted-Incidence Syndromic
Combination Antibiogram**.
WISCA is a Bayesian approach that integrates:
- **Pathogen prevalence** (how often each species causes the syndrome),
- **Regimen susceptibility** (how often a regimen works *if* the
pathogen is known),
to estimate the **overall empirical coverage** of antimicrobial
regimens, with quantified uncertainty.
This vignette explains how WISCA works, why it is useful, and how to
apply it using the `AMR` package.
## Why traditional antibiograms fall short
A standard antibiogram gives you:
Species → Antibiotic → Susceptibility %
But clinicians dont know the species *a priori*. They need to choose a
regimen that covers the **likely pathogens**, without knowing which one
is present.
Traditional antibiograms calculate the susceptibility % as just the
number of resistant isolates divided by the total number of tested
isolates. Therefore, traditional antibiograms:
- Fragment information by organism,
- Do not weight by real-world prevalence,
- Do not account for combination therapy or sample size,
- Do not provide uncertainty.
## The idea of WISCA
WISCA asks:
> “What is the **probability** that this regimen **will cover** the
> pathogen, given the syndrome?”
This means combining two things:
- **Incidence** of each pathogen in the syndrome,
- **Susceptibility** of each pathogen to the regimen.
We can write this as:
$$\text{Coverage} = \sum\limits_{i}\left( \text{Incidence}_{i} \times \text{Susceptibility}_{i} \right)$$
For example, suppose:
- *E. coli* causes 60% of cases, and 90% of *E. coli* are susceptible to
a drug.
- *Klebsiella* causes 40% of cases, and 70% of *Klebsiella* are
susceptible.
Then:
$$\text{Coverage} = (0.6 \times 0.9) + (0.4 \times 0.7) = 0.82$$
But in real data, incidence and susceptibility are **estimated from
samples**, so they carry uncertainty. WISCA models this
**probabilistically**, using conjugate Bayesian distributions.
## The Bayesian engine behind WISCA
### Pathogen incidence
Let:
- $K$ be the number of pathogens,
- $\alpha = (1,1,\ldots,1)$ be a **Dirichlet** prior (uniform),
- $n = \left( n_{1},\ldots,n_{K} \right)$ be the observed counts per
species.
Then the posterior incidence is:
$$p \sim \text{Dirichlet}\left( \alpha_{1} + n_{1},\ldots,\alpha_{K} + n_{K} \right)$$
To simulate from this, we use:
$$x_{i} \sim \text{Gamma}\left( \alpha_{i} + n_{i},\ 1 \right),\quad p_{i} = \frac{x_{i}}{\sum\limits_{j = 1}^{K}x_{j}}$$
### Susceptibility
Each pathogenregimen pair has a prior and data:
- Prior: $\text{Beta}\left( \alpha_{0},\beta_{0} \right)$, with default
$\alpha_{0} = \beta_{0} = 1$
- Data: $S$ susceptible out of $N$ tested
The $S$ category could also include values SDD (susceptible,
dose-dependent) and I (intermediate \[CLSI\], or susceptible, increased
exposure \[EUCAST\]).
Then the posterior is:
$$\theta \sim \text{Beta}\left( \alpha_{0} + S,\ \beta_{0} + N - S \right)$$
### Final coverage estimate
Putting it together:
1. Simulate pathogen incidence: $\mathbf{p} \sim \text{Dirichlet}$
2. Simulate susceptibility:
$\theta_{i} \sim \text{Beta}\left( 1 + S_{i},\ 1 + R_{i} \right)$
3. Combine:
$$\text{Coverage} = \sum\limits_{i = 1}^{K}p_{i} \cdot \theta_{i}$$
Repeat this simulation (e.g. 1000×) and summarise:
- **Mean** = expected coverage
- **Quantiles** = credible interval
## Practical use in the `AMR` package
### Prepare data and simulate synthetic syndrome
``` r
library(AMR)
data <- example_isolates
# Structure of our data
data
#> # A tibble: 2,000 × 46
#> date patient age gender ward mo PEN OXA FLC AMX
#> <date> <chr> <dbl> <chr> <chr> <mo> <sir> <sir> <sir> <sir>
#> 1 2002-01-02 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 2 2002-01-03 A77334 65 F Clinical B_ESCHR_COLI R NA NA NA
#> 3 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 4 2002-01-07 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 5 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 6 2002-01-13 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 7 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 8 2002-01-14 462729 78 M Clinical B_STPHY_AURS R NA S R
#> 9 2002-01-16 067927 45 F ICU B_STPHY_EPDR R NA R NA
#> 10 2002-01-17 858515 79 F ICU B_STPHY_EPDR R NA S NA
#> # 1,990 more rows
#> # 36 more variables: AMC <sir>, AMP <sir>, TZP <sir>, CZO <sir>, FEP <sir>,
#> # CXM <sir>, FOX <sir>, CTX <sir>, CAZ <sir>, CRO <sir>, GEN <sir>,
#> # TOB <sir>, AMK <sir>, KAN <sir>, TMP <sir>, SXT <sir>, NIT <sir>,
#> # FOS <sir>, LNZ <sir>, CIP <sir>, MFX <sir>, VAN <sir>, TEC <sir>,
#> # TCY <sir>, TGC <sir>, DOX <sir>, ERY <sir>, CLI <sir>, AZM <sir>,
#> # IPM <sir>, MEM <sir>, MTR <sir>, CHL <sir>, COL <sir>, MUP <sir>, …
# Add a fake syndrome column
data$syndrome <- ifelse(data$mo %like% "coli", "UTI", "No UTI")
```
### Basic WISCA antibiogram
``` r
wisca(data,
antimicrobials = c("AMC", "CIP", "GEN"))
```
| Amoxicillin/clavulanic acid | Ciprofloxacin | Gentamicin |
|:----------------------------|:-----------------|:-------------------|
| 73.7% (71.7-75.8%) | 77% (74.3-79.4%) | 72.8% (70.7-74.8%) |
### Use combination regimens
``` r
wisca(data,
antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"))
```
| Amoxicillin/clavulanic acid | Amoxicillin/clavulanic acid + Ciprofloxacin | Amoxicillin/clavulanic acid + Gentamicin |
|:----------------------------|:--------------------------------------------|:-----------------------------------------|
| 73.8% (71.8-75.7%) | 87.5% (85.9-89%) | 89.7% (88.2-91.1%) |
### Stratify by syndrome
``` r
wisca(data,
antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"),
syndromic_group = "syndrome")
```
| Syndromic Group | Amoxicillin/clavulanic acid | Amoxicillin/clavulanic acid + Ciprofloxacin | Amoxicillin/clavulanic acid + Gentamicin |
|:----------------|:----------------------------|:--------------------------------------------|:-----------------------------------------|
| No UTI | 70.1% (67.8-72.3%) | 85.2% (83.1-87.2%) | 87.1% (85.3-88.7%) |
| UTI | 80.9% (77.7-83.8%) | 88.2% (85.7-90.5%) | 90.9% (88.7-93%) |
The `AMR` package is available in 28 languages, which can all be used
for the [`wisca()`](https://amr-for-r.org/reference/antibiogram.md)
function too:
``` r
wisca(data,
antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"),
syndromic_group = gsub("UTI", "UCI", data$syndrome),
language = "Spanish")
```
| Grupo sindrómico | Amoxicilina/ácido clavulánico | Amoxicilina/ácido clavulánico + Ciprofloxacina | Amoxicilina/ácido clavulánico + Gentamicina |
|:-----------------|:------------------------------|:-----------------------------------------------|:--------------------------------------------|
| No UCI | 70% (67.8-72.4%) | 85.3% (83.3-87.2%) | 87% (85.3-88.8%) |
| UCI | 80.9% (77.7-83.9%) | 88.2% (85.5-90.6%) | 90.9% (88.7-93%) |
## Sensible defaults, which can be customised
- `simulations = 1000`: number of Monte Carlo draws
- `conf_interval = 0.95`: coverage interval width
- `combine_SI = TRUE`: count “I” and “SDD” as susceptible
## Limitations
- It assumes your data are representative
- No adjustment for patient-level covariates, although these could be
passed onto the `syndromic_group` argument
- WISCA does not model resistance over time, you might want to use
`tidymodels` for that, for which we [wrote a basic
introduction](https://amr-for-r.org/articles/AMR_with_tidymodels.html)
## Summary
WISCA enables:
- Empirical regimen comparison,
- Syndrome-specific coverage estimation,
- Fully probabilistic interpretation.
It is available in the `AMR` package via either:
``` r
wisca(...)
antibiogram(..., wisca = TRUE)
```
## Reference
Bielicki, JA, et al. (2016). *Selecting appropriate empirical antibiotic
regimens for paediatric bloodstream infections: application of a
Bayesian decision model to local and pooled antimicrobial resistance
surveillance data.* **J Antimicrob Chemother**. 71(3):794-802.
<https://doi.org/10.1093/jac/dkv397>

View File

@@ -30,7 +30,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
@@ -80,7 +80,7 @@
<main id="main" class="col-md-9"><div class="page-header">
<img src="../logo.svg" class="logo" alt=""><h1>Download data sets for download / own use</h1>
<h4 data-toc-skip class="date">13 October 2025</h4>
<h4 data-toc-skip class="date">24 November 2025</h4>
<small class="dont-index">Source: <a href="https://github.com/msberends/AMR/blob/main/vignettes/datasets.Rmd" class="external-link"><code>vignettes/datasets.Rmd</code></a></small>
<div class="d-none name"><code>datasets.Rmd</code></div>
@@ -417,14 +417,14 @@ all SNOMED codes as comma separated values.</p>
<h2 id="antimicrobials-antibiotic-and-antifungal-drugs">
<code>antimicrobials</code>: Antibiotic and Antifungal Drugs<a class="anchor" aria-label="anchor" href="#antimicrobials-antibiotic-and-antifungal-drugs"></a>
</h2>
<p>A data set with 496 rows and 14 columns, containing the following
<p>A data set with 498 rows and 14 columns, containing the following
column names:<br><em>ab</em>, <em>cid</em>, <em>name</em>, <em>group</em>, <em>atc</em>,
<em>atc_group1</em>, <em>atc_group2</em>, <em>abbreviations</em>,
<em>synonyms</em>, <em>oral_ddd</em>, <em>oral_units</em>,
<em>iv_ddd</em>, <em>iv_units</em>, and <em>loinc</em>.</p>
<p>This data set is in R available as <code>antimicrobials</code>, after
you load the <code>AMR</code> package.</p>
<p>It was last updated on 1 September 2025 14:56:55 UTC. Find more info
<p>It was last updated on 24 November 2025 10:24:02 UTC. Find more info
about the contents, (scientific) source, and structure of this <a href="https://amr-for-r.org/reference/antimicrobials.html">data set
here</a>.</p>
<p><strong>Direct download links:</strong></p>

561
articles/datasets.md Normal file
View File

@@ -0,0 +1,561 @@
# Download data sets for download / own use
All reference data (about microorganisms, antimicrobials, SIR
interpretation, EUCAST rules, etc.) in this `AMR` package are reliable,
up-to-date and freely available. We continually export our data sets to
formats for use in R, MS Excel, Apache Feather, Apache Parquet, SPSS,
and Stata. We also provide tab-separated text files that are
machine-readable and suitable for input in any software program, such as
laboratory information systems.
> If you are working in Python, be sure to use our [AMR for
> Python](https://amr-for-r.org/articles/AMR_for_Python.html) package.
> It allows all relevant AMR data sets to be natively available in
> Python.
## `microorganisms`: Full Microbial Taxonomy
A data set with 78 679 rows and 26 columns, containing the following
column names:
*mo*, *fullname*, *status*, *kingdom*, *phylum*, *class*, *order*,
*family*, *genus*, *species*, *subspecies*, *rank*, *ref*,
*oxygen_tolerance*, *source*, *lpsn*, *lpsn_parent*, *lpsn_renamed_to*,
*mycobank*, *mycobank_parent*, *mycobank_renamed_to*, *gbif*,
*gbif_parent*, *gbif_renamed_to*, *prevalence*, and *snomed*.
This data set is in R available as `microorganisms`, after you load the
`AMR` package.
It was last updated on 18 September 2025 12:58:34 UTC. Find more info
about the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/microorganisms.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.rds)
(1.8 MB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.txt)
(17.7 MB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.xlsx)
(8.8 MB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.feather)
(8.4 MB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.parquet)
(3.8 MB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.sav)
(28.4 MB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.dta)
(89.5 MB)
**NOTE: The exported files for SPSS and Stata contain only the first 50
SNOMED codes per record, as their file size would otherwise exceed 100
MB; the file size limit of GitHub.** Their file structures and
compression techniques are very inefficient. Advice? Use R instead. Its
free and much better in many ways.
The tab-separated text file and Microsoft Excel workbook both contain
all SNOMED codes as comma separated values.
**Example content**
Included (sub)species per taxonomic kingdom:
| Kingdom | Number of (sub)species |
|:-----------------:|:----------------------:|
| (unknown kingdom) | 1 |
| Animalia | 1 628 |
| Archaea | 1 419 |
| Bacteria | 39 249 |
| Chromista | 178 |
| Fungi | 28 137 |
First 6 rows when filtering on genus *Escherichia*:
| mo | fullname | status | kingdom | phylum | class | order | family | genus | species | subspecies | rank | ref | oxygen_tolerance | source | lpsn | lpsn_parent | lpsn_renamed_to | mycobank | mycobank_parent | mycobank_renamed_to | gbif | gbif_parent | gbif_renamed_to | prevalence | snomed |
|:-----------------:|:--------------------------:|:--------:|:--------:|:--------------:|:-------------------:|:----------------:|:------------------:|:-----------:|:--------------:|:----------:|:----------:|:-----------------------:|:---------------------------:|:------:|:------:|:-----------:|:---------------:|:--------:|:---------------:|:-------------------:|:--------:|:-----------:|:---------------:|:----------:|:-----------------------------------------:|
| B_ESCHR | Escherichia | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | | | genus | Castellani et al., 1919 | facultative anaerobe | LPSN | 515602 | 482 | | | | | | 11158430 | | 1 | 407310004, 407251000, 407281008, … |
| B_ESCHR_ADCR | Escherichia adecarboxylata | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | adecarboxylata | | species | Leclerc, 1962 | likely facultative anaerobe | LPSN | 776052 | 515602 | 777447 | | | | | | | 1 | |
| B_ESCHR_ALBR | Escherichia albertii | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | albertii | | species | Huys et al., 2003 | facultative anaerobe | LPSN | 776053 | 515602 | | | | | 5427575 | | | 1 | 419388003 |
| B_ESCHR_BLTT | Escherichia blattae | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | blattae | | species | Burgess et al., 1973 | likely facultative anaerobe | LPSN | 776056 | 515602 | 788468 | | | | | | | 1 | |
| B_ESCHR_COLI | Escherichia coli | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | coli | | species | Castellani et al., 1919 | facultative anaerobe | LPSN | 776057 | 515602 | | | | | 11286021 | | | 1 | 1095001000112106, 715307006, 737528008, … |
| B_ESCHR_COLI_COLI | Escherichia coli coli | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | coli | coli | subspecies | | | GBIF | | 776057 | | | | | 12233256 | 11286021 | | 1 | |
------------------------------------------------------------------------
## `antimicrobials`: Antibiotic and Antifungal Drugs
A data set with 498 rows and 14 columns, containing the following column
names:
*ab*, *cid*, *name*, *group*, *atc*, *atc_group1*, *atc_group2*,
*abbreviations*, *synonyms*, *oral_ddd*, *oral_units*, *iv_ddd*,
*iv_units*, and *loinc*.
This data set is in R available as `antimicrobials`, after you load the
`AMR` package.
It was last updated on 24 November 2025 10:24:02 UTC. Find more info
about the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/antimicrobials.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.rds)
(45 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.txt)
(0.1 MB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.xlsx)
(78 kB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.feather)
(0.1 MB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.parquet)
(0.1 MB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.sav)
(0.4 MB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antimicrobials.dta)
(10 kB)
The tab-separated text, Microsoft Excel, SPSS, and Stata files all
contain the ATC codes, common abbreviations, trade names and LOINC codes
as comma separated values.
**Example content**
| ab | cid | name | group | atc | atc_group1 | atc_group2 | abbreviations | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | loinc |
|:---:|:--------:|:---------------------------:|:------------------------:|:------------------------------:|:-------------------------------------------:|:------------------------------------------------------------:|:-------------------:|:-------------------------------------------------------:|:--------:|:----------:|:------:|:--------:|:------------------------------:|
| AMK | 37768 | Amikacin | Aminoglycosides | D06AX12, J01GB06, QD06AX12, … | Aminoglycoside antibacterials | Other aminoglycosides | ak, ami, amik, … | amikacillin, amikacina, amikacine, … | | | 1.0 | g | 101493-5, 11-7, 12-5, … |
| AMX | 33613 | Amoxicillin | Beta-lactams/penicillins | J01CA04, QG51AA03, QJ01CA04 | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | ac, amox, amoxic, … | acuotricina, alfamox, alfida, … | 1.5 | g | 3.0 | g | 101498-4, 15-8, 16-6, … |
| AMC | 23665637 | Amoxicillin/clavulanic acid | Beta-lactams/penicillins | J01CR02, QJ01CR02 | Beta-lactam antibacterials, penicillins | Combinations of penicillins, incl. beta-lactamase inhibitors | a/c, amcl, aml, … | amocla, amoclan, amoclav, … | 1.5 | g | 3.0 | g | |
| AMP | 6249 | Ampicillin | Beta-lactams/penicillins | J01CA01, QJ01CA01, QJ51CA01, … | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | am, amp, amp100, … | adobacillin, alpen, amblosin, … | 2.0 | g | 6.0 | g | 101477-8, 101478-6, 18864-9, … |
| AZM | 447043 | Azithromycin | Macrolides/lincosamides | J01FA10, QJ01FA10, QS01AA26, … | Macrolides, lincosamides and streptogramins | Macrolides | az, azi, azit, … | aritromicina, aruzilina, azasite, … | 0.3 | g | 0.5 | g | 100043-9, 16420-2, 16421-0, … |
| PEN | 5904 | Benzylpenicillin | Beta-lactams/penicillins | J01CE01, QJ01CE01, QJ51CE01, … | Combinations of antibacterials | Combinations of antibacterials | bepe, pen, peni, … | bencilpenicilina, benzopenicillin, benzylpenicilline, … | | | 3.6 | g | |
------------------------------------------------------------------------
## `clinical_breakpoints`: Interpretation from MIC values & disk diameters to SIR
A data set with 40 217 rows and 14 columns, containing the following
column names:
*guideline*, *type*, *host*, *method*, *site*, *mo*, *rank_index*, *ab*,
*ref_tbl*, *disk_dose*, *breakpoint_S*, *breakpoint_R*, *uti*, and
*is_SDD*.
This data set is in R available as `clinical_breakpoints`, after you
load the `AMR` package.
It was last updated on 20 April 2025 10:55:31 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/clinical_breakpoints.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.rds)
(88 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.txt)
(3.7 MB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.xlsx)
(2.4 MB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.feather)
(1.8 MB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.parquet)
(0.1 MB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.sav)
(6.6 MB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/clinical_breakpoints.dta)
(11.1 MB)
**Example content**
| guideline | type | host | method | site | mo | mo_name | rank_index | ab | ab_name | ref_tbl | disk_dose | breakpoint_S | breakpoint_R | uti | is_SDD |
|:-----------:|:-----:|:-----:|:------:|:----:|:-------------:|:--------------------------:|:----------:|:---:|:-----------------------------:|:---------------:|:--------------:|:------------:|:------------:|:-----:|:------:|
| EUCAST 2025 | human | human | DISK | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | MEM | Meropenem | A. xylosoxidans | 10 mcg | 26.000 | 20.000 | FALSE | FALSE |
| EUCAST 2025 | human | human | MIC | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | MEM | Meropenem | A. xylosoxidans | | 1.000 | 4.000 | FALSE | FALSE |
| EUCAST 2025 | human | human | DISK | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | SXT | Trimethoprim/sulfamethoxazole | A. xylosoxidans | 1.25/23.75 mcg | 26.000 | 26.000 | FALSE | FALSE |
| EUCAST 2025 | human | human | MIC | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | SXT | Trimethoprim/sulfamethoxazole | A. xylosoxidans | | 0.125 | 0.125 | FALSE | FALSE |
| EUCAST 2025 | human | human | DISK | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | TZP | Piperacillin/tazobactam | A. xylosoxidans | 30/6 mcg | 26.000 | 26.000 | FALSE | FALSE |
| EUCAST 2025 | human | human | MIC | | B_ACHRMB_XYLS | Achromobacter xylosoxidans | 2 | TZP | Piperacillin/tazobactam | A. xylosoxidans | | 4.000 | 4.000 | FALSE | FALSE |
------------------------------------------------------------------------
## `microorganisms.groups`: Species Groups and Microbiological Complexes
A data set with 534 rows and 4 columns, containing the following column
names:
*mo_group*, *mo*, *mo_group_name*, and *mo_name*.
This data set is in R available as `microorganisms.groups`, after you
load the `AMR` package.
It was last updated on 26 March 2025 16:19:17 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/microorganisms.groups.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.rds)
(6 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.txt)
(50 kB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.xlsx)
(20 kB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.feather)
(19 kB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.parquet)
(13 kB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.sav)
(65 kB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.groups.dta)
(83 kB)
**Example content**
| mo_group | mo | mo_group_name | mo_name |
|:--------------:|:------------:|:-------------------------------:|:---------------------------:|
| B_ACNTB_BMNN-C | B_ACNTB_BMNN | Acinetobacter baumannii complex | Acinetobacter baumannii |
| B_ACNTB_BMNN-C | B_ACNTB_CLCC | Acinetobacter baumannii complex | Acinetobacter calcoaceticus |
| B_ACNTB_BMNN-C | B_ACNTB_LCTC | Acinetobacter baumannii complex | Acinetobacter dijkshoorniae |
| B_ACNTB_BMNN-C | B_ACNTB_NSCM | Acinetobacter baumannii complex | Acinetobacter nosocomialis |
| B_ACNTB_BMNN-C | B_ACNTB_PITT | Acinetobacter baumannii complex | Acinetobacter pittii |
| B_ACNTB_BMNN-C | B_ACNTB_SFRT | Acinetobacter baumannii complex | Acinetobacter seifertii |
------------------------------------------------------------------------
## `intrinsic_resistant`: Intrinsic Bacterial Resistance
A data set with 271 905 rows and 2 columns, containing the following
column names:
*mo* and *ab*.
This data set is in R available as `intrinsic_resistant`, after you load
the `AMR` package.
It was last updated on 28 March 2025 10:17:49 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/intrinsic_resistant.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.rds)
(0.1 MB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.txt)
(10.1 MB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.xlsx)
(2.9 MB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.feather)
(2.3 MB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.parquet)
(0.3 MB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.sav)
(14.8 MB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/intrinsic_resistant.dta)
(22.6 MB)
**Example content**
Example rows when filtering on *Enterobacter cloacae*:
| microorganism | antibiotic |
|:--------------------:|:---------------------------:|
| Enterobacter cloacae | Acetylmidecamycin |
| Enterobacter cloacae | Acetylspiramycin |
| Enterobacter cloacae | Amoxicillin |
| Enterobacter cloacae | Amoxicillin/clavulanic acid |
| Enterobacter cloacae | Ampicillin |
| Enterobacter cloacae | Ampicillin/sulbactam |
| Enterobacter cloacae | Avoparcin |
| Enterobacter cloacae | Azithromycin |
| Enterobacter cloacae | Benzylpenicillin |
| Enterobacter cloacae | Bleomycin |
| Enterobacter cloacae | Cadazolid |
| Enterobacter cloacae | Cefadroxil |
| Enterobacter cloacae | Cefalexin |
| Enterobacter cloacae | Cefalotin |
| Enterobacter cloacae | Cefazolin |
| Enterobacter cloacae | Cefoxitin |
| Enterobacter cloacae | Clarithromycin |
| Enterobacter cloacae | Clindamycin |
| Enterobacter cloacae | Cycloserine |
| Enterobacter cloacae | Dalbavancin |
| Enterobacter cloacae | Dirithromycin |
| Enterobacter cloacae | Erythromycin |
| Enterobacter cloacae | Flurithromycin |
| Enterobacter cloacae | Fusidic acid |
| Enterobacter cloacae | Gamithromycin |
| Enterobacter cloacae | Josamycin |
| Enterobacter cloacae | Kitasamycin |
| Enterobacter cloacae | Lincomycin |
| Enterobacter cloacae | Linezolid |
| Enterobacter cloacae | Meleumycin |
| Enterobacter cloacae | Midecamycin |
| Enterobacter cloacae | Miocamycin |
| Enterobacter cloacae | Nafithromycin |
| Enterobacter cloacae | Norvancomycin |
| Enterobacter cloacae | Oleandomycin |
| Enterobacter cloacae | Oritavancin |
| Enterobacter cloacae | Pirlimycin |
| Enterobacter cloacae | Pristinamycin |
| Enterobacter cloacae | Quinupristin/dalfopristin |
| Enterobacter cloacae | Ramoplanin |
| Enterobacter cloacae | Rifampicin |
| Enterobacter cloacae | Rokitamycin |
| Enterobacter cloacae | Roxithromycin |
| Enterobacter cloacae | Solithromycin |
| Enterobacter cloacae | Spiramycin |
| Enterobacter cloacae | Tedizolid |
| Enterobacter cloacae | Teicoplanin |
| Enterobacter cloacae | Telavancin |
| Enterobacter cloacae | Telithromycin |
| Enterobacter cloacae | Thiacetazone |
| Enterobacter cloacae | Tildipirosin |
| Enterobacter cloacae | Tilmicosin |
| Enterobacter cloacae | Troleandomycin |
| Enterobacter cloacae | Tulathromycin |
| Enterobacter cloacae | Tylosin |
| Enterobacter cloacae | Tylvalosin |
| Enterobacter cloacae | Vancomycin |
------------------------------------------------------------------------
## `dosage`: Dosage Guidelines from EUCAST
A data set with 759 rows and 9 columns, containing the following column
names:
*ab*, *name*, *type*, *dose*, *dose_times*, *administration*, *notes*,
*original_txt*, and *eucast_version*.
This data set is in R available as `dosage`, after you load the `AMR`
package.
It was last updated on 20 April 2025 10:55:31 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/dosage.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.rds)
(4 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.txt)
(66 kB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.xlsx)
(37 kB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.feather)
(28 kB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.parquet)
(9 kB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.sav)
(97 kB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/dosage.dta)
(0.2 MB)
**Example content**
| ab | name | type | dose | dose_times | administration | notes | original_txt | eucast_version |
|:---:|:-----------:|:-----------------:|:-----------:|:----------:|:--------------:|:-----:|:------------------:|:--------------:|
| AMK | Amikacin | standard_dosage | 25-30 mg/kg | 1 | iv | | 25-30 mg/kg x 1 iv | 15 |
| AMX | Amoxicillin | high_dosage | 2 g | 6 | iv | | 2 g x 6 iv | 15 |
| AMX | Amoxicillin | standard_dosage | 1 g | 3 | iv | | 1 g x 3-4 iv | 15 |
| AMX | Amoxicillin | high_dosage | 0.75-1 g | 3 | oral | | 0.75-1 g x 3 oral | 15 |
| AMX | Amoxicillin | standard_dosage | 0.5 g | 3 | oral | | 0.5 g x 3 oral | 15 |
| AMX | Amoxicillin | uncomplicated_uti | 0.5 g | 3 | oral | | 0.5 g x 3 oral | 15 |
------------------------------------------------------------------------
## `example_isolates`: Example Data for Practice
A data set with 2 000 rows and 46 columns, containing the following
column names:
*date*, *patient*, *age*, *gender*, *ward*, *mo*, *PEN*, *OXA*, *FLC*,
*AMX*, *AMC*, *AMP*, *TZP*, *CZO*, *FEP*, *CXM*, *FOX*, *CTX*, *CAZ*,
*CRO*, *GEN*, *TOB*, *AMK*, *KAN*, *TMP*, *SXT*, *NIT*, *FOS*, *LNZ*,
*CIP*, *MFX*, *VAN*, *TEC*, *TCY*, *TGC*, *DOX*, *ERY*, *CLI*, *AZM*,
*IPM*, *MEM*, *MTR*, *CHL*, *COL*, *MUP*, and *RIF*.
This data set is in R available as `example_isolates`, after you load
the `AMR` package.
It was last updated on 15 June 2024 13:33:49 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/example_isolates.html).
**Example content**
| date | patient | age | gender | ward | mo | PEN | OXA | FLC | AMX | AMC | AMP | TZP | CZO | FEP | CXM | FOX | CTX | CAZ | CRO | GEN | TOB | AMK | KAN | TMP | SXT | NIT | FOS | LNZ | CIP | MFX | VAN | TEC | TCY | TGC | DOX | ERY | CLI | AZM | IPM | MEM | MTR | CHL | COL | MUP | RIF |
|:----------:|:-------:|:---:|:------:|:--------:|:------------:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| 2002-01-02 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | | | | I | | | | | I | | | | | | | | | R | R | | | R | | | R | R | R | | | R | R | R | | | | | | | R |
| 2002-01-03 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | | | | I | | | | | I | | | | | | | | | R | R | | | R | | | R | R | R | | | R | R | R | | | | | | | R |
| 2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | | R | | | | | | | R | | | R | | | | | | S | S | | | | | | S | | S | S | S | R | | R | | | | | R | | |
| 2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | | R | | | | | | | R | | | R | | | | | | S | S | | | | | | S | | S | S | S | R | | R | | | | | R | | |
| 2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | | R | | | | | | | R | | | R | | | | | | R | | | | | | | S | | S | S | S | R | | R | | | | | R | | |
| 2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | | R | | | | | | | R | | | R | | | | | | R | | | | | | | S | | S | S | S | R | R | R | | | | | R | | |
------------------------------------------------------------------------
## `example_isolates_unclean`: Example Data for Practice
A data set with 3 000 rows and 8 columns, containing the following
column names:
*patient_id*, *hospital*, *date*, *bacteria*, *AMX*, *AMC*, *CIP*, and
*GEN*.
This data set is in R available as `example_isolates_unclean`, after you
load the `AMR` package.
It was last updated on 27 August 2022 18:49:37 UTC. Find more info about
the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/example_isolates_unclean.html).
**Example content**
| patient_id | hospital | date | bacteria | AMX | AMC | CIP | GEN |
|:----------:|:--------:|:----------:|:-------------:|:---:|:---:|:---:|:---:|
| J3 | A | 2012-11-21 | E. coli | R | I | S | S |
| R7 | A | 2018-04-03 | K. pneumoniae | R | I | S | S |
| P3 | A | 2014-09-19 | E. coli | R | S | S | S |
| P10 | A | 2015-12-10 | E. coli | S | I | S | S |
| B7 | A | 2015-03-02 | E. coli | S | S | S | S |
| W3 | A | 2018-03-31 | S. aureus | R | S | R | S |
------------------------------------------------------------------------
## `microorganisms.codes`: Common Laboratory Codes
A data set with 6 036 rows and 2 columns, containing the following
column names:
*code* and *mo*.
This data set is in R available as `microorganisms.codes`, after you
load the `AMR` package.
It was last updated on 4 May 2025 16:50:25 UTC. Find more info about the
contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/microorganisms.codes.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.rds)
(27 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.txt)
(0.1 MB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.xlsx)
(98 kB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.feather)
(0.1 MB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.parquet)
(68 kB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.sav)
(0.2 MB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/microorganisms.codes.dta)
(0.2 MB)
**Example content**
| code | mo |
|:----:|:------------:|
| 1011 | B_GRAMP |
| 1012 | B_GRAMP |
| 1013 | B_GRAMN |
| 1014 | B_GRAMN |
| 1015 | F_YEAST |
| 103 | B_ESCHR_COLI |
------------------------------------------------------------------------
## `antivirals`: Antiviral Drugs
A data set with 120 rows and 11 columns, containing the following column
names:
*av*, *name*, *atc*, *cid*, *atc_group*, *synonyms*, *oral_ddd*,
*oral_units*, *iv_ddd*, *iv_units*, and *loinc*.
This data set is in R available as `antivirals`, after you load the
`AMR` package.
It was last updated on 20 October 2023 12:51:48 UTC. Find more info
about the contents, (scientific) source, and structure of this [data set
here](https://amr-for-r.org/reference/antimicrobials.html).
**Direct download links:**
- Download as [original R Data Structure (RDS)
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.rds)
(6 kB)
- Download as [tab-separated text
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.txt)
(17 kB)
- Download as [Microsoft Excel
workbook](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.xlsx)
(16 kB)
- Download as [Apache Feather
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.feather)
(16 kB)
- Download as [Apache Parquet
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.parquet)
(13 kB)
- Download as [IBM SPSS Statistics data
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.sav)
(32 kB)
- Download as [Stata DTA
file](https://github.com/msberends/AMR/raw/main/data-raw/datasets/antivirals.dta)
(78 kB)
The tab-separated text, Microsoft Excel, SPSS, and Stata files all
contain the trade names and LOINC codes as comma separated values.
**Example content**
| av | name | atc | cid | atc_group | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | loinc |
|:---:|:------------------:|:-------:|:---------:|:------------------------------------------------------------------:|:-----------------------------------------------------:|:--------:|:----------:|:------:|:--------:|:----------------------------:|
| ABA | Abacavir | J05AF06 | 441300 | Nucleoside and nucleotide reverse transcriptase inhibitors | abacavir sulfate, avacavir, ziagen | 0.6 | g | | | 29113-8, 30273-7, 30287-7, … |
| ACI | Aciclovir | J05AB01 | 135398513 | Nucleosides and nucleotides excl. reverse transcriptase inhibitors | acicloftal, aciclovier, aciclovirum, … | 4.0 | g | 4 | g | |
| ADD | Adefovir dipivoxil | J05AF08 | 60871 | Nucleoside and nucleotide reverse transcriptase inhibitors | adefovir di, adefovir di ester, adefovir dipivoxyl, … | 10.0 | mg | | | |
| AME | Amenamevir | J05AX26 | 11397521 | Other antivirals | amenalief | 0.4 | g | | | |
| AMP | Amprenavir | J05AE05 | 65016 | Protease inhibitors | agenerase, carbamate, prozei | 1.2 | g | | | 29114-6, 30296-8, 30297-6, … |
| ASU | Asunaprevir | J05AP06 | 16076883 | Antivirals for treatment of HCV infections | sunvepra, sunvepratrade | 0.2 | g | | | |

View File

@@ -7,7 +7,7 @@
<a class="navbar-brand me-2" href="../index.html">AMR (for R)</a>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9002</small>
<small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.0.1.9003</small>
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">

16
articles/index.md Normal file
View File

@@ -0,0 +1,16 @@
# Articles
### All vignettes
- [AMR for Python](https://amr-for-r.org/articles/AMR_for_Python.md):
- [AMR with
tidymodels](https://amr-for-r.org/articles/AMR_with_tidymodels.md):
- [Conduct AMR data analysis](https://amr-for-r.org/articles/AMR.md):
- [Download data sets for download / own
use](https://amr-for-r.org/articles/datasets.md):
- [Apply EUCAST rules](https://amr-for-r.org/articles/EUCAST.md):
- [Conduct principal component analysis (PCA) for
AMR](https://amr-for-r.org/articles/PCA.md):
- [Work with WHONET data](https://amr-for-r.org/articles/WHONET.md):
- [Estimating Empirical Coverage with
WISCA](https://amr-for-r.org/articles/WISCA.md):