---
title: "Estimating Empirical Coverage with WISCA"
output: 
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 3
vignette: >
  %\VignetteIndexEntry{Estimating Empirical Coverage with WISCA}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  warning = FALSE,
  collapse = TRUE,
  comment = "#>",
  fig.width = 7.5,
  fig.height = 5
)
```

> This explainer was largely written by our [AMR for R Assistant](https://chat.amr-for-r.org), a ChatGPT manually-trained model able to answer any question about the `AMR` package.

## Introduction

Clinical guidelines for empirical antimicrobial therapy require *probabilistic reasoning*: what is the chance that a regimen will cover the likely infecting organisms, before culture results are available?

This is the purpose of **WISCA**, or **Weighted-Incidence Syndromic Combination Antibiogram**.

WISCA is a Bayesian approach that integrates:

- **Pathogen prevalence** (how often each species causes the syndrome),
- **Regimen susceptibility** (how often a regimen works *if* the pathogen is known),

to estimate the **overall empirical coverage** of antimicrobial regimens, with quantified uncertainty.

This vignette explains how WISCA works, why it is useful, and how to apply it using the `AMR` package.

## Why traditional antibiograms fall short

A standard antibiogram gives you:

```
Species → Antibiotic → Susceptibility %
```

But clinicians don’t know the species *a priori*. They need to choose a regimen that covers the **likely pathogens**, without knowing which one is present.

Traditional antibiograms calculate the susceptibility % as just the number of resistant isolates divided by the total number of tested isolates. Therefore, traditional antibiograms:

- Fragment information by organism,
- Do not weight by real-world prevalence,
- Do not account for combination therapy or sample size,
- Do not provide uncertainty.

## The idea of WISCA

WISCA asks:

> "What is the **probability** that this regimen **will cover** the pathogen, given the syndrome?"

This means combining two things:

- **Incidence** of each pathogen in the syndrome,
- **Susceptibility** of each pathogen to the regimen.

We can write this as:

$$\text{Coverage} = \sum_i (\text{Incidence}_i \times \text{Susceptibility}_i)$$

For example, suppose:

- *E. coli* causes 60% of cases, and 90% of *E. coli* are susceptible to a drug.
- *Klebsiella* causes 40% of cases, and 70% of *Klebsiella* are susceptible.

Then:

$$\text{Coverage} = (0.6 \times 0.9) + (0.4 \times 0.7) = 0.82$$

But in real data, incidence and susceptibility are **estimated from samples**, so they carry uncertainty. WISCA models this **probabilistically**, using conjugate Bayesian distributions.

## The Bayesian engine behind WISCA

### Pathogen incidence

Let:

- $K$ be the number of pathogens,
- $\alpha = (1, 1, \ldots, 1)$ be a **Dirichlet** prior (uniform),
- $n = (n_1, \ldots, n_K)$ be the observed counts per species.

Then the posterior incidence is:

$$p \sim \text{Dirichlet}(\alpha_1 + n_1, \ldots, \alpha_K + n_K)$$

To simulate from this, we use:

$$x_i \sim \text{Gamma}(\alpha_i + n_i,\ 1), \quad p_i = \frac{x_i}{\sum_{j=1}^{K} x_j}$$

### Susceptibility

Each pathogen–regimen pair has a prior and data:

- Prior: $\text{Beta}(\alpha_0, \beta_0)$, with default $\alpha_0 = \beta_0 = 1$
- Data: $S$ susceptible out of $N$ tested

The $S$ category could also include values SDD (susceptible, dose-dependent) and I (intermediate [CLSI], or susceptible, increased exposure [EUCAST]).

Then the posterior is:

$$\theta \sim \text{Beta}(\alpha_0 + S,\ \beta_0 + N - S)$$

### Final coverage estimate

Putting it together:

1. Simulate pathogen incidence: $\boldsymbol{p} \sim \text{Dirichlet}$
2. Simulate susceptibility: $\theta_i \sim \text{Beta}(1 + S_i,\ 1 + R_i)$
3. Combine:

$$\text{Coverage} = \sum_{i=1}^{K} p_i \cdot \theta_i$$

Repeat this simulation (e.g. 1000×) and summarise:

- **Mean** = expected coverage
- **Quantiles** = credible interval

## Practical use in the `AMR` package

### Prepare data and simulate synthetic syndrome

```{r}
library(AMR)
data <- example_isolates

# Structure of our data
data

# Add a fake syndrome column
data$syndrome <- ifelse(data$mo %like% "coli", "UTI", "No UTI")
```

### Basic WISCA antibiogram

```{r}
wisca(data,
      antimicrobials = c("AMC", "CIP", "GEN"))
```

### Use combination regimens

```{r}
wisca(data,
      antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"))
```

### Stratify by syndrome

```{r}
wisca(data,
      antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"),
      syndromic_group = "syndrome")
```

The `AMR` package is available in `r length(AMR:::LANGUAGES_SUPPORTED)` languages, which can all be used for the `wisca()` function too:

```{r}
wisca(data,
      antimicrobials = c("AMC", "AMC + CIP", "AMC + GEN"),
      syndromic_group = gsub("UTI", "UCI", data$syndrome),
      language = "Spanish")
```


## Sensible defaults, which can be customised

- `simulations = 1000`: number of Monte Carlo draws
- `conf_interval = 0.95`: coverage interval width
- `combine_SI = TRUE`: count "I" and "SDD" as susceptible

## Limitations

- It assumes your data are representative
- No adjustment for patient-level covariates, although these could be passed onto the `syndromic_group` argument
- WISCA does not model resistance over time, you might want to use `tidymodels` for that, for which we [wrote a basic introduction](https://amr-for-r.org/articles/AMR_with_tidymodels.html)

## Summary

WISCA enables:

- Empirical regimen comparison,
- Syndrome-specific coverage estimation,
- Fully probabilistic interpretation.

It is available in the `AMR` package via either:

```r
wisca(...)

antibiogram(..., wisca = TRUE)
```

## Reference

Bielicki, JA, et al. (2016). *Selecting appropriate empirical antibiotic regimens for paediatric bloodstream infections: application of a Bayesian decision model to local and pooled antimicrobial resistance surveillance data.* **J Antimicrob Chemother**. 71(3):794-802. https://doi.org/10.1093/jac/dkv397