mirror of
https://github.com/msberends/AMR.git
synced 2026-03-11 15:47:54 +01:00
* mdro(): infer base drug resistance from drug+inhibitor combination columns (#209) When a base beta-lactam column (e.g., piperacillin/PIP) is absent but a corresponding drug+inhibitor combination (e.g., piperacillin/tazobactam/TZP) is present and resistant, resistance in the base drug is now correctly inferred. This is clinically sound: resistance in a combination implies the inhibitor provided no benefit, so the base drug is also resistant. Susceptibility in a combination is NOT propagated to the base drug (the inhibitor may be responsible for susceptibility), so only R values are inferred; missing base drugs remain NA otherwise. Implementation details: - Uses AB_BETALACTAMS_WITH_INHIBITOR to identify all beta-lactam+inhibitor combinations present in the user's data - Derives base drug AB codes by stripping the "/inhibitor" part from names - Creates synthetic proxy columns (.sir_proxy_<AB>) in x, set to "R" when any matching combination is R, otherwise NA - Proxy columns are added to cols_ab before drug variable assignment, so all existing guideline logic benefits without any changes - Multiple combos for the same base drug are OR-ed (any R → R) - Adds internal ab_without_inhibitor() helper for the name->base mapping - Verbose mode reports which combinations are used for inference Bumps version: 3.0.1.9028 -> 3.0.1.9029 https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG * Add sir.R/mic.R fixes and mdro() unit tests; bump to 3.0.1.9030 R/sir.R (line 571): Guard purely numeric strings (e.g. "1", "8") from the Unicode letter filter. Values matching the broad SIR regex but consisting only of digits must not be stripped; add `x %unlike% "^[0-9+]$"` predicate. R/mic.R (lines 220-222): Preserve the letter 'e' during Unicode-letter removal so that MIC values in scientific notation (e.g. "1e-3", "2.5e-2") survive the cleaning step. - Line 220: [\\p{L}] → [^e\\P{L}] (remove all letters except 'e') - Line 222: [^0-9.><= -]+ → [^0-9e.><= -]+ (allow 'e' in whitelist) tests/testthat/test-mdro.R: New tests for the drug+inhibitor inference added in the previous commit (issue #209): - TZP=R with no PIP column → PIP inferred R → MDRO class elevated - TZP=S with no PIP column → proxy col is NA (not S) → class lower - verbose mode emits "Inferring resistance" message - AMC=R with no AMX column runs without error (Enterococcus faecium) https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG * Fix version to single bump (9029) and update CLAUDE.md versioning rules CLAUDE.md: Rewrite the "Version and date bump" subsection to document that: - Exactly ONE version bump is allowed per PR (PRs are squash-merged into one commit on the default branch, so one commit = one version increment) - The correct version is computed from git history: currentversion="${currenttag}.$((commits_since_tag + 9001 + 1))" with the +1 accounting for the PR's own squash commit not yet on the default branch - Fall back to incrementing DESCRIPTION's version by 1 if git describe fails - The Date: field tracks the date of the *last* PR commit (updated each time) DESCRIPTION / NEWS.md: Correct the version from 3.0.1.9030 back to 3.0.1.9029. Two version bumps were made across two commits in this PR; since it will be squash-merged as one commit only one bump is correct. Also update Date to today (2026-03-07). https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG * Fix stats::setNames, test accessor bug, and version script verification R/mdro.R: Qualify setNames() as stats::setNames() in the drug+inhibitor inference block to satisfy R CMD CHECK's global-function checks. tests/testthat/test-mdro.R: mdro() with verbose=FALSE returns an atomic ordered factor, not a data.frame. Fix three test errors introduced in the previous commit: - Line 320: result_no_pip$MDRO -> result_no_pip (factor, no $ accessor) - Line 328: result_tzp_s$MDRO / result_no_pip$MDRO -> direct factor refs - Line 347: expect_inherits(..., "data.frame") -> c("factor","ordered") Also fix the comment on line 347 to match the actual return type. Version: confirmed at 3.0.1.9029 (no further bump; one bump already made this PR). git describe failed (no tags in dev environment) — fallback applies. The +1 in CLAUDE.md's formula is correct for tagged repos: currentcommit + 9001 + 1 = 27 + 9001 + 1 = 9029 ✓ https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG * Fix unit tests: use mrgn guideline and expect_message() for proxy tests Three failures corrected: 1. Classification tests (lines 321, 329): The EUCAST guideline for P. aeruginosa already has OR logic (PIP OR TZP), so TZP=R alone satisfies it regardless of whether the PIP proxy exists. Switch to guideline="mrgn": the MRGN 4MRGN criterion for P. aeruginosa requires PIP=R explicitly (lines 1488-1496 of mdro.R), with no TZP fallback. Without the proxy: PIP missing -> not 4MRGN -> level 1. With the proxy (TZP=R infers PIP=R): 4MRGN reached -> level 3. The TZP=S case leaves proxy=NA, so PIP is still absent effectively -> level 1, which is < level 3 as expected. 2. Verbose/message test (line 335): message_() routes through message() to stderr, not cat() to stdout. expect_output() only captures stdout so it always saw nothing. Fix: use expect_message() instead, and remove the inner suppressMessages() that was swallowing the message before expect_message() could capture it. Also trim two stale lines left over from the old expect_output block. https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG --------- Co-authored-by: Claude <noreply@anthropic.com>
184 lines
7.3 KiB
Markdown
184 lines
7.3 KiB
Markdown
# CLAUDE.md — AMR R Package
|
||
|
||
This file provides context for Claude Code when working in this repository.
|
||
|
||
## Project Overview
|
||
|
||
**AMR** is a zero-dependency R package for antimicrobial resistance (AMR) data analysis using a One Health approach. It is peer-reviewed, used in 175+ countries, and supports 28 languages.
|
||
|
||
Key capabilities:
|
||
- SIR (Susceptible/Intermediate/Resistant) classification using EUCAST 2011–2025 and CLSI 2011–2025 breakpoints
|
||
- Antibiogram generation: traditional, combined, syndromic, and WISCA
|
||
- Microorganism taxonomy database (~79,000 species)
|
||
- Antimicrobial drug database (~620 drugs)
|
||
- Multi-drug resistant organism (MDRO) classification
|
||
- First-isolate identification
|
||
- Minimum Inhibitory Concentration (MIC) and disk diffusion handling
|
||
- Multilingual output (28 languages)
|
||
|
||
## Common Commands
|
||
|
||
All commands run inside an R session:
|
||
|
||
```r
|
||
# Rebuild documentation (roxygen2 → .Rd files + NAMESPACE)
|
||
devtools::document()
|
||
|
||
# Run all tests
|
||
devtools::test()
|
||
|
||
# Full package check (CRAN-level: docs + tests + checks)
|
||
devtools::check()
|
||
|
||
# Build pkgdown website locally
|
||
pkgdown::build_site()
|
||
|
||
# Code coverage report
|
||
covr::package_coverage()
|
||
```
|
||
|
||
From the shell:
|
||
|
||
```bash
|
||
# CRAN check from parent directory
|
||
R CMD check AMR
|
||
```
|
||
|
||
## Repository Structure
|
||
|
||
```
|
||
R/ # All R source files (62 files, ~28,000 lines)
|
||
man/ # Auto-generated .Rd documentation (do not edit manually)
|
||
tests/testthat/ # testthat test files (test-*.R) and helper-functions.R
|
||
data/ # Pre-compiled .rda datasets
|
||
data-raw/ # Scripts used to generate data/ files
|
||
vignettes/ # Rmd vignette articles
|
||
inst/ # Installed files (translations, etc.)
|
||
_pkgdown.yml # pkgdown website configuration
|
||
```
|
||
|
||
## R Source File Conventions
|
||
|
||
**Naming conventions in `R/`:**
|
||
|
||
| Prefix/Name | Purpose |
|
||
|---|---|
|
||
| `aa_*.R` | Loaded first (helpers, globals, options, package docs) |
|
||
| `zz_deprecated.R` | Deprecated function wrappers |
|
||
| `zzz.R` | `.onLoad` / `.onAttach` initialization |
|
||
|
||
**Key source files:**
|
||
|
||
- `aa_helper_functions.R` / `aa_helper_pm_functions.R` — internal utility functions (large; ~63 KB and ~37 KB)
|
||
- `aa_globals.R` — global constants and breakpoint lookup structures
|
||
- `aa_options.R` — `amr_options()` / `get_AMR_option()` system
|
||
- `mo.R` / `mo_property.R` — microorganism lookup and properties
|
||
- `ab.R` / `ab_property.R` — antimicrobial drug functions
|
||
- `av.R` / `av_property.R` — antiviral drug functions
|
||
- `sir.R` / `sir_calc.R` / `sir_df.R` — SIR classification engine
|
||
- `mic.R` / `disk.R` — MIC and disk diffusion classes
|
||
- `antibiogram.R` — antibiogram generation (traditional, combined, syndromic, WISCA)
|
||
- `first_isolate.R` — first-isolate identification algorithms
|
||
- `mdro.R` — MDRO classification (EUCAST, CLSI, CDC, custom guidelines)
|
||
- `amr_selectors.R` — tidyselect helpers for selecting AMR columns
|
||
- `interpretive_rules.R` / `custom_eucast_rules.R` — clinical interpretation rules
|
||
- `translate.R` — 28-language translation system
|
||
- `ggplot_sir.R` / `ggplot_pca.R` / `plotting.R` — visualisation functions
|
||
|
||
## Custom S3 Classes
|
||
|
||
The package defines five S3 classes with full print/format/plot/vctrs support:
|
||
|
||
| Class | Created by | Represents |
|
||
|---|---|---|
|
||
| `<mo>` | `as.mo()` | Microorganism code |
|
||
| `<ab>` | `as.ab()` | Antimicrobial drug code |
|
||
| `<av>` | `as.av()` | Antiviral drug code |
|
||
| `<sir>` | `as.sir()` | SIR value (S/I/R/SDD) |
|
||
| `<mic>` | `as.mic()` | Minimum inhibitory concentration |
|
||
| `<disk>` | `as.disk()` | Disk diffusion diameter |
|
||
|
||
## Data Files
|
||
|
||
Pre-compiled in `data/` (do not edit directly; regenerate via `data-raw/` scripts):
|
||
|
||
| File | Contents |
|
||
|---|---|
|
||
| `microorganisms.rda` | ~79,000 microbial species with full taxonomy |
|
||
| `antimicrobials.rda` | ~620 antimicrobial drugs with ATC codes |
|
||
| `antivirals.rda` | Antiviral drugs |
|
||
| `clinical_breakpoints.rda` | EUCAST + CLSI breakpoints (2011–2025) |
|
||
| `intrinsic_resistant.rda` | Intrinsic resistance patterns |
|
||
| `example_isolates.rda` | Example AMR dataset for documentation/testing |
|
||
| `WHONET.rda` | Example WHONET-format dataset |
|
||
|
||
## Zero-Dependency Design
|
||
|
||
The package has **no `Imports`** in `DESCRIPTION`. All optional integrations (ggplot2, dplyr, data.table, tidymodels, cli, crayon, etc.) are listed in `Suggests` and guarded with:
|
||
|
||
```r
|
||
if (requireNamespace("pkg", quietly = TRUE)) { ... }
|
||
```
|
||
|
||
Never add packages to `Imports`. If new functionality requires an external package, add it to `Suggests` and guard usage appropriately.
|
||
|
||
## Testing
|
||
|
||
- **Framework:** `testthat` (R ≥ 3.1); legacy `tinytest` used for R 3.0–3.6 CI
|
||
- **Test files:** `tests/testthat/test-*.R`
|
||
- **Helpers:** `tests/testthat/helper-functions.R`
|
||
- **CI matrix:** GitHub Actions across Windows / macOS / Linux × R devel / release / oldrel-1 through oldrel-4
|
||
- **Coverage:** `covr` (some files excluded: `atc_online.R`, `mo_source.R`, `translate.R`, `resistance_predict.R`, `zz_deprecated.R`, helper files, `zzz.R`)
|
||
|
||
## Documentation
|
||
|
||
- All exported functions use **roxygen2** blocks (`RoxygenNote: 7.3.3`, markdown enabled)
|
||
- Run `devtools::document()` after any change to roxygen comments
|
||
- Never edit files in `man/` directly — they are auto-generated
|
||
- Vignettes live in `vignettes/` as `.Rmd` files
|
||
- The pkgdown website is configured in `_pkgdown.yml`
|
||
|
||
## Versioning
|
||
|
||
Version format: `major.minor.patch.dev` (e.g., `3.0.1.9021`)
|
||
|
||
- Development versions use a `.9xxx` suffix
|
||
- Stable CRAN releases drop the dev suffix (e.g., `3.0.1`)
|
||
- `NEWS.md` uses sections **New**, **Fixes**, **Updates** with GitHub issue references (`#NNN`)
|
||
|
||
### Version and date bump required for every PR
|
||
|
||
All PRs are **squash-merged**, so each PR lands as exactly **one commit** on the default branch. Version numbers are kept in sync with the cumulative commit count since the last released tag. Therefore **exactly one version bump is allowed per PR**, regardless of how many intermediate commits are made on the branch.
|
||
|
||
#### Computing the correct version number
|
||
|
||
Run the following from the repo root to determine the version string to use:
|
||
|
||
```bash
|
||
currenttag=$(git describe --tags --abbrev=0 | sed 's/v//')
|
||
currenttagfull=$(git describe --tags --abbrev=0)
|
||
defaultbranch=$(git branch | cut -c 3- | grep -E '^master$|^main$')
|
||
currentcommit=$(git rev-list --count ${currenttagfull}..${defaultbranch})
|
||
currentversion="${currenttag}.$((currentcommit + 9001 + 1))"
|
||
echo "$currentversion"
|
||
```
|
||
|
||
The `+ 1` accounts for the fact that this PR's squash commit is not yet on the default branch. Set **both** of these files to the resulting version string (and only once per PR, even across multiple commits):
|
||
|
||
1. **`DESCRIPTION`** — the `Version:` field
|
||
2. **`NEWS.md`** — the top-level heading `# AMR <version>`
|
||
|
||
If `git describe` fails (e.g. no tags exist in the environment), fall back to reading the current version from `DESCRIPTION` and adding 1 to the last numeric component — but only if no bump has already been made in this PR.
|
||
|
||
#### Date field
|
||
|
||
The `Date:` field in `DESCRIPTION` must reflect the date of the **last commit to the PR** (not the first), in ISO format. Update it with every commit so it is always current:
|
||
|
||
```
|
||
Date: 2026-03-07
|
||
```
|
||
|
||
## Internal State
|
||
|
||
The package uses a private `AMR_env` environment (created in `aa_globals.R`) for caching expensive lookups (e.g., microorganism matching scores, breakpoint tables). This avoids re-computation within a session.
|