1
0
mirror of https://github.com/msberends/AMR.git synced 2026-03-11 15:47:54 +01:00
Files
AMR/CLAUDE.md
Matthijs Berends 9af726dcaa mdro(): infer base drug resistance from drug+inhibitor combination co… (#263)
* mdro(): infer base drug resistance from drug+inhibitor combination columns (#209)

When a base beta-lactam column (e.g., piperacillin/PIP) is absent but a
corresponding drug+inhibitor combination (e.g., piperacillin/tazobactam/TZP)
is present and resistant, resistance in the base drug is now correctly
inferred. This is clinically sound: resistance in a combination implies the
inhibitor provided no benefit, so the base drug is also resistant.

Susceptibility in a combination is NOT propagated to the base drug (the
inhibitor may be responsible for susceptibility), so only R values are
inferred; missing base drugs remain NA otherwise.

Implementation details:
- Uses AB_BETALACTAMS_WITH_INHIBITOR to identify all beta-lactam+inhibitor
  combinations present in the user's data
- Derives base drug AB codes by stripping the "/inhibitor" part from names
- Creates synthetic proxy columns (.sir_proxy_<AB>) in x, set to "R" when
  any matching combination is R, otherwise NA
- Proxy columns are added to cols_ab before drug variable assignment,
  so all existing guideline logic benefits without any changes
- Multiple combos for the same base drug are OR-ed (any R → R)
- Adds internal ab_without_inhibitor() helper for the name->base mapping
- Verbose mode reports which combinations are used for inference

Bumps version: 3.0.1.9028 -> 3.0.1.9029

https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG

* Add sir.R/mic.R fixes and mdro() unit tests; bump to 3.0.1.9030

R/sir.R (line 571):
  Guard purely numeric strings (e.g. "1", "8") from the Unicode letter
  filter. Values matching the broad SIR regex but consisting only of digits
  must not be stripped; add `x %unlike% "^[0-9+]$"` predicate.

R/mic.R (lines 220-222):
  Preserve the letter 'e' during Unicode-letter removal so that MIC values
  in scientific notation (e.g. "1e-3", "2.5e-2") survive the cleaning step.
  - Line 220: [\\p{L}] → [^e\\P{L}]  (remove all letters except 'e')
  - Line 222: [^0-9.><= -]+ → [^0-9e.><= -]+  (allow 'e' in whitelist)

tests/testthat/test-mdro.R:
  New tests for the drug+inhibitor inference added in the previous commit
  (issue #209):
  - TZP=R with no PIP column → PIP inferred R → MDRO class elevated
  - TZP=S with no PIP column → proxy col is NA (not S) → class lower
  - verbose mode emits "Inferring resistance" message
  - AMC=R with no AMX column runs without error (Enterococcus faecium)

https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG

* Fix version to single bump (9029) and update CLAUDE.md versioning rules

CLAUDE.md: Rewrite the "Version and date bump" subsection to document that:
- Exactly ONE version bump is allowed per PR (PRs are squash-merged into one
  commit on the default branch, so one commit = one version increment)
- The correct version is computed from git history:
    currentversion="${currenttag}.$((commits_since_tag + 9001 + 1))"
  with the +1 accounting for the PR's own squash commit not yet on the
  default branch
- Fall back to incrementing DESCRIPTION's version by 1 if git describe fails
- The Date: field tracks the date of the *last* PR commit (updated each time)

DESCRIPTION / NEWS.md: Correct the version from 3.0.1.9030 back to 3.0.1.9029.
Two version bumps were made across two commits in this PR; since it will be
squash-merged as one commit only one bump is correct. Also update Date to
today (2026-03-07).

https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG

* Fix stats::setNames, test accessor bug, and version script verification

R/mdro.R:
  Qualify setNames() as stats::setNames() in the drug+inhibitor inference
  block to satisfy R CMD CHECK's global-function checks.

tests/testthat/test-mdro.R:
  mdro() with verbose=FALSE returns an atomic ordered factor, not a
  data.frame. Fix three test errors introduced in the previous commit:
  - Line 320: result_no_pip$MDRO -> result_no_pip (factor, no $ accessor)
  - Line 328: result_tzp_s$MDRO / result_no_pip$MDRO -> direct factor refs
  - Line 347: expect_inherits(..., "data.frame") -> c("factor","ordered")
  Also fix the comment on line 347 to match the actual return type.

Version: confirmed at 3.0.1.9029 (no further bump; one bump already made
this PR). git describe failed (no tags in dev environment) — fallback
applies. The +1 in CLAUDE.md's formula is correct for tagged repos:
currentcommit + 9001 + 1 = 27 + 9001 + 1 = 9029 ✓

https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG

* Fix unit tests: use mrgn guideline and expect_message() for proxy tests

Three failures corrected:

1. Classification tests (lines 321, 329): The EUCAST guideline for
   P. aeruginosa already has OR logic (PIP OR TZP), so TZP=R alone
   satisfies it regardless of whether the PIP proxy exists. Switch to
   guideline="mrgn": the MRGN 4MRGN criterion for P. aeruginosa
   requires PIP=R explicitly (lines 1488-1496 of mdro.R), with no TZP
   fallback. Without the proxy: PIP missing -> not 4MRGN -> level 1.
   With the proxy (TZP=R infers PIP=R): 4MRGN reached -> level 3.
   The TZP=S case leaves proxy=NA, so PIP is still absent effectively
   -> level 1, which is < level 3 as expected.

2. Verbose/message test (line 335): message_() routes through message()
   to stderr, not cat() to stdout. expect_output() only captures stdout
   so it always saw nothing. Fix: use expect_message() instead, and
   remove the inner suppressMessages() that was swallowing the message
   before expect_message() could capture it.

Also trim two stale lines left over from the old expect_output block.

https://claude.ai/code/session_01Cp154UtssHg84bw38xiiTG

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-07 18:06:55 +01:00

7.3 KiB
Raw Blame History

CLAUDE.md — AMR R Package

This file provides context for Claude Code when working in this repository.

Project Overview

AMR is a zero-dependency R package for antimicrobial resistance (AMR) data analysis using a One Health approach. It is peer-reviewed, used in 175+ countries, and supports 28 languages.

Key capabilities:

  • SIR (Susceptible/Intermediate/Resistant) classification using EUCAST 20112025 and CLSI 20112025 breakpoints
  • Antibiogram generation: traditional, combined, syndromic, and WISCA
  • Microorganism taxonomy database (~79,000 species)
  • Antimicrobial drug database (~620 drugs)
  • Multi-drug resistant organism (MDRO) classification
  • First-isolate identification
  • Minimum Inhibitory Concentration (MIC) and disk diffusion handling
  • Multilingual output (28 languages)

Common Commands

All commands run inside an R session:

# Rebuild documentation (roxygen2 → .Rd files + NAMESPACE)
devtools::document()

# Run all tests
devtools::test()

# Full package check (CRAN-level: docs + tests + checks)
devtools::check()

# Build pkgdown website locally
pkgdown::build_site()

# Code coverage report
covr::package_coverage()

From the shell:

# CRAN check from parent directory
R CMD check AMR

Repository Structure

R/              # All R source files (62 files, ~28,000 lines)
man/            # Auto-generated .Rd documentation (do not edit manually)
tests/testthat/ # testthat test files (test-*.R) and helper-functions.R
data/           # Pre-compiled .rda datasets
data-raw/       # Scripts used to generate data/ files
vignettes/      # Rmd vignette articles
inst/           # Installed files (translations, etc.)
_pkgdown.yml    # pkgdown website configuration

R Source File Conventions

Naming conventions in R/:

Prefix/Name Purpose
aa_*.R Loaded first (helpers, globals, options, package docs)
zz_deprecated.R Deprecated function wrappers
zzz.R .onLoad / .onAttach initialization

Key source files:

  • aa_helper_functions.R / aa_helper_pm_functions.R — internal utility functions (large; ~63 KB and ~37 KB)
  • aa_globals.R — global constants and breakpoint lookup structures
  • aa_options.Ramr_options() / get_AMR_option() system
  • mo.R / mo_property.R — microorganism lookup and properties
  • ab.R / ab_property.R — antimicrobial drug functions
  • av.R / av_property.R — antiviral drug functions
  • sir.R / sir_calc.R / sir_df.R — SIR classification engine
  • mic.R / disk.R — MIC and disk diffusion classes
  • antibiogram.R — antibiogram generation (traditional, combined, syndromic, WISCA)
  • first_isolate.R — first-isolate identification algorithms
  • mdro.R — MDRO classification (EUCAST, CLSI, CDC, custom guidelines)
  • amr_selectors.R — tidyselect helpers for selecting AMR columns
  • interpretive_rules.R / custom_eucast_rules.R — clinical interpretation rules
  • translate.R — 28-language translation system
  • ggplot_sir.R / ggplot_pca.R / plotting.R — visualisation functions

Custom S3 Classes

The package defines five S3 classes with full print/format/plot/vctrs support:

Class Created by Represents
<mo> as.mo() Microorganism code
<ab> as.ab() Antimicrobial drug code
<av> as.av() Antiviral drug code
<sir> as.sir() SIR value (S/I/R/SDD)
<mic> as.mic() Minimum inhibitory concentration
<disk> as.disk() Disk diffusion diameter

Data Files

Pre-compiled in data/ (do not edit directly; regenerate via data-raw/ scripts):

File Contents
microorganisms.rda ~79,000 microbial species with full taxonomy
antimicrobials.rda ~620 antimicrobial drugs with ATC codes
antivirals.rda Antiviral drugs
clinical_breakpoints.rda EUCAST + CLSI breakpoints (20112025)
intrinsic_resistant.rda Intrinsic resistance patterns
example_isolates.rda Example AMR dataset for documentation/testing
WHONET.rda Example WHONET-format dataset

Zero-Dependency Design

The package has no Imports in DESCRIPTION. All optional integrations (ggplot2, dplyr, data.table, tidymodels, cli, crayon, etc.) are listed in Suggests and guarded with:

if (requireNamespace("pkg", quietly = TRUE)) { ... }

Never add packages to Imports. If new functionality requires an external package, add it to Suggests and guard usage appropriately.

Testing

  • Framework: testthat (R ≥ 3.1); legacy tinytest used for R 3.03.6 CI
  • Test files: tests/testthat/test-*.R
  • Helpers: tests/testthat/helper-functions.R
  • CI matrix: GitHub Actions across Windows / macOS / Linux × R devel / release / oldrel-1 through oldrel-4
  • Coverage: covr (some files excluded: atc_online.R, mo_source.R, translate.R, resistance_predict.R, zz_deprecated.R, helper files, zzz.R)

Documentation

  • All exported functions use roxygen2 blocks (RoxygenNote: 7.3.3, markdown enabled)
  • Run devtools::document() after any change to roxygen comments
  • Never edit files in man/ directly — they are auto-generated
  • Vignettes live in vignettes/ as .Rmd files
  • The pkgdown website is configured in _pkgdown.yml

Versioning

Version format: major.minor.patch.dev (e.g., 3.0.1.9021)

  • Development versions use a .9xxx suffix
  • Stable CRAN releases drop the dev suffix (e.g., 3.0.1)
  • NEWS.md uses sections New, Fixes, Updates with GitHub issue references (#NNN)

Version and date bump required for every PR

All PRs are squash-merged, so each PR lands as exactly one commit on the default branch. Version numbers are kept in sync with the cumulative commit count since the last released tag. Therefore exactly one version bump is allowed per PR, regardless of how many intermediate commits are made on the branch.

Computing the correct version number

Run the following from the repo root to determine the version string to use:

currenttag=$(git describe --tags --abbrev=0 | sed 's/v//')
currenttagfull=$(git describe --tags --abbrev=0)
defaultbranch=$(git branch | cut -c 3- | grep -E '^master$|^main$')
currentcommit=$(git rev-list --count ${currenttagfull}..${defaultbranch})
currentversion="${currenttag}.$((currentcommit + 9001 + 1))"
echo "$currentversion"

The + 1 accounts for the fact that this PR's squash commit is not yet on the default branch. Set both of these files to the resulting version string (and only once per PR, even across multiple commits):

  1. DESCRIPTION — the Version: field
  2. NEWS.md — the top-level heading # AMR <version>

If git describe fails (e.g. no tags exist in the environment), fall back to reading the current version from DESCRIPTION and adding 1 to the last numeric component — but only if no bump has already been made in this PR.

Date field

The Date: field in DESCRIPTION must reflect the date of the last commit to the PR (not the first), in ISO format. Update it with every commit so it is always current:

Date: 2026-03-07

Internal State

The package uses a private AMR_env environment (created in aa_globals.R) for caching expensive lookups (e.g., microorganism matching scores, breakpoint tables). This avoids re-computation within a session.