mirror of https://github.com/msberends/AMR.git synced 2026-05-14 00:30:50 +02:00

Files

Matthijs Berends f7e9294bea Add parallel computing support to antibiogram() and wisca() (#281 ) (#282 )

* Add parallel computing support to antibiogram() and wisca() (#281)

For WISCA: simulations are distributed across (group, chunk) job pairs
via future.apply::future_lapply(), keeping all workers active even when
the regimen count is smaller than nbrOfWorkers(). Sequential fallback
with progress ticker is preserved when parallel = FALSE or workers = 1.

For grouped antibiograms: each group is processed by a separate worker,
mirroring the row-batch approach in as.sir().

Same gate pattern as as.sir() (PR #280): requires a non-sequential
future::plan() to be active; auto-upgrades to parallel = TRUE when a
parallel plan is detected; throws an informative error otherwise.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix version to 3.0.1.9055 and update CLAUDE.md version formula

Uses origin/${defaultbranch} (with a fetch) instead of the local
branch ref so the commit count is never stale after a merge.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix non-ASCII characters in antibiogram.R

Replace en/em dashes and non-breaking spaces with ASCII equivalents
to satisfy R CMD check portability requirement.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Update auto-generated Rd files after documentation rebuild

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Move parallel gate to top of antibiogram.default() like sir.R

The gate was inside the wisca==TRUE block, so parallel=TRUE with a
sequential plan was silently ignored for non-WISCA antibiograms.
Now the gate runs unconditionally at the top of the function,
identical to the as.sir() pattern: error on explicit parallel=TRUE
with sequential plan, auto-upgrade when a non-sequential plan is
already active.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix parallel WISCA returning all NA; strengthen tests; add sequential hint

Bug: lapply() over a factor yields length-1 factor elements (integer
codes), while for() over a factor yields character strings.  The job
list stored j\$group as a factor integer, but the reassembly loop
compared it with identical(j\$group, g) where g was character -- always
FALSE, so no simulation chunks were ever assembled and coverage stayed
NA throughout.

Fix: convert unique_groups to character before building jobs so both
the job list and the reassembly loop use the same type.

Tests: replaced na.rm = TRUE guards with explicit anyNA() checks so the
test suite would have caught the all-NA result immediately.

Also adds a sequential-mode performance hint (analogous to sir.R
lines 1116-1127) when simulations >= 500 and >= 3 regimens.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

---------

Co-authored-by: Claude <noreply@anthropic.com>

2026-04-30 18:41:56 +01:00

8.2 KiB

Raw Permalink Blame History

CLAUDE.md — AMR R Package

This file provides context for Claude Code when working in this repository.

Project Overview

AMR is a zero-dependency R package for antimicrobial resistance (AMR) data analysis using a One Health approach. It is peer-reviewed, used in 175+ countries, and supports 28 languages.

Key capabilities:

SIR (Susceptible/Intermediate/Resistant) classification using EUCAST 2011–2025 and CLSI 2011–2025 breakpoints
Antibiogram generation: traditional, combined, syndromic, and WISCA
Microorganism taxonomy database (~79,000 species)
Antimicrobial drug database (~620 drugs)
Multi-drug resistant organism (MDRO) classification
First-isolate identification
Minimum Inhibitory Concentration (MIC) and disk diffusion handling
Multilingual output (28 languages)

Common Commands

All commands run inside an R session:

# Rebuild documentation (roxygen2 → .Rd files + NAMESPACE)
devtools::document()

# Run all tests
devtools::test()

# Full package check (CRAN-level: docs + tests + checks)
devtools::check()

# Build pkgdown website locally
pkgdown::build_site()

# Code coverage report
covr::package_coverage()

From the shell:

# CRAN check from parent directory
R CMD check AMR

Repository Structure

R/              # All R source files (62 files, ~28,000 lines)
man/            # Auto-generated .Rd documentation (do not edit manually)
tests/testthat/ # testthat test files (test-*.R) and helper-functions.R
data/           # Pre-compiled .rda datasets
data-raw/       # Scripts used to generate data/ files
vignettes/      # Rmd vignette articles
inst/           # Installed files (translations, etc.)
_pkgdown.yml    # pkgdown website configuration

R Source File Conventions

Naming conventions in R/:

Prefix/Name	Purpose
`aa_*.R`	Loaded first (helpers, globals, options, package docs)
`zz_deprecated.R`	Deprecated function wrappers
`zzz.R`	`.onLoad` / `.onAttach` initialization

Key source files:

aa_helper_functions.R / aa_helper_pm_functions.R — internal utility functions (large; ~63 KB and ~37 KB)
aa_globals.R — global constants and breakpoint lookup structures
aa_options.R — amr_options() / get_AMR_option() system
mo.R / mo_property.R — microorganism lookup and properties
ab.R / ab_property.R — antimicrobial drug functions
av.R / av_property.R — antiviral drug functions
sir.R / sir_calc.R / sir_df.R — SIR classification engine
mic.R / disk.R — MIC and disk diffusion classes
antibiogram.R — antibiogram generation (traditional, combined, syndromic, WISCA)
first_isolate.R — first-isolate identification algorithms
mdro.R — MDRO classification (EUCAST, CLSI, CDC, custom guidelines)
amr_selectors.R — tidyselect helpers for selecting AMR columns
interpretive_rules.R / custom_eucast_rules.R — clinical interpretation rules
translate.R — 28-language translation system
ggplot_sir.R / ggplot_pca.R / plotting.R — visualisation functions

Custom S3 Classes

The package defines five S3 classes with full print/format/plot/vctrs support:

Class	Created by	Represents
`<mo>`	`as.mo()`	Microorganism code
`<ab>`	`as.ab()`	Antimicrobial drug code
`<av>`	`as.av()`	Antiviral drug code
`<sir>`	`as.sir()`	SIR value (S/I/R/SDD)
`<mic>`	`as.mic()`	Minimum inhibitory concentration
`<disk>`	`as.disk()`	Disk diffusion diameter

Data Files

Pre-compiled in data/ (do not edit directly; regenerate via data-raw/ scripts):

File	Contents
`microorganisms.rda`	~79,000 microbial species with full taxonomy
`antimicrobials.rda`	~620 antimicrobial drugs with ATC codes
`antivirals.rda`	Antiviral drugs
`clinical_breakpoints.rda`	EUCAST + CLSI breakpoints (2011–2025)
`intrinsic_resistant.rda`	Intrinsic resistance patterns
`example_isolates.rda`	Example AMR dataset for documentation/testing
`WHONET.rda`	Example WHONET-format dataset

Zero-Dependency Design

The package has no Imports in DESCRIPTION. All optional integrations (ggplot2, dplyr, data.table, tidymodels, cli, crayon, etc.) are listed in Suggests and guarded with:

if (requireNamespace("pkg", quietly = TRUE)) { ... }

Never add packages to Imports. If new functionality requires an external package, add it to Suggests and guard usage appropriately.

Testing

Framework: testthat (R ≥ 3.1); legacy tinytest used for R 3.0–3.6 CI
Test files: tests/testthat/test-*.R
Helpers: tests/testthat/helper-functions.R
CI matrix: GitHub Actions across Windows / macOS / Linux × R devel / release / oldrel-1 through oldrel-4
Coverage: covr (some files excluded: atc_online.R, mo_source.R, translate.R, resistance_predict.R, zz_deprecated.R, helper files, zzz.R)

Documentation

All exported functions use roxygen2 blocks (RoxygenNote: 7.3.3, markdown enabled)
Run devtools::document() after any change to roxygen comments
Never edit files in man/ directly — they are auto-generated
Vignettes live in vignettes/ as .Rmd files
The pkgdown website is configured in _pkgdown.yml

Versioning

Version format: major.minor.patch.dev (e.g., 3.0.1.9021)

Development versions use a .9xxx suffix
Stable CRAN releases drop the dev suffix (e.g., 3.0.1)
NEWS.md uses sections New, Fixes, Updates with GitHub issue references (#NNN)

Version and date bump required for every PR

All PRs are squash-merged, so each PR lands as exactly one commit on the default branch. Version numbers are kept in sync with the cumulative commit count since the last released tag. Therefore exactly one version bump is allowed per PR, regardless of how many intermediate commits are made on the branch.

Computing the correct version number

First, ensure git and gh are installed — both are required for the version computation and for pushing changes. Install them if missing before doing anything else:

which git || apt-get install -y git
which gh   || apt-get install -y gh
# Also ensure all tags are fetched so git describe works
git fetch --tags

Then run the following from the repo root to determine the version string to use:

currenttag=$(git describe --tags --abbrev=0 | sed 's/v//')
currenttagfull=$(git describe --tags --abbrev=0)
defaultbranch=$(git branch | cut -c 3- | grep -E '^master$|^main$')
git fetch origin ${defaultbranch} --quiet
currentcommit=$(git rev-list --count ${currenttagfull}..origin/${defaultbranch})
currentversion="${currenttag}.$((currentcommit + 9001 + 1))"
echo "$currentversion"

The + 1 accounts for the fact that this PR's squash commit is not yet on the default branch. Set both of these files to the resulting version string (and only once per PR, even across multiple commits):

DESCRIPTION — the Version: field
NEWS.md — only replace line 1 (the # AMR <version> heading) with the new version number; do not create a new section. NEWS.md is a continuous log for the entire current x.y.z.9nnn development series: all changes since the last stable release accumulate under that single heading. After updating line 1, append the new change as a bullet under the appropriate sub-heading (### New, ### Fixes, or ### Updates).

Style rules for NEWS.md entries:
- Be extremely concise — one short line per item
- Do not end with a full stop (period)
- No verbose explanations; just the essential fact

If git describe fails (e.g. no tags exist in the environment), fall back to reading the current version from DESCRIPTION and adding 1 to the last numeric component — but only if no bump has already been made in this PR.

Date field

The Date: field in DESCRIPTION must reflect the date of the last commit to the PR (not the first), in ISO format. Update it with every commit so it is always current:

Date: 2026-03-07

Internal State

The package uses a private AMR_env environment (created in aa_globals.R) for caching expensive lookups (e.g., microorganism matching scores, breakpoint tables). This avoids re-computation within a session.

8.2 KiB Raw Permalink Blame History Unescape Escape