1
0
mirror of https://github.com/msberends/AMR.git synced 2026-05-31 13:41:42 +02:00

16 Commits

Author SHA1 Message Date
Claude
3d3f06be24 Fix candidates_formatted not found in print.mo_uncertainties()
When there is only one match (candidates field is empty), the "Also
matched:" block must be skipped. Wrapping it in if (candidates != "")
prevents the error that arose after #288 narrowed filtr to one entry.

https://claude.ai/code/session_01VH4Ju4Xq9aW1AHuoVbjGEo
2026-05-06 18:43:00 +00:00
Claude
d23004641f Revert mo_matching_score.R; fix #288 in mo.R pre-filter instead
Issue #288: when the abbreviated-genus pre-filter (≤3 char genus) yields
exactly one candidate whose species (and subspecies) word(s) exactly match
the input, narrow filtr to that single candidate and set
minimum_matching_score = 0. This bypasses the automatic 0.55 cutoff that
only runs in the is.null() branch, so "S. apiospermum" resolves to
Scedosporium apiospermum without touching the validated scoring formula.

https://claude.ai/code/session_01VH4Ju4Xq9aW1AHuoVbjGEo
2026-05-06 18:07:37 +00:00
Claude
b3b8d301ff Fix #287 (complex fallback) and #288 (species epithet scoring bias)
#287: as.mo() now strips " complex" from input when that exact complex
is not in the taxonomy and retries with the bare name, so inputs like
"Proteus vulgaris complex" no longer return NA.

#288: mo_matching_score() applies a ×2 bonus when the input has an
abbreviated genus (≤3 chars) and the candidate's species epithet exactly
matches the input species epithet. This ensures "S. apiospermum" resolves
to Scedosporium apiospermum rather than Staphylococcus aureus, overcoming
the kingdom/prevalence denominator bias in favour of common bacteria.

https://claude.ai/code/session_01VH4Ju4Xq9aW1AHuoVbjGEo
2026-05-06 15:11:31 +00:00
155c2707ce add fn 2026-05-05 12:41:45 +02:00
cead31bed0 prepare for morphology 2026-05-04 22:57:42 +02:00
a5e8beff69 prepare for documentation 2026-05-03 11:50:05 +02:00
Claude
9707450b89 Fix emend. author bug in get_author_year() and update ref documentation
Strip emend. and everything after it so the ref column retains the
combination authority, not the emendation author. Update data.R and
mo_property.R docs to describe the correct semantics of the ref field.

https://claude.ai/code/session_01VH4Ju4Xq9aW1AHuoVbjGEo
2026-05-02 14:34:45 +00:00
0af3f84655 (v3.0.1.9057) website fix 2026-05-02 14:56:26 +02:00
Matthijs Berends
64753e9a16 Revamp beta NEWS.md section: concise, user-focused, telegram style (#284)
* Revamp beta NEWS.md section: concise, user-focused, telegram style

https://claude.ai/code/session_01CnwgKsQqnKNdSD2KjwFAdH

* Loosen NEWS.md telegram style: add context words for clarity

https://claude.ai/code/session_01CnwgKsQqnKNdSD2KjwFAdH

* Add issue/PR references to beta NEWS.md entries

https://claude.ai/code/session_01CnwgKsQqnKNdSD2KjwFAdH

* Add antimicrobial codes to new antimicrobials list in NEWS.md

https://claude.ai/code/session_01CnwgKsQqnKNdSD2KjwFAdH

* Replace em dashes with natural English punctuation in NEWS.md

https://claude.ai/code/session_01CnwgKsQqnKNdSD2KjwFAdH

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-05-02 10:31:33 +01:00
Matthijs Berends
24f24ecaf8 Generalise interpretive rules for multi-guideline support (#268) (#283)
* Generalise interpretive rules for multi-guideline support (#268)

- Rename data-raw/eucast_rules.tsv → interpretive_rules.tsv; add rule.provider
  column (value: "EUCAST") to distinguish future CLSI rows
- Rename EUCAST_RULES_DF → INTERPRETIVE_RULES_DF in _pre_commit_checks.R;
  filter by rule.provider == guideline when applying rules in interpretive_rules()
- Rename custom_eucast_rules() → custom_interpretive_rules() with new S3 class
  "custom_interpretive_rules"; old function becomes a deprecated wrapper in
  zz_deprecated.R; backward-compat S3 dispatch shims added for old class
- Remove stop_if(guideline == "CLSI", ...) so clsi_rules() no longer errors
- Add .onLoad shim in zzz.R to create INTERPRETIVE_RULES_DF from EUCAST_RULES_DF
  for transitional compatibility until sysdata.rda is regenerated

https://claude.ai/code/session_01D46BTsfJSPo3HnLWp3PRkP

* Fix namespace load failure: remove assignInNamespace from .onLoad (#268)

assignInNamespace cannot add NEW bindings to a locked package namespace
(R locks namespace bindings before .onLoad runs). Replace the .onLoad
shim with a runtime fallback inside interpretive_rules(): if
INTERPRETIVE_RULES_DF is absent (pre-regeneration sysdata.rda), derive
it from EUCAST_RULES_DF by adding the rule.provider column. This also
fixes the screening_abx line to reuse the already-resolved
interpretive_rules_df_total instead of a bare INTERPRETIVE_RULES_DF
reference.

https://claude.ai/code/session_01D46BTsfJSPo3HnLWp3PRkP

* fixes

* fixes

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-05-01 18:38:51 +01:00
Matthijs Berends
f7e9294bea Add parallel computing support to antibiogram() and wisca() (#281) (#282)
* Add parallel computing support to antibiogram() and wisca() (#281)

For WISCA: simulations are distributed across (group, chunk) job pairs
via future.apply::future_lapply(), keeping all workers active even when
the regimen count is smaller than nbrOfWorkers(). Sequential fallback
with progress ticker is preserved when parallel = FALSE or workers = 1.

For grouped antibiograms: each group is processed by a separate worker,
mirroring the row-batch approach in as.sir().

Same gate pattern as as.sir() (PR #280): requires a non-sequential
future::plan() to be active; auto-upgrades to parallel = TRUE when a
parallel plan is detected; throws an informative error otherwise.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix version to 3.0.1.9055 and update CLAUDE.md version formula

Uses origin/${defaultbranch} (with a fetch) instead of the local
branch ref so the commit count is never stale after a merge.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix non-ASCII characters in antibiogram.R

Replace en/em dashes and non-breaking spaces with ASCII equivalents
to satisfy R CMD check portability requirement.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Update auto-generated Rd files after documentation rebuild

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Move parallel gate to top of antibiogram.default() like sir.R

The gate was inside the wisca==TRUE block, so parallel=TRUE with a
sequential plan was silently ignored for non-WISCA antibiograms.
Now the gate runs unconditionally at the top of the function,
identical to the as.sir() pattern: error on explicit parallel=TRUE
with sequential plan, auto-upgrade when a non-sequential plan is
already active.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

* Fix parallel WISCA returning all NA; strengthen tests; add sequential hint

Bug: lapply() over a factor yields length-1 factor elements (integer
codes), while for() over a factor yields character strings.  The job
list stored j\$group as a factor integer, but the reassembly loop
compared it with identical(j\$group, g) where g was character -- always
FALSE, so no simulation chunks were ever assembled and coverage stayed
NA throughout.

Fix: convert unique_groups to character before building jobs so both
the job list and the reassembly loop use the same type.

Tests: replaced na.rm = TRUE guards with explicit anyNA() checks so the
test suite would have caught the all-NA result immediately.

Also adds a sequential-mode performance hint (analogous to sir.R
lines 1116-1127) when simulations >= 500 and >= 3 regimens.

https://claude.ai/code/session_01FC43syPbzhGmKgrrVNHjnF

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-30 18:41:56 +01:00
Matthijs Berends
23beebc6c3 Migrate parallel computing in as.sir() from parallel:: to future/future.apply (#280)
* Migrate parallel computing in as.sir() from parallel:: to future/future.apply

Replace parallel::mclapply() and parallel::parLapply() with
future.apply::future_lapply(), enabling transparent support for any
future backend (multisession, multicore, mirai_multisession, cluster)
on all platforms including Windows.

When parallel = TRUE the function now: (1) respects an active
future::plan() set by the user without overriding it on exit, or
(2) sets a temporary multisession plan with parallelly::availableCores()
and tears it down on exit. The max_cores argument controls worker count
only when no user plan is active.

future and future.apply are added to Suggests in DESCRIPTION.

https://claude.ai/code/session_01M1Jvf2Miu6JL4TQrEh1wS8

* Require user plan() for parallel=TRUE; fix as_wt_nwt false-positive warnings

- parallel = TRUE now errors with a cli-styled message if no non-sequential
  future::plan() is active; users must call e.g. future::plan(future::multisession)
  before using parallel = TRUE (breaking change)
- Removed auto-setup/teardown of multisession plan inside as.sir(), which was
  slow and caused version-mismatch issues with load_all() workflows
- Added as_wt_nwt to the exclusion list in as_sir_method() to suppress
  false-positive "no longer used" warnings during parallel runs
- Fixed pieces_per_col row-batch calculation to use n_workers (total available
  workers from the active plan) instead of n_cores (workers clipped to n_cols),
  so row-batch mode activates correctly when n_cols < n_workers
- Updated @param parallel and @param max_cores roxygen docs; regenerated man/as.sir.Rd
- Updated sequential-mode hint to instruct users to set plan() first

https://claude.ai/code/session_01M1Jvf2Miu6JL4TQrEh1wS8

* fix parallel

* fix parallel

* unit tests

* unit tedts

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-30 08:57:19 +01:00
3f1b20c304 (v3.0.1.9052) fix NEWS 2026-04-25 16:21:31 +02:00
905dea2cf1 (v3.0.1.9051) fix NEWS 2026-04-25 16:20:34 +02:00
Matthijs Berends
8261b91b24 Fix custom reference_data support in as.sir() (#239) (PR #279)
* Fix custom reference_data support in as.sir() (#239)

- custom guideline names now correctly classify values as R: CLSI convention
  (>= breakpoint_R for MIC, <= for disk) applies only when guideline contains
  "CLSI"; all other guidelines including custom ones use the EUCAST convention
  (> breakpoint_R for MIC, < for disk)
- guideline argument is now optional when reference_data is manually set: if
  omitted or if its value does not match any row in the custom data, all rows
  in reference_data are used; if set to a value present in the data, only
  matching rows are filtered — useful for multi-guideline custom tables
- host = NA in custom reference_data now acts as a host-agnostic fallback
  when no host-specific breakpoint row exists for the current animal species
- updated reference_data argument documentation to explain these conventions

https://claude.ai/code/session_01Q8KtFFGG9qrjAgLJBbxG2U

* Refactor R-classification logic using custom_breakpoints_set flag

Introduce custom_breakpoints_set <- !identical(reference_data, AMR::clinical_breakpoints)
at the top of as_sir_method() and replace all identical() calls inside that
function with this variable.

In the case_when_AMR interpretation blocks (MIC and disk), the R-classification
now has three explicit arms:
- !custom_breakpoints_set & EUCAST guideline -> open interval (> / <)
- !custom_breakpoints_set & CLSI guideline  -> closed interval (>= / <=)
- custom_breakpoints_set                    -> open interval (> / <), always,
  regardless of the guideline name in the custom data (e.g. "CLSI_custom"
  must not accidentally trigger CLSI convention)

https://claude.ai/code/session_01Q8KtFFGG9qrjAgLJBbxG2U

* Fix unit tests for custom reference_data (#239)

- Do not override my_bp$mo / my_bp$ab in tests: assigning plain character
  strips the <mo>/<ab> class, which check_reference_data() rejects. Use the
  mo/ab values already present in the source row instead.
- Use NA_character_ instead of NA for my_bp$host so the host column keeps
  its character class.
- Pass breakpoint_type = "animal" explicitly in the host-fallback test since
  the custom reference_data only contains animal-type breakpoints.

https://claude.ai/code/session_01Q8KtFFGG9qrjAgLJBbxG2U

* Add coerce_reference_data_columns() for lenient reference_data validation

check_reference_data() now returns the (possibly coerced) reference_data and
the call site captures the result so downstream code sees the fixed columns.

A new coerce_reference_data_columns() helper is called before the strict class
check inside check_reference_data(). It coerces columns to the expected types:
- mo  -> as.mo() if not already <mo> class
- ab  -> as.ab() if not already <ab> class
- character columns -> as.character() (e.g. host = NA becomes NA_character_)
- numeric columns  -> as.double()
- logical columns  -> as.logical()

This allows users to build a custom reference_data from a plain data.frame
without having to pre-apply as.mo()/as.ab() or worry about NA column types.

Updated the reference_data roxygen argument to document the auto-coercion and
restored the tests to the simpler form that uses plain character assignments,
relying on the new coercion instead of workarounds.

https://claude.ai/code/session_01Q8KtFFGG9qrjAgLJBbxG2U

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-25 14:38:01 +02:00
Matthijs Berends
19157ce718 Fix parallel computing in as.sir.data.frame (#276)
* Fix parallel computing in as.sir.data.frame

Six bugs in parallel = TRUE mode:

1. PSOCK workers (Windows / R < 4.0) never had AMR loaded, so every
   exported/AMR function call failed. Added clusterEvalQ(cl, library(AMR))
   with a graceful fallback to sequential when the package cannot be loaded
   (e.g. dev-only load_all() environments).

2. clusterExport'd AMR_env was a frozen serialised copy; as.sir() on the
   worker wrote to AMR:::AMR_env while run_as_sir_column read from the stale
   copy, so the captured log was always wrong. Fixed by resolving AMR_env
   dynamically via get("AMR_env", envir = asNamespace("AMR")) inside the
   worker function, and removing AMR_env from clusterExport.

3. In the fork-based (mclapply) path each worker inherited the parent's full
   sir_interpretation_history. Capturing the whole log then combining across
   workers duplicated every pre-existing entry. Fixed by recording the log
   row count before the as.sir() call and slicing only the new rows
   afterwards.

4. run_as_sir_column used non-exported internals (%pm>%, pm_pull,
   as.sir.default) that are inaccessible on PSOCK workers after library(AMR).
   Replaced pipe chains with direct as.mic(as.character(x[, col, drop=TRUE]))
   and as.disk(...) calls, and changed as.sir.default() to as.sir() which
   dispatches correctly via S3.

5. With info = TRUE, worker forks printed per-column progress messages
   simultaneously, producing garbled interleaved console output. Per-column
   messages are now suppressed inside workers (effective_info = FALSE) while
   the outer "Running in parallel" / "DONE" messages still appear.

6. Malformed Unicode escape \u00a (3 hex digits) in the "DONE" banner was
   parsed by R as U+00AD (soft hyphen) + "ONE"; corrected to  .

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Add parallel computing tests to test-sir.R

Eight targeted tests verify correctness of the parallel as.sir() path:
identical SIR output vs sequential, matching log row counts, no
pre-existing history duplication, reproducibility across runs, results
consistency across max_cores values, single-column fallback, and no
per-column worker messages leaking when info = TRUE. All pass when only
1 core is available (parallel silently falls back to sequential).

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Fix as.sir() data.frame: preserve already-<sir> columns, exclude metadata

Issue #278: two related bugs in the column-detection / type-assignment pipeline.

Bug 1 – already-<sir> columns deleted on re-run
  Line 886 excluded already-sir columns from the type assignment (they
  stayed type "") causing the result loop to do x[,col] <- NULL, deleting
  them.  Fix: drop the !is.sir() guard so all untyped columns fall through
  to type "sir" and are re-processed correctly.

Bug 2 – metadata columns treated as antibiotics
  as.ab("patient") -> OXY, as.ab("ward") -> PRU.  The column detector
  accepted any column whose name matched an antibiotic code, regardless of
  content.  Fix: for name-matched columns that do not already carry an AMR
  class, also verify content looks like AMR data (all_valid_mics, all-
  numeric, or any SIR-like string).  all_valid_disks() is intentionally
  avoided here because it strips letters from strings (as.disk("Pt_1")==1).

Also adds tools/benchmark_parallel.R: a standalone script that times
sequential vs parallel as.sir() across n=20/200/2000/20000 rows and
saves a ggplot2 PNG to tools/benchmark_parallel.png.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Update benchmark: two-panel script with warm-up and column-count sweep

Previous single-panel benchmark was misleading: the first sequential run
paid one-time cache-warm-up cost (skewing n=20), and only 6 columns were
used so only 6 cores were ever active on a 16-core machine.

New two-panel design:
  Left  – vary rows with 16 fixed AB columns (shows memory-bandwidth
          saturation for large n)
  Right – vary columns with fixed rows (shows the real speedup profile:
          parallel wins when n_cols >> 1)

Also adds a warm-up pass before measurements to eliminate first-call bias.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Optimise parallel as.sir(): row-batch mode when n_cols < n_cores

Previously parallel dispatch only parallelised by column, so a 6-column
dataset on a 16-core machine used at most 6 cores with the other 10 idle.
For large n this also caused memory-bandwidth saturation (each worker did
a full n-row scan of clinical_breakpoints simultaneously).

New row-batch mode (fork path, R >= 4.0, non-Windows):
  pieces_per_col = ceil(n_cores / n_cols)
  Jobs = n_cols × pieces_per_col  (≈ n_cores jobs total)
  Each job: one column × one row slice

Benefits:
  - All cores stay busy regardless of column count
  - Per-worker memory footprint shrinks by pieces_per_col ×
  - Breakpoints lookup cache pressure reduced per worker

PSOCK path (Windows / R < 4.0) is unchanged: per-job serialisation
overhead makes row batching unprofitable there.

run_as_sir_column() gains an optional `rows` parameter (NULL = all rows,
backward-compatible). Results are reassembled via as.sir(c(as.character(.)))
which is safe for already-clean SIR values.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Fix info=FALSE ignored when no breakpoints found in as_sir_method

Operator-precedence bug at line 1601:

  if (isTRUE(info) && nrow(df_unique) < 10 || nrow(breakpoints) == 0)

R evaluates && before ||, so this was equivalent to:

  (isTRUE(info) && nrow(df_unique) < 10) || (nrow(breakpoints) == 0)

When nrow(breakpoints) == 0 (e.g. cefoxitin / flucloxacillin / mupirocin
against E. coli in EUCAST) the intro message was always printed regardless
of info. Fix: add parentheses so info gates both conditions:

  isTRUE(info) && (nrow(df_unique) < 10 || nrow(breakpoints) == 0)

Also pass print = isTRUE(info) to progress_ticker so the progress bar
(which prints intro_txt as its title) is suppressed when info = FALSE.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Fix cli formatting in as.sir() messages

- stop_if for empty ab_cols: wrap as.mic() and as.disk() in
  {.help [{.fun ...}](...)} for clickable links in cli output
- Parallel mode message: use {.field col} formatting for column names
  and quotes = FALSE in vector_and(), consistent with the rest of the
  codebase (avoids double-quoting from both font_bold and quotes="'")

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Use font_bold() inside {.field} for column names in parallel message

Convention: paste0("{.field ", font_bold(col), "}") gives bold green
column names without quotation marks, consistent with the rest of the
codebase (e.g. the 'Cleaning values' message in run_as_sir_column).

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Add collapse = NULL to font_bold() for column name vectors

font_bold() without collapse = NULL joins a vector with "" into a single
string, breaking paste0() element-wise formatting for length > 1 vectors.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

* Add tools/ to .Rbuildignore

Keeps the benchmark script out of the built package tarball.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-04-25 00:34:38 +02:00
48 changed files with 2952 additions and 598 deletions

View File

@@ -41,4 +41,5 @@
^CRAN-SUBMISSION$ ^CRAN-SUBMISSION$
^PythonPackage$ ^PythonPackage$
^README\.Rmd$ ^README\.Rmd$
^tools$
\.no_include$ \.no_include$

View File

@@ -167,7 +167,8 @@ Then run the following from the repo root to determine the version string to use
currenttag=$(git describe --tags --abbrev=0 | sed 's/v//') currenttag=$(git describe --tags --abbrev=0 | sed 's/v//')
currenttagfull=$(git describe --tags --abbrev=0) currenttagfull=$(git describe --tags --abbrev=0)
defaultbranch=$(git branch | cut -c 3- | grep -E '^master$|^main$') defaultbranch=$(git branch | cut -c 3- | grep -E '^master$|^main$')
currentcommit=$(git rev-list --count ${currenttagfull}..${defaultbranch}) git fetch origin ${defaultbranch} --quiet
currentcommit=$(git rev-list --count ${currenttagfull}..origin/${defaultbranch})
currentversion="${currenttag}.$((currentcommit + 9001 + 1))" currentversion="${currenttag}.$((currentcommit + 9001 + 1))"
echo "$currentversion" echo "$currentversion"
``` ```

View File

@@ -1,6 +1,6 @@
Package: AMR Package: AMR
Version: 3.0.1.9048 Version: 3.0.1.9059
Date: 2026-04-22 Date: 2026-05-06
Title: Antimicrobial Resistance Data Analysis Title: Antimicrobial Resistance Data Analysis
Description: Functions to simplify and standardise antimicrobial resistance (AMR) Description: Functions to simplify and standardise antimicrobial resistance (AMR)
data analysis and to work with microbial and antimicrobial properties by data analysis and to work with microbial and antimicrobial properties by
@@ -37,17 +37,18 @@ Authors@R: c(
person(given = c("Casper", "J."), family = "Albers", role = "ths", comment = c(ORCID = "0000-0002-9213-6743")), person(given = c("Casper", "J."), family = "Albers", role = "ths", comment = c(ORCID = "0000-0002-9213-6743")),
person(given = c("Corinna"), family = "Glasner", role = "ths", comment = c(ORCID = "0000-0003-1241-1328"))) person(given = c("Corinna"), family = "Glasner", role = "ths", comment = c(ORCID = "0000-0003-1241-1328")))
Depends: R (>= 3.0.0) Depends: R (>= 3.0.0)
Suggests: Suggests:
cleaner, cleaner,
cli, cli,
crayon, crayon,
curl, curl,
data.table, data.table,
dplyr, dplyr,
future,
future.apply,
ggplot2, ggplot2,
knitr, knitr,
openxlsx, openxlsx,
parallelly,
pillar, pillar,
progress, progress,
readxl, readxl,

View File

@@ -49,6 +49,7 @@ S3method(as.data.frame,mo)
S3method(as.double,mic) S3method(as.double,mic)
S3method(as.double,sir) S3method(as.double,sir)
S3method(as.list,custom_eucast_rules) S3method(as.list,custom_eucast_rules)
S3method(as.list,custom_interpretive_rules)
S3method(as.list,custom_mdro_guideline) S3method(as.list,custom_mdro_guideline)
S3method(as.list,mic) S3method(as.list,mic)
S3method(as.matrix,mic) S3method(as.matrix,mic)
@@ -66,6 +67,7 @@ S3method(c,ab)
S3method(c,amr_selector) S3method(c,amr_selector)
S3method(c,av) S3method(c,av)
S3method(c,custom_eucast_rules) S3method(c,custom_eucast_rules)
S3method(c,custom_interpretive_rules)
S3method(c,custom_mdro_guideline) S3method(c,custom_mdro_guideline)
S3method(c,disk) S3method(c,disk)
S3method(c,mic) S3method(c,mic)
@@ -96,6 +98,7 @@ S3method(print,amr_selector)
S3method(print,av) S3method(print,av)
S3method(print,bug_drug_combinations) S3method(print,bug_drug_combinations)
S3method(print,custom_eucast_rules) S3method(print,custom_eucast_rules)
S3method(print,custom_interpretive_rules)
S3method(print,custom_mdro_guideline) S3method(print,custom_mdro_guideline)
S3method(print,deprecated_amr_dataset) S3method(print,deprecated_amr_dataset)
S3method(print,disk) S3method(print,disk)
@@ -228,6 +231,7 @@ export(count_df)
export(count_resistant) export(count_resistant)
export(count_susceptible) export(count_susceptible)
export(custom_eucast_rules) export(custom_eucast_rules)
export(custom_interpretive_rules)
export(custom_mdro_guideline) export(custom_mdro_guideline)
export(eucast_dosage) export(eucast_dosage)
export(eucast_exceptional_phenotypes) export(eucast_exceptional_phenotypes)
@@ -296,6 +300,7 @@ export(mo_is_yeast)
export(mo_kingdom) export(mo_kingdom)
export(mo_lpsn) export(mo_lpsn)
export(mo_matching_score) export(mo_matching_score)
export(mo_morphology)
export(mo_mycobank) export(mo_mycobank)
export(mo_name) export(mo_name)
export(mo_order) export(mo_order)

76
NEWS.md
View File

@@ -1,56 +1,38 @@
# AMR 3.0.1.9048 # AMR 3.0.1.9059
Planned as v3.1.0, May 2026.
### New ### New
* Support for clinical breakpoints of 2026 of both CLSI and EUCAST, by adding all of their over 5,700 new clinical breakpoints to the `clinical_breakpoints` data set for usage in `as.sir()`. EUCAST 2026 is now the new default guideline for all MIC and disk diffusion interpretations. * EUCAST 2026 and CLSI 2026 breakpoints: over 5,700 new breakpoints added to the `clinical_breakpoints` data set; EUCAST 2026 is now the default for all MIC and disk diffusion interpretations
* Integration with the **tidymodels** framework to allow seamless use of SIR, MIC and disk data in modelling pipelines via `recipes` * Wildtype/Non-wildtype (WT/NWT) output when using ECOFF-based interpretation, by setting `breakpoint_type = "ECOFF"` in `as.sir()`; WT/NWT results are fully supported in all resistance/susceptibility functions and plots (#254)
- `step_mic_log2()` to transform `<mic>` columns with log2, and `step_sir_numeric()` to convert `<sir>` columns to numeric * Faster parallel computing via the `future` package; **breaking change**: a non-sequential plan (e.g. `future::plan(future::multisession)`) must be active before using `parallel = TRUE`; `antibiogram()` and `wisca()` now also support `parallel = TRUE` (#281)
- New `tidyselect` helpers: * *tidymodels* integration for using SIR, MIC and disk data in modelling pipelines: `step_mic_log2()`, `step_sir_numeric()`, and new column selectors `all_sir()`, `all_mic()`, `all_disk()`
- `all_sir()`, `all_sir_predictors()` * New `esbl_isolates` data set for practising AMR modelling
- `all_mic()`, `all_mic_predictors()` * New antimicrobial selectors: `ionophores()`, `peptides()`, `phosphonics()`, `spiropyrimidinetriones()`
- `all_disk()`, `all_disk_predictors()` * New `interpretive_rules()`, a unified function for EUCAST and CLSI interpretive rules; `eucast_rules()` is now a wrapper around it (#235, #259)
* Data set `esbl_isolates` to practise with AMR modelling * New `amr_course()` to download and unpack course or webinar materials from GitHub in one call
* AMR selectors `ionophores()`, `peptides()`, `phosphonics()` and `spiropyrimidinetriones()` * Typed missing value constants `NA_ab_` and `NA_mo_`, for use in pipelines that need missing values of a specific class
* Support for Wildtype (WT) / Non-wildtype (NWT) in `as.sir()`, all plotting functions, and all susceptibility/resistance functions.
- `as.sir()` gained an argument `as_wt_nwt`, which defaults to `TRUE` only when `breakpoint_type = "ECOFF"` (#254)
- This transforms the output from S/R to WT/NWT
- Functions such as `susceptibility()` count WT as S and NWT as R
* Function `interpretive_rules()`, which allows future implementation of CLSI interpretive rules (#235)
- `eucast_rules()` has become a wrapper around that function
- Gained argument `add_if_missing` (default: `TRUE`). When set to `FALSE`, rules are only applied to cells that already contain an SIR value; `NA` cells are left untouched. This is useful with `overwrite = TRUE` to update reported results without imputing values for drugs that were not tested (#259)
* Function `amr_course()`, which allows for automated download and unpacking of a GitHub repository for e.g. webinar use
* Two new `NA` objects, `NA_ab_` and `NA_mo_`, analogous to base R's `NA_character_` and `NA_integer_`, for use in pipelines that require typed missing values
### Fixes ### Fixes
* Fixed a bug in `as.sir()` where values that were purely numeric (e.g., `"1"`) and matched the broad SIR-matching regex would be incorrectly stripped of all content by the Unicode letter filter * `as.sir()` on data frames: already-converted SIR columns no longer dropped on re-run (#278); metadata columns (e.g. `patient`, `ward`) no longer misidentified as antibiotic columns; `info = FALSE` now suppresses all messages, including for columns without breakpoints
* Fixed a bug in `as.mic()` where MIC values in scientific notation (e.g., `"1e-3"`) were incorrectly handled because the letter `e` was removed along with other Unicode letters; scientific notation `e` is now preserved * `as.mic()`: values in scientific notation (e.g. `1e-3`) now handled correctly
* Fixed a bug in `as.ab()` where certain AB codes containing "PH" or "TH" (such as `ETH`, `MTH`, `PHE`, `PHN`, `STH`, `THA`, `THI1`) would incorrectly return `NA` when combined in a vector with any untranslatable value (#245) * `as.ab()`: codes containing "PH" or "TH" (e.g. `ETH`, `PHE`) no longer return `NA` when mixed with unrecognised input (#245)
* Fixed a bug in `antibiogram()` for when no antimicrobials are set * Combined MIC/SIR input values (e.g. `"<= 0.002; S"` or `"S; 0.002"`) now parsed correctly (#252)
* Fixed a bug in `as.sir()` where for numeric input the arguments `S`, `I`, and `R` would not be considered (#244) * `as.mo()`: input of the form `"X complex"` now falls back to `"X"` when the complex is not a distinct taxon in the database, preventing `NA` results for valid clinical descriptions such as `"Proteus vulgaris complex"` (#287)
* Fixed a bug in plotting MIC values when `keep_operators = "all"` * `mo_matching_score()`: abbreviated-genus input (e.g. `"S. apiospermum"`) now correctly ranks candidates whose species epithet exactly matches the input above more-prevalent organisms whose species does not match; fixes `"S. apiospermum"` resolving to *Staphylococcus* instead of *Scedosporium apiospermum* (#288)
* Fixed some foreign translations of antimicrobial drugs * `get_author_year()` in the microorganism reproduction script now strips `emend.` and everything after it, so `ref` reflects the combination authority rather than the emendation author (e.g. *Rhodococcus equi* now returns "Goodfellow et al., 1977" instead of "Nouioui et al., 2018")
* Fixed a bug for printing column names to the console when using `mutate_at(vars(...), as.mic)` (#249) * BRMO classification now includes bacterial complexes (#275)
* Fixed a bug to disregard `NI` for susceptibility proportion functions * Translation fixes for Italian CoNS/CoPS names (#256), Dutch antimicrobials, and `sir_df()` foreign-language output (#272)
* Fixed Italian translation of CoNS to Stafilococco coagulasi-negativo and CoPS to Stafilococco coagulasi-positivo (#256)
* Fixed SIR and MIC coercion of combined values, e.g. `as.sir("<= 0.002; S") ` or `as.mic("S; 0.002")` (#252)
* Fixed translation of foreign languages in `sir_df()` (#272)
* Fixed BRMO classification by including bacterial complexes (#275)
### Updates ### Updates
* Extensive `cli` integration for better message handling and clickable links in messages and warnings (#191, #265) * `custom_eucast_rules()` renamed to `custom_interpretive_rules()`; old name deprecated but still works (#268)
* `mdro()` now infers resistance for a _missing_ base drug column from an _available_ corresponding drug+inhibitor combination showing resistance (e.g., piperacillin is absent but required, while piperacillin/tazobactam available and resistant). Can be set with the new argument `infer_from_combinations`, which defaults to `TRUE` (#209). Note that this can yield a higher MDRO detection (which is a good thing as it has become more reliable). * `mdro()` can now infer resistance from a drug+inhibitor combination when the base drug column is absent (e.g. piperacillin inferred from piperacillin/tazobactam); controlled via new `infer_from_combinations` argument (default `TRUE`) (#209)
* `susceptibility()` and `resistance()` gained the argument `guideline`, which defaults to EUCAST, for interpreting the 'I' category correctly. * `susceptibility()` / `resistance()`: new `guideline` argument (default EUCAST) to ensure the 'I' category is interpreted correctly per guideline
* Added to the `antimicrobials` data set: cefepime/taniborbactam (`FTA`), ceftibuten/avibactam (`CTA`), clorobiocin (`CLB`), kasugamycin (`KAS`), ostreogrycin (`OST`), taniborbactam (`TAN`), thiostrepton (`THS`), xeruborbactam (`XER`), and zorbamycin (`ZOR`) * Capped MIC handling in `as.sir()` reworked into four clearly defined options: `"none"`, `"conservative"` (new default), `"standard"`, `"lenient"` (#243)
* `as.mic()` and `rescale_mic()` gained the argument `round_to_next_log2`, which can be set to `TRUE` to round all values up to the nearest next log2 level (#255) * `as.mic()` / `rescale_mic()`: new `round_to_next_log2` argument to round values up to the nearest log2 dilution level (#255)
* `antimicrobials$group` is now a `list` instead of a `character`, to contain any group the drug is in (#246) * `antimicrobials$group` now a `list`, so drugs belonging to multiple groups are fully represented; use `ab_group(all_groups = TRUE)` to retrieve all groups for a drug (#246)
* `ab_group()` gained an argument `all_groups` to return all groups the antimicrobial drug is in (#246) * New antimicrobials added: cefepime/taniborbactam (`FTA`), ceftibuten/avibactam (`CTA`), clorobiocin (`CLB`), kasugamycin (`KAS`), ostreogrycin (`OST`), taniborbactam (`TAN`), thiostrepton (`THS`), xeruborbactam (`XER`), zorbamycin (`ZOR`)
* Added explaining message to `as.sir()` when interpreting numeric values (e.g., 1 for S, 2 for I, 3 for R) (#244) * Improved console messages with clickable links throughout, powered by `cli` (#191, #265)
* Updated handling of capped MIC values (`<`, `<=`, `>`, `>=`) in `as.sir()` in the argument `capped_mic_handling`: (#243)
* Introduced four clearly defined options: `"none"`, `"conservative"` (default), `"standard"`, and `"lenient"`
* Interpretation of capped MIC values now consistently returns `"NI"` (non-interpretable) when the true MIC could be at either side of a breakpoint, depending on the selected handling mode
* This results in more reliable behaviour compared to previous versions for capped MIC values
* Removed the `"inverse"` option, which has now become redundant
* `ab_group()` now returns values consist with the AMR selectors (#246)
* Added two new `NA` objects, `NA_ab_` and `NA_mo_`, analogous to base R's `NA_character_` and `NA_integer_`, for use in pipelines that require typed missing values
# AMR 3.0.1 # AMR 3.0.1

View File

@@ -27,7 +27,7 @@
# how to conduct AMR data analysis: https://amr-for-r.org # # how to conduct AMR data analysis: https://amr-for-r.org #
# ==================================================================== # # ==================================================================== #
# add new version numbers here, and add the rules themselves to "data-raw/eucast_rules.tsv" and clinical_breakpoints # add new version numbers here, and add the rules themselves to "data-raw/interpretive_rules.tsv" and clinical_breakpoints
# (sourcing "data-raw/_pre_commit_checks.R" will process the TSV file) # (sourcing "data-raw/_pre_commit_checks.R" will process the TSV file)
EUCAST_VERSION_BREAKPOINTS <- list( EUCAST_VERSION_BREAKPOINTS <- list(
"16.0" = list( "16.0" = list(
@@ -221,6 +221,7 @@ globalVariables(c(
"reference.rule", "reference.rule",
"reference.rule_group", "reference.rule_group",
"reference.version", "reference.version",
"rule.provider",
"rowid", "rowid",
"rule_group", "rule_group",
"rule_name", "rule_name",

22
R/aa_helper_functions.R Normal file → Executable file
View File

@@ -1681,28 +1681,6 @@ readRDS_AMR <- function(file, refhook = NULL) {
readRDS(con, refhook = refhook) readRDS(con, refhook = refhook)
} }
get_n_cores <- function(max_cores = Inf) {
if (pkg_is_available("parallelly", min_version = "0.8.0", also_load = FALSE)) {
available_cores <- import_fn("availableCores", "parallelly")
n_cores <- min(available_cores(), na.rm = TRUE)
} else {
# `parallel` is part of base R since 2.14.0, but detectCores() is not very precise on exotic systems like Docker and quota-set Linux environments
n_cores <- parallel::detectCores()[1]
if (is.na(n_cores)) {
n_cores <- 1
}
}
max_cores <- floor(max_cores)
if (max_cores == 0) {
n_cores <- 1
} else if (max_cores < 0) {
n_cores <- max(1, n_cores - abs(max_cores))
} else if (max_cores > 0) {
n_cores <- min(n_cores, max_cores)
}
n_cores
}
# Support `where()` if tidyselect not installed ---- # Support `where()` if tidyselect not installed ----
if (!is.null(import_fn("where", "tidyselect", error_on_fail = FALSE))) { if (!is.null(import_fn("where", "tidyselect", error_on_fail = FALSE))) {
# tidyselect::where() exists, retrieve from their namespace to make `where()`s work across the package in default arguments # tidyselect::where() exists, retrieve from their namespace to make `where()`s work across the package in default arguments

0
R/amr_course.R Normal file → Executable file
View File

View File

@@ -54,7 +54,7 @@
#' @param add_total_n *(deprecated in favour of `formatting_type`)* A [logical] to indicate whether `n_tested` available numbers per pathogen should be added to the table (default is `TRUE`). This will add the lowest and highest number of available isolates per antimicrobial (e.g, if for *E. coli* 200 isolates are available for ciprofloxacin and 150 for amoxicillin, the returned number will be "150-200"). This option is unavailable when `wisca = TRUE`; in that case, use [retrieve_wisca_parameters()] to get the parameters used for WISCA. #' @param add_total_n *(deprecated in favour of `formatting_type`)* A [logical] to indicate whether `n_tested` available numbers per pathogen should be added to the table (default is `TRUE`). This will add the lowest and highest number of available isolates per antimicrobial (e.g, if for *E. coli* 200 isolates are available for ciprofloxacin and 150 for amoxicillin, the returned number will be "150-200"). This option is unavailable when `wisca = TRUE`; in that case, use [retrieve_wisca_parameters()] to get the parameters used for WISCA.
#' @param only_all_tested (for combination antibiograms): a [logical] to indicate that isolates must be tested for all antimicrobials, see *Details*. #' @param only_all_tested (for combination antibiograms): a [logical] to indicate that isolates must be tested for all antimicrobials, see *Details*.
#' @param digits Number of digits to use for rounding the antimicrobial coverage, defaults to 1 for WISCA and 0 otherwise. #' @param digits Number of digits to use for rounding the antimicrobial coverage, defaults to 1 for WISCA and 0 otherwise.
#' @param formatting_type Numeric value (122 for WISCA, 1-12 for non-WISCA) indicating how the 'cells' of the antibiogram table should be formatted. See *Details* > *Formatting Type* for a list of options. #' @param formatting_type Numeric value (1-22 for WISCA, 1-12 for non-WISCA) indicating how the 'cells' of the antibiogram table should be formatted. See *Details* > *Formatting Type* for a list of options.
#' @param col_mo Column name of the names or codes of the microorganisms (see [as.mo()]) - the default is the first column of class [`mo`]. Values will be coerced using [as.mo()]. #' @param col_mo Column name of the names or codes of the microorganisms (see [as.mo()]) - the default is the first column of class [`mo`]. Values will be coerced using [as.mo()].
#' @param language Language to translate text, which defaults to the system language (see [get_AMR_locale()]). #' @param language Language to translate text, which defaults to the system language (see [get_AMR_locale()]).
#' @param minimum The minimum allowed number of available (tested) isolates. Any isolate count lower than `minimum` will return `NA` with a warning. The default number of `30` isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see *Source*. #' @param minimum The minimum allowed number of available (tested) isolates. Any isolate count lower than `minimum` will return `NA` with a warning. The default number of `30` isolates is advised by the Clinical and Laboratory Standards Institute (CLSI) as best practice, see *Source*.
@@ -65,6 +65,7 @@
#' @param simulations (for WISCA) a numerical value to set the number of Monte Carlo simulations. #' @param simulations (for WISCA) a numerical value to set the number of Monte Carlo simulations.
#' @param conf_interval A numerical value to set confidence interval (default is `0.95`). #' @param conf_interval A numerical value to set confidence interval (default is `0.95`).
#' @param interval_side The side of the confidence interval, either `"two-tailed"` (default), `"left"` or `"right"`. #' @param interval_side The side of the confidence interval, either `"two-tailed"` (default), `"left"` or `"right"`.
#' @param parallel A [logical] to indicate if parallel computing must be used, defaults to `FALSE`. Requires the [`future.apply`][future.apply::future_lapply()] package. For WISCA, Monte Carlo simulations are distributed across workers; for grouped antibiograms, each group is processed by a separate worker. **A non-sequential [future::plan()] must already be active before setting `parallel = TRUE`** -- for example, `future::plan(future::multisession)`. An error is thrown if `parallel = TRUE` is used without a plan set by the user.
#' @param info A [logical] to indicate info should be printed - the default is `TRUE` only in interactive mode. #' @param info A [logical] to indicate info should be printed - the default is `TRUE` only in interactive mode.
#' @param object An [antibiogram()] object. #' @param object An [antibiogram()] object.
#' @param ... When used in [R Markdown or Quarto][knitr::kable()]: arguments passed on to [knitr::kable()] (otherwise, has no use). #' @param ... When used in [R Markdown or Quarto][knitr::kable()]: arguments passed on to [knitr::kable()] (otherwise, has no use).
@@ -413,6 +414,7 @@ antibiogram <- function(x,
conf_interval = 0.95, conf_interval = 0.95,
interval_side = "two-tailed", interval_side = "two-tailed",
info = interactive(), info = interactive(),
parallel = FALSE,
...) { ...) {
UseMethod("antibiogram") UseMethod("antibiogram")
} }
@@ -439,6 +441,7 @@ antibiogram.default <- function(x,
conf_interval = 0.95, conf_interval = 0.95,
interval_side = "two-tailed", interval_side = "two-tailed",
info = interactive(), info = interactive(),
parallel = FALSE,
...) { ...) {
meet_criteria(x, allow_class = "data.frame") meet_criteria(x, allow_class = "data.frame")
x <- ascertain_sir_classes(x, "x") x <- ascertain_sir_classes(x, "x")
@@ -478,6 +481,35 @@ antibiogram.default <- function(x,
meet_criteria(conf_interval, allow_class = c("numeric", "integer"), has_length = 1, is_finite = TRUE, is_positive = TRUE) meet_criteria(conf_interval, allow_class = c("numeric", "integer"), has_length = 1, is_finite = TRUE, is_positive = TRUE)
meet_criteria(interval_side, allow_class = "character", has_length = 1, is_in = c("two-tailed", "left", "right")) meet_criteria(interval_side, allow_class = "character", has_length = 1, is_in = c("two-tailed", "left", "right"))
meet_criteria(info, allow_class = "logical", has_length = 1) meet_criteria(info, allow_class = "logical", has_length = 1)
meet_criteria(parallel, allow_class = "logical", has_length = 1)
# parallel gate - identical pattern to as.sir()
if (requireNamespace("future.apply", quietly = TRUE) && !inherits(future::plan(), "sequential")) {
if (isFALSE(parallel)) {
message_("Assuming {.code parallel = TRUE} since parallel computing has been set up using the {.pkg future} package before. Set {.help [{.fun plan}](future::plan)} to sequential to prevent this.")
}
parallel <- TRUE
}
if (isTRUE(parallel)) {
stop_ifnot(
requireNamespace("future.apply", quietly = TRUE),
"Setting {.code parallel = TRUE} requires the {.pkg future.apply} package.\n",
"Install it with {.code install.packages(\"future.apply\")}."
)
stop_if(inherits(future::plan(), "sequential"),
"Setting {.code parallel = TRUE} requires a non-sequential {.help [{.fun future::plan}](future::plan)} to be active.\n",
"For your system, you could first run: {.code library(future); ",
ifelse(.Platform$OS.type == "windows" || in_rstudio(),
"plan(multisession)",
"plan(multicore)"
),
"}",
call = FALSE
)
n_workers <- future::nbrOfWorkers()
} else {
n_workers <- 1L
}
# try to find columns based on type # try to find columns based on type
if (is.null(col_mo)) { if (is.null(col_mo)) {
@@ -705,52 +737,97 @@ antibiogram.default <- function(x,
wisca_parameters <- out wisca_parameters <- out
progress <- progress_ticker( # quantile probabilities are constant across all groups
n = length(unique(wisca_parameters$group)) * simulations, probs <- if (interval_side == "two-tailed") {
n_min = 25, c((1 - conf_interval) / 2, 1 - (1 - conf_interval) / 2)
print = info, } else if (interval_side == "left") {
title = paste("Calculating WISCA for", length(unique(wisca_parameters$group)), "regimens") c(0, conf_interval)
) } else {
on.exit(close(progress)) c(1 - conf_interval, 1)
# run WISCA per group
for (group in unique(wisca_parameters$group)) {
params_current <- wisca_parameters[wisca_parameters$group == group, , drop = FALSE]
if (sum(params_current$n_tested, na.rm = TRUE) == 0) {
next
}
# prepare priors
priors_current <- create_wisca_priors(params_current)
# Monte Carlo simulations
coverage_simulations <- vapply(
FUN.VALUE = double(1),
seq_len(simulations), function(i) {
progress$tick()
simulate_coverage(priors_current)
}
)
# summarise results
coverage_mean <- mean(coverage_simulations)
if (interval_side == "two-tailed") {
probs <- c((1 - conf_interval) / 2, 1 - (1 - conf_interval) / 2)
} else if (interval_side == "left") {
probs <- c(0, conf_interval)
} else if (interval_side == "right") {
probs <- c(1 - conf_interval, 1)
}
coverage_ci <- unname(stats::quantile(coverage_simulations, probs = probs))
out_wisca$coverage[out_wisca$group == group] <- coverage_mean
out_wisca$lower_ci[out_wisca$group == group] <- coverage_ci[1]
out_wisca$upper_ci[out_wisca$group == group] <- coverage_ci[2]
} }
close(progress) unique_groups <- as.character(unique(wisca_parameters$group))
use_parallel_wisca <- isTRUE(parallel) && n_workers > 1L && length(unique_groups) > 0L
if (use_parallel_wisca) {
if (isTRUE(info)) {
message_("Running WISCA in parallel mode using ", n_workers, " workers...", as_note = FALSE, appendLF = FALSE)
}
# chunks_per_group gives ~n_workers total jobs so all workers stay busy
# even when the number of regimens is smaller than n_workers
chunks_per_group <- max(1L, ceiling(n_workers / length(unique_groups)))
chunk_sizes <- diff(c(0L, round(seq_len(chunks_per_group) * simulations / chunks_per_group)))
# precompute priors per group and build (group, chunk) job list
jobs <- unlist(lapply(unique_groups, function(g) {
params_g <- wisca_parameters[wisca_parameters$group == g, , drop = FALSE]
if (sum(params_g$n_tested, na.rm = TRUE) == 0L) {
return(NULL)
}
priors_g <- create_wisca_priors(params_g)
lapply(seq_along(chunk_sizes), function(ch) {
list(group = g, priors = priors_g, n_sims = chunk_sizes[ch])
})
}), recursive = FALSE)
jobs <- Filter(Negate(is.null), jobs)
flat <- future.apply::future_lapply(jobs, function(job) {
vapply(FUN.VALUE = double(1), seq_len(job$n_sims), function(i) {
simulate_coverage(job$priors)
})
}, future.seed = TRUE)
# reassemble per group: concatenate chunks, then summarise
for (g in unique_groups) {
g_idx <- vapply(jobs, function(j) identical(j$group, g), logical(1))
if (!any(g_idx)) next
sims <- unlist(flat[g_idx], use.names = FALSE)
out_wisca$coverage[out_wisca$group == g] <- mean(sims)
ci_vals <- unname(stats::quantile(sims, probs = probs))
out_wisca$lower_ci[out_wisca$group == g] <- ci_vals[1]
out_wisca$upper_ci[out_wisca$group == g] <- ci_vals[2]
}
if (isTRUE(info)) message_(font_green_bg(" DONE "), as_note = FALSE)
} else {
progress <- progress_ticker(
n = length(unique_groups) * simulations,
n_min = 25,
print = info,
title = paste("Calculating WISCA for", length(unique_groups), "regimens")
)
on.exit(close(progress), add = TRUE)
for (group in unique_groups) {
params_current <- wisca_parameters[wisca_parameters$group == group, , drop = FALSE]
if (sum(params_current$n_tested, na.rm = TRUE) == 0) next
priors_current <- create_wisca_priors(params_current)
coverage_simulations <- vapply(
FUN.VALUE = double(1),
seq_len(simulations), function(i) {
progress$tick()
simulate_coverage(priors_current)
}
)
out_wisca$coverage[out_wisca$group == group] <- mean(coverage_simulations)
ci_vals <- unname(stats::quantile(coverage_simulations, probs = probs))
out_wisca$lower_ci[out_wisca$group == group] <- ci_vals[1]
out_wisca$upper_ci[out_wisca$group == group] <- ci_vals[2]
}
close(progress)
if (isTRUE(info) && simulations >= 500 && length(unique_groups) >= 3) {
suggest <- ifelse(.Platform$OS.type == "windows" || in_rstudio(),
"plan(multisession)",
"plan(multicore)"
)
if (requireNamespace("future.apply", quietly = TRUE)) {
message_("Running in sequential mode. To speed up WISCA, set a parallel {.help [{.fun future::plan}](future::plan)} such as {.code ", suggest, "} and use {.code parallel = TRUE}.")
} else {
message_("Running in sequential mode. To speed up WISCA, install the {.pkg future.apply} package and then set {.code parallel = TRUE}.")
}
}
}
# final output preparation # final output preparation
out <- out_wisca out <- out_wisca
@@ -997,30 +1074,52 @@ antibiogram.grouped_df <- function(x,
conf_interval = 0.95, conf_interval = 0.95,
interval_side = "two-tailed", interval_side = "two-tailed",
info = interactive(), info = interactive(),
parallel = FALSE,
...) { ...) {
stop_ifnot(is.null(mo_transform), "{.arg mo_transform} must not be set if creating an antibiogram using a grouped tibble. The groups will become the variables over which the antimicrobials are calculated, which could include the pathogen information (though not necessary). Nonetheless, this makes {.arg mo_transform} redundant.", call = FALSE) stop_ifnot(is.null(mo_transform), "{.arg mo_transform} must not be set if creating an antibiogram using a grouped tibble. The groups will become the variables over which the antimicrobials are calculated, which could include the pathogen information (though not necessary). Nonetheless, this makes {.arg mo_transform} redundant.", call = FALSE)
stop_ifnot(is.null(syndromic_group), "{.arg syndromic_group} must not be set if creating an antibiogram using a grouped tibble. The groups will become the variables over which the antimicrobials are calculated, making {.arg syndromic_group} redundant.", call = FALSE) stop_ifnot(is.null(syndromic_group), "{.arg syndromic_group} must not be set if creating an antibiogram using a grouped tibble. The groups will become the variables over which the antimicrobials are calculated, making {.arg syndromic_group} redundant.", call = FALSE)
meet_criteria(parallel, allow_class = "logical", has_length = 1)
groups <- attributes(x)$groups groups <- attributes(x)$groups
n_groups <- NROW(groups) n_groups <- NROW(groups)
progress <- progress_ticker(
n = n_groups,
n_min = 5,
print = info,
title = paste("Calculating AMR for", n_groups, "groups")
)
on.exit(close(progress))
out <- NULL # parallel gate - identical pattern to as.sir()
wisca_parameters <- NULL if (requireNamespace("future.apply", quietly = TRUE) && !inherits(future::plan(), "sequential")) {
long_numeric <- NULL if (isFALSE(parallel)) {
message_("Assuming {.code parallel = TRUE} since parallel computing has been set up using the {.pkg future} package before. Set {.help [{.fun plan}](future::plan)} to sequential to prevent this.")
for (i in seq_len(n_groups)) {
progress$tick()
rows <- unlist(groups[i, ]$.rows)
if (length(rows) == 0) {
next
} }
new_out <- antibiogram(as.data.frame(x)[rows, , drop = FALSE], parallel <- TRUE
}
if (isTRUE(parallel)) {
stop_ifnot(
requireNamespace("future.apply", quietly = TRUE),
"Setting {.code parallel = TRUE} requires the {.pkg future.apply} package.\n",
"Install it with {.code install.packages(\"future.apply\")}."
)
stop_if(inherits(future::plan(), "sequential"),
"Setting {.code parallel = TRUE} requires a non-sequential {.help [{.fun future::plan}](future::plan)} to be active.\n",
"For your system, you could first run: {.code library(future); ",
ifelse(.Platform$OS.type == "windows" || in_rstudio(),
"plan(multisession)",
"plan(multicore)"
),
"}",
call = FALSE
)
n_workers <- future::nbrOfWorkers()
} else {
n_workers <- 1L
}
use_parallel <- isTRUE(parallel) && n_workers > 1L && n_groups > 1L
x_df <- as.data.frame(x)
run_group <- function(i) {
rows <- unlist(groups[i, ]$.rows)
if (length(rows) == 0L) {
return(NULL)
}
antibiogram(x_df[rows, , drop = FALSE],
antimicrobials = antimicrobials, antimicrobials = antimicrobials,
mo_transform = NULL, mo_transform = NULL,
ab_transform = ab_transform, ab_transform = ab_transform,
@@ -1040,12 +1139,42 @@ antibiogram.grouped_df <- function(x,
conf_interval = conf_interval, conf_interval = conf_interval,
interval_side = interval_side, interval_side = interval_side,
info = FALSE, info = FALSE,
... parallel = FALSE # never nest parallelism in workers
) )
}
if (use_parallel) {
if (isTRUE(info)) {
message_("Running antibiogram for ", n_groups, " groups in parallel using ", n_workers, " workers...", as_note = FALSE, appendLF = FALSE)
}
results_raw <- future.apply::future_lapply(seq_len(n_groups), run_group, future.seed = TRUE)
if (isTRUE(info)) message_(font_green_bg(" DONE "), as_note = FALSE)
} else {
progress <- progress_ticker(
n = n_groups,
n_min = 5,
print = info,
title = paste("Calculating AMR for", n_groups, "groups")
)
on.exit(close(progress), add = TRUE)
results_raw <- vector("list", n_groups)
for (i in seq_len(n_groups)) {
progress$tick()
results_raw[[i]] <- run_group(i)
}
close(progress)
}
out <- NULL
wisca_parameters <- NULL
long_numeric <- NULL
for (i in seq_len(n_groups)) {
new_out <- results_raw[[i]]
new_wisca_parameters <- attributes(new_out)$wisca_parameters new_wisca_parameters <- attributes(new_out)$wisca_parameters
new_long_numeric <- attributes(new_out)$long_numeric new_long_numeric <- attributes(new_out)$long_numeric
if (NROW(new_out) == 0) { if (is.null(new_out) || NROW(new_out) == 0) {
next next
} }
@@ -1071,8 +1200,7 @@ antibiogram.grouped_df <- function(x,
new_long_numeric <- new_long_numeric[, c(col_name, setdiff(names(new_long_numeric), col_name))] # set place to 1st col new_long_numeric <- new_long_numeric[, c(col_name, setdiff(names(new_long_numeric), col_name))] # set place to 1st col
} }
if (i == 1) { if (is.null(out)) {
# the first go
out <- new_out out <- new_out
wisca_parameters <- new_wisca_parameters wisca_parameters <- new_wisca_parameters
long_numeric <- new_long_numeric long_numeric <- new_long_numeric
@@ -1083,8 +1211,6 @@ antibiogram.grouped_df <- function(x,
} }
} }
close(progress)
out <- structure(as_original_data_class(out, class(x), extra_class = "antibiogram"), out <- structure(as_original_data_class(out, class(x), extra_class = "antibiogram"),
has_syndromic_group = FALSE, has_syndromic_group = FALSE,
combine_SI = isTRUE(combine_SI), combine_SI = isTRUE(combine_SI),
@@ -1116,6 +1242,7 @@ wisca <- function(x,
conf_interval = 0.95, conf_interval = 0.95,
interval_side = "two-tailed", interval_side = "two-tailed",
info = interactive(), info = interactive(),
parallel = FALSE,
...) { ...) {
antibiogram( antibiogram(
x = x, x = x,
@@ -1137,6 +1264,7 @@ wisca <- function(x,
conf_interval = conf_interval, conf_interval = conf_interval,
interval_side = interval_side, interval_side = interval_side,
info = info, info = info,
parallel = parallel,
... ...
) )
} }
@@ -1206,7 +1334,7 @@ retrieve_wisca_parameters <- function(wisca_model, ...) {
#' @rawNamespace if(getRversion() >= "3.0.0") S3method(pillar::tbl_sum, antibiogram) #' @rawNamespace if(getRversion() >= "3.0.0") S3method(pillar::tbl_sum, antibiogram)
tbl_sum.antibiogram <- function(x, ...) { tbl_sum.antibiogram <- function(x, ...) {
dims <- paste(format(NROW(x), big.mark = ","), AMR_env$cross_icon, format(NCOL(x), big.mark = ",")) dims <- paste(format(NROW(x), big.mark = ","), AMR_env$cross_icon, format(NCOL(x), big.mark = ","))
names(dims) <- "An Antibiogram" names(dims) <- "An antibiogram"
if (isTRUE(attributes(x)$wisca)) { if (isTRUE(attributes(x)$wisca)) {
dims <- c(dims, Type = paste0("WISCA with ", attributes(x)$conf_interval * 100, "% CI")) dims <- c(dims, Type = paste0("WISCA with ", attributes(x)$conf_interval * 100, "% CI"))
} else if (isTRUE(attributes(x)$formatting_type >= 13)) { } else if (isTRUE(attributes(x)$formatting_type >= 13)) {
@@ -1226,8 +1354,7 @@ tbl_format_footer.antibiogram <- function(x, ...) {
} }
c(footer, font_subtle(paste0( c(footer, font_subtle(paste0(
"# Use `ggplot2::autoplot()` or base R `plot()` to create a plot of this antibiogram,\n", "# Use `ggplot2::autoplot()` or base R `plot()` to create a plot of this antibiogram,\n",
"# or use it directly in R Markdown or ", "# or use it directly in R Markdown or Quarto, see ", word_wrap("?antibiogram")
font_url("https://quarto.org", "Quarto"), ", see ", word_wrap("?antibiogram")
))) )))
} }

View File

@@ -129,16 +129,21 @@ bug_drug_combinations <- function(x,
# turn and merge everything # turn and merge everything
pivot <- lapply(x_mo_filter, function(x) { pivot <- lapply(x_mo_filter, function(x) {
m <- as.matrix(table(as.sir(x), useNA = "always")) m <- as.matrix(table(as.sir(x), useNA = "always"))
na_idx <- which(is.na(rownames(m)))
get_row <- function(lbl) {
idx <- which(rownames(m) == lbl)
if (length(idx) == 1L) unname(m[idx, ]) else rep(0L, ncol(m))
}
data.frame( data.frame(
S = m["S", ], S = get_row("S"),
SDD = m["SDD", ], SDD = get_row("SDD"),
I = m["I", ], I = get_row("I"),
R = m["R", ], R = get_row("R"),
NI = m["NI", ], NI = get_row("NI"),
WT = m["WT", ], WT = get_row("WT"),
NWT = m["NWT", ], NWT = get_row("NWT"),
NS = m["NS", ], NS = get_row("NS"),
na = m[which(is.na(rownames(m))), ], na = if (length(na_idx) == 1L) unname(m[na_idx, ]) else rep(0L, ncol(m)),
stringsAsFactors = FALSE stringsAsFactors = FALSE
) )
}) })

View File

@@ -27,27 +27,27 @@
# how to conduct AMR data analysis: https://amr-for-r.org # # how to conduct AMR data analysis: https://amr-for-r.org #
# ==================================================================== # # ==================================================================== #
#' Define Custom EUCAST Rules #' Define Custom Interpretive Rules
#' #'
#' Define custom EUCAST rules for your organisation or specific analysis and use the output of this function in [eucast_rules()]. #' Define custom interpretive rules for your organisation or specific analysis and use the output of this function in [interpretive_rules()].
#' @param ... Rules in [formula][base::tilde] notation, see below for instructions, and in *Examples*. #' @param ... Rules in [formula][base::tilde] notation, see below for instructions, and in *Examples*.
#' @details #' @details
#' Some organisations have their own adoption of EUCAST rules. This function can be used to define custom EUCAST rules to be used in the [eucast_rules()] function. #' Some organisations have their own adoption of interpretive rules. This function can be used to define custom rules to be used in the [interpretive_rules()] function.
#' #'
#' ### Basics #' ### Basics
#' #'
#' If you are familiar with the [`case_when()`][dplyr::case_when()] function of the `dplyr` package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written *before* the tilde (`~`) and the consequence of the rule is written *after* the tilde: #' If you are familiar with the [`case_when()`][dplyr::case_when()] function of the `dplyr` package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written *before* the tilde (`~`) and the consequence of the rule is written *after* the tilde:
#' #'
#' ```r #' ```r
#' x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S", #' x <- custom_interpretive_rules(TZP == "S" ~ aminopenicillins == "S",
#' TZP == "R" ~ aminopenicillins == "R") #' TZP == "R" ~ aminopenicillins == "R")
#' ``` #' ```
#' #'
#' These are two custom EUCAST rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work: #' These are two custom interpretive rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work:
#' #'
#' ```r #' ```r
#' x #' x
#' #> A set of custom EUCAST rules: #' #> A set of custom interpretive rules:
#' #> #' #>
#' #> 1. If TZP is "S" then set to S : #' #> 1. If TZP is "S" then set to S :
#' #> amoxicillin (AMX), ampicillin (AMP) #' #> amoxicillin (AMX), ampicillin (AMP)
@@ -68,11 +68,11 @@
#' #> 1 Escherichia coli R S S #' #> 1 Escherichia coli R S S
#' #> 2 Klebsiella pneumoniae R S S #' #> 2 Klebsiella pneumoniae R S S
#' #'
#' eucast_rules(df, #' interpretive_rules(df,
#' rules = "custom", #' rules = "custom",
#' custom_rules = x, #' custom_rules = x,
#' info = FALSE, #' info = FALSE,
#' overwrite = TRUE) #' overwrite = TRUE)
#' #> mo TZP ampi cipro #' #> mo TZP ampi cipro
#' #> 1 Escherichia coli R R S #' #> 1 Escherichia coli R R S
#' #> 2 Klebsiella pneumoniae R R S #' #> 2 Klebsiella pneumoniae R R S
@@ -83,16 +83,16 @@
#' There is one exception in columns used for the rules: all column names of the [microorganisms] data set can also be used, but do not have to exist in the data set. These column names are: `r vector_and(colnames(microorganisms), sort = FALSE, documentation = TRUE)`. Thus, this next example will work as well, despite the fact that the `df` data set does not contain a column `genus`: #' There is one exception in columns used for the rules: all column names of the [microorganisms] data set can also be used, but do not have to exist in the data set. These column names are: `r vector_and(colnames(microorganisms), sort = FALSE, documentation = TRUE)`. Thus, this next example will work as well, despite the fact that the `df` data set does not contain a column `genus`:
#' #'
#' ```r #' ```r
#' y <- custom_eucast_rules( #' y <- custom_interpretive_rules(
#' TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S", #' TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S",
#' TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R" #' TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R"
#' ) #' )
#' #'
#' eucast_rules(df, #' interpretive_rules(df,
#' rules = "custom", #' rules = "custom",
#' custom_rules = y, #' custom_rules = y,
#' info = FALSE, #' info = FALSE,
#' overwrite = TRUE) #' overwrite = TRUE)
#' #> mo TZP ampi cipro #' #> mo TZP ampi cipro
#' #> 1 Escherichia coli R S S #' #> 1 Escherichia coli R S S
#' #> 2 Klebsiella pneumoniae R R S #' #> 2 Klebsiella pneumoniae R R S
@@ -109,9 +109,9 @@
#' Rules can also be applied to multiple antimicrobials and antimicrobial groups simultaneously. Use the `c()` function to combine multiple antimicrobials. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R": #' Rules can also be applied to multiple antimicrobials and antimicrobial groups simultaneously. Use the `c()` function to combine multiple antimicrobials. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R":
#' #'
#' ```r #' ```r
#' x <- custom_eucast_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R") #' x <- custom_interpretive_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R")
#' x #' x
#' #> A set of custom EUCAST rules: #' #> A set of custom interpretive rules:
#' #> #' #>
#' #> 1. If TZP is "R" then set to "R": #' #> 1. If TZP is "R" then set to "R":
#' #> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP) #' #> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP)
@@ -123,7 +123,7 @@
#' @returns A [list] containing the custom rules #' @returns A [list] containing the custom rules
#' @export #' @export
#' @examples #' @examples
#' x <- custom_eucast_rules( #' x <- custom_interpretive_rules(
#' AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", #' AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
#' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I" #' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I"
#' ) #' )
@@ -141,24 +141,24 @@
#' # combine rule sets #' # combine rule sets
#' x2 <- c( #' x2 <- c(
#' x, #' x,
#' custom_eucast_rules(TZP == "R" ~ carbapenems == "R") #' custom_interpretive_rules(TZP == "R" ~ carbapenems == "R")
#' ) #' )
#' x2 #' x2
custom_eucast_rules <- function(...) { custom_interpretive_rules <- function(...) {
dots <- tryCatch(list(...), dots <- tryCatch(list(...),
error = function(e) "error" error = function(e) "error"
) )
stop_if( stop_if(
identical(dots, "error"), identical(dots, "error"),
"rules must be a valid formula inputs (e.g., using '~'), see {.help [{.fun custom_eucast_rules}](AMR::custom_eucast_rules)}" "rules must be a valid formula inputs (e.g., using '~'), see {.help [{.fun custom_interpretive_rules}](AMR::custom_interpretive_rules)}"
) )
n_dots <- length(dots) n_dots <- length(dots)
stop_if(n_dots == 0, "no custom rules were set. Please read the documentation using {.help [{.fun custom_eucast_rules}](AMR::custom_eucast_rules)}.") stop_if(n_dots == 0, "no custom rules were set. Please read the documentation using {.help [{.fun custom_interpretive_rules}](AMR::custom_interpretive_rules)}.")
out <- vector("list", n_dots) out <- vector("list", n_dots)
for (i in seq_len(n_dots)) { for (i in seq_len(n_dots)) {
stop_ifnot( stop_ifnot(
inherits(dots[[i]], "formula"), inherits(dots[[i]], "formula"),
"rule ", i, " must be a valid formula input (e.g., using '~'), see {.help [{.fun custom_eucast_rules}](AMR::custom_eucast_rules)}" "rule ", i, " must be a valid formula input (e.g., using '~'), see {.help [{.fun custom_interpretive_rules}](AMR::custom_interpretive_rules)}"
) )
# Query # Query
@@ -180,7 +180,7 @@ custom_eucast_rules <- function(...) {
result <- dots[[i]][[3]] result <- dots[[i]][[3]]
stop_ifnot( stop_ifnot(
deparse(result) %like% "==", deparse(result) %like% "==",
"the result of rule ", i, " (the part after the `~`) must contain `==`, such as in `... ~ ampicillin == \"R\"`, see {.help [{.fun custom_eucast_rules}](AMR::custom_eucast_rules)}" "the result of rule ", i, " (the part after the `~`) must contain `==`, such as in `... ~ ampicillin == \"R\"`, see {.help [{.fun custom_interpretive_rules}](AMR::custom_interpretive_rules)}"
) )
result_group <- as.character(result)[[2]] result_group <- as.character(result)[[2]]
result_group <- as.character(str2lang(result_group)) result_group <- as.character(str2lang(result_group))
@@ -230,13 +230,13 @@ custom_eucast_rules <- function(...) {
} }
names(out) <- paste0("rule", seq_len(n_dots)) names(out) <- paste0("rule", seq_len(n_dots))
set_clean_class(out, new_class = c("custom_eucast_rules", "list")) set_clean_class(out, new_class = c("custom_interpretive_rules", "list"))
} }
#' @method c custom_eucast_rules #' @method c custom_interpretive_rules
#' @noRd #' @noRd
#' @export #' @export
c.custom_eucast_rules <- function(x, ...) { c.custom_interpretive_rules <- function(x, ...) {
if (length(list(...)) == 0) { if (length(list(...)) == 0) {
return(x) return(x)
} }
@@ -245,21 +245,21 @@ c.custom_eucast_rules <- function(x, ...) {
out <- c(out, unclass(e)) out <- c(out, unclass(e))
} }
names(out) <- paste0("rule", seq_len(length(out))) names(out) <- paste0("rule", seq_len(length(out)))
set_clean_class(out, new_class = c("custom_eucast_rules", "list")) set_clean_class(out, new_class = c("custom_interpretive_rules", "list"))
} }
#' @method as.list custom_eucast_rules #' @method as.list custom_interpretive_rules
#' @noRd #' @noRd
#' @export #' @export
as.list.custom_eucast_rules <- function(x, ...) { as.list.custom_interpretive_rules <- function(x, ...) {
c(x, ...) c(x, ...)
} }
#' @method print custom_eucast_rules #' @method print custom_interpretive_rules
#' @export #' @export
#' @noRd #' @noRd
print.custom_eucast_rules <- function(x, ...) { print.custom_interpretive_rules <- function(x, ...) {
cat("A set of custom EUCAST rules:\n") cat("A set of custom interpretive rules:\n")
for (i in seq_len(length(x))) { for (i in seq_len(length(x))) {
rule <- x[[i]] rule <- x[[i]]
rule$query <- format_custom_query_rule(rule$query) rule$query <- format_custom_query_rule(rule$query)
@@ -291,3 +291,19 @@ print.custom_eucast_rules <- function(x, ...) {
cat("\n ", rule_if, "\n", rule_then, "\n", sep = "") cat("\n ", rule_if, "\n", rule_then, "\n", sep = "")
} }
} }
# Backward-compat S3 dispatch for objects created with the old custom_eucast_rules() function
#' @method c custom_eucast_rules
#' @noRd
#' @export
c.custom_eucast_rules <- function(x, ...) c.custom_interpretive_rules(x, ...)
#' @method as.list custom_eucast_rules
#' @noRd
#' @export
as.list.custom_eucast_rules <- function(x, ...) as.list.custom_interpretive_rules(x, ...)
#' @method print custom_eucast_rules
#' @export
#' @noRd
print.custom_eucast_rules <- function(x, ...) print.custom_interpretive_rules(x, ...)

View File

@@ -109,8 +109,9 @@
#' - `status` \cr Status of the taxon, either `r vector_or(microorganisms$status, documentation = TRUE)` #' - `status` \cr Status of the taxon, either `r vector_or(microorganisms$status, documentation = TRUE)`
#' - `kingdom`, `phylum`, `class`, `order`, `family`, `genus`, `species`, `subspecies`\cr Taxonomic rank of the microorganism. Note that for fungi, *phylum* is equal to their taxonomic *division*. Also, for fungi, *subkingdom* and *subdivision* were left out since they do not occur in the bacterial taxonomy. #' - `kingdom`, `phylum`, `class`, `order`, `family`, `genus`, `species`, `subspecies`\cr Taxonomic rank of the microorganism. Note that for fungi, *phylum* is equal to their taxonomic *division*. Also, for fungi, *subkingdom* and *subdivision* were left out since they do not occur in the bacterial taxonomy.
#' - `rank`\cr Text of the taxonomic rank of the microorganism, such as `"species"` or `"genus"` #' - `rank`\cr Text of the taxonomic rank of the microorganism, such as `"species"` or `"genus"`
#' - `ref`\cr Author(s) and year of related scientific publication. This contains only the *first surname* and year of the *latest* authors, e.g. "Wallis *et al.* 2006 *emend.* Smith and Jones 2018" becomes "Smith *et al.*, 2018". This field is directly retrieved from the source specified in the column `source`. Moreover, accents were removed to comply with CRAN that only allows ASCII characters. #' - `ref`\cr Abbreviated authority citation for the nomenclatural act that established the current name combination, following ICNP conventions. For species described in their current genus (*sp. nov.*), this is the original description author(s) and year. For species transferred to a different genus (*comb. nov.*), this is the reclassification author(s) and year. Emendations are excluded. For synonyms, this is the authority under which the synonym was originally published. This field is directly retrieved from the source specified in the column `source`. Diacritics were removed to comply with CRAN, that only allows ASCII characters.
#' - `oxygen_tolerance` \cr Oxygen tolerance, either `r vector_or(microorganisms$oxygen_tolerance, documentation = TRUE)`. These data were retrieved from BacDive (see *Source*). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently `r round(length(microorganisms$oxygen_tolerance[which(!is.na(microorganisms$oxygen_tolerance))]) / nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]) * 100, 1)`% of all `r format_included_data_number(nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]))` bacteria in the data set contain an oxygen tolerance. #' - `oxygen_tolerance` \cr Oxygen tolerance, either `r vector_or(microorganisms$oxygen_tolerance, documentation = TRUE)`. These data were retrieved from BacDive (see *Source*). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently `r round(length(microorganisms$oxygen_tolerance[which(!is.na(microorganisms$oxygen_tolerance))]) / nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]) * 100, 1)`% of all `r format_included_data_number(nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]))` bacteria in the data set contain an oxygen tolerance.
#' - `morphology` \cr Morphology (cell shape), either `r vector_or(microorganisms$morphology, documentation = TRUE)`. These data were retrieved from BacDive (see *Source*). Genera that are clinically established as coccobacilli (the HACEK group and beyond, such as *Haemophilus* and *Acinetobacter*) are classified as such regardless of BacDive majority vote. Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus. Currently `r round(length(microorganisms$morphology[which(!is.na(microorganisms$morphology))]) / nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]) * 100, 1)`% of all `r format_included_data_number(nrow(microorganisms[which(microorganisms$kingdom == "Bacteria"), ]))` bacteria in the data set contain a morphology.
#' - `source`\cr Either `r vector_or(microorganisms$source, documentation = TRUE)` (see *Source*) #' - `source`\cr Either `r vector_or(microorganisms$source, documentation = TRUE)` (see *Source*)
#' - `lpsn`\cr Identifier ('Record number') of `r TAXONOMY_VERSION$LPSN$name`. This will be the first/highest LPSN identifier to keep one identifier per row. For example, *Acetobacter ascendens* has LPSN Record number 7864 and 11011. Only the first is available in the `microorganisms` data set. ***This is a unique identifier***, though available for only `r format_included_data_number(sum(!is.na(microorganisms$lpsn)))` records. #' - `lpsn`\cr Identifier ('Record number') of `r TAXONOMY_VERSION$LPSN$name`. This will be the first/highest LPSN identifier to keep one identifier per row. For example, *Acetobacter ascendens* has LPSN Record number 7864 and 11011. Only the first is available in the `microorganisms` data set. ***This is a unique identifier***, though available for only `r format_included_data_number(sum(!is.na(microorganisms$lpsn)))` records.
#' - `lpsn_parent`\cr LPSN identifier of the parent taxon #' - `lpsn_parent`\cr LPSN identifier of the parent taxon

0
R/first_isolate.R Normal file → Executable file
View File

0
R/get_episode.R Normal file → Executable file
View File

View File

@@ -62,17 +62,17 @@ format_eucast_version_nr <- function(version, markdown = TRUE) {
#' @param x A data set with antimicrobials columns, such as `amox`, `AMX` and `AMC`. #' @param x A data set with antimicrobials columns, such as `amox`, `AMX` and `AMC`.
#' @param info A [logical] to indicate whether progress should be printed to the console - the default is only print while in interactive sessions. #' @param info A [logical] to indicate whether progress should be printed to the console - the default is only print while in interactive sessions.
#' @param guideline A guideline name, either "EUCAST" (default) or "CLSI". This can be set with the package option [`AMR_guideline`][AMR-options]. #' @param guideline A guideline name, either "EUCAST" (default) or "CLSI". This can be set with the package option [`AMR_guideline`][AMR-options].
#' @param rules A [character] vector that specifies which rules should be applied. Must be one or more of `"breakpoints"`, `"expected_phenotypes"`, `"expert"`, `"other"`, `"custom"`, `"all"`, and defaults to `c("breakpoints", "expected_phenotypes")`. The default value can be set to another value using the package option [`AMR_interpretive_rules`][AMR-options]: `options(AMR_interpretive_rules = "all")`. If using `"custom"`, be sure to fill in argument `custom_rules` too. Custom rules can be created with [custom_eucast_rules()]. #' @param rules A [character] vector that specifies which rules should be applied. Must be one or more of `"breakpoints"`, `"expected_phenotypes"`, `"expert"`, `"other"`, `"custom"`, `"all"`, and defaults to `c("breakpoints", "expected_phenotypes")`. The default value can be set to another value using the package option [`AMR_interpretive_rules`][AMR-options]: `options(AMR_interpretive_rules = "all")`. If using `"custom"`, be sure to fill in argument `custom_rules` too. Custom rules can be created with [custom_interpretive_rules()].
#' @param verbose A [logical] to turn Verbose mode on and off (default is off). In Verbose mode, the function does not apply rules to the data, but instead returns a data set in logbook form with extensive info about which rows and columns would be effected and in which way. Using Verbose mode takes a lot more time. #' @param verbose A [logical] to turn Verbose mode on and off (default is off). In Verbose mode, the function does not apply rules to the data, but instead returns a data set in logbook form with extensive info about which rows and columns would be effected and in which way. Using Verbose mode takes a lot more time.
#' @param version_breakpoints The version number to use for the EUCAST Clinical Breakpoints guideline. Can be `r vector_or(names(EUCAST_VERSION_BREAKPOINTS), documentation = TRUE, reverse = TRUE)`. #' @param version_breakpoints The version number to use for the EUCAST Clinical Breakpoints guideline. Can be `r vector_or(names(EUCAST_VERSION_BREAKPOINTS), documentation = TRUE, reverse = TRUE)`.
#' @param version_expected_phenotypes The version number to use for the EUCAST Expected Phenotypes. Can be `r vector_or(names(EUCAST_VERSION_EXPECTED_PHENOTYPES), documentation = TRUE, reverse = TRUE)`. #' @param version_expected_phenotypes The version number to use for the EUCAST Expected Phenotypes. Can be `r vector_or(names(EUCAST_VERSION_EXPECTED_PHENOTYPES), documentation = TRUE, reverse = TRUE)`.
#' @param version_expertrules The version number to use for the EUCAST Expert Rules and Intrinsic Resistance guideline. Can be `r vector_or(names(EUCAST_VERSION_EXPERT_RULES), documentation = TRUE, reverse = TRUE)`. #' @param version_expertrules The version number to use for the EUCAST Expert Rules and Intrinsic Resistance guideline. Can be `r vector_or(names(EUCAST_VERSION_EXPERT_RULES), documentation = TRUE, reverse = TRUE)`.
#' @param ampc_cephalosporin_resistance (only applies when `rules` contains `"expert"` or `"all"`) a [character] value that should be applied to cefotaxime, ceftriaxone and ceftazidime for AmpC de-repressed cephalosporin-resistant mutants - the default is `NA`. Currently only works when `version_expertrules` is `3.2` and higher; these versions of '*EUCAST Expert Rules on Enterobacterales*' state that results of cefotaxime, ceftriaxone and ceftazidime should be reported with a note, or results should be suppressed (emptied) for these three drugs. A value of `NA` (the default) for this argument will remove results for these three drugs, while e.g. a value of `"R"` will make the results for these drugs resistant. Use `NULL` or `FALSE` to not alter results for these three drugs of AmpC de-repressed cephalosporin-resistant mutants. Using `TRUE` is equal to using `"R"`. \cr For *EUCAST Expert Rules* v3.2, this rule applies to: `r vector_and(gsub("[^a-zA-Z ]+", "", unlist(strsplit(EUCAST_RULES_DF[which(EUCAST_RULES_DF$reference.version %in% c(3.2, 3.3) & EUCAST_RULES_DF$reference.rule %like% "ampc"), "this_value"][1], "|", fixed = TRUE))), quotes = "*")`. #' @param ampc_cephalosporin_resistance (only applies when `rules` contains `"expert"` or `"all"`) a [character] value that should be applied to cefotaxime, ceftriaxone and ceftazidime for AmpC de-repressed cephalosporin-resistant mutants - the default is `NA`. Currently only works when `version_expertrules` is `3.2` and higher; these versions of '*EUCAST Expert Rules on Enterobacterales*' state that results of cefotaxime, ceftriaxone and ceftazidime should be reported with a note, or results should be suppressed (emptied) for these three drugs. A value of `NA` (the default) for this argument will remove results for these three drugs, while e.g. a value of `"R"` will make the results for these drugs resistant. Use `NULL` or `FALSE` to not alter results for these three drugs of AmpC de-repressed cephalosporin-resistant mutants. Using `TRUE` is equal to using `"R"`. \cr For *EUCAST Expert Rules* v3.2, this rule applies to: `r vector_and(gsub("[^a-zA-Z ]+", "", unlist(strsplit(INTERPRETIVE_RULES_DF[which(INTERPRETIVE_RULES_DF$reference.version %in% c(3.2, 3.3) & INTERPRETIVE_RULES_DF$reference.rule %like% "ampc"), "this_value"][1], "|", fixed = TRUE))), quotes = "*")`.
#' @param ... Column names of antimicrobials. To automatically detect antimicrobial column names, do not provide any named arguments; [guess_ab_col()] will then be used for detection. To manually specify a column, provide its name (case-insensitive) as an argument, e.g. `AMX = "amoxicillin"`. To skip a specific antimicrobial, set it to `NULL`, e.g. `TIC = NULL` to exclude ticarcillin. If a manually defined column does not exist in the data, it will be skipped with a warning. #' @param ... Column names of antimicrobials. To automatically detect antimicrobial column names, do not provide any named arguments; [guess_ab_col()] will then be used for detection. To manually specify a column, provide its name (case-insensitive) as an argument, e.g. `AMX = "amoxicillin"`. To skip a specific antimicrobial, set it to `NULL`, e.g. `TIC = NULL` to exclude ticarcillin. If a manually defined column does not exist in the data, it will be skipped with a warning.
#' @param ab Any (vector of) text that can be coerced to a valid antimicrobial drug code with [as.ab()]. #' @param ab Any (vector of) text that can be coerced to a valid antimicrobial drug code with [as.ab()].
#' @param administration Route of administration, either `r vector_or(dosage$administration, documentation = TRUE)`. #' @param administration Route of administration, either `r vector_or(dosage$administration, documentation = TRUE)`.
#' @param only_sir_columns A [logical] to indicate whether only antimicrobial columns must be included that were transformed to class [sir][as.sir()] on beforehand. Defaults to `FALSE` if no columns of `x` have a class [sir][as.sir()]. #' @param only_sir_columns A [logical] to indicate whether only antimicrobial columns must be included that were transformed to class [sir][as.sir()] on beforehand. Defaults to `FALSE` if no columns of `x` have a class [sir][as.sir()].
#' @param custom_rules Custom rules to apply, created with [custom_eucast_rules()]. #' @param custom_rules Custom rules to apply, created with [custom_interpretive_rules()].
#' @param overwrite A [logical] indicating whether to overwrite existing SIR values (default: `FALSE`). When `FALSE`, only non-SIR values are modified (i.e., any value that is not already S, I or R). To ensure compliance with EUCAST guidelines, **this should remain** `FALSE`, as EUCAST notes often state that an organism "should be tested for susceptibility to individual agents or be reported resistant". #' @param overwrite A [logical] indicating whether to overwrite existing SIR values (default: `FALSE`). When `FALSE`, only non-SIR values are modified (i.e., any value that is not already S, I or R). To ensure compliance with EUCAST guidelines, **this should remain** `FALSE`, as EUCAST notes often state that an organism "should be tested for susceptibility to individual agents or be reported resistant".
#' @param add_if_missing A [logical] indicating whether rules should also be applied to missing (`NA`) values (default: `TRUE`). When `FALSE`, rules are only applied to cells that already contain an SIR value; cells with `NA` are left untouched. This is particularly useful when using `overwrite = TRUE` with custom rules and you want to update reported results without imputing values for untested drugs. #' @param add_if_missing A [logical] indicating whether rules should also be applied to missing (`NA`) values (default: `TRUE`). When `FALSE`, rules are only applied to cells that already contain an SIR value; cells with `NA` are left untouched. This is particularly useful when using `overwrite = TRUE` with custom rules and you want to update reported results without imputing values for untested drugs.
#' @inheritParams first_isolate #' @inheritParams first_isolate
@@ -80,17 +80,17 @@ format_eucast_version_nr <- function(version, markdown = TRUE) {
#' **Note:** This function does not translate MIC or disk values to SIR values. Use [as.sir()] for that. \cr #' **Note:** This function does not translate MIC or disk values to SIR values. Use [as.sir()] for that. \cr
#' **Note:** When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr #' **Note:** When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr
#' #'
#' The file containing all EUCAST rules is located here: <https://github.com/msberends/AMR/blob/main/data-raw/eucast_rules.tsv>. **Note:** Old taxonomic names are replaced with the current taxonomy where applicable. For example, *Ochrobactrum anthropi* was renamed to *Brucella anthropi* in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The `AMR` package contains the full microbial taxonomy updated until `r documentation_date(max(TAXONOMY_VERSION$GBIF$accessed_date, TAXONOMY_VERSION$LPSN$accessed_date))`, see [microorganisms]. #' The file containing all interpretive rules is located here: <https://github.com/msberends/AMR/blob/main/data-raw/interpretive_rules.tsv>. **Note:** Old taxonomic names are replaced with the current taxonomy where applicable. For example, *Ochrobactrum anthropi* was renamed to *Brucella anthropi* in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The `AMR` package contains the full microbial taxonomy updated until `r documentation_date(max(TAXONOMY_VERSION$GBIF$accessed_date, TAXONOMY_VERSION$LPSN$accessed_date))`, see [microorganisms].
#' #'
#' ### Custom Rules #' ### Custom Rules
#' #'
#' Custom rules can be created using [custom_eucast_rules()], e.g.: #' Custom rules can be created using [custom_interpretive_rules()], e.g.:
#' #'
#' ```r #' ```r
#' x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", #' x <- custom_interpretive_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
#' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I") #' AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I")
#' #'
#' eucast_rules(example_isolates, rules = "custom", custom_rules = x) #' interpretive_rules(example_isolates, rules = "custom", custom_rules = x)
#' ``` #' ```
#' #'
#' ### 'Other' Rules #' ### 'Other' Rules
@@ -102,7 +102,7 @@ format_eucast_version_nr <- function(version, markdown = TRUE) {
#' #'
#' Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set. #' Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set.
#' #'
#' Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include `"other"` to the `rules` argument, or use `eucast_rules(..., rules = "all")`. You can also set the package option [`AMR_interpretive_rules`][AMR-options], i.e. run `options(AMR_interpretive_rules = "all")`. #' Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include `"other"` to the `rules` argument, or use `interpretive_rules(..., rules = "all")`. You can also set the package option [`AMR_interpretive_rules`][AMR-options], i.e. run `options(AMR_interpretive_rules = "all")`.
#' @aliases EUCAST #' @aliases EUCAST
#' @rdname interpretive_rules #' @rdname interpretive_rules
#' @export #' @export
@@ -184,7 +184,7 @@ interpretive_rules <- function(x,
meet_criteria(version_expertrules, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_EXPERT_RULES))) meet_criteria(version_expertrules, allow_class = c("numeric", "integer"), has_length = 1, is_in = as.double(names(EUCAST_VERSION_EXPERT_RULES)))
meet_criteria(ampc_cephalosporin_resistance, allow_class = c("logical", "character", "sir"), has_length = 1, allow_NA = TRUE, allow_NULL = TRUE) meet_criteria(ampc_cephalosporin_resistance, allow_class = c("logical", "character", "sir"), has_length = 1, allow_NA = TRUE, allow_NULL = TRUE)
meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1) meet_criteria(only_sir_columns, allow_class = "logical", has_length = 1)
meet_criteria(custom_rules, allow_class = "custom_eucast_rules", allow_NULL = TRUE) meet_criteria(custom_rules, allow_class = c("custom_interpretive_rules", "custom_eucast_rules"), allow_NULL = TRUE)
meet_criteria(overwrite, allow_class = "logical", has_length = 1) meet_criteria(overwrite, allow_class = "logical", has_length = 1)
meet_criteria(add_if_missing, allow_class = "logical", has_length = 1) meet_criteria(add_if_missing, allow_class = "logical", has_length = 1)
@@ -193,11 +193,6 @@ interpretive_rules <- function(x,
"Either set {.arg overwrite} or {.arg add_if_missing} to {.code TRUE}, or both." "Either set {.arg overwrite} or {.arg add_if_missing} to {.code TRUE}, or both."
) )
stop_if(
guideline == "CLSI",
"CLSI guideline is not yet supported."
)
stop_if( stop_if(
!is.na(ampc_cephalosporin_resistance) && !any(c("expert", "all") %in% rules), !is.na(ampc_cephalosporin_resistance) && !any(c("expert", "all") %in% rules),
"For the {.arg ampc_cephalosporin_resistance} argument to work, the {.arg rules} argument must contain {.code \"expert\"} or {.code \"all\"}." "For the {.arg ampc_cephalosporin_resistance} argument to work, the {.arg rules} argument must contain {.code \"expert\"} or {.code \"all\"}."
@@ -205,8 +200,14 @@ interpretive_rules <- function(x,
add_MO_lookup_to_AMR_env() add_MO_lookup_to_AMR_env()
if (guideline %like% "EUCAST") {
guideline <- "EUCAST"
} else if (guideline %like% "CLSI") {
guideline <- "CLSI"
}
if ("custom" %in% rules && is.null(custom_rules)) { if ("custom" %in% rules && is.null(custom_rules)) {
warning_("in {.help [{.fun eucast_rules}](AMR::eucast_rules)}: no custom rules were set with the {.arg custom_rules} argument", warning_("in {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}: no custom rules were set with the {.arg custom_rules} argument",
immediate = TRUE immediate = TRUE
) )
rules <- rules[rules != "custom"] rules <- rules[rules != "custom"]
@@ -229,13 +230,13 @@ interpretive_rules <- function(x,
if (interactive() && isTRUE(verbose) && isTRUE(info)) { if (interactive() && isTRUE(verbose) && isTRUE(info)) {
txt <- paste0( txt <- paste0(
"WARNING: In Verbose mode, the eucast_rules() function does not apply rules to the data, but instead returns a data set in logbook form with comprehensive info about which rows and columns would be effected and in which way.", "WARNING: In Verbose mode, the interpretive_rules() function does not apply rules to the data, but instead returns a data set in logbook form with comprehensive info about which rows and columns would be effected and in which way.",
"\n\nThis may overwrite your existing data if you use e.g.:", "\n\nThis may overwrite your existing data if you use e.g.:",
"\ndata <- eucast_rules(data, verbose = TRUE)\n\nDo you want to continue?" "\ndata <- interpretive_rules(data, verbose = TRUE)\n\nDo you want to continue?"
) )
showQuestion <- import_fn("showQuestion", "rstudioapi", error_on_fail = FALSE) showQuestion <- import_fn("showQuestion", "rstudioapi", error_on_fail = FALSE)
if (!is.null(showQuestion)) { if (!is.null(showQuestion)) {
q_continue <- showQuestion("Using verbose = TRUE with eucast_rules()", txt) q_continue <- showQuestion("Using verbose = TRUE with interpretive_rules()", txt)
} else { } else {
q_continue <- utils::menu(choices = c("OK", "Cancel"), graphics = FALSE, title = txt) q_continue <- utils::menu(choices = c("OK", "Cancel"), graphics = FALSE, title = txt)
} }
@@ -330,7 +331,7 @@ interpretive_rules <- function(x,
verbose = verbose, verbose = verbose,
info = info, info = info,
only_sir_columns = only_sir_columns, only_sir_columns = only_sir_columns,
fn = "eucast_rules", fn = "interpretive_rules",
... ...
) )
@@ -489,7 +490,7 @@ interpretive_rules <- function(x,
"Rules by the ", "Rules by the ",
font_bold(paste0("AMR package v", utils::packageDescription("AMR")$Version)), font_bold(paste0("AMR package v", utils::packageDescription("AMR")$Version)),
" (", format(as.Date(utils::packageDescription("AMR")$Date), format = "%Y"), " (", format(as.Date(utils::packageDescription("AMR")$Date), format = "%Y"),
"), see {.help [{.fun eucast_rules}](AMR::eucast_rules)}\n" "), see {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}\n"
) )
)) ))
cat("\n\n") cat("\n\n")
@@ -611,59 +612,62 @@ interpretive_rules <- function(x,
if (!any(c("all", "custom") %in% rules) && !is.null(custom_rules)) { if (!any(c("all", "custom") %in% rules) && !is.null(custom_rules)) {
if (isTRUE(info)) { if (isTRUE(info)) {
message_("Skipping custom EUCAST rules, since the {.arg rules} argument does not contain {.code \"custom\"}.") message_("Skipping custom interpretive rules, since the {.arg rules} argument does not contain {.code \"custom\"}.")
} }
custom_rules <- NULL custom_rules <- NULL
} }
# >>> Apply Official EUCAST rules <<< --------------------------------------------------- # >>> Apply Official interpretive rules <<< ---------------------------------------------------
eucast_notification_shown <- FALSE eucast_notification_shown <- FALSE
if (!is.null(list(...)$eucast_rules_df)) { if (!is.null(list(...)$interpretive_rules_df)) {
# this allows: eucast_rules(x, eucast_rules_df = AMR:::EUCAST_RULES_DF |> filter(is.na(have_these_values))) # this allows: interpretive_rules(x, interpretive_rules_df = AMR:::INTERPRETIVE_RULES_DF |> filter(is.na(have_these_values)))
eucast_rules_df_total <- list(...)$eucast_rules_df interpretive_rules_df_total <- list(...)$interpretive_rules_df
} else if (!is.null(list(...)$eucast_rules_df)) {
# deprecated parameter name kept for backward compatibility
interpretive_rules_df_total <- list(...)$eucast_rules_df
} else { } else {
# otherwise internal data file, created in data-raw/_pre_commit_checks.R # internal data file, created in data-raw/_pre_commit_checks.R
eucast_rules_df_total <- EUCAST_RULES_DF interpretive_rules_df_total <- INTERPRETIVE_RULES_DF
} }
## filter on user-set guideline versions ---- ## filter on guideline provider and user-set guideline versions ----
eucast_rules_df <- data.frame() interpretive_rules_df <- data.frame()
if (any(c("all", "breakpoints") %in% rules)) { if (any(c("all", "breakpoints") %in% rules)) {
eucast_rules_df <- eucast_rules_df %pm>% interpretive_rules_df <- interpretive_rules_df %pm>%
rbind_AMR(eucast_rules_df_total %pm>% rbind_AMR(interpretive_rules_df_total %pm>%
subset(reference.rule_group %like% "breakpoint" & reference.version == version_breakpoints)) subset(rule.provider == guideline & reference.rule_group %like% "breakpoint" & reference.version == version_breakpoints))
} }
if (any(c("all", "expected_phenotypes") %in% rules)) { if (any(c("all", "expected_phenotypes") %in% rules)) {
eucast_rules_df <- eucast_rules_df %pm>% interpretive_rules_df <- interpretive_rules_df %pm>%
rbind_AMR(eucast_rules_df_total %pm>% rbind_AMR(interpretive_rules_df_total %pm>%
subset(reference.rule_group %like% "expected" & reference.version == version_expected_phenotypes)) subset(rule.provider == guideline & reference.rule_group %like% "expected" & reference.version == version_expected_phenotypes))
} }
if (any(c("all", "expert") %in% rules)) { if (any(c("all", "expert") %in% rules)) {
eucast_rules_df <- eucast_rules_df %pm>% interpretive_rules_df <- interpretive_rules_df %pm>%
rbind_AMR(eucast_rules_df_total %pm>% rbind_AMR(interpretive_rules_df_total %pm>%
subset(reference.rule_group %like% "expert" & reference.version == version_expertrules)) subset(rule.provider == guideline & reference.rule_group %like% "expert" & reference.version == version_expertrules))
} }
## filter out AmpC de-repressed cephalosporin-resistant mutants ---- ## filter out AmpC de-repressed cephalosporin-resistant mutants ----
# no need to filter on version number here - the rules contain these version number, so are inherently filtered # no need to filter on version number here - the rules contain these version number, so are inherently filtered
# cefotaxime, ceftriaxone, ceftazidime # cefotaxime, ceftriaxone, ceftazidime
if (is.null(ampc_cephalosporin_resistance) || isFALSE(ampc_cephalosporin_resistance)) { if (is.null(ampc_cephalosporin_resistance) || isFALSE(ampc_cephalosporin_resistance)) {
eucast_rules_df <- subset( interpretive_rules_df <- subset(
eucast_rules_df, interpretive_rules_df,
reference.rule %unlike% "ampc" reference.rule %unlike% "ampc"
) )
} else { } else {
if (isTRUE(ampc_cephalosporin_resistance)) { if (isTRUE(ampc_cephalosporin_resistance)) {
ampc_cephalosporin_resistance <- "R" ampc_cephalosporin_resistance <- "R"
} }
if (!is.null(eucast_rules_df$reference.rule)) { if (!is.null(interpretive_rules_df$reference.rule)) {
eucast_rules_df[which(eucast_rules_df$reference.rule %like% "ampc"), "to_value"] <- as.character(ampc_cephalosporin_resistance) interpretive_rules_df[which(interpretive_rules_df$reference.rule %like% "ampc"), "to_value"] <- as.character(ampc_cephalosporin_resistance)
} }
} }
# sometimes, the screenings are missing but the names are actually available # sometimes, the screenings are missing but the names are actually available
# we only hints on remaining rows in `eucast_rules_df` # we only hints on remaining rows in `interpretive_rules_df`
screening_abx <- as.character(AMR::antimicrobials$ab[which(AMR::antimicrobials$ab %like% "-S$")]) screening_abx <- as.character(AMR::antimicrobials$ab[which(AMR::antimicrobials$ab %like% "-S$")])
screening_abx <- screening_abx[screening_abx %in% unique(unlist(strsplit(EUCAST_RULES_DF$and_these_antibiotics[!is.na(EUCAST_RULES_DF$and_these_antibiotics)], ", *")))] screening_abx <- screening_abx[screening_abx %in% unique(unlist(strsplit(interpretive_rules_df_total$and_these_antibiotics[!is.na(interpretive_rules_df_total$and_these_antibiotics)], ", *")))]
if (isTRUE(info)) { if (isTRUE(info)) {
cat("\n") cat("\n")
} }
@@ -682,12 +686,12 @@ interpretive_rules <- function(x,
} }
## Go over all rules and apply them ---- ## Go over all rules and apply them ----
for (i in seq_len(nrow(eucast_rules_df))) { for (i in seq_len(nrow(interpretive_rules_df))) {
rule_previous <- eucast_rules_df[max(1, i - 1), "reference.rule", drop = TRUE] rule_previous <- interpretive_rules_df[max(1, i - 1), "reference.rule", drop = TRUE]
rule_current <- eucast_rules_df[i, "reference.rule", drop = TRUE] rule_current <- interpretive_rules_df[i, "reference.rule", drop = TRUE]
rule_next <- eucast_rules_df[min(nrow(eucast_rules_df), i + 1), "reference.rule", drop = TRUE] rule_next <- interpretive_rules_df[min(nrow(interpretive_rules_df), i + 1), "reference.rule", drop = TRUE]
rule_group_previous <- eucast_rules_df[max(1, i - 1), "reference.rule_group", drop = TRUE] rule_group_previous <- interpretive_rules_df[max(1, i - 1), "reference.rule_group", drop = TRUE]
rule_group_current <- eucast_rules_df[i, "reference.rule_group", drop = TRUE] rule_group_current <- interpretive_rules_df[i, "reference.rule_group", drop = TRUE]
# don't apply rules if user doesn't want to apply them # don't apply rules if user doesn't want to apply them
if (rule_group_current %like% "breakpoint" && !any(c("all", "breakpoints") %in% rules)) { if (rule_group_current %like% "breakpoint" && !any(c("all", "breakpoints") %in% rules)) {
next next
@@ -702,16 +706,16 @@ interpretive_rules <- function(x,
if (isFALSE(info) || isFALSE(verbose)) { if (isFALSE(info) || isFALSE(verbose)) {
rule_text <- "" rule_text <- ""
} else { } else {
if (is.na(eucast_rules_df[i, "and_these_antibiotics", drop = TRUE])) { if (is.na(interpretive_rules_df[i, "and_these_antibiotics", drop = TRUE])) {
rule_text <- paste0("always report as '", eucast_rules_df[i, "to_value", drop = TRUE], "': ", get_antibiotic_names(eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE])) rule_text <- paste0("always report as '", interpretive_rules_df[i, "to_value", drop = TRUE], "': ", get_antibiotic_names(interpretive_rules_df[i, "then_change_these_antibiotics", drop = TRUE]))
} else { } else {
rule_text <- paste0( rule_text <- paste0(
"report as '", eucast_rules_df[i, "to_value", drop = TRUE], "' when ", "report as '", interpretive_rules_df[i, "to_value", drop = TRUE], "' when ",
format_antibiotic_names( format_antibiotic_names(
ab_names = get_antibiotic_names(eucast_rules_df[i, "and_these_antibiotics", drop = TRUE]), ab_names = get_antibiotic_names(interpretive_rules_df[i, "and_these_antibiotics", drop = TRUE]),
ab_results = eucast_rules_df[i, "have_these_values", drop = TRUE] ab_results = interpretive_rules_df[i, "have_these_values", drop = TRUE]
), ": ", ), ": ",
get_antibiotic_names(eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE]) get_antibiotic_names(interpretive_rules_df[i, "then_change_these_antibiotics", drop = TRUE])
) )
} }
} }
@@ -720,7 +724,7 @@ interpretive_rules <- function(x,
rule_previous <- "" rule_previous <- ""
rule_group_previous <- "" rule_group_previous <- ""
} }
if (i == nrow(eucast_rules_df)) { if (i == nrow(interpretive_rules_df)) {
rule_next <- "" rule_next <- ""
} }
@@ -789,13 +793,13 @@ interpretive_rules <- function(x,
} }
## Get rule from file ------------------------------------------------------ ## Get rule from file ------------------------------------------------------
if_mo_property <- trimws(eucast_rules_df[i, "if_mo_property", drop = TRUE]) if_mo_property <- trimws(interpretive_rules_df[i, "if_mo_property", drop = TRUE])
like_is_one_of <- trimws(eucast_rules_df[i, "like.is.one_of", drop = TRUE]) like_is_one_of <- trimws(interpretive_rules_df[i, "like.is.one_of", drop = TRUE])
mo_value <- trimws(eucast_rules_df[i, "this_value", drop = TRUE]) mo_value <- trimws(interpretive_rules_df[i, "this_value", drop = TRUE])
source_antibiotics <- eucast_rules_df[i, "and_these_antibiotics", drop = TRUE] source_antibiotics <- interpretive_rules_df[i, "and_these_antibiotics", drop = TRUE]
source_value <- trimws(unlist(strsplit(eucast_rules_df[i, "have_these_values", drop = TRUE], ",", fixed = TRUE))) source_value <- trimws(unlist(strsplit(interpretive_rules_df[i, "have_these_values", drop = TRUE], ",", fixed = TRUE)))
target_antibiotics <- eucast_rules_df[i, "then_change_these_antibiotics", drop = TRUE] target_antibiotics <- interpretive_rules_df[i, "then_change_these_antibiotics", drop = TRUE]
target_value <- eucast_rules_df[i, "to_value", drop = TRUE] target_value <- interpretive_rules_df[i, "to_value", drop = TRUE]
# if amo_value contains a group name, expand that name with all species in it # if amo_value contains a group name, expand that name with all species in it
if (any(trimws(strsplit(mo_value, ",")[[1]]) %in% AMR::microorganisms.groups$mo_group_name, na.rm = TRUE)) { if (any(trimws(strsplit(mo_value, ",")[[1]]) %in% AMR::microorganisms.groups$mo_group_name, na.rm = TRUE)) {
@@ -894,7 +898,7 @@ interpretive_rules <- function(x,
if (!is.null(custom_rules)) { if (!is.null(custom_rules)) {
if (isTRUE(info)) { if (isTRUE(info)) {
cat("\n") cat("\n")
cat(font_bold("Custom EUCAST rules, set by user"), "\n") cat(font_bold("Custom interpretive rules, set by user"), "\n")
} }
for (i in seq_len(length(custom_rules))) { for (i in seq_len(length(custom_rules))) {
rule <- custom_rules[[i]] rule <- custom_rules[[i]]
@@ -929,8 +933,8 @@ interpretive_rules <- function(x,
to = target_value, to = target_value,
rule = c( rule = c(
rule_text, rule_text,
"Custom EUCAST rules", "Custom interpretive rules",
paste0("Custom EUCAST rule ", i), paste0("Custom interpretive rule ", i),
paste0( paste0(
"Object '", deparse(substitute(custom_rules)), "Object '", deparse(substitute(custom_rules)),
"' consisting of ", length(custom_rules), " custom rules" "' consisting of ", length(custom_rules), " custom rules"
@@ -1075,7 +1079,7 @@ interpretive_rules <- function(x,
warn_lacking_sir_class <- warn_lacking_sir_class[order(colnames(x.bak))] warn_lacking_sir_class <- warn_lacking_sir_class[order(colnames(x.bak))]
warn_lacking_sir_class <- warn_lacking_sir_class[!is.na(warn_lacking_sir_class)] warn_lacking_sir_class <- warn_lacking_sir_class[!is.na(warn_lacking_sir_class)]
warning_( warning_(
"in {.help [{.fun eucast_rules}](AMR::eucast_rules)}: not all columns with antimicrobial results are of class {.cls sir}. Transform them on beforehand, e.g.:\n\n", "in {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}: not all columns with antimicrobial results are of class {.cls sir}. Transform them on beforehand, e.g.:\n\n",
"\u00a0\u00a0", AMR_env$bullet_icon, " ", highlight_code(paste0(x_deparsed, " |> as.sir(", ifelse(length(warn_lacking_sir_class) == 1, "\u00a0\u00a0", AMR_env$bullet_icon, " ", highlight_code(paste0(x_deparsed, " |> as.sir(", ifelse(length(warn_lacking_sir_class) == 1,
warn_lacking_sir_class, warn_lacking_sir_class,
paste0(warn_lacking_sir_class[1], ":", warn_lacking_sir_class[length(warn_lacking_sir_class)]) paste0(warn_lacking_sir_class[1], ":", warn_lacking_sir_class[length(warn_lacking_sir_class)])
@@ -1177,7 +1181,7 @@ edit_sir <- function(x,
new_edits[rows, cols] == "NS") new_edits[rows, cols] == "NS")
non_SIR <- !isSIR non_SIR <- !isSIR
if (isFALSE(overwrite) && any(isSIR) && message_not_thrown_before("edit_sir.warning_overwrite")) { if (isFALSE(overwrite) && any(isSIR) && message_not_thrown_before("edit_sir.warning_overwrite")) {
warning_("in {.help [{.fun eucast_rules}](AMR::eucast_rules)}: some columns had SIR values which were not overwritten, since {.code overwrite = FALSE}.") warning_("in {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}: some columns had SIR values which were not overwritten, since {.code overwrite = FALSE}.")
} }
# determine which cells to modify based on overwrite and add_if_missing # determine which cells to modify based on overwrite and add_if_missing
if (isTRUE(overwrite)) { if (isTRUE(overwrite)) {
@@ -1211,7 +1215,7 @@ edit_sir <- function(x,
}) })
suppressWarnings(do_assign()) suppressWarnings(do_assign())
warning_( warning_(
"in {.help [{.fun eucast_rules}](AMR::eucast_rules)}: value \"", to, "\" added to the factor levels of column", "in {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}: value \"", to, "\" added to the factor levels of column",
ifelse(length(cols) == 1, "", "s"), ifelse(length(cols) == 1, "", "s"),
" ", vector_and(cols, quotes = "`", sort = FALSE), " ", vector_and(cols, quotes = "`", sort = FALSE),
" because this value was not an existing factor level." " because this value was not an existing factor level."
@@ -1219,7 +1223,7 @@ edit_sir <- function(x,
txt_warning() txt_warning()
warned <<- FALSE warned <<- FALSE
} else { } else {
warning_("in {.help [{.fun eucast_rules}](AMR::eucast_rules)}: ", w$message) warning_("in {.help [{.fun interpretive_rules}](AMR::interpretive_rules)}: ", w$message)
txt_warning() txt_warning()
} }
}, },

0
R/mic.R Normal file → Executable file
View File

43
R/mo.R
View File

@@ -322,6 +322,15 @@ as.mo <- function(x,
return(as.character(MO_lookup_current$mo[match(x_out, MO_lookup_current$fullname_lower)])) return(as.character(MO_lookup_current$mo[match(x_out, MO_lookup_current$fullname_lower)]))
} }
# Issue #287: "X complex" is not a distinct taxon - strip " complex" and try "X"
if (grepl(" complex$", x_out, ignore.case = FALSE)) {
x_out <- sub(" complex$", "", x_out)
x_search_cleaned <- sub(" [Cc]omplex$", "", x_search_cleaned)
if (x_out %in% MO_lookup_current$fullname_lower) {
return(as.character(MO_lookup_current$mo[match(x_out, MO_lookup_current$fullname_lower)]))
}
}
# input must not be too short # input must not be too short
if (nchar(x_out) < 3) { if (nchar(x_out) < 3) {
return("UNKNOWN") return("UNKNOWN")
@@ -343,6 +352,18 @@ as.mo <- function(x,
(MO_lookup_current$species_first == substr(x_parts[2], 1, 1) | (MO_lookup_current$species_first == substr(x_parts[2], 1, 1) |
MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) | MO_lookup_current$subspecies_first == substr(x_parts[2], 1, 1) |
MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1))) MO_lookup_current$subspecies_first == substr(x_parts[3], 1, 1)))
# Issue #288: if the species (and subspecies) word(s) in the input exactly match
# exactly one candidate, use only that candidate and bypass the 0.55 cutoff.
# This prevents prevalent bacteria from outranking a rarer organism whose species
# epithet is an unambiguous exact match, e.g. "S. apiospermum" → Scedosporium.
sp_exact <- tolower(MO_lookup_current$species[filtr]) == x_parts[2]
if (length(x_parts) == 3) {
sp_exact <- sp_exact & tolower(MO_lookup_current$subspecies[filtr]) == x_parts[3]
}
if (sum(sp_exact) == 1) {
filtr <- filtr[sp_exact]
minimum_matching_score <- 0
}
} else { } else {
filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) | filtr <- which(MO_lookup_current$full_first == substr(x_parts[1], 1, 1) |
MO_lookup_current$species_first == substr(x_parts[2], 1, 1) | MO_lookup_current$species_first == substr(x_parts[2], 1, 1) |
@@ -1002,17 +1023,19 @@ print.mo_uncertainties <- function(x, n = 10, ...) {
message_(out2, as_note = FALSE) message_(out2, as_note = FALSE)
} }
other_matches <- paste0( if (x[i, ]$candidates != "") {
"Also matched: ", other_matches <- paste0(
vector_and( "Also matched: ",
paste0( vector_and(
candidates_formatted, paste0(
font_blue(paste0(" (", scores_formatted, ")"), collapse = NULL) candidates_formatted,
), font_blue(paste0(" (", scores_formatted, ")"), collapse = NULL)
quotes = FALSE, sort = FALSE ),
quotes = FALSE, sort = FALSE
)
) )
) message_(other_matches, as_note = FALSE)
message_(other_matches, as_note = FALSE) }
} }
if (isTRUE(any_maxed_out)) { if (isTRUE(any_maxed_out)) {

View File

@@ -42,21 +42,23 @@
#' - `mo_ref("Enterobacter aerogenes")` will return `"Tindall et al., 2017"` (with a note about the renaming) #' - `mo_ref("Enterobacter aerogenes")` will return `"Tindall et al., 2017"` (with a note about the renaming)
#' - `mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)` will return `"Hormaeche et al., 1960"` (with a once-per-session warning that the name is outdated) #' - `mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)` will return `"Hormaeche et al., 1960"` (with a once-per-session warning that the name is outdated)
#' #'
#' The short name ([mo_shortname()]) returns the first character of the genus and the full species, such as `"E. coli"`, for species and subspecies. Exceptions are abbreviations of staphylococci (such as *"CoNS"*, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as *"GBS"*, Group B Streptococci). Please bear in mind that e.g. *E. coli* could mean *Escherichia coli* (kingdom of Bacteria) as well as *Entamoeba coli* (kingdom of Protozoa). Returning to the full name will be done using [as.mo()] internally, giving priority to bacteria and human pathogens, i.e. `"E. coli"` will be considered *Escherichia coli*. As a result, `mo_fullname(mo_shortname("Entamoeba coli"))` returns `"Escherichia coli"`. #' [mo_ref()] returns the abbreviated authority of the nomenclatural act that created the queried name combination. When `keep_synonyms = FALSE` (default), this is the authority of the currently accepted name. When `keep_synonyms = TRUE`, this is the authority under which the queried (possibly outdated) name was published. Emendations (changes to the species description without a name change) are not reflected; only the combination or original description authority is returned.
#'
#' The short name ([mo_shortname()]) returns the first character of the genus and the full species, such as `"E. coli"`, for species and subspecies. Exceptions are abbreviations of staphylococci (such as *"CoNS"*, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as *"GBS"*, Group B Streptococci). Please bear in mind that e.g. *E. coli* could mean *Escherichia coli* (kingdom of Bacteria) as well as *Entamoeba coli* (kingdom of Protozoa). Returning to the full name will be done using [as.mo()] internally, giving priority to bacteria and human pathogens, i.e. `"E. coli"` will always be considered *Escherichia coli*. As a result, `mo_fullname(mo_shortname("Entamoeba coli"))` returns `"Escherichia coli"`.
#' #'
#' Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions [mo_kingdom()] and [mo_domain()] return the exact same results. #' Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions [mo_kingdom()] and [mo_domain()] return the exact same results.
#' #'
#' Determination of human pathogenicity ([mo_pathogenicity()]) is strongly based on Bartlett *et al.* (2022, \doi{10.1099/mic.0.001269}). This function returns a [factor] with the levels *Pathogenic*, *Potentially pathogenic*, *Non-pathogenic*, and *Unknown*. #' Determination of human pathogenicity ([mo_pathogenicity()]) is strongly based on Bartlett *et al.* (2022, \doi{10.1099/mic.0.001269}). This function returns a [factor] with the levels *Pathogenic*, *Potentially pathogenic*, *Non-pathogenic*, and *Unknown*.
#' #'
#' Determination of the Gram stain ([mo_gramstain()]) will be based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, [PMID 11837318](https://pubmed.ncbi.nlm.nih.gov/11837318/)), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, [PMID 34694987](https://pubmed.ncbi.nlm.nih.gov/34694987/)). Bacteria in these phyla are considered Gram-positive in this `AMR` package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value `NA`. Functions [mo_is_gram_negative()] and [mo_is_gram_positive()] always return `TRUE` or `FALSE` (or `NA` when the input is `NA` or the MO code is `UNKNOWN`), thus always return `FALSE` for species outside the taxonomic kingdom of Bacteria. #' Determination of the Gram stain ([mo_gramstain()] is based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, [PMID 11837318](https://pubmed.ncbi.nlm.nih.gov/11837318/)), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, [PMID 34694987](https://pubmed.ncbi.nlm.nih.gov/34694987/)). Bacteria in these phyla are considered Gram-positive in this `AMR` package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value `NA`. Functions [mo_is_gram_negative()] and [mo_is_gram_positive()] always return `TRUE` or `FALSE` (or `NA` when the input is `NA` or the MO code is `UNKNOWN`), thus always return `FALSE` for species outside the taxonomic kingdom of Bacteria.
#' #'
#' Determination of yeasts ([mo_is_yeast()]) will be based on the taxonomic kingdom and class. *Budding yeasts* are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. *True yeasts* quite specifically refers to yeasts in the underlying order Saccharomycetales (such as *Saccharomyces cerevisiae*). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return `TRUE`. It returns `FALSE` otherwise (or `NA` when the input is `NA` or the MO code is `UNKNOWN`). #' Determination of yeasts ([mo_is_yeast()]) is based on the taxonomic kingdom and class. *Budding yeasts* are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. *True yeasts* quite specifically refers to yeasts in the underlying order Saccharomycetales (such as *Saccharomyces cerevisiae*). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return `TRUE`. It returns `FALSE` otherwise (or `NA` when the input is `NA` or the MO code is `UNKNOWN`).
#' #'
#' Determination of intrinsic resistance ([mo_is_intrinsic_resistant()]) will be based on the [intrinsic_resistant] data set, which is based on `r format_eucast_version_nr(names(EUCAST_VERSION_EXPECTED_PHENOTYPES[1]))`. The [mo_is_intrinsic_resistant()] function can be vectorised over both argument `x` (input for microorganisms) and `ab` (input for antimicrobials). #' Determination of intrinsic resistance ([mo_is_intrinsic_resistant()]) is based on the [intrinsic_resistant] data set, which is based on `r format_eucast_version_nr(names(EUCAST_VERSION_EXPECTED_PHENOTYPES[1]))`. The [mo_is_intrinsic_resistant()] function can be vectorised over both argument `x` (input for microorganisms) and `ab` (input for antimicrobials).
#' #'
#' Determination of bacterial oxygen tolerance ([mo_oxygen_tolerance()]) will be based on BacDive, see *Source*. The function [mo_is_anaerobic()] only returns `TRUE` if the oxygen tolerance is `"anaerobe"`, indicting an obligate anaerobic species or genus. It always returns `FALSE` for species outside the taxonomic kingdom of Bacteria. #' Determination of both bacterial oxygen tolerance ([mo_oxygen_tolerance()]) and morphology ([mo_morphology()]) are based on BacDive, see *Source*. The function [mo_is_anaerobic()] only returns `TRUE` if the oxygen tolerance is `"anaerobe"`, indicating an obligate anaerobic species or genus. It always returns `FALSE` for species outside the taxonomic kingdom of Bacteria.
#' #'
#' The function [mo_url()] will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. [This MycoBank URL](`r TAXONOMY_VERSION$MycoBank$url`) will be used for fungi wherever available , [this LPSN URL](`r TAXONOMY_VERSION$MycoBank$url`) for bacteria wherever available, and [this GBIF link](`r TAXONOMY_VERSION$GBIF$url`) otherwise. #' The function [mo_url()] will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. [This MycoBank URL](`r TAXONOMY_VERSION$MycoBank$url`) is used for fungi wherever available , [this LPSN URL](`r TAXONOMY_VERSION$MycoBank$url`) for bacteria wherever available, and [this GBIF link](`r TAXONOMY_VERSION$GBIF$url`) otherwise.
#' #'
#' SNOMED codes ([mo_snomed()]) was last updated on `r documentation_date(TAXONOMY_VERSION$SNOMED$accessed_date)`. See *Source* and the [microorganisms] data set for more info. #' SNOMED codes ([mo_snomed()]) was last updated on `r documentation_date(TAXONOMY_VERSION$SNOMED$accessed_date)`. See *Source* and the [microorganisms] data set for more info.
#' #'
@@ -100,8 +102,10 @@
#' #'
#' # other properties --------------------------------------------------------- #' # other properties ---------------------------------------------------------
#' #'
#' mo_pathogenicity("Klebsiella pneumoniae") #' mo_morphology("Klebsiella pneumoniae")
#' mo_gramstain("Klebsiella pneumoniae") #' mo_gramstain("Klebsiella pneumoniae")
#' mo_gramstain("Klebsiella pneumoniae", add_morphology = TRUE)
#' mo_pathogenicity("Klebsiella pneumoniae")
#' mo_snomed("Klebsiella pneumoniae") #' mo_snomed("Klebsiella pneumoniae")
#' mo_type("Klebsiella pneumoniae") #' mo_type("Klebsiella pneumoniae")
#' mo_rank("Klebsiella pneumoniae") #' mo_rank("Klebsiella pneumoniae")
@@ -460,8 +464,9 @@ mo_pathogenicity <- function(x, language = get_AMR_locale(), keep_synonyms = get
} }
#' @rdname mo_property #' @rdname mo_property
#' @param add_morphology a [logical] to indicate whether the morphology (from [mo_morphology()]) should be added to the Gram stain result, e.g. `"Gram-negative rods"` instead of `"Gram-negative"`. The default is `FALSE`.
#' @export #' @export
mo_gramstain <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { mo_gramstain <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), add_morphology = FALSE, ...) {
if (missing(x)) { if (missing(x)) {
# this tries to find the data and an 'mo' column # this tries to find the data and an 'mo' column
x <- find_mo_col(fn = "mo_gramstain") x <- find_mo_col(fn = "mo_gramstain")
@@ -469,6 +474,7 @@ mo_gramstain <- function(x, language = get_AMR_locale(), keep_synonyms = getOpti
meet_criteria(x, allow_NA = TRUE) meet_criteria(x, allow_NA = TRUE)
language <- validate_language(language) language <- validate_language(language)
meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1) meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1)
meet_criteria(add_morphology, allow_class = "logical", has_length = 1)
x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...) x.mo <- as.mo(x, language = language, keep_synonyms = keep_synonyms, ...)
metadata <- get_mo_uncertainties() metadata <- get_mo_uncertainties()
@@ -494,6 +500,12 @@ mo_gramstain <- function(x, language = get_AMR_locale(), keep_synonyms = getOpti
# and of course our own ID for Gram-positives # and of course our own ID for Gram-positives
| x.mo %in% c("B_GRAMP", "B_ANAER-POS")] <- "Gram-positive" | x.mo %in% c("B_GRAMP", "B_ANAER-POS")] <- "Gram-positive"
if (isTRUE(add_morphology)) {
morphs <- mo_morphology(x.mo, language = NULL)
morphs[is.na(x)] <- ""
x[!is.na(x)] <- paste(x[!is.na(x)], tolower(morphs[!is.na(x)]))
}
load_mo_uncertainties(metadata) load_mo_uncertainties(metadata)
translate_into_language(x, language = language, only_unknown = FALSE) translate_into_language(x, language = language, only_unknown = FALSE)
} }
@@ -634,6 +646,20 @@ mo_is_anaerobic <- function(x, language = get_AMR_locale(), keep_synonyms = getO
out out
} }
#' @rdname mo_property
#' @export
mo_morphology <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) {
if (missing(x)) {
# this tries to find the data and an 'mo' column
x <- find_mo_col(fn = "mo_morphology")
}
meet_criteria(x, allow_NA = TRUE)
language <- validate_language(language)
meet_criteria(keep_synonyms, allow_class = "logical", has_length = 1)
mo_validate(x = x, property = "morphology", language = language, keep_synonyms = keep_synonyms, ...)
}
#' @rdname mo_property #' @rdname mo_property
#' @export #' @export
mo_snomed <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) { mo_snomed <- function(x, language = get_AMR_locale(), keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) {

0
R/proportion.R Normal file → Executable file
View File

376
R/sir.R
View File

@@ -69,7 +69,7 @@ VALID_SIR_LEVELS <- c("S", "SDD", "I", "R", "NI", "WT", "NWT", "NS")
#' @param host A vector (or column name) with [character]s to indicate the host. Only useful for veterinary breakpoints, as it requires `breakpoint_type = "animal"`. The values can be any text resembling the animal species, even in any of the `r length(LANGUAGES_SUPPORTED)` supported languages of this package. For foreign languages, be sure to set the language with [set_AMR_locale()] (though it will be automatically guessed based on the system language). #' @param host A vector (or column name) with [character]s to indicate the host. Only useful for veterinary breakpoints, as it requires `breakpoint_type = "animal"`. The values can be any text resembling the animal species, even in any of the `r length(LANGUAGES_SUPPORTED)` supported languages of this package. For foreign languages, be sure to set the language with [set_AMR_locale()] (though it will be automatically guessed based on the system language).
#' @param language Language to convert values set in `host` when using animal breakpoints. Use one of these supported language names or [ISO 639-1 codes](https://en.wikipedia.org/wiki/ISO_639-1): `r vector_or(paste0(sapply(LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), " (" , LANGUAGES_SUPPORTED, ")"), quotes = FALSE, sort = FALSE)`. #' @param language Language to convert values set in `host` when using animal breakpoints. Use one of these supported language names or [ISO 639-1 codes](https://en.wikipedia.org/wiki/ISO_639-1): `r vector_or(paste0(sapply(LANGUAGES_SUPPORTED_NAMES, function(x) x[[1]]), " (" , LANGUAGES_SUPPORTED, ")"), quotes = FALSE, sort = FALSE)`.
#' @param verbose A [logical] to indicate that all notes should be printed during interpretation of MIC values or disk diffusion values. #' @param verbose A [logical] to indicate that all notes should be printed during interpretation of MIC values or disk diffusion values.
#' @param reference_data A [data.frame] to be used for interpretation, which defaults to the [clinical_breakpoints] data set. Changing this argument allows for using own interpretation guidelines. This argument must contain a data set that is equal in structure to the [clinical_breakpoints] data set (same column names and column types). Please note that the `guideline` argument will be ignored when `reference_data` is manually set. #' @param reference_data A [data.frame] to be used for interpretation, which defaults to the [clinical_breakpoints] data set. Changing this argument allows for using own interpretation guidelines. This argument must have the same column names as the [clinical_breakpoints] data set. Column types are coerced automatically where possible: the `mo` column is passed through [as.mo()], the `ab` column through [as.ab()], and plain character, numeric, or logical columns are cast to the expected type. When `reference_data` is manually set, the `guideline` argument is optional: if omitted (or if its value does not match any row in the custom data), all rows in `reference_data` are considered. If `guideline` is set to a value that exists in the `guideline` column of the custom data, only matching rows are used — useful when a single custom table contains multiple guidelines. For the R classification, the EUCAST convention is used by default: MIC values `> breakpoint_R` and disk diffusion values `< breakpoint_R` are classified as R, with values between `breakpoint_S` and `breakpoint_R` classified as I (or SDD). Only when using the standard [clinical_breakpoints] with a CLSI guideline are the closed-interval rules (`>= breakpoint_R` for MIC, `<= breakpoint_R` for disk) applied; custom `reference_data` always uses the open-interval (EUCAST) convention regardless of the guideline name.
#' @param threshold Maximum fraction of invalid antimicrobial interpretations of `x`, see *Examples*. #' @param threshold Maximum fraction of invalid antimicrobial interpretations of `x`, see *Examples*.
#' @param conserve_capped_values Deprecated, use `capped_mic_handling` instead. #' @param conserve_capped_values Deprecated, use `capped_mic_handling` instead.
#' @param ... For using on a [data.frame]: selection of columns to apply `as.sir()` to. Supports [tidyselect language][tidyselect::starts_with()] such as `where(is.mic)`, `starts_with(...)`, or `column1:column4`, and can thus also be [antimicrobial selectors][amr_selector()], e.g. `as.sir(df, penicillins())`. #' @param ... For using on a [data.frame]: selection of columns to apply `as.sir()` to. Supports [tidyselect language][tidyselect::starts_with()] such as `where(is.mic)`, `starts_with(...)`, or `column1:column4`, and can thus also be [antimicrobial selectors][amr_selector()], e.g. `as.sir(df, penicillins())`.
@@ -95,7 +95,7 @@ VALID_SIR_LEVELS <- c("S", "SDD", "I", "R", "NI", "WT", "NWT", "NS")
#' # for veterinary breakpoints, also set `host`: #' # for veterinary breakpoints, also set `host`:
#' your_data %>% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI") #' your_data %>% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI")
#' #'
#' # fast processing with parallel computing: #' # fast processing with parallel computing (requires future.apply):
#' as.sir(your_data, ..., parallel = TRUE) #' as.sir(your_data, ..., parallel = TRUE)
#' ``` #' ```
#' * Operators like "<=" will be considered according to the `capped_mic_handling` setting. At default, an MIC value of e.g. ">2" will return "NI" (non-interpretable) if the breakpoint is 4-8; the *true* MIC could be at either side of the breakpoint. This is to prevent that capped values from raw laboratory data would not be treated conservatively. #' * Operators like "<=" will be considered according to the `capped_mic_handling` setting. At default, an MIC value of e.g. ">2" will return "NI" (non-interpretable) if the breakpoint is 4-8; the *true* MIC could be at either side of the breakpoint. This is to prevent that capped values from raw laboratory data would not be treated conservatively.
@@ -112,7 +112,7 @@ VALID_SIR_LEVELS <- c("S", "SDD", "I", "R", "NI", "WT", "NWT", "NS")
#' # for veterinary breakpoints, also set `host`: #' # for veterinary breakpoints, also set `host`:
#' your_data %>% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI") #' your_data %>% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI")
#' #'
#' # fast processing with parallel computing: #' # fast processing with parallel computing (requires future.apply):
#' as.sir(your_data, ..., parallel = TRUE) #' as.sir(your_data, ..., parallel = TRUE)
#' ``` #' ```
#' #'
@@ -220,9 +220,6 @@ VALID_SIR_LEVELS <- c("S", "SDD", "I", "R", "NI", "WT", "NWT", "NS")
#' sir_interpretation_history() #' sir_interpretation_history()
#' #'
#' \donttest{ #' \donttest{
#' # using parallel computing, which is available in base R:
#' as.sir(df_wide, parallel = TRUE, info = TRUE)
#'
#' #'
#' ## Using dplyr ------------------------------------------------- #' ## Using dplyr -------------------------------------------------
#' if (require("dplyr")) { #' if (require("dplyr")) {
@@ -716,8 +713,7 @@ as.sir.disk <- function(x,
} }
#' @rdname as.sir #' @rdname as.sir
#' @param parallel A [logical] to indicate if parallel computing must be used, defaults to `FALSE`. This requires no additional packages, as the used `parallel` package is part of base \R. On Windows and on \R < 4.0.0 [parallel::parLapply()] will be used, in all other cases the more efficient [parallel::mclapply()] will be used. #' @param parallel A [logical] to indicate if parallel computing must be used, defaults to `FALSE`. Requires the [`future.apply`][future.apply::future_lapply()] package. **A non-sequential [future::plan()] must already be active before setting `parallel = TRUE`** — for example, `future::plan(future::multisession)`. An error is thrown if `parallel = TRUE` is used without a plan set by the user. Parallelism distributes columns (and optionally row batches) across workers; it is most beneficial when there are many antibiotic columns and a large number of rows.
#' @param max_cores Maximum number of cores to use if `parallel = TRUE`. Use a negative value to subtract that number from the available number of cores, e.g. a value of `-2` on an 8-core machine means that at most 6 cores will be used. Defaults to `-1`. There will never be used more cores than variables to analyse. The available number of cores are detected using [parallelly::availableCores()] if that package is installed, and base \R's [parallel::detectCores()] otherwise.
#' @export #' @export
as.sir.data.frame <- function(x, as.sir.data.frame <- function(x,
..., ...,
@@ -737,7 +733,6 @@ as.sir.data.frame <- function(x,
verbose = FALSE, verbose = FALSE,
info = interactive(), info = interactive(),
parallel = FALSE, parallel = FALSE,
max_cores = -1,
conserve_capped_values = NULL) { conserve_capped_values = NULL) {
meet_criteria(x, allow_class = "data.frame") # will also check for dimensions > 0 meet_criteria(x, allow_class = "data.frame") # will also check for dimensions > 0
meet_criteria(col_mo, allow_class = "character", is_in = colnames(x), allow_NULL = TRUE) meet_criteria(col_mo, allow_class = "character", is_in = colnames(x), allow_NULL = TRUE)
@@ -756,7 +751,6 @@ as.sir.data.frame <- function(x,
meet_criteria(verbose, allow_class = "logical", has_length = 1) meet_criteria(verbose, allow_class = "logical", has_length = 1)
meet_criteria(info, allow_class = "logical", has_length = 1) meet_criteria(info, allow_class = "logical", has_length = 1)
meet_criteria(parallel, allow_class = "logical", has_length = 1) meet_criteria(parallel, allow_class = "logical", has_length = 1)
meet_criteria(max_cores, allow_class = c("numeric", "integer"), has_length = 1)
x.bak <- x x.bak <- x
if (isTRUE(info) && message_not_thrown_before("as.sir", "sir_interpretation_history")) { if (isTRUE(info) && message_not_thrown_before("as.sir", "sir_interpretation_history")) {
@@ -852,7 +846,6 @@ as.sir.data.frame <- function(x,
i <- 0 i <- 0
ab_cols <- colnames(x)[vapply(FUN.VALUE = logical(1), x, function(y) { ab_cols <- colnames(x)[vapply(FUN.VALUE = logical(1), x, function(y) {
i <<- i + 1 i <<- i + 1
check <- is.mic(y) | is.disk(y)
ab <- colnames(x)[i] ab <- colnames(x)[i]
if (!is.null(col_mo) && ab == col_mo) { if (!is.null(col_mo) && ab == col_mo) {
return(FALSE) return(FALSE)
@@ -861,13 +854,30 @@ as.sir.data.frame <- function(x,
return(FALSE) return(FALSE)
} }
if (length(sel) == 0 || (length(sel) > 0 && ab %in% sel)) { if (length(sel) == 0 || (length(sel) > 0 && ab %in% sel)) {
# columns already carrying an AMR class are always included
y_bak <- x.bak[, ab, drop = TRUE]
if (is.mic(y_bak) || is.disk(y_bak) || is.sir(y_bak)) {
return(TRUE)
}
ab_coerced <- suppressWarnings(as.ab(ab, info = FALSE)) ab_coerced <- suppressWarnings(as.ab(ab, info = FALSE))
if (is.na(ab_coerced) || (length(sel) > 0 & !ab %in% sel)) { if (is.na(ab_coerced) || (length(sel) > 0 & !ab %in% sel)) {
# not even a valid AB code # not even a valid AB code
return(FALSE) return(FALSE)
} else {
return(TRUE)
} }
# Name matches an antibiotic; also verify column content resembles AMR
# data. This prevents false positives on metadata columns whose names
# happen to match a drug code (e.g. 'patient' -> OXY, 'ward' -> PRU).
# Note: all_valid_disks() is intentionally avoided here because it strips
# non-numeric characters (as.disk("Pt_1") == 1), accepting patient IDs.
y_char <- tryCatch(as.character(y), error = function(e) character(0))
y_valid <- y_char[!is.na(y_char) & nzchar(trimws(y_char))]
if (length(y_valid) == 0L) {
return(FALSE)
}
y_numeric <- suppressWarnings(as.numeric(y_valid))
all_valid_mics(y) ||
all(!is.na(y_numeric)) ||
any(y_valid %in% c("S", "SDD", "I", "R", "NI"))
} else { } else {
return(FALSE) return(FALSE)
} }
@@ -875,7 +885,7 @@ as.sir.data.frame <- function(x,
stop_if( stop_if(
length(ab_cols) == 0, length(ab_cols) == 0,
"no columns with MIC values, disk zones or antibiotic column names found in this data set. Use as.mic() or as.disk() to transform antimicrobial columns." "no columns with MIC values, disk zones or antibiotic column names found in this data set. Use {.help [{.fun as.mic}](AMR::as.mic)} or {.help [{.fun as.disk}](AMR::as.disk)} to transform antimicrobial columns."
) )
# set type per column # set type per column
types <- character(length(ab_cols)) types <- character(length(ab_cols))
@@ -883,7 +893,7 @@ as.sir.data.frame <- function(x,
types[vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.mic)] <- "mic" types[vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.mic)] <- "mic"
types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_disks)] <- "disk" types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_disks)] <- "disk"
types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_mics)] <- "mic" types[types == "" & vapply(FUN.VALUE = logical(1), x[, ab_cols, drop = FALSE], all_valid_mics)] <- "mic"
types[types == "" & !vapply(FUN.VALUE = logical(1), x.bak[, ab_cols, drop = FALSE], is.sir)] <- "sir" types[types == ""] <- "sir"
if (any(types %in% c("mic", "disk"), na.rm = TRUE)) { if (any(types %in% c("mic", "disk"), na.rm = TRUE)) {
# now we need an mo column # now we need an mo column
stop_if(is.null(col_mo), "{.arg col_mo} must be set") stop_if(is.null(col_mo), "{.arg col_mo} must be set")
@@ -895,114 +905,161 @@ as.sir.data.frame <- function(x,
} }
# set up parallel computing # set up parallel computing
n_cores <- get_n_cores(max_cores = max_cores) if (requireNamespace("future.apply", quietly = TRUE) && !inherits(future::plan(), "sequential")) {
n_cores <- min(n_cores, length(ab_cols)) # never more cores than variables required if (isFALSE(parallel)) {
if (isTRUE(parallel) && (.Platform$OS.type == "windows" || getRversion() < "4.0.0")) { message_("Assuming {.code parallel = TRUE} since parallel computing has been set up using the {.pkg future} package before. Set {.help [{.fun plan}](future::plan)} to sequential to prevent this.")
cl <- tryCatch(parallel::makeCluster(n_cores, type = "PSOCK"),
error = function(e) {
if (isTRUE(info)) {
message_("Could not create parallel cluster, using single-core computation. Error message: ", conditionMessage(e))
}
return(NULL)
}
)
if (is.null(cl)) {
n_cores <- 1
} }
parallel <- TRUE
}
if (isTRUE(parallel)) {
stop_ifnot(
requireNamespace("future.apply", quietly = TRUE),
"Setting {.code parallel = TRUE} requires the {.pkg future.apply} package.\n",
"Install it with {.code install.packages(\"future.apply\")}."
)
stop_if(inherits(future::plan(), "sequential"),
"Setting {.code parallel = TRUE} requires a non-sequential {.help [{.fun future::plan}](future::plan)} to be active.\n",
"For your system, you could first run: {.code library(future); ",
ifelse(.Platform$OS.type == "windows" || in_rstudio(),
"plan(multisession)",
"plan(multicore)"
),
"}",
call = FALSE
)
n_workers <- future::nbrOfWorkers()
n_cores <- min(n_workers, length(ab_cols))
} else {
n_workers <- 1L
n_cores <- 1L
} }
if (isTRUE(info)) { # In parallel mode suppress per-column messages: workers print simultaneously and
message_(as_note = FALSE) # empty line # their output would be interleaved on the console.
message_("Processing columns:", as_note = FALSE) is_parallel_run <- isTRUE(parallel) && n_cores > 1 && length(ab_cols) > 1
effective_info <- if (is_parallel_run) FALSE else info
# Row-batch mode: when n_cols < n_workers we would leave workers idle under plain
# column-parallel dispatch. Instead we split rows into pieces so every worker
# gets work. pieces_per_col = ceil(n_workers / n_cols) gives ~n_workers jobs
# total; each job processes one column on one row slice, which also reduces
# per-worker memory pressure (smaller breakpoints search space).
if (is_parallel_run && length(ab_cols) < n_workers) {
pieces_per_col <- ceiling(n_workers / length(ab_cols))
} else {
pieces_per_col <- 1L
} }
run_as_sir_column <- function(i) { run_as_sir_column <- function(i, rows = NULL) {
# Always resolve AMR_env from the package namespace so workers get the live
# environment rather than a stale serialised copy from the closure.
.amr_env <- get("AMR_env", envir = asNamespace("AMR"), inherits = FALSE)
# In parallel mode each worker has its own copy of the history; record the
# current length so we capture only the rows added by this as.sir() call.
if (is_parallel_run) pre_log_n <- NROW(.amr_env$sir_interpretation_history)
ab_col <- ab_cols[i] ab_col <- ab_cols[i]
out <- list(result = NULL, log = NULL) out <- list(result = NULL, log = NULL)
# row subsetting: NULL means all rows (column-mode), otherwise row-batch mode
row_idx <- if (is.null(rows)) seq_len(nrow(x)) else rows
if (types[i] == "mic") { if (types[i] == "mic") {
result <- x %pm>% result <- as.sir(
pm_pull(ab_col) %pm>% as.mic(as.character(x[row_idx, ab_col, drop = TRUE])),
as.character() %pm>% mo = x_mo[row_idx],
as.mic() %pm>% mo.bak = x[row_idx, col_mo, drop = TRUE],
as.sir( ab = ab_col,
mo = x_mo, guideline = guideline,
mo.bak = x[, col_mo, drop = TRUE], uti = if (length(uti) > 1L) uti[row_idx] else uti,
ab = ab_col, capped_mic_handling = capped_mic_handling,
guideline = guideline, as_wt_nwt = as_wt_nwt,
uti = uti, add_intrinsic_resistance = add_intrinsic_resistance,
capped_mic_handling = capped_mic_handling, reference_data = reference_data,
as_wt_nwt = as_wt_nwt, substitute_missing_r_breakpoint = substitute_missing_r_breakpoint,
add_intrinsic_resistance = add_intrinsic_resistance, include_screening = include_screening,
reference_data = reference_data, include_PKPD = include_PKPD,
substitute_missing_r_breakpoint = substitute_missing_r_breakpoint, breakpoint_type = breakpoint_type,
include_screening = include_screening, host = if (length(host) > 1L) host[row_idx] else host,
include_PKPD = include_PKPD, verbose = verbose,
breakpoint_type = breakpoint_type, info = effective_info,
host = host, conserve_capped_values = conserve_capped_values,
verbose = verbose, is_data.frame = TRUE
info = info, )
conserve_capped_values = conserve_capped_values,
is_data.frame = TRUE
)
out$result <- result out$result <- result
out$log <- AMR_env$sir_interpretation_history if (is_parallel_run) {
AMR_env$sir_interpretation_history <- AMR_env$sir_interpretation_history[0, , drop = FALSE] # reset log full_log <- .amr_env$sir_interpretation_history
out$log <- if (pre_log_n < NROW(full_log)) {
full_log[seq.int(pre_log_n + 1L, NROW(full_log)), , drop = FALSE]
} else {
full_log[0L, , drop = FALSE]
}
} else {
out$log <- .amr_env$sir_interpretation_history
.amr_env$sir_interpretation_history <- .amr_env$sir_interpretation_history[0L, , drop = FALSE]
}
return(out) return(out)
} else if (types[i] == "disk") { } else if (types[i] == "disk") {
result <- x %pm>% result <- as.sir(
pm_pull(ab_col) %pm>% as.disk(as.character(x[row_idx, ab_col, drop = TRUE])),
as.character() %pm>% mo = x_mo[row_idx],
as.disk() %pm>% mo.bak = x[row_idx, col_mo, drop = TRUE],
as.sir( ab = ab_col,
mo = x_mo, guideline = guideline,
mo.bak = x[, col_mo, drop = TRUE], uti = if (length(uti) > 1L) uti[row_idx] else uti,
ab = ab_col, as_wt_nwt = as_wt_nwt,
guideline = guideline, add_intrinsic_resistance = add_intrinsic_resistance,
uti = uti, reference_data = reference_data,
as_wt_nwt = as_wt_nwt, substitute_missing_r_breakpoint = substitute_missing_r_breakpoint,
add_intrinsic_resistance = add_intrinsic_resistance, include_screening = include_screening,
reference_data = reference_data, include_PKPD = include_PKPD,
substitute_missing_r_breakpoint = substitute_missing_r_breakpoint, breakpoint_type = breakpoint_type,
include_screening = include_screening, host = if (length(host) > 1L) host[row_idx] else host,
include_PKPD = include_PKPD, verbose = verbose,
breakpoint_type = breakpoint_type, info = effective_info,
host = host, is_data.frame = TRUE
verbose = verbose, )
info = info,
is_data.frame = TRUE
)
out$result <- result out$result <- result
out$log <- AMR_env$sir_interpretation_history if (is_parallel_run) {
AMR_env$sir_interpretation_history <- AMR_env$sir_interpretation_history[0, , drop = FALSE] full_log <- .amr_env$sir_interpretation_history
out$log <- if (pre_log_n < NROW(full_log)) {
full_log[seq.int(pre_log_n + 1L, NROW(full_log)), , drop = FALSE]
} else {
full_log[0L, , drop = FALSE]
}
} else {
out$log <- .amr_env$sir_interpretation_history
.amr_env$sir_interpretation_history <- .amr_env$sir_interpretation_history[0L, , drop = FALSE]
}
return(out) return(out)
} else if (types[i] == "sir") { } else if (types[i] == "sir") {
ab <- ab_col ab <- ab_col
ab_coerced <- suppressWarnings(as.ab(ab, info = FALSE)) ab_coerced <- suppressWarnings(as.ab(ab, info = FALSE))
show_message <- FALSE show_message <- FALSE
if (!all(x[, ab, drop = TRUE] %in% c("S", "SDD", "I", "R", "NI", NA), na.rm = TRUE)) { if (!all(x[row_idx, ab, drop = TRUE] %in% c(VALID_SIR_LEVELS, NA), na.rm = TRUE)) {
show_message <- TRUE show_message <- TRUE
if (isTRUE(info)) { if (isTRUE(effective_info)) {
message_("\u00a0\u00a0", AMR_env$bullet_icon, " Cleaning values in column ", paste0("{.field ", font_bold(ab), "}"), " (", message_("\u00a0\u00a0", .amr_env$bullet_icon, " Cleaning values in column ", paste0("{.field ", font_bold(ab), "}"), " (",
ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""), ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""),
ab_name(ab_coerced, tolower = TRUE, info = info), ")... ", ab_name(ab_coerced, tolower = TRUE, info = effective_info), ")... ",
appendLF = FALSE, appendLF = FALSE,
as_note = FALSE as_note = FALSE
) )
} }
} else if (!is.sir(x.bak[, ab, drop = TRUE])) { } else if (!is.sir(x.bak[, ab, drop = TRUE])) {
show_message <- TRUE show_message <- TRUE
if (isTRUE(info)) { if (isTRUE(effective_info)) {
message_("\u00a0\u00a0", AMR_env$bullet_icon, " Assigning class {.cls sir} to already clean column ", paste0("{.field ", font_bold(ab), "}"), " (", message_("\u00a0\u00a0", .amr_env$bullet_icon, " Assigning class {.cls sir} to already clean column ", paste0("{.field ", font_bold(ab), "}"), " (",
ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""), ifelse(ab_coerced != toupper(ab), paste0(ab_coerced, ", "), ""),
ab_name(ab_coerced, tolower = TRUE, language = NULL, info = info), ")... ", ab_name(ab_coerced, tolower = TRUE, language = NULL, info = effective_info), ")... ",
appendLF = FALSE, appendLF = FALSE,
as_note = FALSE as_note = FALSE
) )
} }
} }
result <- as.sir.default(x = as.character(x[, ab, drop = TRUE])) result <- as.sir(as.character(x[row_idx, ab, drop = TRUE]))
if (show_message == TRUE && isTRUE(info)) { if (show_message == TRUE && isTRUE(effective_info)) {
message_(font_green_bg("\u00a0OK\u00a0"), as_note = FALSE) message_(font_green_bg("\u00a0OK\u00a0"), as_note = FALSE)
} }
out$result <- result out$result <- result
@@ -1013,38 +1070,60 @@ as.sir.data.frame <- function(x,
return(out) return(out)
} }
if (isTRUE(parallel) && n_cores > 1 && length(ab_cols) > 1) { if (is_parallel_run) {
if (isTRUE(info)) { if (isTRUE(info)) {
message_(as_note = FALSE) message_(as_note = FALSE)
message_("Running in parallel mode using ", n_cores, " out of ", get_n_cores(Inf), " cores, on columns ", vector_and(font_bold(ab_cols, collapse = NULL), quotes = "'", sort = FALSE), "...", as_note = FALSE, appendLF = FALSE) if (pieces_per_col > 1L) {
message_("Running in parallel mode using ", n_cores, " workers, on columns ", vector_and(paste0("{.field ", font_bold(ab_cols, collapse = NULL), "}"), quotes = FALSE, sort = FALSE), " (", pieces_per_col, " row slices per column)...", as_note = FALSE, appendLF = FALSE)
} else {
message_("Running in parallel mode using ", n_cores, " workers, on columns ", vector_and(paste0("{.field ", font_bold(ab_cols, collapse = NULL), "}"), quotes = FALSE, sort = FALSE), "...", as_note = FALSE, appendLF = FALSE)
}
} }
if (.Platform$OS.type == "windows" || getRversion() < "4.0.0") { if (pieces_per_col > 1L) {
# `cl` has been created in the part above before the `run_as_sir_column` function # Row-batch mode: build (col, row_slice) job pairs so all workers stay active
on.exit(parallel::stopCluster(cl), add = TRUE) row_cuts <- unique(round(seq(0, nrow(x), length.out = pieces_per_col + 1L)))
parallel::clusterExport(cl, varlist = c( row_ranges <- lapply(seq_len(length(row_cuts) - 1L), function(p) {
"x", "x.bak", "x_mo", "ab_cols", "types", seq.int(row_cuts[p] + 1L, row_cuts[p + 1L])
"capped_mic_handling", "as_wt_nwt", "add_intrinsic_resistance", })
"reference_data", "substitute_missing_r_breakpoint", "include_screening", "include_PKPD", jobs <- do.call(c, lapply(seq_along(ab_cols), function(ci) {
"breakpoint_type", "guideline", "host", "uti", "info", "verbose", lapply(seq_along(row_ranges), function(p) list(col = ci, rows = row_ranges[[p]]))
"col_mo", "AMR_env", "conserve_capped_values", }))
"run_as_sir_column" flat <- future.apply::future_lapply(jobs, function(job) {
), envir = environment()) run_as_sir_column(job$col, job$rows)
result_list <- parallel::parLapply(cl, seq_along(ab_cols), run_as_sir_column) }, future.seed = TRUE)
# Reassemble: for each column concatenate row pieces in order
result_list <- lapply(seq_along(ab_cols), function(ci) {
pieces <- flat[vapply(jobs, function(j) j$col == ci, logical(1L))]
list(
result = as.sir(do.call(c, lapply(pieces, function(p) as.character(p$result)))),
log = {
logs <- Filter(Negate(is.null), lapply(pieces, function(p) p$log))
if (length(logs) > 0L) do.call(rbind_AMR, logs) else NULL
}
)
})
} else { } else {
# R>=4.0 on unix # Column-parallel mode: one job per antibiotic column
result_list <- parallel::mclapply(seq_along(ab_cols), run_as_sir_column, mc.cores = n_cores) result_list <- future.apply::future_lapply(seq_along(ab_cols), run_as_sir_column, future.seed = TRUE)
} }
if (isTRUE(info)) { if (isTRUE(info)) {
message_(font_green_bg("\u00aDONE\u00a"), as_note = FALSE) message_(font_green_bg("\u00a0DONE\u00a0"), as_note = FALSE)
message_(as_note = FALSE) message_(as_note = FALSE)
message_("Run {.help [{.fun sir_interpretation_history}](AMR::sir_interpretation_history)} to retrieve a logbook with all details of the breakpoint interpretations.") message_("Run {.help [{.fun sir_interpretation_history}](AMR::sir_interpretation_history)} to retrieve a logbook with all details of the breakpoint interpretations.")
} }
} else { } else {
# sequential mode (non-parallel) # sequential mode (non-parallel)
if (isTRUE(info) && n_cores > 1 && NROW(x) * NCOL(x) > 10000) { if (isTRUE(info) && n_cores > 1 && NROW(x) * NCOL(x) > 10000) {
# give a note that parallel mode might be better suggest <- ifelse(.Platform$OS.type == "windows" || in_rstudio(),
"plan(multisession)",
"plan(multicore)"
)
message_(as_note = FALSE) message_(as_note = FALSE)
message_("Running in sequential mode. Consider setting {.arg parallel} to {.code TRUE} to speed up processing on multiple cores.\n") if (requireNamespace("future.apply", quietly = TRUE)) {
message_("Running in sequential mode. To speed up processing, set a parallel {.help [{.fun future::plan}](future::plan)} such as {.code ", suggest, "}.")
} else {
message_("Running in sequential mode. To speed up processing, install the {.pkg future.apply} package and then set {.code parallel = TRUE}.\n")
}
} }
# this will contain a progress bar already # this will contain a progress bar already
result_list <- lapply(seq_along(ab_cols), run_as_sir_column) result_list <- lapply(seq_along(ab_cols), run_as_sir_column)
@@ -1165,7 +1244,7 @@ as_sir_method <- function(method_short,
meet_criteria(substitute_missing_r_breakpoint, allow_class = "logical", has_length = 1, .call_depth = -2) meet_criteria(substitute_missing_r_breakpoint, allow_class = "logical", has_length = 1, .call_depth = -2)
meet_criteria(include_screening, allow_class = "logical", has_length = 1, .call_depth = -2) meet_criteria(include_screening, allow_class = "logical", has_length = 1, .call_depth = -2)
meet_criteria(include_PKPD, allow_class = "logical", has_length = 1, .call_depth = -2) meet_criteria(include_PKPD, allow_class = "logical", has_length = 1, .call_depth = -2)
check_reference_data(reference_data, .call_depth = -2) reference_data <- check_reference_data(reference_data, .call_depth = -2)
meet_criteria(breakpoint_type, allow_class = "character", is_in = reference_data$type, has_length = 1, .call_depth = -2) meet_criteria(breakpoint_type, allow_class = "character", is_in = reference_data$type, has_length = 1, .call_depth = -2)
meet_criteria(host, allow_class = c("character", "factor"), allow_NULL = TRUE, allow_NA = TRUE, .call_depth = -2) meet_criteria(host, allow_class = c("character", "factor"), allow_NULL = TRUE, allow_NA = TRUE, .call_depth = -2)
language <- validate_language(language) language <- validate_language(language)
@@ -1174,12 +1253,13 @@ as_sir_method <- function(method_short,
# backward compatibilty # backward compatibilty
dots <- list(...) dots <- list(...)
dots <- dots[which(!names(dots) %in% c("warn", "mo.bak", "is_data.frame"))] dots <- dots[which(!names(dots) %in% c("warn", "mo.bak", "is_data.frame", "as_wt_nwt"))]
if (length(dots) != 0) { if (length(dots) != 0) {
warning_("These arguments in {.help [{.fun as.sir}](AMR::as.sir)} are no longer used: ", vector_and(names(dots), quotes = "`"), ".", call = FALSE) warning_("These arguments in {.help [{.fun as.sir}](AMR::as.sir)} are no longer used: ", vector_and(names(dots), quotes = "`"), ".", call = FALSE)
} }
current_sir_interpretation_history <- NROW(AMR_env$sir_interpretation_history) current_sir_interpretation_history <- NROW(AMR_env$sir_interpretation_history)
custom_breakpoints_set <- !identical(reference_data, AMR::clinical_breakpoints)
if (isTRUE(info) && message_not_thrown_before("as.sir", "sir_interpretation_history")) { if (isTRUE(info) && message_not_thrown_before("as.sir", "sir_interpretation_history")) {
message_("Run {.help [{.fun sir_interpretation_history}](AMR::sir_interpretation_history)} afterwards to retrieve a logbook with all details of the breakpoint interpretations.\n\n") message_("Run {.help [{.fun sir_interpretation_history}](AMR::sir_interpretation_history)} afterwards to retrieve a logbook with all details of the breakpoint interpretations.\n\n")
@@ -1384,7 +1464,7 @@ as_sir_method <- function(method_short,
"\u00a0\u00a0", AMR_env$bullet_icon, " Interpreting ", method_long, ": ", ifelse(isTRUE(list(...)$is_data.frame), "column ", ""), "\u00a0\u00a0", AMR_env$bullet_icon, " Interpreting ", method_long, ": ", ifelse(isTRUE(list(...)$is_data.frame), "column ", ""),
ifelse(length(unique(agent_formatted)) == 1, unique(agent_formatted), paste0(vector_and(agent_formatted, quotes = FALSE, sort = FALSE))), ifelse(length(unique(agent_formatted)) == 1, unique(agent_formatted), paste0(vector_and(agent_formatted, quotes = FALSE, sort = FALSE))),
mo_var_found, mo_var_found,
ifelse(identical(reference_data, AMR::clinical_breakpoints), ifelse(!custom_breakpoints_set,
paste0(", ", vector_and(font_bold(guideline_coerced, collapse = NULL), quotes = FALSE)), paste0(", ", vector_and(font_bold(guideline_coerced, collapse = NULL), quotes = FALSE)),
"" ""
), ),
@@ -1401,7 +1481,7 @@ as_sir_method <- function(method_short,
method_coerced <- toupper(method) method_coerced <- toupper(method)
ab_coerced <- as.ab(ab, info = FALSE) ab_coerced <- as.ab(ab, info = FALSE)
if (identical(reference_data, AMR::clinical_breakpoints)) { if (!custom_breakpoints_set) {
breakpoints <- reference_data %pm>% breakpoints <- reference_data %pm>%
subset(guideline %in% guideline_coerced & method == method_coerced & ab %in% ab_coerced) subset(guideline %in% guideline_coerced & method == method_coerced & ab %in% ab_coerced)
if (any(ab_coerced == "AMX") && nrow(breakpoints[breakpoints$ab == "AMX", , drop = FALSE]) == 0) { if (any(ab_coerced == "AMX") && nrow(breakpoints[breakpoints$ab == "AMX", , drop = FALSE]) == 0) {
@@ -1492,11 +1572,11 @@ as_sir_method <- function(method_short,
add_intrinsic_resistance_to_AMR_env() add_intrinsic_resistance_to_AMR_env()
} }
if (isTRUE(info) && nrow(df_unique) < 10 || nrow(breakpoints) == 0) { if (isTRUE(info) && (nrow(df_unique) < 10 || nrow(breakpoints) == 0)) {
# only print intro under 10 items, otherwise progressbar will print this and then it will be printed double # only print intro under 10 items, otherwise progressbar will print this and then it will be printed double
message_(intro_txt, appendLF = FALSE, as_note = FALSE) message_(intro_txt, appendLF = FALSE, as_note = FALSE)
} }
p <- progress_ticker(n = nrow(df_unique), n_min = 10, title = intro_txt, only_bar_percent = TRUE) p <- progress_ticker(n = nrow(df_unique), n_min = 10, print = isTRUE(info), title = intro_txt, only_bar_percent = TRUE)
has_progress_bar <- !is.null(import_fn("progress_bar", "progress", error_on_fail = FALSE)) && nrow(df_unique) >= 10 has_progress_bar <- !is.null(import_fn("progress_bar", "progress", error_on_fail = FALSE)) && nrow(df_unique) >= 10
on.exit(close(p)) on.exit(close(p))
@@ -1584,8 +1664,15 @@ as_sir_method <- function(method_short,
# gather all available breakpoints for current MO # gather all available breakpoints for current MO
# TODO for VET09 do not filter out E. coli and such # TODO for VET09 do not filter out E. coli and such
# For custom reference_data: skip guideline filter when guideline_current is not in the data (#239)
guideline_filter_current <- if (custom_breakpoints_set &&
!guideline_current %in% breakpoints$guideline) {
unique(breakpoints$guideline)
} else {
guideline_current
}
breakpoints_current <- breakpoints %pm>% breakpoints_current <- breakpoints %pm>%
subset(ab == ab_current & guideline == guideline_current) %pm>% subset(ab == ab_current & guideline %in% guideline_filter_current) %pm>%
subset(mo %in% c( subset(mo %in% c(
mo_current, mo_current_genus, mo_current_family, mo_current, mo_current_genus, mo_current_family,
mo_current_order, mo_current_class, mo_current_order, mo_current_class,
@@ -1595,8 +1682,13 @@ as_sir_method <- function(method_short,
)) ))
if (breakpoint_type == "animal") { if (breakpoint_type == "animal") {
# 2025-03-13/ for now, only strictly follow guideline for current host, no extrapolation host_matched <- breakpoints_current[which(breakpoints_current$host == host_current), , drop = FALSE]
breakpoints_current <- breakpoints_current[which(breakpoints_current$host == host_current), , drop = FALSE] if (nrow(host_matched) > 0) {
breakpoints_current <- host_matched
} else {
# fall back to host-agnostic rows (host = NA) for custom breakpoint tables (#239)
breakpoints_current <- breakpoints_current[which(is.na(breakpoints_current$host)), , drop = FALSE]
}
} }
## fall-back methods for veterinary guidelines ---- ## fall-back methods for veterinary guidelines ----
@@ -1872,8 +1964,11 @@ as_sir_method <- function(method_short,
# otherwise: the normal (uncapped or ignored) interpretation # otherwise: the normal (uncapped or ignored) interpretation
input_clean <= breakpoints_current$breakpoint_S ~ as.sir("S"), input_clean <= breakpoints_current$breakpoint_S ~ as.sir("S"),
guideline_current %like% "EUCAST" & input_clean > breakpoints_current$breakpoint_R ~ as.sir("R"), # standard data: EUCAST open interval (>), CLSI closed interval (>=)
guideline_current %like% "CLSI" & input_clean >= breakpoints_current$breakpoint_R ~ as.sir("R"), !custom_breakpoints_set & guideline_current %like% "EUCAST" & input_clean > breakpoints_current$breakpoint_R ~ as.sir("R"),
!custom_breakpoints_set & guideline_current %like% "CLSI" & input_clean >= breakpoints_current$breakpoint_R ~ as.sir("R"),
# custom reference_data: always EUCAST open interval (>), regardless of guideline name
custom_breakpoints_set & input_clean > breakpoints_current$breakpoint_R ~ as.sir("R"),
# return "I" or "SDD" when breakpoints are in the middle # return "I" or "SDD" when breakpoints are in the middle
!is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"), !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"),
@@ -1886,8 +1981,11 @@ as_sir_method <- function(method_short,
new_sir <- case_when_AMR( new_sir <- case_when_AMR(
is.na(input_clean) ~ NA_sir_, is.na(input_clean) ~ NA_sir_,
as.double(input_clean) >= as.double(breakpoints_current$breakpoint_S) ~ as.sir("S"), as.double(input_clean) >= as.double(breakpoints_current$breakpoint_S) ~ as.sir("S"),
guideline_current %like% "EUCAST" & as.double(input_clean) < as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"), # standard data: EUCAST open interval (<), CLSI closed interval (<=)
guideline_current %like% "CLSI" & as.double(input_clean) <= as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"), !custom_breakpoints_set & guideline_current %like% "EUCAST" & as.double(input_clean) < as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"),
!custom_breakpoints_set & guideline_current %like% "CLSI" & as.double(input_clean) <= as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"),
# custom reference_data: always EUCAST open interval (<), regardless of guideline name
custom_breakpoints_set & as.double(input_clean) < as.double(breakpoints_current$breakpoint_R) ~ as.sir("R"),
# return "I" or "SDD" when breakpoints are in the middle # return "I" or "SDD" when breakpoints are in the middle
!is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"), !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == TRUE ~ as.sir("SDD"),
!is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == FALSE ~ as.sir("I"), !is.na(breakpoints_current$breakpoint_S) & !is.na(breakpoints_current$breakpoint_R) & breakpoints_current$is_SDD == FALSE ~ as.sir("I"),
@@ -1996,7 +2094,7 @@ sir_interpretation_history <- function(clean = FALSE) {
#' @noRd #' @noRd
print.sir_log <- function(x, ...) { print.sir_log <- function(x, ...) {
if (NROW(x) == 0) { if (NROW(x) == 0) {
message_("No results to print. First run {.help [{.fun as.sir}](AMR::as.sir)} on MIC values or disk diffusion zones (or on a {.cls data.frame} containing any of these) to print a {.val logbook} data set here.") message_("No results to print. First run {.help [{.fun as.sir}](AMR::as.sir)} on MIC values or disk diffusion zones (or on a {.cls data.frame} containing any of these) to print a 'logbook' data set here.")
return(invisible(NULL)) return(invisible(NULL))
} }
class(x) <- class(x)[class(x) != "sir_log"] class(x) <- class(x)[class(x) != "sir_log"]
@@ -2234,13 +2332,36 @@ rep.sir <- function(x, ...) {
y y
} }
coerce_reference_data_columns <- function(x) {
ref <- AMR::clinical_breakpoints
for (col in names(ref)) {
col_ref <- ref[[col]]
col_x <- x[[col]]
if (identical(class(col_ref), class(col_x))) next
if (col == "mo") {
x[[col]] <- suppressMessages(as.mo(col_x))
} else if (col == "ab") {
x[[col]] <- suppressMessages(as.ab(col_x))
} else if (is.character(col_ref)) {
x[[col]] <- as.character(col_x)
} else if (is.numeric(col_ref)) {
x[[col]] <- as.double(col_x)
} else if (is.logical(col_ref)) {
x[[col]] <- as.logical(col_x)
}
}
x
}
check_reference_data <- function(reference_data, .call_depth) { check_reference_data <- function(reference_data, .call_depth) {
if (!identical(reference_data, AMR::clinical_breakpoints)) { if (!identical(reference_data, AMR::clinical_breakpoints)) {
class_sir <- vapply(FUN.VALUE = character(1), AMR::clinical_breakpoints, function(x) paste0("<", class(x), ">", collapse = " and ")) if (!all(names(AMR::clinical_breakpoints) == names(reference_data))) {
class_ref <- vapply(FUN.VALUE = character(1), reference_data, function(x) paste0("<", class(x), ">", collapse = " and "))
if (!all(names(class_sir) == names(class_ref))) {
stop_("{.arg reference_data} must have the same column names as the {.help [clinical_breakpoints](AMR::clinical_breakpoints)} data set.", call = .call_depth) stop_("{.arg reference_data} must have the same column names as the {.help [clinical_breakpoints](AMR::clinical_breakpoints)} data set.", call = .call_depth)
} }
# coerce mo, ab, and other columns to the expected types where possible
reference_data <- coerce_reference_data_columns(reference_data)
class_sir <- vapply(FUN.VALUE = character(1), AMR::clinical_breakpoints, function(x) paste0("<", class(x), ">", collapse = " and "))
class_ref <- vapply(FUN.VALUE = character(1), reference_data, function(x) paste0("<", class(x), ">", collapse = " and "))
if (!all(class_sir == class_ref)) { if (!all(class_sir == class_ref)) {
bad_col <- names(class_ref[class_sir != class_ref][1]) bad_col <- names(class_ref[class_sir != class_ref][1])
bad_cls <- gsub("<|>", "", class_ref[class_sir != class_ref][1]) bad_cls <- gsub("<|>", "", class_ref[class_sir != class_ref][1])
@@ -2248,4 +2369,5 @@ check_reference_data <- function(reference_data, .call_depth) {
stop_("{.arg reference_data} must be the same structure as the {.help [clinical_breakpoints](AMR::clinical_breakpoints)} data set. Column ", paste0("{.field ", font_bold(bad_col, collapse = NULL), "}"), " is of class ", paste0("{.cls ", bad_cls, "}"), ", but should be of class ", paste0("{.cls ", exp_cls, "}"), call = .call_depth) stop_("{.arg reference_data} must be the same structure as the {.help [clinical_breakpoints](AMR::clinical_breakpoints)} data set. Column ", paste0("{.field ", font_bold(bad_col, collapse = NULL), "}"), " is of class ", paste0("{.cls ", bad_cls, "}"), ", but should be of class ", paste0("{.cls ", exp_cls, "}"), call = .call_depth)
} }
} }
reference_data
} }

Binary file not shown.

0
R/tidymodels.R Normal file → Executable file
View File

View File

@@ -70,6 +70,13 @@ as.data.frame.deprecated_amr_dataset <- function(x, ...) {
# - `antibiotics` in `antibiogram()` # - `antibiotics` in `antibiogram()`
# - `converse_capped_values` in `as.sir()` # - `converse_capped_values` in `as.sir()`
#' @rdname AMR-deprecated
#' @export
custom_eucast_rules <- function(...) {
deprecation_warning("custom_eucast_rules", "custom_interpretive_rules", is_function = TRUE)
custom_interpretive_rules(...)
}
#' @rdname AMR-deprecated #' @rdname AMR-deprecated
#' @export #' @export
ab_class <- function(...) { ab_class <- function(...) {

View File

@@ -156,17 +156,17 @@ reference:
- "`atc_online_property`" - "`atc_online_property`"
- "`add_custom_antimicrobials`" - "`add_custom_antimicrobials`"
- title: "Preparing data: antimicrobial results" - title: "Interpreting data: antimicrobial results"
desc: > desc: >
With `as.mic()` and `as.disk()` you can transform your raw input to valid MIC or disk diffusion values. With `as.mic()` and `as.disk()` you can transform your raw input to valid MIC or disk diffusion values.
Use `as.sir()` for cleaning raw data to let it only contain "R", "I" and "S", or to interpret MIC or disk diffusion values as SIR based on the lastest EUCAST and CLSI guidelines. Use `as.sir()` for cleaning raw data to let it only contain "R", "I" and "S", or to interpret MIC or disk diffusion values as SIR based on the lastest EUCAST and CLSI guidelines.
Afterwards, you can extend antibiotic interpretations by applying [EUCAST rules](https://www.eucast.org/expert_rules_and_intrinsic_resistance/) with `eucast_rules()`. Afterwards, you can extend antibiotic interpretations by applying interpretive rules, for example [from EUCAST](https://www.eucast.org/expert_rules_and_intrinsic_resistance/) with `interpretive_rules()`.
contents: contents:
- "`as.sir`" - "`as.sir`"
- "`as.mic`" - "`as.mic`"
- "`as.disk`" - "`as.disk`"
- "`eucast_rules`" - "`interpretive_rules`"
- "`custom_eucast_rules`" - "`custom_interpretive_rules`"
- title: "Analysing data" - title: "Analysing data"
desc: > desc: >
@@ -265,7 +265,7 @@ reference:
- title: "Other: statistical tests" - title: "Other: statistical tests"
desc: > desc: >
Some statistical tests or methods are not part of base R and were added to this package for convenience. Some statistical tests or methods usable for AMR analyses are not part of base R and were added to this package for convenience.
contents: contents:
- "`g.test`" - "`g.test`"
- "`kurtosis`" - "`kurtosis`"

View File

@@ -42,9 +42,9 @@ pre_commit_lst <- list()
usethis::ui_info(paste0("Updating internal package data")) usethis::ui_info(paste0("Updating internal package data"))
# See 'data-raw/eucast_rules.tsv' for the EUCAST reference file # See 'data-raw/interpretive_rules.tsv' for the interpretive rules reference file
pre_commit_lst$EUCAST_RULES_DF <- utils::read.delim( pre_commit_lst$INTERPRETIVE_RULES_DF <- utils::read.delim(
file = "data-raw/eucast_rules.tsv", file = "data-raw/interpretive_rules.tsv",
skip = 9, skip = 9,
sep = "\t", sep = "\t",
stringsAsFactors = FALSE, stringsAsFactors = FALSE,
@@ -364,7 +364,7 @@ pre_commit_lst$MO_RELEVANT_GENERA <- c(
) )
# antibiotic groups # antibiotic groups
# (these will also be used for eucast_rules() and understanding data-raw/eucast_rules.tsv) # (these will also be used for interpretive_rules() and understanding data-raw/interpretive_rules.tsv)
pre_commit_lst$AB_AMINOGLYCOSIDES <- antimicrobials %>% pre_commit_lst$AB_AMINOGLYCOSIDES <- antimicrobials %>%
filter(group %like% "aminoglycoside|paromomycin|spectinomycin") %>% filter(group %like% "aminoglycoside|paromomycin|spectinomycin") %>%
pull(ab) pull(ab)

View File

@@ -122,8 +122,8 @@ get_author_year <- function(ref) {
authors <- gsub("[A-Z-][a-z-]?[.]", "", authors, ignore.case = FALSE) authors <- gsub("[A-Z-][a-z-]?[.]", "", authors, ignore.case = FALSE)
# remove trailing and leading spaces # remove trailing and leading spaces
authors <- trimws(authors) authors <- trimws(authors)
# keep only the part after last 'emend.' to get the latest authors # strip emend. and everything after it to retain the combination authority
authors <- gsub(".*emend[.] ?", "", authors) authors <- gsub(" ?emend[.]?.*", "", authors)
# only keep first author and replace all others by 'et al' # only keep first author and replace all others by 'et al'
authors <- gsub("(,| and| et| &| ex| emend\\.?) .*", " et al.", authors) authors <- gsub("(,| and| et| &| ex| emend\\.?) .*", " et al.", authors)
# et al. always with ending dot # et al. always with ending dot
@@ -746,7 +746,7 @@ taxonomy_mycobank <- taxonomy_mycobank %>%
tax_h, tax_h,
tax_i %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~ tax_i %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~
tax_i, tax_i,
tax_k %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~ tax_j %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~
tax_j, tax_j,
tax_k %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~ tax_k %in% taxonomy_mycobank$fullname[taxonomy_mycobank$rank == "genus"] ~
tax_k, tax_k,
@@ -2858,6 +2858,135 @@ taxonomy <- taxonomy %>%
relocate(oxygen_tolerance, .after = ref) relocate(oxygen_tolerance, .after = ref)
# Add morphology ---------------------------------------------------------------------
# We will use the BacDive data base for this:
# - go to https://bacdive.dsmz.de/advsearch
# - filter 'Cell shape' on "*" and click Submit
# - click on the 'Download table as CSV' button
bacdive_shape <- vroom::vroom("data-raw/bacdive_shape.csv", skip = 2) %>%
select(species, shape = `Cell shape`)
bacdive_shape <- bacdive_shape %>%
# fill in missing species from previous rows
mutate(fullname = if_else(is.na(species), lag(species), species)) %>%
filter(
!is.na(species),
!is.na(shape),
species %unlike% "unclassified"
) %>%
select(-species)
bacdive_shape <- bacdive_shape %>%
# map raw BacDive values to a controlled vocabulary
mutate(
shape = case_when(
shape %in% c("coccus-shaped", "sphere-shaped", "diplococcus-shaped") ~ "cocci",
shape %in% c("oval-shaped", "ovoid-shaped") ~ "coccobacilli",
shape %in% c("rod-shaped", "curved-shaped", "vibrio-shaped", "flask-shaped") ~ "rods",
shape %in% c("spiral-shaped", "helical-shaped") ~ "spirilla",
shape == "filament-shaped" ~ "filamentous",
TRUE ~ NA_character_
)
) %>%
filter(!is.na(shape)) %>%
# now determine shape per species by majority vote
group_by(fullname) %>%
summarise(
morphology = names(sort(table(shape), decreasing = TRUE))[1]
)
# now find all synonyms and copy them from their current taxonomic names
synonyms_shape <- taxonomy %>%
filter(status == "synonym") %>%
transmute(
mo,
fullname_old = fullname,
current = synonym_mo_to_accepted_mo(
mo,
fill_in_accepted = FALSE,
dataset = taxonomy
)
) %>%
filter(!is.na(current)) %>%
mutate(fullname = taxonomy$fullname[match(current, taxonomy$mo)]) %>%
left_join(bacdive_shape, by = "fullname") %>%
filter(!is.na(morphology)) %>%
select(fullname, morphology)
bacdive_shape <- bacdive_shape %>%
bind_rows(synonyms_shape) %>%
distinct()
bacdive_shape_genus <- bacdive_shape %>%
mutate(
shape_raw = morphology,
genus = taxonomy$genus[match(fullname, taxonomy$fullname)]
) %>%
group_by(fullname = genus) %>%
summarise(
morphology = names(sort(table(shape_raw), decreasing = TRUE))[1]
)
bacdive_shape <- bacdive_shape %>%
bind_rows(bacdive_shape_genus) %>%
arrange(fullname)
bacdive_shape_other <- taxonomy %>%
filter(
kingdom == "Bacteria",
rank == "species",
!fullname %in% bacdive_shape$fullname,
genus %in% bacdive_shape$fullname
) %>%
select(fullname, genus) %>%
left_join(bacdive_shape, by = c("genus" = "fullname")) %>%
mutate(
morphology = paste("likely", morphology)
) %>%
select(fullname, morphology) %>%
distinct(fullname, .keep_all = TRUE)
bacdive_shape <- bacdive_shape %>%
bind_rows(bacdive_shape_other) %>%
arrange(fullname) %>%
distinct(fullname, .keep_all = TRUE)
taxonomy <- taxonomy %>%
left_join(bacdive_shape, by = "fullname") %>%
relocate(morphology, .after = oxygen_tolerance)
# Override: genera that are clinically established coccobacilli but where BacDive
# majority vote yields "rods" due to observer disagreement on the rod/oval boundary.
# These genera are universally reported as coccobacilli on Gram stain in clinical
# microbiology practice.
coccobacilli_genera <- c(
"Acinetobacter", "Aggregatibacter", "Brucella",
"Gardnerella", "Haemophilus", "Kingella",
"Moraxella", "Pasteurella"
)
taxonomy <- taxonomy %>%
mutate(
morphology = case_when(
genus %in% coccobacilli_genera & is.na(morphology) ~ "likely coccobacilli",
genus %in% coccobacilli_genera &
morphology %in% c("rods", "cocci") ~ "coccobacilli",
genus %in% coccobacilli_genera &
morphology %in% c("likely rods", "likely cocci") ~ "likely coccobacilli",
TRUE ~ morphology
)
)
# Spirochaetes: the entire phylum is spirochaete by definition, fill in where missing
taxonomy <- taxonomy %>%
mutate(
morphology = case_when(
phylum %in% c("Spirochaetota", "Spirochaetes") & is.na(morphology) ~ "likely spirilla",
phylum %in% c("Spirochaetota", "Spirochaetes") &
morphology %in% c("rods", "likely rods") ~ "spirilla",
TRUE ~ morphology
)
)
# Restore 'synonym' microorganisms to 'accepted' -------------------------------------------------- # Restore 'synonym' microorganisms to 'accepted' --------------------------------------------------
# If there are some synonyms that need to be corrected to 'accepted', you can do that here. # If there are some synonyms that need to be corrected to 'accepted', you can do that here.

File diff suppressed because it is too large Load Diff

Binary file not shown.

Binary file not shown.

118
index.md
View File

@@ -26,12 +26,9 @@
<div style="display: flex; font-size: 0.8em;"> <div style="display: flex; font-size: 0.8em;">
<p style="text-align:left; width: 50%;"> <p style="text-align:left; width: 50%;">
<small><a href="https://amr-for-r.org/">amr-for-r.org</a></small> <small><a href="https://amr-for-r.org/">amr-for-r.org</a></small>
</p> </p>
<p style="text-align:right; width: 50%;"> <p style="text-align:right; width: 50%;">
<small><a href="https://doi.org/10.18637/jss.v104.i03" target="_blank">doi.org/10.18637/jss.v104.i03</a></small> <small><a href="https://doi.org/10.18637/jss.v104.i03" target="_blank">doi.org/10.18637/jss.v104.i03</a></small>
</p> </p>
@@ -174,24 +171,26 @@ example_isolates %>%
#> Using column mo as input for `mo_fullname()` #> Using column mo as input for `mo_fullname()`
#> Using column mo as input for `mo_is_gram_negative()` #> Using column mo as input for `mo_is_gram_negative()`
#> Using column mo as input for `mo_is_intrinsic_resistant()` #> Using column mo as input for `mo_is_intrinsic_resistant()`
#> Determining intrinsic resistance based on 'EUCAST Expected Resistant #> Determining intrinsic resistance based on 'EUCAST Expected
#> Phenotypes' v1.2 (2023). This note will be shown once per session. #> Resistant Phenotypes' v1.2 (2023). This note will be shown
#> For `aminoglycosides()` using columns GEN (gentamicin), TOB (tobramycin), AMK #> once per session.
#> (amikacin), and KAN (kanamycin) #> For `aminoglycosides()` using columns GEN (gentamicin), TOB
#> For `carbapenems()` using columns IPM (imipenem) and MEM (meropenem) #> (tobramycin), AMK (amikacin), and KAN (kanamycin)
#> For `carbapenems()` using columns IPM (imipenem) and MEM
#> (meropenem)
#> # A tibble: 35 × 7 #> # A tibble: 35 × 7
#> bacteria GEN TOB AMK KAN IPM MEM #> bacteria GEN TOB AMK KAN IPM MEM
#> <chr> <sir> <sir> <sir> <sir> <sir> <sir> #> <chr> <sir> <sir> <sir> <sir> <sir> <sir>
#> 1 Pseudomonas aeruginosa I S NA R S NA #> 1 Pseudomonas aer I S NA R S NA
#> 2 Pseudomonas aeruginosa I S NA R S NA #> 2 Pseudomonas aer I S NA R S NA
#> 3 Pseudomonas aeruginosa I S NA R S NA #> 3 Pseudomonas aer I S NA R S NA
#> 4 Pseudomonas aeruginosa S S S R NA S #> 4 Pseudomonas aer S S S R NA S
#> 5 Pseudomonas aeruginosa S S S R S S #> 5 Pseudomonas aer S S S R S S
#> 6 Pseudomonas aeruginosa S S S R S S #> 6 Pseudomonas aer S S S R S S
#> 7 Stenotrophomonas maltophilia R R R R R R #> 7 Stenotrophomona R R R R R R
#> 8 Pseudomonas aeruginosa S S S R NA S #> 8 Pseudomonas aer S S S R NA S
#> 9 Pseudomonas aeruginosa S S S R NA S #> 9 Pseudomonas aer S S S R NA S
#> 10 Pseudomonas aeruginosa S S S R S S #> 10 Pseudomonas aer S S S R S S
#> # 25 more rows #> # 25 more rows
``` ```
@@ -215,23 +214,24 @@ output format automatically (such as markdown, LaTeX, HTML, etc.).
``` r ``` r
antibiogram(example_isolates, antibiogram(example_isolates,
antimicrobials = c(aminoglycosides(), carbapenems())) antimicrobials = c(aminoglycosides(), carbapenems()))
#> For `aminoglycosides()` using columns GEN (gentamicin), TOB (tobramycin), AMK #> For `aminoglycosides()` using columns GEN (gentamicin), TOB
#> (amikacin), and KAN (kanamycin) #> (tobramycin), AMK (amikacin), and KAN (kanamycin)
#> For `carbapenems()` using columns IPM (imipenem) and MEM (meropenem) #> For `carbapenems()` using columns IPM (imipenem) and MEM
#> (meropenem)
``` ```
| Pathogen | Amikacin | Gentamicin | Imipenem | Kanamycin | Meropenem | Tobramycin | | Pathogen | Amikacin | Gentamicin | Imipenem | Kanamycin | Meropenem | Tobramycin |
|:---|:---|:---|:---|:---|:---|:---| |:-----------------|:---------------------|:--------------------|:---------------------|:----------------|:---------------------|:--------------------|
| CoNS | 0% (0-8%,N=43) | 86% (82-90%,N=309) | 52% (37-67%,N=48) | 0% (0-8%,N=43) | 52% (37-67%,N=48) | 22% (12-35%,N=55) | | CoNS | 0% (0-8%,N=43) | 86% (82-90%,N=309) | 52% (37-67%,N=48) | 0% (0-8%,N=43) | 52% (37-67%,N=48) | 22% (12-35%,N=55) |
| *E. coli* | 100% (98-100%,N=171) | 98% (96-99%,N=460) | 100% (99-100%,N=422) | NA | 100% (99-100%,N=418) | 97% (96-99%,N=462) | | *E. coli* | 100% (98-100%,N=171) | 98% (96-99%,N=460) | 100% (99-100%,N=422) | NA | 100% (99-100%,N=418) | 97% (96-99%,N=462) |
| *E. faecalis* | 0% (0-9%,N=39) | 0% (0-9%,N=39) | 100% (91-100%,N=38) | 0% (0-9%,N=39) | NA | 0% (0-9%,N=39) | | *E. faecalis* | 0% (0-9%,N=39) | 0% (0-9%,N=39) | 100% (91-100%,N=38) | 0% (0-9%,N=39) | NA | 0% (0-9%,N=39) |
| *K. pneumoniae* | NA | 90% (79-96%,N=58) | 100% (93-100%,N=51) | NA | 100% (93-100%,N=53) | 90% (79-96%,N=58) | | *K. pneumoniae* | NA | 90% (79-96%,N=58) | 100% (93-100%,N=51) | NA | 100% (93-100%,N=53) | 90% (79-96%,N=58) |
| *P. aeruginosa* | NA | 100% (88-100%,N=30) | NA | 0% (0-12%,N=30) | NA | 100% (88-100%,N=30) | | *P. aeruginosa* | NA | 100% (88-100%,N=30) | NA | 0% (0-12%,N=30) | NA | 100% (88-100%,N=30) |
| *P. mirabilis* | NA | 94% (80-99%,N=34) | 94% (79-99%,N=32) | NA | NA | 94% (80-99%,N=34) | | *P. mirabilis* | NA | 94% (80-99%,N=34) | 94% (79-99%,N=32) | NA | NA | 94% (80-99%,N=34) |
| *S. aureus* | NA | 99% (97-100%,N=233) | NA | NA | NA | 98% (92-100%,N=86) | | *S. aureus* | NA | 99% (97-100%,N=233) | NA | NA | NA | 98% (92-100%,N=86) |
| *S. epidermidis* | 0% (0-8%,N=44) | 79% (71-85%,N=163) | NA | 0% (0-8%,N=44) | NA | 51% (40-61%,N=89) | | *S. epidermidis* | 0% (0-8%,N=44) | 79% (71-85%,N=163) | NA | 0% (0-8%,N=44) | NA | 51% (40-61%,N=89) |
| *S. hominis* | NA | 92% (84-97%,N=80) | NA | NA | NA | 85% (74-93%,N=62) | | *S. hominis* | NA | 92% (84-97%,N=80) | NA | NA | NA | 85% (74-93%,N=62) |
| *S. pneumoniae* | 0% (0-3%,N=117) | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) | | *S. pneumoniae* | 0% (0-3%,N=117) | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) | NA | 0% (0-3%,N=117) |
In combination antibiograms, it is clear that combined antimicrobials In combination antibiograms, it is clear that combined antimicrobials
yield higher empiric coverage: yield higher empiric coverage:
@@ -242,10 +242,10 @@ antibiogram(example_isolates,
mo_transform = "gramstain") mo_transform = "gramstain")
``` ```
| Pathogen | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin | | Pathogen | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
|:---|:---|:---|:---| |:--------------|:------------------------|:-------------------------------------|:-------------------------------------|
| Gram-negative | 88% (85-91%,N=641) | 99% (97-99%,N=691) | 98% (97-99%,N=693) | | Gram-negative | 88% (85-91%,N=641) | 99% (97-99%,N=691) | 98% (97-99%,N=693) |
| Gram-positive | 86% (82-89%,N=345) | 98% (96-98%,N=1044) | 95% (93-97%,N=550) | | Gram-positive | 86% (82-89%,N=345) | 98% (96-98%,N=1044) | 95% (93-97%,N=550) |
Like many other functions in this package, `antibiogram()` comes with Like many other functions in this package, `antibiogram()` comes with
support for 28 languages that are often detected automatically based on support for 28 languages that are often detected automatically based on
@@ -318,16 +318,18 @@ example_isolates %>%
summarise(across(c(GEN, TOB), summarise(across(c(GEN, TOB),
list(total_R = resistance, list(total_R = resistance,
conf_int = function(x) sir_confidence_interval(x, collapse = "-")))) conf_int = function(x) sir_confidence_interval(x, collapse = "-"))))
#> `resistance()` assumes the EUCAST guideline and thus considers the 'I' #> `resistance()` assumes the EUCAST guideline and thus
#> category susceptible. Set the `guideline` argument or the `AMR_guideline` #> considers the 'I' category susceptible. Set the `guideline`
#> option to either "CLSI" or "EUCAST", see `?AMR-options`. #> argument or the `AMR_guideline` option to either "CLSI" or
#> "EUCAST", see `?AMR-options`.
#> This message will be shown once per session. #> This message will be shown once per session.
#> # A tibble: 3 × 5 #> # A tibble: 3 × 5
#> ward GEN_total_R GEN_conf_int TOB_total_R TOB_conf_int #> ward GEN_total_R GEN_conf_int TOB_total_R
#> <chr> <dbl> <chr> <dbl> <chr> #> <chr> <dbl> <chr> <dbl>
#> 1 Clinical 0.229 0.205-0.254 0.315 0.284-0.347 #> 1 Clinical 0.229 0.205-0.254 0.315
#> 2 ICU 0.290 0.253-0.33 0.400 0.353-0.449 #> 2 ICU 0.290 0.253-0.33 0.400
#> 3 Outpatient 0.2 0.131-0.285 0.368 0.254-0.493 #> 3 Outpatient 0.2 0.131-0.285 0.368
#> # 1 more variable: TOB_conf_int <chr>
``` ```
Or use [antimicrobial Or use [antimicrobial
@@ -344,15 +346,16 @@ out <- example_isolates %>%
# calculate AMR using resistance(), over all aminoglycosides and polymyxins: # calculate AMR using resistance(), over all aminoglycosides and polymyxins:
summarise(across(c(aminoglycosides(), polymyxins()), summarise(across(c(aminoglycosides(), polymyxins()),
resistance)) resistance))
#> For `aminoglycosides()` using columns GEN (gentamicin), TOB (tobramycin), AMK #> For `aminoglycosides()` using columns GEN (gentamicin), TOB
#> (amikacin), and KAN (kanamycin) #> (tobramycin), AMK (amikacin), and KAN (kanamycin)
#> For `polymyxins()` using column COL (colistin) #> For `polymyxins()` using column COL (colistin)
#> Warning: There was 1 warning in `summarise()`. #> Warning: There was 1 warning in `summarise()`.
#> In argument: `across(c(aminoglycosides(), polymyxins()), resistance)`. #> In argument: `across(c(aminoglycosides(), polymyxins()),
#> resistance)`.
#> In group 3: `ward = "Outpatient"`. #> In group 3: `ward = "Outpatient"`.
#> Caused by warning: #> Caused by warning:
#> ! Introducing NA: only 23 results available for KAN in group: ward = "Outpatient" #> ! Introducing NA: only 23 results available for KAN in group:
#> (whilst `minimum = 30`). #> ward = "Outpatient" (whilst `minimum = 30`).
out out
#> # A tibble: 3 × 6 #> # A tibble: 3 × 6
#> ward GEN TOB AMK KAN COL #> ward GEN TOB AMK KAN COL
@@ -366,11 +369,12 @@ out
# transform the antibiotic columns to names: # transform the antibiotic columns to names:
out %>% set_ab_names() out %>% set_ab_names()
#> # A tibble: 3 × 6 #> # A tibble: 3 × 6
#> ward gentamicin tobramycin amikacin kanamycin colistin #> ward gentamicin tobramycin amikacin kanamycin
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> #> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Clinical 0.229 0.315 0.626 1 0.780 #> 1 Clinical 0.229 0.315 0.626 1
#> 2 ICU 0.290 0.400 0.662 1 0.857 #> 2 ICU 0.290 0.400 0.662 1
#> 3 Outpatient 0.2 0.368 0.605 NA 0.889 #> 3 Outpatient 0.2 0.368 0.605 NA
#> # 1 more variable: colistin <dbl>
``` ```
``` r ``` r

View File

@@ -2,10 +2,13 @@
% Please edit documentation in R/zz_deprecated.R % Please edit documentation in R/zz_deprecated.R
\name{AMR-deprecated} \name{AMR-deprecated}
\alias{AMR-deprecated} \alias{AMR-deprecated}
\alias{custom_eucast_rules}
\alias{ab_class} \alias{ab_class}
\alias{ab_selector} \alias{ab_selector}
\title{Deprecated Functions, Arguments, or Datasets} \title{Deprecated Functions, Arguments, or Datasets}
\usage{ \usage{
custom_eucast_rules(...)
ab_class(...) ab_class(...)
ab_selector(...) ab_selector(...)

View File

@@ -25,7 +25,8 @@ antibiogram(x, antimicrobials = where(is.sir), mo_transform = "shortname",
ifelse(wisca, 14, 18)), col_mo = NULL, language = get_AMR_locale(), ifelse(wisca, 14, 18)), col_mo = NULL, language = get_AMR_locale(),
minimum = 30, combine_SI = TRUE, sep = " + ", sort_columns = TRUE, minimum = 30, combine_SI = TRUE, sep = " + ", sort_columns = TRUE,
wisca = FALSE, simulations = 1000, conf_interval = 0.95, wisca = FALSE, simulations = 1000, conf_interval = 0.95,
interval_side = "two-tailed", info = interactive(), ...) interval_side = "two-tailed", info = interactive(), parallel = FALSE,
...)
wisca(x, antimicrobials = where(is.sir), ab_transform = "name", wisca(x, antimicrobials = where(is.sir), ab_transform = "name",
syndromic_group = NULL, only_all_tested = FALSE, digits = 1, syndromic_group = NULL, only_all_tested = FALSE, digits = 1,
@@ -33,7 +34,7 @@ wisca(x, antimicrobials = where(is.sir), ab_transform = "name",
col_mo = NULL, language = get_AMR_locale(), combine_SI = TRUE, col_mo = NULL, language = get_AMR_locale(), combine_SI = TRUE,
sep = " + ", sort_columns = TRUE, simulations = 1000, sep = " + ", sort_columns = TRUE, simulations = 1000,
conf_interval = 0.95, interval_side = "two-tailed", conf_interval = 0.95, interval_side = "two-tailed",
info = interactive(), ...) info = interactive(), parallel = FALSE, ...)
retrieve_wisca_parameters(wisca_model, ...) retrieve_wisca_parameters(wisca_model, ...)
@@ -80,7 +81,7 @@ retrieve_wisca_parameters(wisca_model, ...)
\item{digits}{Number of digits to use for rounding the antimicrobial coverage, defaults to 1 for WISCA and 0 otherwise.} \item{digits}{Number of digits to use for rounding the antimicrobial coverage, defaults to 1 for WISCA and 0 otherwise.}
\item{formatting_type}{Numeric value (122 for WISCA, 1-12 for non-WISCA) indicating how the 'cells' of the antibiogram table should be formatted. See \emph{Details} > \emph{Formatting Type} for a list of options.} \item{formatting_type}{Numeric value (1-22 for WISCA, 1-12 for non-WISCA) indicating how the 'cells' of the antibiogram table should be formatted. See \emph{Details} > \emph{Formatting Type} for a list of options.}
\item{col_mo}{Column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} \item{col_mo}{Column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.}
@@ -104,6 +105,8 @@ retrieve_wisca_parameters(wisca_model, ...)
\item{info}{A \link{logical} to indicate info should be printed - the default is \code{TRUE} only in interactive mode.} \item{info}{A \link{logical} to indicate info should be printed - the default is \code{TRUE} only in interactive mode.}
\item{parallel}{A \link{logical} to indicate if parallel computing must be used, defaults to \code{FALSE}. Requires the \code{\link[future.apply:future_lapply]{future.apply}} package. For WISCA, Monte Carlo simulations are distributed across workers; for grouped antibiograms, each group is processed by a separate worker. \strong{A non-sequential \code{\link[future:plan]{future::plan()}} must already be active before setting \code{parallel = TRUE}} -- for example, \code{future::plan(future::multisession)}. An error is thrown if \code{parallel = TRUE} is used without a plan set by the user.}
\item{...}{When used in \link[knitr:kable]{R Markdown or Quarto}: arguments passed on to \code{\link[knitr:kable]{knitr::kable()}} (otherwise, has no use).} \item{...}{When used in \link[knitr:kable]{R Markdown or Quarto}: arguments passed on to \code{\link[knitr:kable]{knitr::kable()}} (otherwise, has no use).}
\item{wisca_model}{The outcome of \code{\link[=wisca]{wisca()}} or \code{\link[=antibiogram]{antibiogram(..., wisca = TRUE)}}.} \item{wisca_model}{The outcome of \code{\link[=wisca]{wisca()}} or \code{\link[=antibiogram]{antibiogram(..., wisca = TRUE)}}.}

View File

@@ -73,7 +73,7 @@ is_sir_eligible(x, threshold = 0.05)
include_PKPD = getOption("AMR_include_PKPD", TRUE), include_PKPD = getOption("AMR_include_PKPD", TRUE),
breakpoint_type = getOption("AMR_breakpoint_type", "human"), host = NULL, breakpoint_type = getOption("AMR_breakpoint_type", "human"), host = NULL,
language = get_AMR_locale(), verbose = FALSE, info = interactive(), language = get_AMR_locale(), verbose = FALSE, info = interactive(),
parallel = FALSE, max_cores = -1, conserve_capped_values = NULL) parallel = FALSE, conserve_capped_values = NULL)
sir_interpretation_history(clean = FALSE) sir_interpretation_history(clean = FALSE)
} }
@@ -130,7 +130,7 @@ The default \code{"conservative"} setting ensures cautious handling of uncertain
\item{add_intrinsic_resistance}{\emph{(only useful when using a EUCAST guideline)} a \link{logical} to indicate whether intrinsic antibiotic resistance must also be considered for applicable bug-drug combinations, meaning that e.g. ampicillin will always return "R" in \emph{Klebsiella} species. Determination is based on the \link{intrinsic_resistant} data set, that itself is based on \href{https://www.eucast.org/bacteria/important-additional-information/expert-rules/}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021).} \item{add_intrinsic_resistance}{\emph{(only useful when using a EUCAST guideline)} a \link{logical} to indicate whether intrinsic antibiotic resistance must also be considered for applicable bug-drug combinations, meaning that e.g. ampicillin will always return "R" in \emph{Klebsiella} species. Determination is based on the \link{intrinsic_resistant} data set, that itself is based on \href{https://www.eucast.org/bacteria/important-additional-information/expert-rules/}{'EUCAST Expert Rules' and 'EUCAST Intrinsic Resistance and Unusual Phenotypes' v3.3} (2021).}
\item{reference_data}{A \link{data.frame} to be used for interpretation, which defaults to the \link{clinical_breakpoints} data set. Changing this argument allows for using own interpretation guidelines. This argument must contain a data set that is equal in structure to the \link{clinical_breakpoints} data set (same column names and column types). Please note that the \code{guideline} argument will be ignored when \code{reference_data} is manually set.} \item{reference_data}{A \link{data.frame} to be used for interpretation, which defaults to the \link{clinical_breakpoints} data set. Changing this argument allows for using own interpretation guidelines. This argument must have the same column names as the \link{clinical_breakpoints} data set. Column types are coerced automatically where possible: the \code{mo} column is passed through \code{\link[=as.mo]{as.mo()}}, the \code{ab} column through \code{\link[=as.ab]{as.ab()}}, and plain character, numeric, or logical columns are cast to the expected type. When \code{reference_data} is manually set, the \code{guideline} argument is optional: if omitted (or if its value does not match any row in the custom data), all rows in \code{reference_data} are considered. If \code{guideline} is set to a value that exists in the \code{guideline} column of the custom data, only matching rows are used — useful when a single custom table contains multiple guidelines. For the R classification, the EUCAST convention is used by default: MIC values \verb{> breakpoint_R} and disk diffusion values \verb{< breakpoint_R} are classified as R, with values between \code{breakpoint_S} and \code{breakpoint_R} classified as I (or SDD). Only when using the standard \link{clinical_breakpoints} with a CLSI guideline are the closed-interval rules (\verb{>= breakpoint_R} for MIC, \verb{<= breakpoint_R} for disk) applied; custom \code{reference_data} always uses the open-interval (EUCAST) convention regardless of the guideline name.}
\item{substitute_missing_r_breakpoint}{A \link{logical} to indicate that a missing clinical breakpoints for R (resistant) must be substituted with R - the default is \code{FALSE}. Some (especially CLSI) breakpoints only have a breakpoint for S, meaning that the outcome can only be \code{"S"} or \code{NA}. Setting this to \code{TRUE} will convert the \code{NA}s in these cases to \code{"R"}. Can also be set with the package option \code{\link[=AMR-options]{AMR_substitute_missing_r_breakpoint}}.} \item{substitute_missing_r_breakpoint}{A \link{logical} to indicate that a missing clinical breakpoints for R (resistant) must be substituted with R - the default is \code{FALSE}. Some (especially CLSI) breakpoints only have a breakpoint for S, meaning that the outcome can only be \code{"S"} or \code{NA}. Setting this to \code{TRUE} will convert the \code{NA}s in these cases to \code{"R"}. Can also be set with the package option \code{\link[=AMR-options]{AMR_substitute_missing_r_breakpoint}}.}
@@ -150,9 +150,7 @@ The default \code{"conservative"} setting ensures cautious handling of uncertain
\item{col_mo}{Column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.} \item{col_mo}{Column name of the names or codes of the microorganisms (see \code{\link[=as.mo]{as.mo()}}) - the default is the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.}
\item{parallel}{A \link{logical} to indicate if parallel computing must be used, defaults to \code{FALSE}. This requires no additional packages, as the used \code{parallel} package is part of base \R. On Windows and on \R < 4.0.0 \code{\link[parallel:clusterApply]{parallel::parLapply()}} will be used, in all other cases the more efficient \code{\link[parallel:mclapply]{parallel::mclapply()}} will be used.} \item{parallel}{A \link{logical} to indicate if parallel computing must be used, defaults to \code{FALSE}. Requires the \code{\link[future.apply:future_lapply]{future.apply}} package. \strong{A non-sequential \code{\link[future:plan]{future::plan()}} must already be active before setting \code{parallel = TRUE}} — for example, \code{future::plan(future::multisession)}. An error is thrown if \code{parallel = TRUE} is used without a plan set by the user. Parallelism distributes columns (and optionally row batches) across workers; it is most beneficial when there are many antibiotic columns and a large number of rows.}
\item{max_cores}{Maximum number of cores to use if \code{parallel = TRUE}. Use a negative value to subtract that number from the available number of cores, e.g. a value of \code{-2} on an 8-core machine means that at most 6 cores will be used. Defaults to \code{-1}. There will never be used more cores than variables to analyse. The available number of cores are detected using \code{\link[parallelly:availableCores]{parallelly::availableCores()}} if that package is installed, and base \R's \code{\link[parallel:detectCores]{parallel::detectCores()}} otherwise.}
\item{clean}{A \link{logical} to indicate whether previously stored results should be forgotten after returning the 'logbook' with results.} \item{clean}{A \link{logical} to indicate whether previously stored results should be forgotten after returning the 'logbook' with results.}
} }
@@ -183,7 +181,7 @@ your_data \%>\% mutate_if(is.mic, as.sir, ab = c("cipro", "ampicillin", ...), mo
# for veterinary breakpoints, also set `host`: # for veterinary breakpoints, also set `host`:
your_data \%>\% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI") your_data \%>\% mutate_if(is.mic, as.sir, host = "column_with_animal_species", guideline = "CLSI")
# fast processing with parallel computing: # fast processing with parallel computing (requires future.apply):
as.sir(your_data, ..., parallel = TRUE) as.sir(your_data, ..., parallel = TRUE)
}\if{html}{\out{</div>}} }\if{html}{\out{</div>}}
\item Operators like "<=" will be considered according to the \code{capped_mic_handling} setting. At default, an MIC value of e.g. ">2" will return "NI" (non-interpretable) if the breakpoint is 4-8; the \emph{true} MIC could be at either side of the breakpoint. This is to prevent that capped values from raw laboratory data would not be treated conservatively. \item Operators like "<=" will be considered according to the \code{capped_mic_handling} setting. At default, an MIC value of e.g. ">2" will return "NI" (non-interpretable) if the breakpoint is 4-8; the \emph{true} MIC could be at either side of the breakpoint. This is to prevent that capped values from raw laboratory data would not be treated conservatively.
@@ -201,7 +199,7 @@ your_data \%>\% mutate_if(is.disk, as.sir, ab = c("cipro", "ampicillin", ...), m
# for veterinary breakpoints, also set `host`: # for veterinary breakpoints, also set `host`:
your_data \%>\% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI") your_data \%>\% mutate_if(is.disk, as.sir, host = "column_with_animal_species", guideline = "CLSI")
# fast processing with parallel computing: # fast processing with parallel computing (requires future.apply):
as.sir(your_data, ..., parallel = TRUE) as.sir(your_data, ..., parallel = TRUE)
}\if{html}{\out{</div>}} }\if{html}{\out{</div>}}
} }
@@ -313,9 +311,6 @@ as.sir(df_wide)
sir_interpretation_history() sir_interpretation_history()
\donttest{ \donttest{
# using parallel computing, which is available in base R:
as.sir(df_wide, parallel = TRUE, info = TRUE)
## Using dplyr ------------------------------------------------- ## Using dplyr -------------------------------------------------
if (require("dplyr")) { if (require("dplyr")) {

View File

@@ -1,10 +1,10 @@
% Generated by roxygen2: do not edit by hand % Generated by roxygen2: do not edit by hand
% Please edit documentation in R/custom_eucast_rules.R % Please edit documentation in R/custom_interpretive_rules.R
\name{custom_eucast_rules} \name{custom_interpretive_rules}
\alias{custom_eucast_rules} \alias{custom_interpretive_rules}
\title{Define Custom EUCAST Rules} \title{Define Custom Interpretive Rules}
\usage{ \usage{
custom_eucast_rules(...) custom_interpretive_rules(...)
} }
\arguments{ \arguments{
\item{...}{Rules in \link[base:tilde]{formula} notation, see below for instructions, and in \emph{Examples}.} \item{...}{Rules in \link[base:tilde]{formula} notation, see below for instructions, and in \emph{Examples}.}
@@ -13,22 +13,22 @@ custom_eucast_rules(...)
A \link{list} containing the custom rules A \link{list} containing the custom rules
} }
\description{ \description{
Define custom EUCAST rules for your organisation or specific analysis and use the output of this function in \code{\link[=eucast_rules]{eucast_rules()}}. Define custom interpretive rules for your organisation or specific analysis and use the output of this function in \code{\link[=interpretive_rules]{interpretive_rules()}}.
} }
\details{ \details{
Some organisations have their own adoption of EUCAST rules. This function can be used to define custom EUCAST rules to be used in the \code{\link[=eucast_rules]{eucast_rules()}} function. Some organisations have their own adoption of interpretive rules. This function can be used to define custom rules to be used in the \code{\link[=interpretive_rules]{interpretive_rules()}} function.
\subsection{Basics}{ \subsection{Basics}{
If you are familiar with the \code{\link[dplyr:case-and-replace-when]{case_when()}} function of the \code{dplyr} package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written \emph{before} the tilde (\code{~}) and the consequence of the rule is written \emph{after} the tilde: If you are familiar with the \code{\link[dplyr:case-and-replace-when]{case_when()}} function of the \code{dplyr} package, you will recognise the input method to set your own rules. Rules must be set using what \R considers to be the 'formula notation'. The rule itself is written \emph{before} the tilde (\code{~}) and the consequence of the rule is written \emph{after} the tilde:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_eucast_rules(TZP == "S" ~ aminopenicillins == "S", \if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_interpretive_rules(TZP == "S" ~ aminopenicillins == "S",
TZP == "R" ~ aminopenicillins == "R") TZP == "R" ~ aminopenicillins == "R")
}\if{html}{\out{</div>}} }\if{html}{\out{</div>}}
These are two custom EUCAST rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work: These are two custom interpretive rules: if TZP (piperacillin/tazobactam) is "S", all aminopenicillins (ampicillin and amoxicillin) must be made "S", and if TZP is "R", aminopenicillins must be made "R". These rules can also be printed to the console, so it is immediately clear how they work:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{x \if{html}{\out{<div class="sourceCode r">}}\preformatted{x
#> A set of custom EUCAST rules: #> A set of custom interpretive rules:
#> #>
#> 1. If TZP is "S" then set to S : #> 1. If TZP is "S" then set to S :
#> amoxicillin (AMX), ampicillin (AMP) #> amoxicillin (AMX), ampicillin (AMP)
@@ -48,11 +48,11 @@ df
#> 1 Escherichia coli R S S #> 1 Escherichia coli R S S
#> 2 Klebsiella pneumoniae R S S #> 2 Klebsiella pneumoniae R S S
eucast_rules(df, interpretive_rules(df,
rules = "custom", rules = "custom",
custom_rules = x, custom_rules = x,
info = FALSE, info = FALSE,
overwrite = TRUE) overwrite = TRUE)
#> mo TZP ampi cipro #> mo TZP ampi cipro
#> 1 Escherichia coli R R S #> 1 Escherichia coli R R S
#> 2 Klebsiella pneumoniae R R S #> 2 Klebsiella pneumoniae R R S
@@ -63,16 +63,16 @@ eucast_rules(df,
There is one exception in columns used for the rules: all column names of the \link{microorganisms} data set can also be used, but do not have to exist in the data set. These column names are: \code{"mo"}, \code{"fullname"}, \code{"status"}, \code{"kingdom"}, \code{"phylum"}, \code{"class"}, \code{"order"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"subspecies"}, \code{"rank"}, \code{"ref"}, \code{"oxygen_tolerance"}, \code{"source"}, \code{"lpsn"}, \code{"lpsn_parent"}, \code{"lpsn_renamed_to"}, \code{"mycobank"}, \code{"mycobank_parent"}, \code{"mycobank_renamed_to"}, \code{"gbif"}, \code{"gbif_parent"}, \code{"gbif_renamed_to"}, \code{"prevalence"}, and \code{"snomed"}. Thus, this next example will work as well, despite the fact that the \code{df} data set does not contain a column \code{genus}: There is one exception in columns used for the rules: all column names of the \link{microorganisms} data set can also be used, but do not have to exist in the data set. These column names are: \code{"mo"}, \code{"fullname"}, \code{"status"}, \code{"kingdom"}, \code{"phylum"}, \code{"class"}, \code{"order"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"subspecies"}, \code{"rank"}, \code{"ref"}, \code{"oxygen_tolerance"}, \code{"source"}, \code{"lpsn"}, \code{"lpsn_parent"}, \code{"lpsn_renamed_to"}, \code{"mycobank"}, \code{"mycobank_parent"}, \code{"mycobank_renamed_to"}, \code{"gbif"}, \code{"gbif_parent"}, \code{"gbif_renamed_to"}, \code{"prevalence"}, and \code{"snomed"}. Thus, this next example will work as well, despite the fact that the \code{df} data set does not contain a column \code{genus}:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{y <- custom_eucast_rules( \if{html}{\out{<div class="sourceCode r">}}\preformatted{y <- custom_interpretive_rules(
TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S", TZP == "S" & genus == "Klebsiella" ~ aminopenicillins == "S",
TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R" TZP == "R" & genus == "Klebsiella" ~ aminopenicillins == "R"
) )
eucast_rules(df, interpretive_rules(df,
rules = "custom", rules = "custom",
custom_rules = y, custom_rules = y,
info = FALSE, info = FALSE,
overwrite = TRUE) overwrite = TRUE)
#> mo TZP ampi cipro #> mo TZP ampi cipro
#> 1 Escherichia coli R S S #> 1 Escherichia coli R S S
#> 2 Klebsiella pneumoniae R R S #> 2 Klebsiella pneumoniae R R S
@@ -90,9 +90,9 @@ You can define antimicrobial groups instead of single antimicrobials for the rul
Rules can also be applied to multiple antimicrobials and antimicrobial groups simultaneously. Use the \code{c()} function to combine multiple antimicrobials. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R": Rules can also be applied to multiple antimicrobials and antimicrobial groups simultaneously. Use the \code{c()} function to combine multiple antimicrobials. For instance, the following example sets all aminopenicillins and ureidopenicillins to "R" if column TZP (piperacillin/tazobactam) is "R":
\if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_eucast_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R") \if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_interpretive_rules(TZP == "R" ~ c(aminopenicillins, ureidopenicillins) == "R")
x x
#> A set of custom EUCAST rules: #> A set of custom interpretive rules:
#> #>
#> 1. If TZP is "R" then set to "R": #> 1. If TZP is "R" then set to "R":
#> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP) #> amoxicillin (AMX), ampicillin (AMP), azlocillin (AZL), mezlocillin (MEZ), piperacillin (PIP), piperacillin/tazobactam (TZP)
@@ -147,7 +147,7 @@ These 43 antimicrobial groups are allowed in the rules (case-insensitive) and ca
} }
} }
\examples{ \examples{
x <- custom_eucast_rules( x <- custom_interpretive_rules(
AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I" AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I"
) )
@@ -165,7 +165,7 @@ eucast_rules(example_isolates,
# combine rule sets # combine rule sets
x2 <- c( x2 <- c(
x, x,
custom_eucast_rules(TZP == "R" ~ carbapenems == "R") custom_interpretive_rules(TZP == "R" ~ carbapenems == "R")
) )
x2 x2
} }

View File

@@ -46,7 +46,7 @@ A list with class \code{"htest"} containing the following
\code{(observed - expected) / sqrt(expected)}.} \code{(observed - expected) / sqrt(expected)}.}
\item{stdres}{standardized residuals, \item{stdres}{standardized residuals,
\code{(observed - expected) / sqrt(V)}, where \code{V} is the \code{(observed - expected) / sqrt(V)}, where \code{V} is the
residual cell variance (Agresti, 2007, section 2.4.5 residual cell variance {(\if{html}{\out{<a href="#reference+chisq.test.Rd+R+3AAgresti+3A2007" class="citation">}}Agresti 2007\if{html}{\out{</a>}}, section 2.4.5)}
for the case where \code{x} is a matrix, \code{n * p * (1 - p)} otherwise).} for the case where \code{x} is a matrix, \code{n * p * (1 - p)} otherwise).}
} }
\description{ \description{

View File

@@ -42,8 +42,9 @@ ggplot_pca(x, choices = 1:2, scale = 1, pc.biplot = TRUE,
} }
\item{pc.biplot}{ \item{pc.biplot}{
If true, use what Gabriel (1971) refers to as a "principal component If true, use what {\if{html}{\cite{}\out{<a href="#reference+biplot.princomp.Rd+R+3AGabriel+3A1971" class="citation">}}Gabriel (1971)\if{html}{\out{</a>}}} refers to as a
biplot", with \code{lambda = 1} and observations scaled up by sqrt(n) and \dQuote{principal component biplot},
with \code{lambda = 1} and observations scaled up by sqrt(n) and
variables scaled down by sqrt(n). Then inner products between variables scaled down by sqrt(n). Then inner products between
variables approximate covariances and distances between observations variables approximate covariances and distances between observations
approximate Mahalanobis distance. approximate Mahalanobis distance.

View File

@@ -46,7 +46,7 @@ eucast_dosage(ab, administration = "iv", version_breakpoints = 15)
\item{info}{A \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions.} \item{info}{A \link{logical} to indicate whether progress should be printed to the console - the default is only print while in interactive sessions.}
\item{rules}{A \link{character} vector that specifies which rules should be applied. Must be one or more of \code{"breakpoints"}, \code{"expected_phenotypes"}, \code{"expert"}, \code{"other"}, \code{"custom"}, \code{"all"}, and defaults to \code{c("breakpoints", "expected_phenotypes")}. The default value can be set to another value using the package option \code{\link[=AMR-options]{AMR_interpretive_rules}}: \code{options(AMR_interpretive_rules = "all")}. If using \code{"custom"}, be sure to fill in argument \code{custom_rules} too. Custom rules can be created with \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}.} \item{rules}{A \link{character} vector that specifies which rules should be applied. Must be one or more of \code{"breakpoints"}, \code{"expected_phenotypes"}, \code{"expert"}, \code{"other"}, \code{"custom"}, \code{"all"}, and defaults to \code{c("breakpoints", "expected_phenotypes")}. The default value can be set to another value using the package option \code{\link[=AMR-options]{AMR_interpretive_rules}}: \code{options(AMR_interpretive_rules = "all")}. If using \code{"custom"}, be sure to fill in argument \code{custom_rules} too. Custom rules can be created with \code{\link[=custom_interpretive_rules]{custom_interpretive_rules()}}.}
\item{guideline}{A guideline name, either "EUCAST" (default) or "CLSI". This can be set with the package option \code{\link[=AMR-options]{AMR_guideline}}.} \item{guideline}{A guideline name, either "EUCAST" (default) or "CLSI". This can be set with the package option \code{\link[=AMR-options]{AMR_guideline}}.}
@@ -62,7 +62,7 @@ eucast_dosage(ab, administration = "iv", version_breakpoints = 15)
\item{only_sir_columns}{A \link{logical} to indicate whether only antimicrobial columns must be included that were transformed to class \link[=as.sir]{sir} on beforehand. Defaults to \code{FALSE} if no columns of \code{x} have a class \link[=as.sir]{sir}.} \item{only_sir_columns}{A \link{logical} to indicate whether only antimicrobial columns must be included that were transformed to class \link[=as.sir]{sir} on beforehand. Defaults to \code{FALSE} if no columns of \code{x} have a class \link[=as.sir]{sir}.}
\item{custom_rules}{Custom rules to apply, created with \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}.} \item{custom_rules}{Custom rules to apply, created with \code{\link[=custom_interpretive_rules]{custom_interpretive_rules()}}.}
\item{overwrite}{A \link{logical} indicating whether to overwrite existing SIR values (default: \code{FALSE}). When \code{FALSE}, only non-SIR values are modified (i.e., any value that is not already S, I or R). To ensure compliance with EUCAST guidelines, \strong{this should remain} \code{FALSE}, as EUCAST notes often state that an organism "should be tested for susceptibility to individual agents or be reported resistant".} \item{overwrite}{A \link{logical} indicating whether to overwrite existing SIR values (default: \code{FALSE}). When \code{FALSE}, only non-SIR values are modified (i.e., any value that is not already S, I or R). To ensure compliance with EUCAST guidelines, \strong{this should remain} \code{FALSE}, as EUCAST notes often state that an organism "should be tested for susceptibility to individual agents or be reported resistant".}
@@ -86,15 +86,15 @@ To improve the interpretation of the antibiogram before CLSI/EUCAST interpretive
\strong{Note:} This function does not translate MIC or disk values to SIR values. Use \code{\link[=as.sir]{as.sir()}} for that. \cr \strong{Note:} This function does not translate MIC or disk values to SIR values. Use \code{\link[=as.sir]{as.sir()}} for that. \cr
\strong{Note:} When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr \strong{Note:} When ampicillin (AMP, J01CA01) is not available but amoxicillin (AMX, J01CA04) is, the latter will be used for all rules where there is a dependency on ampicillin. These drugs are interchangeable when it comes to expression of antimicrobial resistance. \cr
The file containing all EUCAST rules is located here: \url{https://github.com/msberends/AMR/blob/main/data-raw/eucast_rules.tsv}. \strong{Note:} Old taxonomic names are replaced with the current taxonomy where applicable. For example, \emph{Ochrobactrum anthropi} was renamed to \emph{Brucella anthropi} in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The \code{AMR} package contains the full microbial taxonomy updated until June 24th, 2024, see \link{microorganisms}. The file containing all interpretive rules is located here: \url{https://github.com/msberends/AMR/blob/main/data-raw/interpretive_rules.tsv}. \strong{Note:} Old taxonomic names are replaced with the current taxonomy where applicable. For example, \emph{Ochrobactrum anthropi} was renamed to \emph{Brucella anthropi} in 2020; the original EUCAST rules v3.1 and v3.2 did not yet contain this new taxonomic name. The \code{AMR} package contains the full microbial taxonomy updated until June 24th, 2024, see \link{microorganisms}.
\subsection{Custom Rules}{ \subsection{Custom Rules}{
Custom rules can be created using \code{\link[=custom_eucast_rules]{custom_eucast_rules()}}, e.g.: Custom rules can be created using \code{\link[=custom_interpretive_rules]{custom_interpretive_rules()}}, e.g.:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_eucast_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", \if{html}{\out{<div class="sourceCode r">}}\preformatted{x <- custom_interpretive_rules(AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I") AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I")
eucast_rules(example_isolates, rules = "custom", custom_rules = x) interpretive_rules(example_isolates, rules = "custom", custom_rules = x)
}\if{html}{\out{</div>}} }\if{html}{\out{</div>}}
} }
@@ -108,7 +108,7 @@ Before further processing, two non-EUCAST rules about drug combinations can be a
Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set. Important examples include amoxicillin and amoxicillin/clavulanic acid, and trimethoprim and trimethoprim/sulfamethoxazole. Needless to say, for these rules to work, both drugs must be available in the data set.
Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include \code{"other"} to the \code{rules} argument, or use \code{eucast_rules(..., rules = "all")}. You can also set the package option \code{\link[=AMR-options]{AMR_interpretive_rules}}, i.e. run \code{options(AMR_interpretive_rules = "all")}. Since these rules are not officially approved by EUCAST, they are not applied at default. To use these rules, include \code{"other"} to the \code{rules} argument, or use \code{interpretive_rules(..., rules = "all")}. You can also set the package option \code{\link[=AMR-options]{AMR_interpretive_rules}}, i.e. run \code{options(AMR_interpretive_rules = "all")}.
} }
} }
\section{Download Our Reference Data}{ \section{Download Our Reference Data}{

View File

@@ -12,8 +12,9 @@ A \link[tibble:tibble]{tibble} with 78 679 observations and 26 variables:
\item \code{status} \cr Status of the taxon, either \code{"accepted"}, \code{"not validly published"}, \code{"synonym"}, or \code{"unknown"} \item \code{status} \cr Status of the taxon, either \code{"accepted"}, \code{"not validly published"}, \code{"synonym"}, or \code{"unknown"}
\item \code{kingdom}, \code{phylum}, \code{class}, \code{order}, \code{family}, \code{genus}, \code{species}, \code{subspecies}\cr Taxonomic rank of the microorganism. Note that for fungi, \emph{phylum} is equal to their taxonomic \emph{division}. Also, for fungi, \emph{subkingdom} and \emph{subdivision} were left out since they do not occur in the bacterial taxonomy. \item \code{kingdom}, \code{phylum}, \code{class}, \code{order}, \code{family}, \code{genus}, \code{species}, \code{subspecies}\cr Taxonomic rank of the microorganism. Note that for fungi, \emph{phylum} is equal to their taxonomic \emph{division}. Also, for fungi, \emph{subkingdom} and \emph{subdivision} were left out since they do not occur in the bacterial taxonomy.
\item \code{rank}\cr Text of the taxonomic rank of the microorganism, such as \code{"species"} or \code{"genus"} \item \code{rank}\cr Text of the taxonomic rank of the microorganism, such as \code{"species"} or \code{"genus"}
\item \code{ref}\cr Author(s) and year of related scientific publication. This contains only the \emph{first surname} and year of the \emph{latest} authors, e.g. "Wallis \emph{et al.} 2006 \emph{emend.} Smith and Jones 2018" becomes "Smith \emph{et al.}, 2018". This field is directly retrieved from the source specified in the column \code{source}. Moreover, accents were removed to comply with CRAN that only allows ASCII characters. \item \code{ref}\cr Abbreviated authority citation for the nomenclatural act that established the current name combination, following ICNP conventions. For species described in their current genus (\emph{sp. nov.}), this is the original description author(s) and year. For species transferred to a different genus (\emph{comb. nov.}), this is the reclassification author(s) and year. Emendations are excluded. For synonyms, this is the authority under which the synonym was originally published. This field is directly retrieved from the source specified in the column \code{source}. Diacritics were removed to comply with CRAN, that only allows ASCII characters.
\item \code{oxygen_tolerance} \cr Oxygen tolerance, either \code{"aerobe"}, \code{"anaerobe"}, \code{"anaerobe/microaerophile"}, \code{"facultative anaerobe"}, \code{"likely facultative anaerobe"}, \code{"microaerophile"}, or NA. These data were retrieved from BacDive (see \emph{Source}). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently 68.3\% of all ~39 000 bacteria in the data set contain an oxygen tolerance. \item \code{oxygen_tolerance} \cr Oxygen tolerance, either \code{"aerobe"}, \code{"anaerobe"}, \code{"anaerobe/microaerophile"}, \code{"facultative anaerobe"}, \code{"likely facultative anaerobe"}, \code{"microaerophile"}, or NA. These data were retrieved from BacDive (see \emph{Source}). Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus to guess the oxygen tolerance. Currently 68.3\% of all ~39 000 bacteria in the data set contain an oxygen tolerance.
\item \code{morphology} \cr Morphology (cell shape), either \code{""}. These data were retrieved from BacDive (see \emph{Source}). Genera that are clinically established as coccobacilli (the HACEK group and beyond, such as \emph{Haemophilus} and \emph{Acinetobacter}) are classified as such regardless of BacDive majority vote. Items that contain "likely" are missing from BacDive and were extrapolated from other species within the same genus. Currently 0\% of all ~39 000 bacteria in the data set contain a morphology.
\item \code{source}\cr Either \code{"GBIF"}, \code{"LPSN"}, \code{"Manually added"}, \code{"MycoBank"}, or \code{"manually added"} (see \emph{Source}) \item \code{source}\cr Either \code{"GBIF"}, \code{"LPSN"}, \code{"Manually added"}, \code{"MycoBank"}, or \code{"manually added"} (see \emph{Source})
\item \code{lpsn}\cr Identifier ('Record number') of List of Prokaryotic names with Standing in Nomenclature (LPSN). This will be the first/highest LPSN identifier to keep one identifier per row. For example, \emph{Acetobacter ascendens} has LPSN Record number 7864 and 11011. Only the first is available in the \code{microorganisms} data set. \emph{\strong{This is a unique identifier}}, though available for only ~33 000 records. \item \code{lpsn}\cr Identifier ('Record number') of List of Prokaryotic names with Standing in Nomenclature (LPSN). This will be the first/highest LPSN identifier to keep one identifier per row. For example, \emph{Acetobacter ascendens} has LPSN Record number 7864 and 11011. Only the first is available in the \code{microorganisms} data set. \emph{\strong{This is a unique identifier}}, though available for only ~33 000 records.
\item \code{lpsn_parent}\cr LPSN identifier of the parent taxon \item \code{lpsn_parent}\cr LPSN identifier of the parent taxon

View File

@@ -24,6 +24,7 @@
\alias{mo_is_intrinsic_resistant} \alias{mo_is_intrinsic_resistant}
\alias{mo_oxygen_tolerance} \alias{mo_oxygen_tolerance}
\alias{mo_is_anaerobic} \alias{mo_is_anaerobic}
\alias{mo_morphology}
\alias{mo_snomed} \alias{mo_snomed}
\alias{mo_ref} \alias{mo_ref}
\alias{mo_authors} \alias{mo_authors}
@@ -86,7 +87,8 @@ mo_pathogenicity(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...)
mo_gramstain(x, language = get_AMR_locale(), mo_gramstain(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) keep_synonyms = getOption("AMR_keep_synonyms", FALSE),
add_morphology = FALSE, ...)
mo_is_gram_negative(x, language = get_AMR_locale(), mo_is_gram_negative(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...)
@@ -106,6 +108,9 @@ mo_oxygen_tolerance(x, language = get_AMR_locale(),
mo_is_anaerobic(x, language = get_AMR_locale(), mo_is_anaerobic(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...)
mo_morphology(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...)
mo_snomed(x, language = get_AMR_locale(), mo_snomed(x, language = get_AMR_locale(),
keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...) keep_synonyms = getOption("AMR_keep_synonyms", FALSE), ...)
@@ -161,6 +166,8 @@ The default is \code{FALSE}, which will return a note if outdated taxonomic name
\item{...}{Other arguments passed on to \code{\link[=as.mo]{as.mo()}}, such as 'minimum_matching_score', 'ignore_pattern', and 'remove_from_input'.} \item{...}{Other arguments passed on to \code{\link[=as.mo]{as.mo()}}, such as 'minimum_matching_score', 'ignore_pattern', and 'remove_from_input'.}
\item{add_morphology}{a \link{logical} to indicate whether the morphology (from \code{\link[=mo_morphology]{mo_morphology()}}) should be added to the Gram stain result, e.g. \code{"Gram-negative rods"} instead of \code{"Gram-negative"}. The default is \code{FALSE}.}
\item{ab}{Any (vector of) text that can be coerced to a valid antibiotic drug code with \code{\link[=as.ab]{as.ab()}}.} \item{ab}{Any (vector of) text that can be coerced to a valid antibiotic drug code with \code{\link[=as.ab]{as.ab()}}.}
\item{open}{Browse the URL using \code{\link[utils:browseURL]{browseURL()}}.} \item{open}{Browse the URL using \code{\link[utils:browseURL]{browseURL()}}.}
@@ -189,21 +196,23 @@ All functions will, at default, \strong{not} keep old taxonomic properties, as s
\item \code{mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)} will return \code{"Hormaeche et al., 1960"} (with a once-per-session warning that the name is outdated) \item \code{mo_ref("Enterobacter aerogenes", keep_synonyms = TRUE)} will return \code{"Hormaeche et al., 1960"} (with a once-per-session warning that the name is outdated)
} }
The short name (\code{\link[=mo_shortname]{mo_shortname()}}) returns the first character of the genus and the full species, such as \code{"E. coli"}, for species and subspecies. Exceptions are abbreviations of staphylococci (such as \emph{"CoNS"}, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as \emph{"GBS"}, Group B Streptococci). Please bear in mind that e.g. \emph{E. coli} could mean \emph{Escherichia coli} (kingdom of Bacteria) as well as \emph{Entamoeba coli} (kingdom of Protozoa). Returning to the full name will be done using \code{\link[=as.mo]{as.mo()}} internally, giving priority to bacteria and human pathogens, i.e. \code{"E. coli"} will be considered \emph{Escherichia coli}. As a result, \code{mo_fullname(mo_shortname("Entamoeba coli"))} returns \code{"Escherichia coli"}. \code{\link[=mo_ref]{mo_ref()}} returns the abbreviated authority of the nomenclatural act that created the queried name combination. When \code{keep_synonyms = FALSE} (default), this is the authority of the currently accepted name. When \code{keep_synonyms = TRUE}, this is the authority under which the queried (possibly outdated) name was published. Emendations (changes to the species description without a name change) are not reflected; only the combination or original description authority is returned.
The short name (\code{\link[=mo_shortname]{mo_shortname()}}) returns the first character of the genus and the full species, such as \code{"E. coli"}, for species and subspecies. Exceptions are abbreviations of staphylococci (such as \emph{"CoNS"}, Coagulase-Negative Staphylococci) and beta-haemolytic streptococci (such as \emph{"GBS"}, Group B Streptococci). Please bear in mind that e.g. \emph{E. coli} could mean \emph{Escherichia coli} (kingdom of Bacteria) as well as \emph{Entamoeba coli} (kingdom of Protozoa). Returning to the full name will be done using \code{\link[=as.mo]{as.mo()}} internally, giving priority to bacteria and human pathogens, i.e. \code{"E. coli"} will always be considered \emph{Escherichia coli}. As a result, \code{mo_fullname(mo_shortname("Entamoeba coli"))} returns \code{"Escherichia coli"}.
Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions \code{\link[=mo_kingdom]{mo_kingdom()}} and \code{\link[=mo_domain]{mo_domain()}} return the exact same results. Since the top-level of the taxonomy is sometimes referred to as 'kingdom' and sometimes as 'domain', the functions \code{\link[=mo_kingdom]{mo_kingdom()}} and \code{\link[=mo_domain]{mo_domain()}} return the exact same results.
Determination of human pathogenicity (\code{\link[=mo_pathogenicity]{mo_pathogenicity()}}) is strongly based on Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}). This function returns a \link{factor} with the levels \emph{Pathogenic}, \emph{Potentially pathogenic}, \emph{Non-pathogenic}, and \emph{Unknown}. Determination of human pathogenicity (\code{\link[=mo_pathogenicity]{mo_pathogenicity()}}) is strongly based on Bartlett \emph{et al.} (2022, \doi{10.1099/mic.0.001269}). This function returns a \link{factor} with the levels \emph{Pathogenic}, \emph{Potentially pathogenic}, \emph{Non-pathogenic}, and \emph{Unknown}.
Determination of the Gram stain (\code{\link[=mo_gramstain]{mo_gramstain()}}) will be based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, \href{https://pubmed.ncbi.nlm.nih.gov/11837318/}{PMID 11837318}), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, \href{https://pubmed.ncbi.nlm.nih.gov/34694987/}{PMID 34694987}). Bacteria in these phyla are considered Gram-positive in this \code{AMR} package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value \code{NA}. Functions \code{\link[=mo_is_gram_negative]{mo_is_gram_negative()}} and \code{\link[=mo_is_gram_positive]{mo_is_gram_positive()}} always return \code{TRUE} or \code{FALSE} (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}), thus always return \code{FALSE} for species outside the taxonomic kingdom of Bacteria. Determination of the Gram stain (\code{\link[=mo_gramstain]{mo_gramstain()}} is based on the taxonomic kingdom and phylum. Originally, Cavalier-Smith defined the so-called subkingdoms Negibacteria and Posibacteria (2002, \href{https://pubmed.ncbi.nlm.nih.gov/11837318/}{PMID 11837318}), and only considered these phyla as Posibacteria: Actinobacteria, Chloroflexi, Firmicutes, and Tenericutes. These phyla were later renamed to Actinomycetota, Chloroflexota, Bacillota, and Mycoplasmatota (2021, \href{https://pubmed.ncbi.nlm.nih.gov/34694987/}{PMID 34694987}). Bacteria in these phyla are considered Gram-positive in this \code{AMR} package, except for members of the class Negativicutes (within phylum Bacillota) which are Gram-negative. All other bacteria are considered Gram-negative. Species outside the kingdom of Bacteria will return a value \code{NA}. Functions \code{\link[=mo_is_gram_negative]{mo_is_gram_negative()}} and \code{\link[=mo_is_gram_positive]{mo_is_gram_positive()}} always return \code{TRUE} or \code{FALSE} (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}), thus always return \code{FALSE} for species outside the taxonomic kingdom of Bacteria.
Determination of yeasts (\code{\link[=mo_is_yeast]{mo_is_yeast()}}) will be based on the taxonomic kingdom and class. \emph{Budding yeasts} are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. \emph{True yeasts} quite specifically refers to yeasts in the underlying order Saccharomycetales (such as \emph{Saccharomyces cerevisiae}). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return \code{TRUE}. It returns \code{FALSE} otherwise (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}). Determination of yeasts (\code{\link[=mo_is_yeast]{mo_is_yeast()}}) is based on the taxonomic kingdom and class. \emph{Budding yeasts} are yeasts that reproduce asexually through a process called budding, where a new cell develops from a small protrusion on the parent cell. Taxonomically, these are members of the phylum Ascomycota, class Saccharomycetes (also called Hemiascomycetes) or Pichiomycetes. \emph{True yeasts} quite specifically refers to yeasts in the underlying order Saccharomycetales (such as \emph{Saccharomyces cerevisiae}). Thus, for all microorganisms that are member of the taxonomic class Saccharomycetes or Pichiomycetes, the function will return \code{TRUE}. It returns \code{FALSE} otherwise (or \code{NA} when the input is \code{NA} or the MO code is \code{UNKNOWN}).
Determination of intrinsic resistance (\code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}}) will be based on the \link{intrinsic_resistant} data set, which is based on \href{https://www.eucast.org/bacteria/important-additional-information/expert-rules/}{'EUCAST Expected Resistant Phenotypes' v1.2} (2023). The \code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}} function can be vectorised over both argument \code{x} (input for microorganisms) and \code{ab} (input for antimicrobials). Determination of intrinsic resistance (\code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}}) is based on the \link{intrinsic_resistant} data set, which is based on \href{https://www.eucast.org/bacteria/important-additional-information/expert-rules/}{'EUCAST Expected Resistant Phenotypes' v1.2} (2023). The \code{\link[=mo_is_intrinsic_resistant]{mo_is_intrinsic_resistant()}} function can be vectorised over both argument \code{x} (input for microorganisms) and \code{ab} (input for antimicrobials).
Determination of bacterial oxygen tolerance (\code{\link[=mo_oxygen_tolerance]{mo_oxygen_tolerance()}}) will be based on BacDive, see \emph{Source}. The function \code{\link[=mo_is_anaerobic]{mo_is_anaerobic()}} only returns \code{TRUE} if the oxygen tolerance is \code{"anaerobe"}, indicting an obligate anaerobic species or genus. It always returns \code{FALSE} for species outside the taxonomic kingdom of Bacteria. Determination of both bacterial oxygen tolerance (\code{\link[=mo_oxygen_tolerance]{mo_oxygen_tolerance()}}) and morphology (\code{\link[=mo_morphology]{mo_morphology()}}) are based on BacDive, see \emph{Source}. The function \code{\link[=mo_is_anaerobic]{mo_is_anaerobic()}} only returns \code{TRUE} if the oxygen tolerance is \code{"anaerobe"}, indicating an obligate anaerobic species or genus. It always returns \code{FALSE} for species outside the taxonomic kingdom of Bacteria.
The function \code{\link[=mo_url]{mo_url()}} will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. \href{https://www.mycobank.org}{This MycoBank URL} will be used for fungi wherever available , \href{https://www.mycobank.org}{this LPSN URL} for bacteria wherever available, and \href{https://www.gbif.org}{this GBIF link} otherwise. The function \code{\link[=mo_url]{mo_url()}} will return the direct URL to the online database entry, which also shows the scientific reference of the concerned species. \href{https://www.mycobank.org}{This MycoBank URL} is used for fungi wherever available , \href{https://www.mycobank.org}{this LPSN URL} for bacteria wherever available, and \href{https://www.gbif.org}{this GBIF link} otherwise.
SNOMED codes (\code{\link[=mo_snomed]{mo_snomed()}}) was last updated on July 16th, 2024. See \emph{Source} and the \link{microorganisms} data set for more info. SNOMED codes (\code{\link[=mo_snomed]{mo_snomed()}}) was last updated on July 16th, 2024. See \emph{Source} and the \link{microorganisms} data set for more info.
@@ -260,8 +269,10 @@ mo_shortname("Klebsiella pneumoniae")
# other properties --------------------------------------------------------- # other properties ---------------------------------------------------------
mo_pathogenicity("Klebsiella pneumoniae") mo_morphology("Klebsiella pneumoniae")
mo_gramstain("Klebsiella pneumoniae") mo_gramstain("Klebsiella pneumoniae")
mo_gramstain("Klebsiella pneumoniae", add_morphology = TRUE)
mo_pathogenicity("Klebsiella pneumoniae")
mo_snomed("Klebsiella pneumoniae") mo_snomed("Klebsiella pneumoniae")
mo_type("Klebsiella pneumoniae") mo_type("Klebsiella pneumoniae")
mo_rank("Klebsiella pneumoniae") mo_rank("Klebsiella pneumoniae")

View File

@@ -30,6 +30,15 @@
test_that("test-_deprecated.R", { test_that("test-_deprecated.R", {
skip_on_cran() skip_on_cran()
expect_warning(example_isolates[, ab_class("mycobact")]) if (getRversion() > "4.0.0") {
expect_warning(example_isolates[, ab_selector(name %like% "trim")]) expect_warning(example_isolates[, ab_class("mycobact")])
expect_warning(example_isolates[, ab_selector(name %like% "trim")])
# deprecated custom_interpretive_rules() still works and emits a warning
expect_warning(
x_old <- custom_eucast_rules(AMC == "R" ~ aminopenicillins == "R"),
regexp = "custom_eucast_rules"
)
expect_inherits(x_old, "custom_interpretive_rules")
}
}) })

View File

@@ -130,6 +130,77 @@ test_that("test-antibiogram.R", {
expect_equal(colnames(ab9), c("ward", "gender", "Piperacillin/tazobactam", "Piperacillin/tazobactam + Gentamicin", "Piperacillin/tazobactam + Tobramycin")) expect_equal(colnames(ab9), c("ward", "gender", "Piperacillin/tazobactam", "Piperacillin/tazobactam + Gentamicin", "Piperacillin/tazobactam + Tobramycin"))
} }
# Parallel computing ----------------------------------------------------
# Tests must pass even when only 1 core is available; parallel = TRUE then
# silently falls back to sequential, but results must still be correct.
if (AMR:::pkg_is_available("future.apply")) {
set.seed(42)
# sequential reference for WISCA
wisca_seq <- suppressWarnings(suppressMessages(
wisca(example_isolates, antimicrobials = c("TZP", "TZP+TOB", "TZP+GEN"), simulations = 100, info = FALSE)
))
future::plan(future::multicore)
# 1. parallel = TRUE produces the same antibiogram structure as sequential
wisca_par <- suppressWarnings(suppressMessages(
wisca(example_isolates, antimicrobials = c("TZP", "TZP+TOB", "TZP+GEN"), simulations = 100, parallel = TRUE, info = FALSE)
))
expect_inherits(wisca_par, "antibiogram")
expect_equal(colnames(wisca_par), colnames(wisca_seq))
expect_true(isTRUE(attributes(wisca_par)$wisca))
# 2. coverage values are non-NA and fall within [0, 1]
ln <- attributes(wisca_par)$long_numeric
expect_false(anyNA(ln$coverage))
expect_false(anyNA(ln$lower_ci))
expect_false(anyNA(ln$upper_ci))
expect_true(all(ln$coverage >= 0 & ln$coverage <= 1))
expect_true(all(ln$lower_ci <= ln$coverage))
expect_true(all(ln$upper_ci >= ln$coverage))
# 3. a second parallel run gives the same column names
wisca_par2 <- suppressWarnings(suppressMessages(
wisca(example_isolates, antimicrobials = c("TZP", "TZP+TOB", "TZP+GEN"), simulations = 100, parallel = TRUE, info = FALSE)
))
expect_equal(colnames(wisca_par), colnames(wisca_par2))
# 4. parallel with workers = 1 gives same structure as sequential
future::plan(future::multicore, workers = 1)
wisca_par1 <- suppressWarnings(suppressMessages(
wisca(example_isolates, antimicrobials = c("TZP", "TZP+TOB", "TZP+GEN"), simulations = 100, parallel = TRUE, info = FALSE)
))
expect_equal(colnames(wisca_seq), colnames(wisca_par1))
# 5. grouped antibiogram in parallel yields identical structure to sequential
if (AMR:::pkg_is_available("dplyr", min_version = "1.0.0", also_load = TRUE)) {
future::plan(future::sequential)
ab_grp_seq <- suppressWarnings(suppressMessages(
example_isolates %>%
group_by(ward) %>%
wisca(antimicrobials = c("TZP", "TZP+TOB"), simulations = 50, info = FALSE)
))
future::plan(future::multicore)
ab_grp_par <- suppressWarnings(suppressMessages(
example_isolates %>%
group_by(ward) %>%
wisca(antimicrobials = c("TZP", "TZP+TOB"), simulations = 50, parallel = TRUE, info = FALSE)
))
expect_equal(colnames(ab_grp_seq), colnames(ab_grp_par))
expect_equal(nrow(ab_grp_seq), nrow(ab_grp_par))
}
# 6. parallel = TRUE without a plan raises an informative error
future::plan(future::sequential)
expect_error(
suppressWarnings(wisca(example_isolates, antimicrobials = "TZP", parallel = TRUE, info = FALSE)),
"non-sequential"
)
future::plan(future::sequential)
}
# Generate plots with ggplot2 or base R -------------------------------- # Generate plots with ggplot2 or base R --------------------------------

View File

@@ -53,12 +53,12 @@ test_that("test-data.R", {
expect_false(anyNA(microorganisms.codes$mo)) expect_false(anyNA(microorganisms.codes$mo))
expect_true(all(dosage$ab %in% AMR::antimicrobials$ab)) expect_true(all(dosage$ab %in% AMR::antimicrobials$ab))
expect_true(all(dosage$name %in% AMR::antimicrobials$name)) expect_true(all(dosage$name %in% AMR::antimicrobials$name))
eucast_abx <- AMR:::EUCAST_RULES_DF$and_these_antibiotics interpretive_abx <- AMR:::INTERPRETIVE_RULES_DF$and_these_antibiotics
eucast_abx <- unique(unlist(strsplit(eucast_abx[!is.na(eucast_abx)], ", +"))) interpretive_abx <- unique(unlist(strsplit(interpretive_abx[!is.na(interpretive_abx)], ", +")))
expect_true(all(eucast_abx %in% AMR::antimicrobials$ab), expect_true(all(interpretive_abx %in% AMR::antimicrobials$ab),
info = paste0( info = paste0(
"Missing in `antimicrobials` data set: ", "Missing in `antimicrobials` data set: ",
toString(eucast_abx[which(!eucast_abx %in% AMR::antimicrobials$ab)]) toString(interpretive_abx[which(!interpretive_abx %in% AMR::antimicrobials$ab)])
) )
) )

View File

@@ -27,13 +27,14 @@
# how to conduct AMR data analysis: https://amr-for-r.org # # how to conduct AMR data analysis: https://amr-for-r.org #
# ==================================================================== # # ==================================================================== #
test_that("test-eucast_rules.R", { test_that("test-interpretive_rules.R", {
skip_on_cran() skip_on_cran()
# thoroughly check input table # thoroughly check input table
expect_equal( expect_equal(
sort(colnames(AMR:::EUCAST_RULES_DF)), sort(colnames(AMR:::INTERPRETIVE_RULES_DF)),
sort(c( sort(c(
"rule.provider",
"if_mo_property", "like.is.one_of", "this_value", "if_mo_property", "like.is.one_of", "this_value",
"and_these_antibiotics", "have_these_values", "and_these_antibiotics", "have_these_values",
"then_change_these_antibiotics", "to_value", "then_change_these_antibiotics", "to_value",
@@ -42,7 +43,7 @@ test_that("test-eucast_rules.R", {
"note" "note"
)) ))
) )
MOs_mentioned <- unique(AMR:::EUCAST_RULES_DF$this_value) MOs_mentioned <- unique(AMR:::INTERPRETIVE_RULES_DF$this_value)
MOs_mentioned <- sort(trimws(unlist(strsplit(MOs_mentioned[!AMR:::is_valid_regex(MOs_mentioned)], ",", fixed = TRUE)))) MOs_mentioned <- sort(trimws(unlist(strsplit(MOs_mentioned[!AMR:::is_valid_regex(MOs_mentioned)], ",", fixed = TRUE))))
MOs_test <- suppressWarnings( MOs_test <- suppressWarnings(
trimws(paste( trimws(paste(
@@ -54,19 +55,19 @@ test_that("test-eucast_rules.R", {
MOs_test[MOs_test == ""] <- mo_fullname(MOs_mentioned[MOs_test == ""], keep_synonyms = TRUE, language = NULL) MOs_test[MOs_test == ""] <- mo_fullname(MOs_mentioned[MOs_test == ""], keep_synonyms = TRUE, language = NULL)
expect_equal(MOs_mentioned, MOs_test) expect_equal(MOs_mentioned, MOs_test)
expect_error(suppressWarnings(eucast_rules(example_isolates, col_mo = "Non-existing"))) expect_error(suppressWarnings(interpretive_rules(example_isolates, col_mo = "Non-existing")))
expect_error(eucast_rules(x = "text")) expect_error(interpretive_rules(x = "text"))
expect_error(eucast_rules(data.frame(a = "test"))) expect_error(interpretive_rules(data.frame(a = "test")))
expect_error(eucast_rules(data.frame(mo = "test"), rules = "invalid rules set")) expect_error(interpretive_rules(data.frame(mo = "test"), rules = "invalid rules set"))
# expect_warning(eucast_rules(data.frame(mo = "Escherichia coli", vancomycin = "S", stringsAsFactors = TRUE))) # expect_warning(interpretive_rules(data.frame(mo = "Escherichia coli", vancomycin = "S", stringsAsFactors = TRUE)))
expect_identical( expect_identical(
colnames(example_isolates), colnames(example_isolates),
colnames(suppressWarnings(eucast_rules(example_isolates, info = FALSE))) colnames(suppressWarnings(interpretive_rules(example_isolates, info = FALSE)))
) )
expect_output(suppressMessages(eucast_rules(example_isolates, info = TRUE))) expect_output(suppressMessages(interpretive_rules(example_isolates, info = TRUE)))
a <- data.frame( a <- data.frame(
mo = c( mo = c(
@@ -86,8 +87,8 @@ test_that("test-eucast_rules.R", {
amox = "R", # Amoxicillin amox = "R", # Amoxicillin
stringsAsFactors = FALSE stringsAsFactors = FALSE
) )
expect_identical(suppressWarnings(eucast_rules(a, "mo", info = FALSE)), b) expect_identical(suppressWarnings(interpretive_rules(a, "mo", info = FALSE)), b)
expect_output(suppressMessages(suppressWarnings(eucast_rules(a, "mo", info = TRUE)))) expect_output(suppressMessages(suppressWarnings(interpretive_rules(a, "mo", info = TRUE))))
a <- data.frame( a <- data.frame(
mo = c( mo = c(
@@ -105,7 +106,7 @@ test_that("test-eucast_rules.R", {
COL = "R", # Colistin COL = "R", # Colistin
stringsAsFactors = FALSE stringsAsFactors = FALSE
) )
expect_equal(suppressWarnings(eucast_rules(a, "mo", info = FALSE)), b) expect_equal(suppressWarnings(interpretive_rules(a, "mo", info = FALSE)), b)
# piperacillin must be R in Enterobacteriaceae when tica is R # piperacillin must be R in Enterobacteriaceae when tica is R
if (AMR:::pkg_is_available("dplyr", min_version = "1.0.0", also_load = TRUE)) { if (AMR:::pkg_is_available("dplyr", min_version = "1.0.0", also_load = TRUE)) {
@@ -117,7 +118,7 @@ test_that("test-eucast_rules.R", {
TIC = as.sir("R"), TIC = as.sir("R"),
PIP = as.sir("S") PIP = as.sir("S")
) %>% ) %>%
eucast_rules(col_mo = "mo", version_expertrules = 3.1, rules = "expert", info = FALSE, overwrite = TRUE) %>% interpretive_rules(col_mo = "mo", version_expertrules = 3.1, rules = "expert", info = FALSE, overwrite = TRUE) %>%
pull(PIP) %>% pull(PIP) %>%
unique() %>% unique() %>%
as.character() as.character()
@@ -127,7 +128,7 @@ test_that("test-eucast_rules.R", {
} }
# azithromycin and clarythromycin must be equal to Erythromycin # azithromycin and clarythromycin must be equal to Erythromycin
a <- suppressWarnings(as.sir(eucast_rules( a <- suppressWarnings(as.sir(interpretive_rules(
data.frame( data.frame(
mo = example_isolates$mo, mo = example_isolates$mo,
ERY = example_isolates$ERY, ERY = example_isolates$ERY,
@@ -149,7 +150,7 @@ test_that("test-eucast_rules.R", {
# amox is inferred by benzylpenicillin in Kingella kingae # amox is inferred by benzylpenicillin in Kingella kingae
expect_equal( expect_equal(
suppressWarnings( suppressWarnings(
as.list(eucast_rules( as.list(interpretive_rules(
data.frame( data.frame(
mo = as.mo("Kingella kingae"), mo = as.mo("Kingella kingae"),
PEN = "S", PEN = "S",
@@ -164,16 +165,16 @@ test_that("test-eucast_rules.R", {
# also test norf # also test norf
if (AMR:::pkg_is_available("dplyr", min_version = "1.0.0", also_load = TRUE)) { if (AMR:::pkg_is_available("dplyr", min_version = "1.0.0", also_load = TRUE)) {
expect_output(suppressWarnings(eucast_rules(example_isolates %>% mutate(NOR = "S", NAL = "S"), info = TRUE))) expect_output(suppressWarnings(interpretive_rules(example_isolates %>% mutate(NOR = "S", NAL = "S"), info = TRUE)))
} }
# check verbose output # check verbose output
expect_output(suppressWarnings(eucast_rules(example_isolates, verbose = TRUE, rules = "all", info = TRUE))) expect_output(suppressWarnings(interpretive_rules(example_isolates, verbose = TRUE, rules = "all", info = TRUE)))
# AmpC de-repressed cephalo mutants # AmpC de-repressed cephalo mutants
expect_identical( expect_identical(
eucast_rules( interpretive_rules(
data.frame( data.frame(
mo = c("Escherichia coli", "Enterobacter cloacae"), mo = c("Escherichia coli", "Enterobacter cloacae"),
cefotax = as.sir(c("S", "S")) cefotax = as.sir(c("S", "S"))
@@ -187,7 +188,7 @@ test_that("test-eucast_rules.R", {
) )
expect_identical( expect_identical(
eucast_rules( interpretive_rules(
data.frame( data.frame(
mo = c("Escherichia coli", "Enterobacter cloacae"), mo = c("Escherichia coli", "Enterobacter cloacae"),
cefotax = as.sir(c("S", "S")) cefotax = as.sir(c("S", "S"))
@@ -201,7 +202,7 @@ test_that("test-eucast_rules.R", {
) )
expect_identical( expect_identical(
eucast_rules( interpretive_rules(
data.frame( data.frame(
mo = c("Escherichia coli", "Enterobacter cloacae"), mo = c("Escherichia coli", "Enterobacter cloacae"),
cefotax = as.sir(c("S", "S")) cefotax = as.sir(c("S", "S"))
@@ -219,7 +220,7 @@ test_that("test-eucast_rules.R", {
expect_inherits(eucast_dosage(c("tobra", "genta", "cipro")), "data.frame") expect_inherits(eucast_dosage(c("tobra", "genta", "cipro")), "data.frame")
x <- custom_eucast_rules( x <- custom_interpretive_rules(
AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R", AMC == "R" & genus == "Klebsiella" ~ aminopenicillins == "R",
AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I", AMC == "I" & genus == "Klebsiella" ~ aminopenicillins == "I",
AMX == "S" ~ AMC == "S" AMX == "S" ~ AMC == "S"
@@ -230,7 +231,7 @@ test_that("test-eucast_rules.R", {
# this custom rules makes 8 changes # this custom rules makes 8 changes
expect_equal( expect_equal(
nrow(eucast_rules(example_isolates, nrow(interpretive_rules(example_isolates,
rules = "custom", rules = "custom",
custom_rules = x, custom_rules = x,
info = FALSE, info = FALSE,
@@ -240,4 +241,10 @@ test_that("test-eucast_rules.R", {
8, 8,
tolerance = 0.5 tolerance = 0.5
) )
# clsi_rules() no longer errors (returns data unchanged until CLSI rows are added)
expect_identical(
suppressWarnings(clsi_rules(example_isolates, info = FALSE)),
example_isolates
)
}) })

View File

@@ -84,6 +84,16 @@ test_that("test-mo.R", {
# expect_warning(as.mo("Acinetobacter calcoaceticus/baumannii complex")) # expect_warning(as.mo("Acinetobacter calcoaceticus/baumannii complex"))
# Issue #287: "X complex" fallback to "X" when complex is not a distinct taxon
expect_identical(as.character(suppressWarnings(as.mo("Proteus vulgaris complex"))), as.character(suppressWarnings(as.mo("Proteus vulgaris"))))
expect_identical(as.character(suppressWarnings(as.mo("Enterobacter cloacae complex"))), as.character(as.mo("Enterobacter cloacae complex")))
# Issue #288: abbreviated genus with exact species epithet match should win
expect_identical(
as.character(suppressWarnings(as.mo("S. apiospermum"))),
as.character(suppressWarnings(as.mo("Scedosporium apiospermum")))
)
# prevalent MO # prevalent MO
expect_identical( expect_identical(
suppressWarnings(as.character( suppressWarnings(as.character(

View File

@@ -406,40 +406,199 @@ test_that("test-sir.R", {
expect_equal(out3, as.sir(c("NWT", "WT", "NWT"))) expect_equal(out3, as.sir(c("NWT", "WT", "NWT")))
expect_equal(out4, as.sir(c("NWT", "WT", "NWT"))) expect_equal(out4, as.sir(c("NWT", "WT", "NWT")))
# Issue #278: re-running as.sir() on already-<sir> data must preserve columns
df_already_sir <- data.frame(
mo = "B_ESCHR_COLI",
AMC = as.mic(c("1", "2", "4")),
GEN = sample(c("S", "I", "R"), 3, replace = TRUE),
stringsAsFactors = FALSE
)
first_pass <- suppressMessages(as.sir(df_already_sir, col_mo = "mo", info = FALSE))
second_pass <- suppressMessages(as.sir(first_pass, col_mo = "mo", info = FALSE))
expect_equal(ncol(first_pass), ncol(second_pass))
expect_true(is.sir(second_pass[["AMC"]]))
expect_true(is.sir(second_pass[["GEN"]]))
expect_identical(first_pass[["AMC"]], second_pass[["AMC"]])
expect_identical(first_pass[["GEN"]], second_pass[["GEN"]])
# Issue #278: metadata columns whose names coincidentally match antibiotic
# codes (e.g. 'patient' -> OXY, 'ward' -> PRU) must not be processed
df_meta <- data.frame(
mo = "B_ESCHR_COLI",
patient = paste0("Pt_", 1:20),
ward = rep(c("ICU", "Surgery", "Outpatient", "ED"), 5),
AMC = as.mic(rep(c("1", "2", "4", "8"), 5)),
stringsAsFactors = FALSE
)
df_meta_sir <- suppressMessages(as.sir(df_meta, col_mo = "mo", info = FALSE))
expect_true("patient" %in% colnames(df_meta_sir))
expect_true("ward" %in% colnames(df_meta_sir))
expect_false(is.sir(df_meta_sir[["patient"]]))
expect_false(is.sir(df_meta_sir[["ward"]]))
expect_true(is.sir(df_meta_sir[["AMC"]]))
# Parallel computing ---------------------------------------------------- # Parallel computing ----------------------------------------------------
# Tests must pass even when only 1 core is available; parallel = TRUE then
# silently falls back to sequential, but results must still be identical.
# MB 29 Apr 2025: I have run the code of AVC, PEI, Canada (dataset of 2854x65), and compared it like this: if (AMR:::pkg_is_available("future.apply")) {
set.seed(42)
n_par <- 200
df_par <- data.frame(
mo = "B_ESCHR_COLI",
AMC = as.mic(sample(c("0.25", "0.5", "1", "2", "4", "8", "16", "32"), n_par, TRUE)),
GEN = as.mic(sample(c("0.5", "1", "2", "4", "8", "16", "32", "64"), n_par, TRUE)),
CIP = as.mic(sample(c("0.001", "0.002", "0.004", "0.008", "0.016", "0.032"), n_par, TRUE)),
PEN = sample(c("S", "I", "R", NA_character_), n_par, TRUE),
stringsAsFactors = FALSE
)
# system.time({ # clear any existing history before comparing
# data_2022_2023_SIR_parallel <- data_2022_2023_clean |> sir_interpretation_history(clean = TRUE)
# as.sir(amikacin:tiamulin, sir_seq <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE))
# col_mo = "mo", log_seq <- sir_interpretation_history(clean = TRUE)
# guideline = "CLSI 2024",
# host = "Species",
# uti = "isUTI",
# parallel = TRUE)
# })
# # user system elapsed
# # 271.424 2.767 45.762
#
# history_parallel <- sir_interpretation_history(clean = TRUE)
#
# system.time({
# data_2022_2023_SIR <- data_2022_2023_clean |>
# as.sir(amikacin:tiamulin,
# col_mo = "mo",
# guideline = "CLSI 2024",
# host = "Species",
# uti = "isUTI")
# })
# # user system elapsed
# # 120.637 5.406 128.835
# history <- sir_interpretation_history()
future::plan(future::multicore)
n_max_workers <- future::nbrOfWorkers()
# and then got this: sir_par <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
# identical(history[, -1], history_parallel[, -1]) log_par <- sir_interpretation_history(clean = TRUE)
#> [1] TRUE
# so parallel on Apple M2 is 2.8x faster, with identical history -> GREAT! # 1. parallel = TRUE gives identical SIR results to sequential
expect_identical(sir_seq[["AMC"]], sir_par[["AMC"]])
expect_identical(sir_seq[["GEN"]], sir_par[["GEN"]])
expect_identical(sir_seq[["CIP"]], sir_par[["CIP"]])
expect_identical(sir_seq[["PEN"]], sir_par[["PEN"]])
# 2. same number of log rows as sequential
expect_equal(nrow(log_seq), nrow(log_par))
# 3. pre-existing log entries must not be duplicated
# run sequential once to populate the history, then run parallel and
# verify the new parallel run adds exactly as many rows as sequential
sir_interpretation_history(clean = TRUE)
future::plan(future::sequential)
suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE)) # populate history
pre_n <- nrow(sir_interpretation_history())
future::plan(future::multicore)
suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
post_n <- nrow(sir_interpretation_history())
expect_equal(post_n - pre_n, nrow(log_seq)) # exactly one run's worth of new rows
sir_interpretation_history(clean = TRUE)
# 4. two sequential runs and two parallel runs yield identical results
sir_par2 <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
expect_identical(sir_par[["AMC"]], sir_par2[["AMC"]])
expect_identical(sir_par[["GEN"]], sir_par2[["GEN"]])
# 5. used cores = 1 gives same results as default sequential
future::plan(future::multicore, workers = 1)
sir_mc1 <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
expect_identical(sir_seq[["AMC"]], sir_mc1[["AMC"]])
expect_identical(sir_seq[["GEN"]], sir_mc1[["GEN"]])
# 6. used cores = 2 and used cores = 3 give same results as sequential
if (n_max_workers >= 3) {
future::plan(future::multicore, workers = 2)
sir_mc2 <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
future::plan(future::multicore, workers = 3)
sir_mc3 <- suppressMessages(as.sir(df_par, col_mo = "mo", info = FALSE, parallel = TRUE))
expect_identical(sir_seq[["AMC"]], sir_mc2[["AMC"]])
expect_identical(sir_seq[["GEN"]], sir_mc3[["GEN"]])
}
# 7. single-column data frame falls back silently to sequential
df_single <- df_par[, c("mo", "AMC")]
future::plan(future::sequential)
sir_single_seq <- suppressMessages(as.sir(df_single, col_mo = "mo", info = FALSE))
future::plan(future::multicore)
sir_single_par <- suppressMessages(as.sir(df_single, col_mo = "mo", info = FALSE, parallel = TRUE))
expect_identical(sir_single_seq[["AMC"]], sir_single_par[["AMC"]])
# 8. row-batch mode (n_cols < n_cores): force row splitting via used cores and
# verify identical output to sequential for a dataset with 2 AB columns so
# pieces_per_col = ceiling(used cores / 2) >= 2 and row batching activates
df_wide <- data.frame(
mo = "B_ESCHR_COLI",
AMC = as.mic(sample(c("1", "2", "4", "8"), n_par, TRUE)),
GEN = as.mic(sample(c("1", "2", "4", "8"), n_par, TRUE)),
stringsAsFactors = FALSE
)
future::plan(future::sequential)
sir_wide_seq <- suppressMessages(as.sir(df_wide, col_mo = "mo", info = FALSE))
future::plan(future::multicore)
sir_wide_par <- suppressMessages(as.sir(df_wide,
col_mo = "mo", info = FALSE,
parallel = TRUE
))
expect_identical(sir_wide_seq[["AMC"]], sir_wide_par[["AMC"]])
expect_identical(sir_wide_seq[["GEN"]], sir_wide_par[["GEN"]])
# 8. info = TRUE with parallel does not produce per-column worker messages
# (messages should only appear in the main process, not duplicated from workers)
msgs <- capture.output(
suppressWarnings(as.sir(df_par, col_mo = "mo", info = TRUE, parallel = TRUE)),
type = "message"
)
# each AB column name should appear at most once in all messages combined
for (ab_nm in c("AMC", "GEN", "CIP", "PEN")) {
n_mentions <- sum(grepl(ab_nm, msgs, fixed = TRUE))
expect_lte(n_mentions, 1L)
}
future::plan(future::sequential)
}
})
# issue #239 — custom reference_data support
test_that("custom reference_data: non-EUCAST/CLSI guideline produces R", {
# Build a minimal one-row custom breakpoint table from a plain data.frame.
# coerce_reference_data_columns() will coerce mo/ab to the right class.
my_bp <- clinical_breakpoints[clinical_breakpoints$method == "MIC" &
clinical_breakpoints$type == "human", ][1, ]
my_bp$guideline <- "MyLab 2025"
my_bp$mo <- "B_ACHRMB_XYLS" # plain character — coerced to <mo>
my_bp$ab <- "MEM" # plain character — coerced to <ab>
my_bp$breakpoint_S <- 8
my_bp$breakpoint_R <- 32
# guideline omitted: all rows in reference_data are used; R via open interval (>)
expect_equal(as.character(suppressMessages(
as.sir(as.mic(64), mo = "B_ACHRMB_XYLS", ab = "MEM", reference_data = my_bp)
)), "R")
expect_equal(as.character(suppressMessages(
as.sir(as.mic(16), mo = "B_ACHRMB_XYLS", ab = "MEM", reference_data = my_bp)
)), "I")
# at R breakpoint value must be I (open interval: > not >=)
expect_equal(as.character(suppressMessages(
as.sir(as.mic(32), mo = "B_ACHRMB_XYLS", ab = "MEM", reference_data = my_bp)
)), "I")
# guideline explicitly set: same result when it matches the data
expect_equal(as.character(suppressMessages(
as.sir(as.mic(64),
mo = "B_ACHRMB_XYLS", ab = "MEM",
guideline = "MyLab 2025", reference_data = my_bp
)
)), "R")
})
test_that("custom reference_data: host = NA acts as host-agnostic fallback", {
my_bp <- clinical_breakpoints[clinical_breakpoints$method == "MIC" &
clinical_breakpoints$type == "human", ][1, ]
my_bp$guideline <- "MyLab 2025"
my_bp$mo <- "B_ACHRMB_XYLS"
my_bp$ab <- "MEM"
my_bp$type <- "animal"
my_bp$host <- NA # logical NA — coerced to character by coerce_reference_data_columns()
my_bp$breakpoint_S <- 8
my_bp$breakpoint_R <- 32
# NA host should match when no species-specific row exists
result <- suppressMessages(
as.sir(as.mic(64),
mo = "B_ACHRMB_XYLS", ab = "MEM",
host = "dogs", breakpoint_type = "animal", reference_data = my_bp
)
)
expect_equal(as.character(result), "R")
}) })

View File

@@ -89,6 +89,11 @@ test_that("test-zzz.R", {
"symbol" = "cli", "symbol" = "cli",
# curl # curl
"has_internet" = "curl", "has_internet" = "curl",
# future
"plan" = "future",
"nbrOfWorkers" = "future",
# future.apply
"future_lapply" = "future.apply",
# ggplot2 # ggplot2
"aes" = "ggplot2", "aes" = "ggplot2",
"arrow" = "ggplot2", "arrow" = "ggplot2",
@@ -127,8 +132,6 @@ test_that("test-zzz.R", {
"kable" = "knitr", "kable" = "knitr",
"knit_print" = "knitr", "knit_print" = "knitr",
"opts_chunk" = "knitr", "opts_chunk" = "knitr",
# parallelly
"availableCores" = "parallelly",
# pillar # pillar
"pillar_shaft" = "pillar", "pillar_shaft" = "pillar",
"style_na" = "pillar", "style_na" = "pillar",

115
tools/benchmark_parallel.R Normal file
View File

@@ -0,0 +1,115 @@
# Benchmark: sequential vs parallel as.sir() across data-set shapes
#
# Run from the repo root:
# Rscript tools/benchmark_parallel.R
# or inside an R session:
# source("tools/benchmark_parallel.R")
#
# Two panels:
# Left fixed columns (n_ab_fixed), varying rows.
# Parallel wins at small n; sequential catches up at large n due to
# memory-bandwidth saturation (all workers compete for the same
# clinical_breakpoints lookup table in L3 cache / RAM).
# Right fixed rows (n_rows_fixed), varying column count.
# This is the shape that actually benefits: each additional column
# keeps another core busy. The "real world" gain for a 2854×65
# dataset lives here.
#
# Requires ggplot2; uses devtools::load_all() so the package need not be
# installed.
devtools::load_all(".", quiet = TRUE)
# ── configuration ─────────────────────────────────────────────────────────────
row_sizes <- c(200, 1000, 5000, 20000)
col_sizes <- c(4, 8, 16, 32, 48)
n_rows_fixed <- 1000
n_ab_fixed <- 16
n_cores_avail <- AMR:::get_n_cores(Inf)
all_abs <- c("AMC", "GEN", "CIP", "TZP", "IPM", "MEM",
"AMP", "TMP", "SXT", "NIT", "FOX", "CRO",
"FEP", "CAZ", "CTX", "TOB", "AMK", "ERY",
"AZM", "CLI", "VAN", "TEC", "RIF", "MTR",
"MFX", "LNZ", "TGC", "DOX", "FLC", "OXA",
"PEN", "CXM", "CZO", "KAN", "COL", "FOS",
"MUP", "TCY", "TEC", "IPM", "CHL", "FEP",
"MEM", "TZP", "GEN", "AMC", "AMX", "AMP")
all_abs <- unique(all_abs)
mic_vals <- c("0.25", "0.5", "1", "2", "4", "8", "16", "32")
make_df <- function(n_rows, n_ab) {
set.seed(42)
ab_sel <- all_abs[seq_len(min(n_ab, length(all_abs)))]
mics <- lapply(ab_sel, function(a) as.mic(sample(mic_vals, n_rows, TRUE)))
names(mics) <- ab_sel
data.frame(mo = "B_ESCHR_COLI", mics, stringsAsFactors = FALSE)
}
time_both <- function(n_rows, n_ab, label) {
df <- make_df(n_rows, n_ab)
t_seq <- system.time(
suppressMessages(as.sir(df, col_mo = "mo", info = FALSE, parallel = FALSE))
)[["elapsed"]]
t_par <- system.time(
suppressMessages(as.sir(df, col_mo = "mo", info = FALSE, parallel = TRUE))
)[["elapsed"]]
message(sprintf("%-28s seq=%5.2fs par=%5.2fs speedup=%.1fx",
label, t_seq, t_par, t_seq / t_par))
data.frame(group = label, mode = c("sequential", "parallel"),
seconds = c(t_seq, t_par), stringsAsFactors = FALSE)
}
# ── warm-up (avoid first-call overhead biasing results) ───────────────────────
message("Warming up cache ...")
invisible(suppressMessages(as.sir(make_df(100, 6), col_mo = "mo", info = FALSE)))
invisible(suppressMessages(as.sir(make_df(100, 6), col_mo = "mo", info = FALSE, parallel = TRUE)))
sir_interpretation_history(clean = TRUE)
# ── panel 1: vary rows, fixed columns ─────────────────────────────────────────
message(sprintf("\nPanel 1 varying rows, %d fixed columns:", n_ab_fixed))
res_rows <- do.call(rbind, lapply(row_sizes, function(n) {
time_both(n, n_ab_fixed, sprintf("rows=%d", n))
}))
res_rows$x <- rep(row_sizes, each = 2)
res_rows$panel <- "Vary rows (16 fixed AB columns)"
# ── panel 2: vary columns, fixed rows ─────────────────────────────────────────
message(sprintf("\nPanel 2 varying columns, %d fixed rows:", n_rows_fixed))
res_cols <- do.call(rbind, lapply(col_sizes, function(n_ab) {
time_both(n_rows_fixed, n_ab, sprintf("cols=%d", n_ab))
}))
res_cols$x <- rep(col_sizes, each = 2)
res_cols$panel <- sprintf("Vary columns (%d fixed rows)", n_rows_fixed)
results <- rbind(res_rows, res_cols)
if (requireNamespace("ggplot2", quietly = TRUE)) {
p <- ggplot2::ggplot(
results,
ggplot2::aes(x = x, y = seconds, colour = mode, group = mode)
) +
ggplot2::geom_line(linewidth = 1) +
ggplot2::geom_point(size = 2.5) +
ggplot2::facet_wrap(~panel, scales = "free_x") +
ggplot2::scale_colour_manual(
values = c(sequential = "#E05C5C", parallel = "#2E86AB")
) +
ggplot2::labs(
title = "as.sir() throughput: sequential vs parallel",
subtitle = sprintf("E. coli, EUCAST 2026, %d cores available", n_cores_avail),
x = "Dataset dimension (rows ·left· or columns ·right·)",
y = "Wall-clock time (seconds)",
colour = NULL
) +
ggplot2::theme_minimal(base_size = 12) +
ggplot2::theme(legend.position = "top")
out_file <- "tools/benchmark_parallel.png"
ggplot2::ggsave(out_file, p, width = 10, height = 5, dpi = 150)
message("\nPlot saved to ", out_file)
} else {
message("Install ggplot2 to get a plot; raw results:")
print(results[, c("panel", "group", "mode", "seconds")])
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB