1
0
mirror of https://github.com/msberends/AMR.git synced 2026-05-31 12:21:40 +02:00
Files
AMR/R
Claude 060449e234 Optimise parallel as.sir(): row-batch mode when n_cols < n_cores
Previously parallel dispatch only parallelised by column, so a 6-column
dataset on a 16-core machine used at most 6 cores with the other 10 idle.
For large n this also caused memory-bandwidth saturation (each worker did
a full n-row scan of clinical_breakpoints simultaneously).

New row-batch mode (fork path, R >= 4.0, non-Windows):
  pieces_per_col = ceil(n_cores / n_cols)
  Jobs = n_cols × pieces_per_col  (≈ n_cores jobs total)
  Each job: one column × one row slice

Benefits:
  - All cores stay busy regardless of column count
  - Per-worker memory footprint shrinks by pieces_per_col ×
  - Breakpoints lookup cache pressure reduced per worker

PSOCK path (Windows / R < 4.0) is unchanged: per-job serialisation
overhead makes row batching unprofitable there.

run_as_sir_column() gains an optional `rows` parameter (NULL = all rows,
backward-compatible). Results are reassembled via as.sir(c(as.character(.)))
which is safe for already-clean SIR values.

https://claude.ai/code/session_012DXCXbZUC54Zij1z9bFiHR
2026-04-24 22:01:09 +00:00
..
2025-04-29 16:15:18 +02:00
2026-04-21 22:11:40 +02:00
2026-04-04 11:51:50 +02:00
2026-03-11 16:07:31 +01:00
2026-03-24 12:44:47 +01:00
2025-04-12 11:46:42 +02:00
2026-03-24 12:44:47 +01:00
2025-04-12 11:46:42 +02:00
2026-03-23 10:38:28 +01:00
2025-06-02 12:11:00 +02:00
2025-04-12 11:46:42 +02:00
2026-04-21 22:11:40 +02:00
2025-04-12 11:46:42 +02:00
2025-04-12 11:46:42 +02:00
2026-04-22 08:16:44 +02:00
2026-03-30 10:01:49 +02:00
2026-03-23 10:38:28 +01:00
2025-06-13 16:12:28 +02:00
2026-03-23 10:38:28 +01:00
2026-03-22 20:44:37 +01:00
2025-04-12 11:46:42 +02:00
2026-04-04 11:51:50 +02:00
2025-04-12 11:46:42 +02:00