diff --git a/CLAUDE.md b/CLAUDE.md
index 566f6de84..752d6045e 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -85,6 +85,27 @@ _pkgdown.yml # pkgdown website configuration
- `translate.R` — 28-language translation system
- `ggplot_sir.R` / `ggplot_pca.R` / `plotting.R` — visualisation functions
+## Code Style
+
+Follow the [tidyverse style guide](https://style.tidyverse.org/) precisely. Key rules:
+
+- 2-space indentation; no tabs
+- `<-` for assignment, not `=`
+- Spaces around all binary operators and after commas; no spaces inside parentheses
+- When a function call must break across lines, place the first argument on a new line indented by 2 spaces, and put the closing `)` on its own line — **never align arguments to the opening parenthesis** (no hanging/forced mid-line indentation)
+
+```r
+# good
+stop_(
+ "some long message part one ",
+ "part two"
+)
+
+# bad — forces indentation to match the opening parenthesis
+stop_("some long message part one ",
+ "part two")
+```
+
## Custom S3 Classes
The package defines five S3 classes with full print/format/plot/vctrs support:
diff --git a/DESCRIPTION b/DESCRIPTION
index c34159271..024b62200 100644
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,5 +1,5 @@
Package: AMR
-Version: 3.0.1.9076
+Version: 3.0.1.9077
Date: 2026-06-26
Title: Antimicrobial Resistance Data Analysis
Description: Functions to simplify and standardise antimicrobial resistance (AMR)
diff --git a/NEWS.md b/NEWS.md
index 15789d30d..1a62ed8a8 100644
--- a/NEWS.md
+++ b/NEWS.md
@@ -1,4 +1,4 @@
-# AMR 3.0.1.9076
+# AMR 3.0.1.9077
Planned as v3.1.0, end of June 2026.
@@ -37,6 +37,7 @@ Planned as v3.1.0, end of June 2026.
* Fixed some EUCAST Expert Rules, mostly on *S. pneumoniae*
### Updated
+* `top_n_microorganisms()`: new `property_for_each` argument for sub-grouping within top *n* groups; rank ordering enforced (only lower taxonomic ranks allowed); fixed `property = NULL` not being accepted; inner filter now tracks original row indices to prevent cross-group contamination
* Taxonomic update for all microorganisms, now updated to June 2026
* `mo_kingdom()` now returns the formal taxonomic kingdom; a one-time note per session explains the change when querying bacterial or archaeal records.
* `mo_taxonomy()` and `mo_info()` gained `domain` for the list output
diff --git a/R/sysdata.rda b/R/sysdata.rda
index 3041024d3..6b90621ff 100755
Binary files a/R/sysdata.rda and b/R/sysdata.rda differ
diff --git a/R/tidymodels.R b/R/tidymodels.R
index f357b815f..b2513e0aa 100755
--- a/R/tidymodels.R
+++ b/R/tidymodels.R
@@ -120,13 +120,14 @@ all_disk_predictors <- function() {
#' @rdname amr-tidymodels
#' @export
step_mic_log2 <- function(
- recipe,
- ...,
- role = NA,
- trained = FALSE,
- columns = NULL,
- skip = FALSE,
- id = recipes::rand_id("mic_log2")) {
+ recipe,
+ ...,
+ role = NA,
+ trained = FALSE,
+ columns = NULL,
+ skip = FALSE,
+ id = recipes::rand_id("mic_log2")
+) {
recipes::add_step(
recipe,
step_mic_log2_new(
@@ -195,13 +196,14 @@ tidy.step_mic_log2 <- function(x, ...) {
#' @rdname amr-tidymodels
#' @export
step_sir_numeric <- function(
- recipe,
- ...,
- role = NA,
- trained = FALSE,
- columns = NULL,
- skip = FALSE,
- id = recipes::rand_id("sir_numeric")) {
+ recipe,
+ ...,
+ role = NA,
+ trained = FALSE,
+ columns = NULL,
+ skip = FALSE,
+ id = recipes::rand_id("sir_numeric")
+) {
recipes::add_step(
recipe,
step_sir_numeric_new(
diff --git a/R/top_n_microorganisms.R b/R/top_n_microorganisms.R
index d6dd08c86..9d237d644 100755
--- a/R/top_n_microorganisms.R
+++ b/R/top_n_microorganisms.R
@@ -29,73 +29,88 @@
#' Filter Top *n* Microorganisms
#'
-#' This function filters a data set to include only the top *n* microorganisms based on a specified property, such as taxonomic family or genus. For example, it can filter a data set to the top 3 species, or to any species in the top 5 genera, or to the top 3 species in each of the top 5 genera.
+#' Filters a data set to include only the top *n* microorganisms based on a specified property, such as taxonomic family or genus. For example, it can filter a data set to the top 3 species, to any species in the top 5 genera, or to the top 3 species in each of the top 5 genera.
#' @param x A data frame containing microbial data.
-#' @param n An integer specifying the maximum number of unique values of the `property` to include in the output.
-#' @param property A character string indicating the microorganism property to use for filtering. Must be one of the column names of the [microorganisms] data set: `r vector_or(colnames(microorganisms), sort = FALSE, documentation = TRUE)`. If `NULL`, the raw values from `col_mo` will be used without transformation. When using `"species"` (default) or `"subpecies"`, the genus will be added to make sure each (sub)species still belongs to the right genus.
-#' @param n_for_each An optional integer specifying the maximum number of rows to retain for each value of the selected property. If `NULL`, all rows within the top *n* groups will be included.
+#' @param n A positive whole number specifying the maximum number of unique values of `property` to include in the output.
+#' @param property A character string indicating the microorganism property to use for filtering. Must be one of the column names of the [microorganisms] data set: `r vector_or(colnames(microorganisms), sort = FALSE, documentation = TRUE)`. If `NULL`, the raw values from `col_mo` will be used without transformation. When using `"species"` (default) or `"subspecies"`, the genus is prepended to ensure each name is unambiguous.
+#' @param n_for_each An optional positive whole number specifying the maximum number of distinct microorganism groups at the level of `property_for_each` to retain within each of the top *n* groups. Only used when `property_for_each` is also set.
+#' @param property_for_each The microorganism property to use for sub-grouping within each top *n* group. Must be one of the column names of the [microorganisms] data set and at a strictly lower taxonomic rank than `property` (allowed order: domain > kingdom > phylum > class > order > family > genus > species > subspecies). Defaults to `"species"`. Only relevant when `n_for_each` is set.
#' @param col_mo A character string indicating the column in `x` that contains microorganism names or codes. Defaults to the first column of class [`mo`]. Values will be coerced using [as.mo()].
#' @param ... Additional arguments passed on to [mo_property()] when `property` is not `NULL`.
-#' @details This function is useful for preprocessing data before creating [antibiograms][antibiogram()] or other analyses that require focused subsets of microbial data. For example, it can filter a data set to only include isolates from the top 10 species.
+#' @details This function is useful for preprocessing data before creating [antibiograms][antibiogram()] or other analyses that require focused subsets of microbial data.
#' @export
#' @seealso [mo_property()], [as.mo()], [antibiogram()]
#' @examples
#' # filter to the top 3 species:
-#' top_n_microorganisms(example_isolates,
-#' n = 3
-#' )
+#' top_n_microorganisms(example_isolates, n = 3)
#'
#' # filter to any species in the top 5 genera:
-#' top_n_microorganisms(example_isolates,
-#' n = 5, property = "genus"
-#' )
+#' top_n_microorganisms(example_isolates, n = 5, property = "genus")
#'
#' # filter to the top 3 species in each of the top 5 genera:
#' top_n_microorganisms(example_isolates,
#' n = 5, property = "genus", n_for_each = 3
#' )
-top_n_microorganisms <- function(x, n, property = "species", n_for_each = NULL, col_mo = NULL, ...) {
+#'
+#' # filter to the top 2 genera in each of the top 3 families:
+#' top_n_microorganisms(example_isolates,
+#' n = 3, property = "family", n_for_each = 2, property_for_each = "genus"
+#' )
+top_n_microorganisms <- function(x, n, property = "species", n_for_each = NULL, property_for_each = "species", col_mo = NULL, ...) {
meet_criteria(x, allow_class = "data.frame") # also checks dimensions to be >0
meet_criteria(n, allow_class = c("numeric", "integer"), has_length = 1, is_finite = TRUE, is_positive = TRUE)
- meet_criteria(property, allow_class = "character", has_length = 1, is_in = colnames(AMR::microorganisms))
+ meet_criteria(property, allow_class = "character", has_length = 1, is_in = colnames(AMR::microorganisms), allow_NULL = TRUE)
meet_criteria(n_for_each, allow_class = c("numeric", "integer"), has_length = 1, is_finite = TRUE, is_positive = TRUE, allow_NULL = TRUE)
+ meet_criteria(property_for_each, allow_class = "character", has_length = 1, is_in = colnames(AMR::microorganisms), allow_NULL = TRUE)
meet_criteria(col_mo, allow_class = "character", has_length = 1, allow_NULL = TRUE, is_in = colnames(x))
+
if (is.null(col_mo)) {
col_mo <- search_type_in_df(x = x, type = "mo", info = TRUE)
stop_if(is.null(col_mo), "{.arg col_mo} must be set")
}
- x.bak <- x
+ .taxonomic_ranks <- c("domain", "kingdom", "phylum", "class", "order", "family", "genus", "species", "subspecies")
+ if (!is.null(n_for_each) && !is.null(property) && !is.null(property_for_each)) {
+ prop_rank <- match(property, .taxonomic_ranks)
+ each_rank <- match(property_for_each, .taxonomic_ranks)
+ if (!is.na(prop_rank) && !is.na(each_rank) && each_rank <= prop_rank) {
+ stop_(
+ "`property_for_each` (\"", property_for_each, "\") must be at a lower ",
+ "taxonomic rank than `property` (\"", property, "\")"
+ )
+ }
+ }
+ x.bak <- x
x[, col_mo] <- as.mo(x[, col_mo, drop = TRUE], keep_synonyms = TRUE)
- if (is.null(property)) {
- x$prop_val <- x[[col_mo]]
- } else if (property == "species") {
- x$prop_val <- paste(mo_genus(x[[col_mo]], ...), mo_species(x[[col_mo]], ...))
- } else if (property == "subspecies") {
- x$prop_val <- paste(mo_genus(x[[col_mo]], ...), mo_species(x[[col_mo]], ...), mo_subspecies(x[[col_mo]], ...))
- } else {
- x$prop_val <- mo_property(x[[col_mo]], property = property, ...)
+ get_prop_val <- function(prop) {
+ if (is.null(prop)) {
+ x[[col_mo]]
+ } else if (prop == "species") {
+ paste(mo_genus(x[[col_mo]], ...), mo_species(x[[col_mo]], ...))
+ } else if (prop == "subspecies") {
+ paste(mo_genus(x[[col_mo]], ...), mo_species(x[[col_mo]], ...), mo_subspecies(x[[col_mo]], ...))
+ } else {
+ mo_property(x[[col_mo]], property = prop, ...)
+ }
}
- counts <- sort(table(x$prop_val), decreasing = TRUE)
- n <- as.integer(n)
- if (length(counts) < n) {
- n <- length(counts)
- }
- count_values <- names(counts)[seq_len(n)]
- filtered_rows <- which(x$prop_val %in% count_values)
+ x$prop_val <- get_prop_val(property)
+ counts <- sort(table(x$prop_val), decreasing = TRUE)
+ n <- min(as.integer(n), length(counts))
+ filtered_rows <- which(x$prop_val %in% names(counts)[seq_len(n)])
if (!is.null(n_for_each)) {
n_for_each <- as.integer(n_for_each)
+ x$prop_val_each <- get_prop_val(property_for_each)
filtered_x <- x[filtered_rows, , drop = FALSE]
+ filtered_x$.orig_row <- filtered_rows
filtered_rows <- do.call(
c,
lapply(split(filtered_x, filtered_x$prop_val), function(group) {
- top_values <- names(sort(table(group[[col_mo]]), decreasing = TRUE)[seq_len(n_for_each)])
- top_values <- top_values[!is.na(top_values)]
- which(x[[col_mo]] %in% top_values)
+ top_each <- names(sort(table(group$prop_val_each), decreasing = TRUE)[seq_len(n_for_each)])
+ group$.orig_row[group$prop_val_each %in% top_each[!is.na(top_each)]]
})
)
}
diff --git a/README.Rmd b/README.Rmd
index a7b2cbce6..0c04b17d9 100644
--- a/README.Rmd
+++ b/README.Rmd
@@ -11,6 +11,7 @@ knitr::opts_chunk$set(
# fig.path = "man/figures/README-",
out.width = "100%"
)
+options(width = 100)
AMR:::reset_all_thrown_messages()
```
diff --git a/data/antibiotics.rda b/data/antibiotics.rda
index 9a9e74d3d..4b919f059 100644
Binary files a/data/antibiotics.rda and b/data/antibiotics.rda differ
diff --git a/data/antimicrobials.rda b/data/antimicrobials.rda
index f5953720f..26199aa8c 100644
Binary files a/data/antimicrobials.rda and b/data/antimicrobials.rda differ
diff --git a/index.Rmd b/index.Rmd
index d6b0d8033..e3814f985 100644
--- a/index.Rmd
+++ b/index.Rmd
@@ -13,6 +13,7 @@ knitr::opts_chunk$set(
fig.path = "pkgdown/assets/",
out.width = "100%"
)
+options(width = 100)
AMR:::reset_all_thrown_messages()
```
diff --git a/index.md b/index.md
index 45829bd79..31f49bfda 100644
--- a/index.md
+++ b/index.md
@@ -27,12 +27,9 @@
-
amr-for-r.org
-
-
doi.org/10.18637/jss.v104.i03
@@ -64,7 +61,7 @@ formed the basis of two PhD theses ([DOI
[DOI 10.33612/diss.192486375](https://doi.org/10.33612/diss.192486375)).
After installing this package, R knows [**~97 000 distinct microbial
-species**](./reference/microorganisms.html) (updated May 2026) and all
+species**](./reference/microorganisms.html) (updated mei 2026) and all
[**~620 antimicrobial and antiviral
drugs**](./reference/antimicrobials.html) by name and code (including
ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all
@@ -175,11 +172,13 @@ example_isolates %>%
#> ℹ Using column mo as input for `mo_fullname()`
#> ℹ Using column mo as input for `mo_is_gram_negative()`
#> ℹ Using column mo as input for `mo_is_intrinsic_resistant()`
-#> ℹ Determining intrinsic resistance based on 'EUCAST Expected Resistant
-#> Phenotypes' v1.2 (2023). This note will be shown once per session.
-#> ℹ For `aminoglycosides()` using columns GEN (gentamicin), TOB (tobramycin), AMK
-#> (amikacin), and KAN (kanamycin)
-#> ℹ For `carbapenems()` using columns IPM (imipenem) and MEM (meropenem)
+#> ℹ Determining intrinsic resistance based on 'EUCAST Expected
+#> Resistant Phenotypes' v1.2 (2023). This note will be shown
+#> once per session.
+#> ℹ For `aminoglycosides()` using columns GEN (gentamicin), TOB
+#> (tobramycin), AMK (amikacin), and KAN (kanamycin)
+#> ℹ For `carbapenems()` using columns IPM (imipenem) and MEM
+#> (meropenem)
#> # A tibble: 35 × 7
#> bacteria GEN TOB AMK KAN IPM MEM
#>
@@ -229,8 +228,8 @@ wisca(example_isolates,
```
| Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
-|:---|:---|:---|
-| 69.9% (64.7-75.2%) | 93.7% (92.2-95.1%) | 89.8% (86.8-92.3%) |
+|:------------------------|:-------------------------------------|:-------------------------------------|
+| 70% (64.7-75.2%) | 93.6% (92.2-95.1%) | 89.8% (87-92.5%) |
WISCA supports stratification by any clinical variable, so you can
generate syndrome-specific or ward-specific coverage estimates:
@@ -244,10 +243,10 @@ wisca(example_isolates,
```
| Syndromic Group | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
-|:---|:---|:---|:---|
-| Clinical | 74.6% (69-80.1%) | 93.6% (91.9-95.1%) | 90.5% (86.9-93%) |
-| ICU | 57% (48.7-65.8%) | 86.7% (83.7-89.7%) | 82.8% (77.9-87.2%) |
-| Outpatient | 57.5% (46.5-68.7%) | 76.7% (70.6-82.4%) | 67.5% (57.2-76.7%) |
+|:----------------|:------------------------|:-------------------------------------|:-------------------------------------|
+| Clinical | 74.6% (68.6-80.6%) | 93.7% (92.1-95.1%) | 90.4% (87-93.1%) |
+| ICU | 57% (48.6-65.7%) | 86.8% (83.6-89.8%) | 82.9% (78.1-87.3%) |
+| Outpatient | 56.9% (45.9-68.2%) | 76.7% (70.6-82.3%) | 68% (57.6-77.2%) |
**For AMR surveillance**, traditional antibiograms remain the right tool
for tracking resistance per species over time:
@@ -256,13 +255,14 @@ for tracking resistance per species over time:
antibiogram(example_isolates,
mo_transform = "gramstain",
antimicrobials = c("AMC", carbapenems(), "TZP"))
-#> ℹ For `carbapenems()` using columns IPM (imipenem) and MEM (meropenem)
+#> ℹ For `carbapenems()` using columns IPM (imipenem) and MEM
+#> (meropenem)
```
-| Pathogen | Amoxicillin/clavulanic acid | Imipenem | Meropenem | Piperacillin/tazobactam |
-|:---|:---|:---|:---|:---|
-| Gram-negative | 76% (73-79%,N=726) | 99% (98-100%,N=631) | 100% (99-100%,N=626) | 88% (85-91%,N=641) |
-| Gram-positive | 76% (74-79%,N=1138) | 81% (75-85%,N=257) | 77% (70-82%,N=203) | 86% (82-89%,N=345) |
+| Pathogen | Amoxicillin/clavulanic acid | Imipenem | Meropenem | Piperacillin/tazobactam |
+|:--------------|:----------------------------|:--------------------|:---------------------|:------------------------|
+| Gram-negative | 76% (73-79%,N=726) | 99% (98-100%,N=631) | 100% (99-100%,N=626) | 88% (85-91%,N=641) |
+| Gram-positive | 76% (74-79%,N=1138) | 81% (75-85%,N=257) | 77% (70-82%,N=203) | 86% (82-89%,N=345) |
Combination antibiograms show the additional coverage gained by adding a
second agent, stratified by species:
@@ -273,10 +273,10 @@ antibiogram(example_isolates,
antimicrobials = c("TZP", "TZP+TOB", "TZP+GEN"))
```
-| Pathogen | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
-|:---|:---|:---|:---|
-| Gram-negative | 88% (85-91%,N=641) | 99% (97-99%,N=691) | 98% (97-99%,N=693) |
-| Gram-positive | 86% (82-89%,N=345) | 98% (96-98%,N=1044) | 95% (93-97%,N=550) |
+| Pathogen | Piperacillin/tazobactam | Piperacillin/tazobactam + Gentamicin | Piperacillin/tazobactam + Tobramycin |
+|:--------------|:------------------------|:-------------------------------------|:-------------------------------------|
+| Gram-negative | 88% (85-91%,N=641) | 99% (97-99%,N=691) | 98% (97-99%,N=693) |
+| Gram-positive | 86% (82-89%,N=345) | 98% (96-98%,N=1044) | 95% (93-97%,N=550) |
Like many other functions in this package, `antibiogram()` and `wisca()`
come with support for 28 languages that are often detected automatically
@@ -349,9 +349,10 @@ example_isolates %>%
summarise(across(c(GEN, TOB),
list(total_R = resistance,
conf_int = function(x) sir_confidence_interval(x, collapse = "-"))))
-#> ℹ `resistance()` assumes the EUCAST guideline and thus considers the 'I'
-#> category susceptible. Set the `guideline` argument or the `AMR_guideline`
-#> option to either "CLSI" or "EUCAST", see `?AMR-options`.
+#> ℹ `resistance()` assumes the EUCAST guideline and thus
+#> considers the 'I' category susceptible. Set the `guideline`
+#> argument or the `AMR_guideline` option to either "CLSI" or
+#> "EUCAST", see `?AMR-options`.
#> ℹ This message will be shown once per session.
#> # A tibble: 3 × 5
#> ward GEN_total_R GEN_conf_int TOB_total_R TOB_conf_int
@@ -375,15 +376,16 @@ out <- example_isolates %>%
# calculate AMR using resistance(), over all aminoglycosides and polymyxins:
summarise(across(c(aminoglycosides(), polymyxins()),
resistance))
-#> ℹ For `aminoglycosides()` using columns GEN (gentamicin), TOB (tobramycin), AMK
-#> (amikacin), and KAN (kanamycin)
+#> ℹ For `aminoglycosides()` using columns GEN (gentamicin), TOB
+#> (tobramycin), AMK (amikacin), and KAN (kanamycin)
#> ℹ For `polymyxins()` using column COL (colistin)
#> Warning: There was 1 warning in `summarise()`.
-#> ℹ In argument: `across(c(aminoglycosides(), polymyxins()), resistance)`.
+#> ℹ In argument: `across(c(aminoglycosides(), polymyxins()),
+#> resistance)`.
#> ℹ In group 3: `ward = "Outpatient"`.
#> Caused by warning:
-#> ! Introducing NA: only 23 results available for KAN in group: ward = "Outpatient"
-#> (whilst `minimum = 30`).
+#> ! Introducing NA: only 23 results available for KAN in group:
+#> ward = "Outpatient" (whilst `minimum = 30`).
out
#> # A tibble: 3 × 6
#> ward GEN TOB AMK KAN COL
diff --git a/man/AMR.Rd b/man/AMR.Rd
index ccc786ca6..13406b27b 100644
--- a/man/AMR.Rd
+++ b/man/AMR.Rd
@@ -12,7 +12,7 @@ The \code{AMR} package is a peer-reviewed, \href{https://amr-for-r.org/#copyrigh
This work was published in the Journal of Statistical Software (Volume 104(3); \doi{10.18637/jss.v104.i03}) and formed the basis of two PhD theses (\doi{10.33612/diss.177417131} and \doi{10.33612/diss.192486375}).
-After installing this package, R knows \href{https://amr-for-r.org/reference/microorganisms.html}{\strong{~97 000 distinct microbial species}} (updated May 2026) and all \href{https://amr-for-r.org/reference/antimicrobials.html}{\strong{~620 antimicrobial and antiviral drugs}} by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral clinical breakpoint guidelines from CLSI 2011-2026 and EUCAST 2011-2026 are included, even with epidemiological cut-off (ECOFF) values. It supports and can read any data format, including WHONET data. This package works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). \strong{It was designed to work in any setting, including those with very limited resources}. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the \href{https://www.rug.nl}{University of Groningen} and the \href{https://www.umcg.nl}{University Medical Center Groningen}.
+After installing this package, R knows \href{https://amr-for-r.org/reference/microorganisms.html}{\strong{~97 000 distinct microbial species}} (updated mei 2026) and all \href{https://amr-for-r.org/reference/antimicrobials.html}{\strong{~620 antimicrobial and antiviral drugs}} by name and code (including ATC, EARS-Net, ASIARS-Net, PubChem, LOINC and SNOMED CT), and knows all about valid SIR and MIC values. The integral clinical breakpoint guidelines from CLSI 2011-2026 and EUCAST 2011-2026 are included, even with epidemiological cut-off (ECOFF) values. It supports and can read any data format, including WHONET data. This package works on Windows, macOS and Linux with all versions of R since R-3.0 (April 2013). \strong{It was designed to work in any setting, including those with very limited resources}. It was created for both routine data analysis and academic research at the Faculty of Medical Sciences of the \href{https://www.rug.nl}{University of Groningen} and the \href{https://www.umcg.nl}{University Medical Center Groningen}.
The \code{AMR} package is available in English, Arabic, Bengali, Chinese, Czech, Danish, Dutch, Finnish, French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Turkish, Ukrainian, Urdu, and Vietnamese. Antimicrobial drug (group) names and colloquial microorganism names are provided in these languages.
}
diff --git a/man/g.test.Rd b/man/g.test.Rd
index 39a42bc1f..8d072b379 100644
--- a/man/g.test.Rd
+++ b/man/g.test.Rd
@@ -46,7 +46,7 @@ A list with class \code{"htest"} containing the following
\code{(observed - expected) / sqrt(expected)}.}
\item{stdres}{standardized residuals,
\code{(observed - expected) / sqrt(V)}, where \code{V} is the
- residual cell variance (Agresti, 2007, section 2.4.5
+ residual cell variance {(\if{html}{\out{}}Agresti 2007\if{html}{\out{}}, section 2.4.5)}
for the case where \code{x} is a matrix, \code{n * p * (1 - p)} otherwise).}
}
\description{
diff --git a/man/ggplot_pca.Rd b/man/ggplot_pca.Rd
index bbbd83e87..671e01287 100644
--- a/man/ggplot_pca.Rd
+++ b/man/ggplot_pca.Rd
@@ -59,8 +59,9 @@ ggplot_pca(
}
\item{pc.biplot}{
- If true, use what Gabriel (1971) refers to as a "principal component
- biplot", with \code{lambda = 1} and observations scaled up by sqrt(n) and
+ If true, use what {\if{html}{\cite{}\out{}}Gabriel (1971)\if{html}{\out{}}} refers to as a
+ \dQuote{principal component biplot},
+ with \code{lambda = 1} and observations scaled up by sqrt(n) and
variables scaled down by sqrt(n). Then inner products between
variables approximate covariances and distances between observations
approximate Mahalanobis distance.
diff --git a/man/top_n_microorganisms.Rd b/man/top_n_microorganisms.Rd
index 1160a136a..93b9126de 100644
--- a/man/top_n_microorganisms.Rd
+++ b/man/top_n_microorganisms.Rd
@@ -9,6 +9,7 @@ top_n_microorganisms(
n,
property = "species",
n_for_each = NULL,
+ property_for_each = "species",
col_mo = NULL,
...
)
@@ -16,37 +17,40 @@ top_n_microorganisms(
\arguments{
\item{x}{A data frame containing microbial data.}
-\item{n}{An integer specifying the maximum number of unique values of the \code{property} to include in the output.}
+\item{n}{A positive whole number specifying the maximum number of unique values of \code{property} to include in the output.}
-\item{property}{A character string indicating the microorganism property to use for filtering. Must be one of the column names of the \link{microorganisms} data set: \code{"mo"}, \code{"fullname"}, \code{"status"}, \code{"domain"}, \code{"kingdom"}, \code{"phylum"}, \code{"class"}, \code{"order"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"subspecies"}, \code{"rank"}, \code{"ref"}, \code{"oxygen_tolerance"}, \code{"morphology"}, \code{"source"}, \code{"lpsn"}, \code{"lpsn_parent"}, \code{"lpsn_renamed_to"}, \code{"mycobank"}, \code{"mycobank_parent"}, \code{"mycobank_renamed_to"}, \code{"gbif"}, \code{"gbif_parent"}, \code{"gbif_renamed_to"}, \code{"prevalence"}, or \code{"snomed"}. If \code{NULL}, the raw values from \code{col_mo} will be used without transformation. When using \code{"species"} (default) or \code{"subpecies"}, the genus will be added to make sure each (sub)species still belongs to the right genus.}
+\item{property}{A character string indicating the microorganism property to use for filtering. Must be one of the column names of the \link{microorganisms} data set: \code{"mo"}, \code{"fullname"}, \code{"status"}, \code{"domain"}, \code{"kingdom"}, \code{"phylum"}, \code{"class"}, \code{"order"}, \code{"family"}, \code{"genus"}, \code{"species"}, \code{"subspecies"}, \code{"rank"}, \code{"ref"}, \code{"oxygen_tolerance"}, \code{"morphology"}, \code{"source"}, \code{"lpsn"}, \code{"lpsn_parent"}, \code{"lpsn_renamed_to"}, \code{"mycobank"}, \code{"mycobank_parent"}, \code{"mycobank_renamed_to"}, \code{"gbif"}, \code{"gbif_parent"}, \code{"gbif_renamed_to"}, \code{"prevalence"}, or \code{"snomed"}. If \code{NULL}, the raw values from \code{col_mo} will be used without transformation. When using \code{"species"} (default) or \code{"subspecies"}, the genus is prepended to ensure each name is unambiguous.}
-\item{n_for_each}{An optional integer specifying the maximum number of rows to retain for each value of the selected property. If \code{NULL}, all rows within the top \emph{n} groups will be included.}
+\item{n_for_each}{An optional positive whole number specifying the maximum number of distinct microorganism groups at the level of \code{property_for_each} to retain within each of the top \emph{n} groups. Only used when \code{property_for_each} is also set.}
+
+\item{property_for_each}{The microorganism property to use for sub-grouping within each top \emph{n} group. Must be one of the column names of the \link{microorganisms} data set and at a strictly lower taxonomic rank than \code{property} (allowed order: domain > kingdom > phylum > class > order > family > genus > species > subspecies). Defaults to \code{"species"}. Only relevant when \code{n_for_each} is set.}
\item{col_mo}{A character string indicating the column in \code{x} that contains microorganism names or codes. Defaults to the first column of class \code{\link{mo}}. Values will be coerced using \code{\link[=as.mo]{as.mo()}}.}
\item{...}{Additional arguments passed on to \code{\link[=mo_property]{mo_property()}} when \code{property} is not \code{NULL}.}
}
\description{
-This function filters a data set to include only the top \emph{n} microorganisms based on a specified property, such as taxonomic family or genus. For example, it can filter a data set to the top 3 species, or to any species in the top 5 genera, or to the top 3 species in each of the top 5 genera.
+Filters a data set to include only the top \emph{n} microorganisms based on a specified property, such as taxonomic family or genus. For example, it can filter a data set to the top 3 species, to any species in the top 5 genera, or to the top 3 species in each of the top 5 genera.
}
\details{
-This function is useful for preprocessing data before creating \link[=antibiogram]{antibiograms} or other analyses that require focused subsets of microbial data. For example, it can filter a data set to only include isolates from the top 10 species.
+This function is useful for preprocessing data before creating \link[=antibiogram]{antibiograms} or other analyses that require focused subsets of microbial data.
}
\examples{
# filter to the top 3 species:
-top_n_microorganisms(example_isolates,
- n = 3
-)
+top_n_microorganisms(example_isolates, n = 3)
# filter to any species in the top 5 genera:
-top_n_microorganisms(example_isolates,
- n = 5, property = "genus"
-)
+top_n_microorganisms(example_isolates, n = 5, property = "genus")
# filter to the top 3 species in each of the top 5 genera:
top_n_microorganisms(example_isolates,
n = 5, property = "genus", n_for_each = 3
)
+
+# filter to the top 2 genera in each of the top 3 families:
+top_n_microorganisms(example_isolates,
+ n = 3, property = "family", n_for_each = 2, property_for_each = "genus"
+)
}
\seealso{
\code{\link[=mo_property]{mo_property()}}, \code{\link[=as.mo]{as.mo()}}, \code{\link[=antibiogram]{antibiogram()}}