All reference data (about microorganisms, antibiotics, SIR
interpretation, EUCAST rules, etc.) in this AMR
package are
reliable, up-to-date and freely available. We continually export our
data sets to formats for use in R, MS Excel, Apache Feather, Apache
Parquet, SPSS, SAS, and Stata. We also provide tab-separated text files
that are machine-readable and suitable for input in any software
program, such as laboratory information systems.
On this page, we explain how to download them and how the structure of the data sets look like.
microorganisms
: Full Microbial Taxonomy
A data set with 52 151 rows and 23 columns, containing the following
column names:
mo, fullname, status, kingdom,
phylum, class, order, family,
genus, species, subspecies, rank,
ref, oxygen_tolerance, source, lpsn,
lpsn_parent, lpsn_renamed_to, gbif,
gbif_parent, gbif_renamed_to, prevalence, and
snomed.
This data set is in R available as microorganisms
, after
you load the AMR
package.
It was last updated on 11 May 2023 19:56:27 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (1.2 MB)
- Download as tab-separated
text file (11.7 MB)
- Download as Microsoft
Excel workbook (5.2 MB)
- Download as Apache
Feather file (5.5 MB)
- Download as Apache
Parquet file (2.6 MB)
- Download as SAS
data (SAS) file (50.9 MB)
- Download as SAS
transport (XPT) file (48.4 MB)
- Download as IBM
SPSS Statistics data file (17.7 MB)
- Download as Stata DTA file (48.5 MB)
NOTE: The exported files for SAS, SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise exceed 100 MB; the file size limit of GitHub. Their file structures and compression techniques are very inefficient. Advice? Use R instead. It’s free and much better in many ways.
The tab-separated text file and Microsoft Excel workbook both contain all SNOMED codes as comma separated values.
Source
This data set contains the full microbial taxonomy of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF):
- Parte, AC et al. (2020). List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; . Accessed from https://lpsn.dsmz.de on December 11th, 2022.
- GBIF Secretariat (2022). GBIF Backbone Taxonomy. Checklist dataset . Accessed from https://www.gbif.org on December 11th, 2022.
- Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL: https://phinvads.cdc.gov
Example content
Included (sub)species per taxonomic kingdom:
Kingdom | Number of (sub)species |
---|---|
(unknown kingdom) | 1 |
Animalia | 1 379 |
Archaea | 1 314 |
Bacteria | 36 485 |
Fungi | 7 901 |
Protozoa | 5 071 |
Example rows when filtering on genus Escherichia:
mo | fullname | status | kingdom | phylum | class | order | family | genus | species | subspecies | rank | ref | oxygen_tolerance | source | lpsn | lpsn_parent | lpsn_renamed_to | gbif | gbif_parent | gbif_renamed_to | prevalence | snomed |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
B_ESCHR | Escherichia | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | genus | Castellani et al., 1919 | facultative anaerobe | LPSN | 515602 | 482 | 3221780 | 11158430 | 1.0 | 407310004, 407251000, 407281008, … | ||||
B_ESCHR_ADCR | Escherichia adecarboxylata | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | adecarboxylata | species | Leclerc, 1962 | aerobe | LPSN | 776052 | 515602 | 777447 | 1.0 | |||||
B_ESCHR_ALBR | Escherichia albertii | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | albertii | species | Huys et al., 2003 | aerobe | LPSN | 776053 | 515602 | 5427575 | 3221780 | 1.0 | 419388003 | |||
B_ESCHR_BLTT | Escherichia blattae | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | blattae | species | Burgess et al., 1973 | likely facultative anaerobe | LPSN | 776056 | 515602 | 788468 | 1.5 | |||||
B_ESCHR_COLI | Escherichia coli | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | coli | species | Castellani et al., 1919 | facultative anaerobe | LPSN | 776057 | 515602 | 11286021 | 3221780 | 1.0 | 1095001000112106, 715307006, 737528008, … | |||
B_ESCHR_DYSN | Escherichia dysenteriae | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | dysenteriae | species | likely facultative anaerobe | GBIF | 10862979 | 3221780 | 1.5 |
antibiotics
: Antibiotic (+Antifungal) Drugs
A data set with 483 rows and 14 columns, containing the following
column names:
ab, cid, name, group, atc,
atc_group1, atc_group2, abbreviations,
synonyms, oral_ddd, oral_units,
iv_ddd, iv_units, and loinc.
This data set is in R available as antibiotics
, after
you load the AMR
package.
It was last updated on 22 February 2023 13:38:57 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (39 kB)
- Download as tab-separated
text file (0.1 MB)
- Download as Microsoft
Excel workbook (66 kB)
- Download as Apache
Feather file (0.1 MB)
- Download as Apache
Parquet file (97 kB)
- Download as SAS
data (SAS) file (1.9 MB)
- Download as SAS
transport (XPT) file (1.4 MB)
- Download as IBM
SPSS Statistics data file (0.3 MB)
- Download as Stata DTA file (0.4 MB)
The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the ATC codes, common abbreviations, trade names and LOINC codes as comma separated values.
Source
This data set contains all EARS-Net and ATC codes gathered from WHO and WHONET, and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.
- ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use)
- PubChem by the US National Library of Medicine
- WHONET software 2019
- LOINC (Logical Observation Identifiers Names and Codes)
Example content
ab | cid | name | group | atc | atc_group1 | atc_group2 | abbreviations | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | loinc |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
AMK | 37768 | Amikacin | Aminoglycosides | D06AX12, J01GB06, S01AA21 | Aminoglycoside antibacterials | Other aminoglycosides | ak, ami, amik, … | amicacin, amikacillin, amikacin, … | 1.0 | g | 13546-7, 15098-7, 17798-0, … | ||
AMX | 33613 | Amoxicillin | Beta-lactams/penicillins | J01CA04 | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | ac, amox, amx | actimoxi, amoclen, amolin, … | 1.5 | g | 3.0 | g | 16365-9, 25274-2, 3344-9, … |
AMC | 23665637 | Amoxicillin/clavulanic acid | Beta-lactams/penicillins | J01CR02 | Beta-lactam antibacterials, penicillins | Combinations of penicillins, incl. beta-lactamase inhibitors | a/c, amcl, aml, … | amocla, amoclan, amoclav, … | 1.5 | g | 3.0 | g | |
AMP | 6249 | Ampicillin | Beta-lactams/penicillins | J01CA01, S01AA19 | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | am, amp, ampi | acillin, adobacillin, amblosin, … | 2.0 | g | 6.0 | g | 21066-6, 3355-5, 33562-0, … |
AZM | 447043 | Azithromycin | Macrolides/lincosamides | J01FA10, S01AA26 | Macrolides, lincosamides and streptogramins | Macrolides | az, azi, azit, … | aritromicina, aruzilina, azasite, … | 0.3 | g | 0.5 | g | 16420-2, 25233-8 |
PEN | 5904 | Benzylpenicillin | Beta-lactams/penicillins | J01CE01, S01AA14 | Combinations of antibacterials | Combinations of antibacterials | bepe, pen, peni, … | abbocillin, ayercillin, bencilpenicilina, … | 3.6 | g |
antivirals
: Antiviral Drugs
A data set with 120 rows and 11 columns, containing the following
column names:
av, name, atc, cid,
atc_group, synonyms, oral_ddd,
oral_units, iv_ddd, iv_units, and
loinc.
This data set is in R available as antivirals
, after you
load the AMR
package.
It was last updated on 13 November 2022 07:46:10 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (5 kB)
- Download as tab-separated
text file (16 kB)
- Download as Microsoft
Excel workbook (16 kB)
- Download as Apache
Feather file (15 kB)
- Download as Apache
Parquet file (13 kB)
- Download as SAS
data (SAS) file (84 kB)
- Download as SAS
transport (XPT) file (68 kB)
- Download as IBM
SPSS Statistics data file (30 kB)
- Download as Stata DTA file (73 kB)
The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the trade names and LOINC codes as comma separated values.
Source
This data set contains all ATC codes gathered from WHO and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.
- ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use)
- PubChem by the US National Library of Medicine
- LOINC (Logical Observation Identifiers Names and Codes)
Example content
av | name | atc | cid | atc_group | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | loinc |
---|---|---|---|---|---|---|---|---|---|---|
ABA | Abacavir | J05AF06 | 441300 | Nucleoside and nucleotide reverse transcriptase inhibitors | abacavir sulfate, avacavir, ziagen | 0.6 | g | 29113-8, 78772-1, 78773-9, … | ||
ACI | Aciclovir | J05AB01 | 135398513 | Nucleosides and nucleotides excl. reverse transcriptase inhibitors | acicloftal, aciclovier, aciclovirum, … | 4.0 | g | 4 | g | |
ADD | Adefovir dipivoxil | J05AF08 | 60871 | Nucleoside and nucleotide reverse transcriptase inhibitors | adefovir di, adefovir di ester, adefovir dipivoxyl, … | 10.0 | mg | |||
AME | Amenamevir | J05AX26 | 11397521 | Other antivirals | amenalief | 0.4 | g | |||
AMP | Amprenavir | J05AE05 | 65016 | Protease inhibitors | agenerase, carbamate, prozei | 1.2 | g | 29114-6, 31028-4, 78791-1 | ||
ASU | Asunaprevir | J05AP06 | 16076883 | Antivirals for treatment of HCV infections | sunvepra, sunvepratrade | 0.2 | g |
clinical_breakpoints
: Interpretation from MIC values
& disk diameters to SIR
A data set with 42 599 rows and 12 columns, containing the following
column names:
guideline, method, site, mo,
rank_index, ab, ref_tbl, disk_dose,
breakpoint_S, breakpoint_R, ecoff, and
uti.
This data set is in R available as clinical_breakpoints
,
after you load the AMR
package.
It was last updated on 22 June 2023 13:10:59 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (82 kB)
- Download as tab-separated
text file (4.4 MB)
- Download as Microsoft
Excel workbook (1.8 MB)
- Download as Apache
Feather file (1.6 MB)
- Download as Apache
Parquet file (0.2 MB)
- Download as SAS
data (SAS) file (3.6 MB)
- Download as SAS
transport (XPT) file (10.6 MB)
- Download as IBM
SPSS Statistics data file (5.8 MB)
- Download as Stata DTA file (10.5 MB)
Source
This data set contains interpretation rules for MIC values and disk diffusion diameters. Included guidelines are CLSI (2011-2023) and EUCAST (2011-2023).
Example content
guideline | method | site | mo | mo_name | rank_index | ab | ab_name | ref_tbl | disk_dose | breakpoint_S | breakpoint_R | ecoff | uti |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EUCAST 2023 | MIC | F_ASPRG_MGTS | Aspergillus fumigatus | 2 | AMB | Amphotericin B | Aspergillus | 1 | 1 | FALSE | |||
EUCAST 2023 | MIC | F_ASPRG_NIGR | Aspergillus niger | 2 | AMB | Amphotericin B | Aspergillus | 1 | 1 | FALSE | |||
EUCAST 2023 | MIC | F_CANDD_ALBC | Candida albicans | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | |||
EUCAST 2023 | MIC | F_CANDD_DBLN | Candida dubliniensis | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | |||
EUCAST 2023 | MIC | F_CANDD_GLBR | Candida glabrata | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | |||
EUCAST 2023 | MIC | F_CANDD_KRUS | Candida krusei | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE |
intrinsic_resistant
: Intrinsic Bacterial
Resistance
A data set with 134 634 rows and 2 columns, containing the following
column names:
mo and ab.
This data set is in R available as intrinsic_resistant
,
after you load the AMR
package.
It was last updated on 16 December 2022 15:10:43 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (78 kB)
- Download as tab-separated
text file (5.1 MB)
- Download as Microsoft
Excel workbook (1.3 MB)
- Download as Apache
Feather file (1.2 MB)
- Download as Apache
Parquet file (0.2 MB)
- Download as SAS
data (SAS) file (9.8 MB)
- Download as SAS
transport (XPT) file (9.5 MB)
- Download as IBM
SPSS Statistics data file (7.4 MB)
- Download as Stata DTA file (9.5 MB)
Source
This data set contains all defined intrinsic resistance by EUCAST of all bug-drug combinations, and is based on ‘EUCAST Expert Rules’ and ‘EUCAST Intrinsic Resistance and Unusual Phenotypes’ v3.3 (2021).
Example content
Example rows when filtering on Enterobacter cloacae:
microorganism | antibiotic |
---|---|
Enterobacter cloacae | Acetylmidecamycin |
Enterobacter cloacae | Acetylspiramycin |
Enterobacter cloacae | Amoxicillin |
Enterobacter cloacae | Amoxicillin/clavulanic acid |
Enterobacter cloacae | Ampicillin |
Enterobacter cloacae | Ampicillin/sulbactam |
Enterobacter cloacae | Avoparcin |
Enterobacter cloacae | Azithromycin |
Enterobacter cloacae | Benzylpenicillin |
Enterobacter cloacae | Cadazolid |
Enterobacter cloacae | Cefadroxil |
Enterobacter cloacae | Cefalexin |
Enterobacter cloacae | Cefalotin |
Enterobacter cloacae | Cefazolin |
Enterobacter cloacae | Cefoxitin |
Enterobacter cloacae | Clarithromycin |
Enterobacter cloacae | Clindamycin |
Enterobacter cloacae | Cycloserine |
Enterobacter cloacae | Dalbavancin |
Enterobacter cloacae | Dirithromycin |
Enterobacter cloacae | Erythromycin |
Enterobacter cloacae | Flurithromycin |
Enterobacter cloacae | Fusidic acid |
Enterobacter cloacae | Gamithromycin |
Enterobacter cloacae | Josamycin |
Enterobacter cloacae | Kitasamycin |
Enterobacter cloacae | Lincomycin |
Enterobacter cloacae | Linezolid |
Enterobacter cloacae | Meleumycin |
Enterobacter cloacae | Midecamycin |
Enterobacter cloacae | Miocamycin |
Enterobacter cloacae | Nafithromycin |
Enterobacter cloacae | Norvancomycin |
Enterobacter cloacae | Oleandomycin |
Enterobacter cloacae | Oritavancin |
Enterobacter cloacae | Pirlimycin |
Enterobacter cloacae | Primycin |
Enterobacter cloacae | Pristinamycin |
Enterobacter cloacae | Quinupristin/dalfopristin |
Enterobacter cloacae | Ramoplanin |
Enterobacter cloacae | Rifampicin |
Enterobacter cloacae | Rokitamycin |
Enterobacter cloacae | Roxithromycin |
Enterobacter cloacae | Solithromycin |
Enterobacter cloacae | Spiramycin |
Enterobacter cloacae | Tedizolid |
Enterobacter cloacae | Teicoplanin |
Enterobacter cloacae | Telavancin |
Enterobacter cloacae | Telithromycin |
Enterobacter cloacae | Thiacetazone |
Enterobacter cloacae | Tildipirosin |
Enterobacter cloacae | Tilmicosin |
Enterobacter cloacae | Troleandomycin |
Enterobacter cloacae | Tulathromycin |
Enterobacter cloacae | Tylosin |
Enterobacter cloacae | Tylvalosin |
Enterobacter cloacae | Vancomycin |
dosage
: Dosage Guidelines from EUCAST
A data set with 503 rows and 9 columns, containing the following
column names:
ab, name, type, dose,
dose_times, administration, notes,
original_txt, and eucast_version.
This data set is in R available as dosage
, after you
load the AMR
package.
It was last updated on 22 June 2023 13:10:59 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (3 kB)
- Download as tab-separated
text file (43 kB)
- Download as Microsoft
Excel workbook (25 kB)
- Download as Apache
Feather file (21 kB)
- Download as Apache
Parquet file (9 kB)
- Download as SAS
data (SAS) file (92 kB)
- Download as SAS
transport (XPT) file (0.1 MB)
- Download as IBM
SPSS Statistics data file (64 kB)
- Download as Stata DTA file (0.1 MB)
Source
EUCAST breakpoints used in this package are based on the dosages in this data set.
Currently included dosages in the data set are meant for: ‘EUCAST Clinical Breakpoint Tables’ v11.0 (2021), ‘EUCAST Clinical Breakpoint Tables’ v12.0 (2022), and ‘EUCAST Clinical Breakpoint Tables’ v13.0 (2023).
Example content
ab | name | type | dose | dose_times | administration | notes | original_txt | eucast_version |
---|---|---|---|---|---|---|---|---|
AMK | Amikacin | standard_dosage | 25-30 mg/kg | 1 | iv | 25-30 mg/kg x 1 iv | 13 | |
AMX | Amoxicillin | high_dosage | 2 g | 6 | iv | 2 g x 6 iv | 13 | |
AMX | Amoxicillin | standard_dosage | 1 g | 3 | iv | 1 g x 3-4 iv | 13 | |
AMX | Amoxicillin | high_dosage | 0.75-1 g | 3 | oral | 0.75-1 g x 3 oral | 13 | |
AMX | Amoxicillin | standard_dosage | 0.5 g | 3 | oral | 0.5 g x 3 oral | 13 | |
AMX | Amoxicillin | uncomplicated_uti | 0.5 g | 3 | oral | 0.5 g x 3 oral | 13 |
example_isolates
: Example Data for Practice
A data set with 2 000 rows and 46 columns, containing the following
column names:
date, patient, age, gender,
ward, mo, PEN, OXA, FLC,
AMX, AMC, AMP, TZP, CZO,
FEP, CXM, FOX, CTX, CAZ,
CRO, GEN, TOB, AMK, KAN,
TMP, SXT, NIT, FOS, LNZ,
CIP, MFX, VAN, TEC, TCY,
TGC, DOX, ERY, CLI, AZM,
IPM, MEM, MTR, CHL, COL,
MUP, and RIF.
This data set is in R available as example_isolates
,
after you load the AMR
package.
It was last updated on 21 January 2023 22:47:20 UTC. Find more info about the structure of this data set here.
Source
This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.
Example content
date | patient | age | gender | ward | mo | PEN | OXA | FLC | AMX | AMC | AMP | TZP | CZO | FEP | CXM | FOX | CTX | CAZ | CRO | GEN | TOB | AMK | KAN | TMP | SXT | NIT | FOS | LNZ | CIP | MFX | VAN | TEC | TCY | TGC | DOX | ERY | CLI | AZM | IPM | MEM | MTR | CHL | COL | MUP | RIF |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2002-01-02 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | I | I | R | R | R | R | R | R | R | R | R | R | |||||||||||||||||||||||||||
2002-01-03 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | I | I | R | R | R | R | R | R | R | R | R | R | |||||||||||||||||||||||||||
2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | S | S | S | S | S | S | R | R | R | |||||||||||||||||||||||||||
2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | S | S | S | S | S | S | R | R | R | |||||||||||||||||||||||||||
2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | R | S | S | S | S | R | R | R | ||||||||||||||||||||||||||||
2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | R | S | S | S | S | R | R | R | R |
example_isolates_unclean
: Example Data for
Practice
A data set with 3 000 rows and 8 columns, containing the following
column names:
patient_id, hospital, date,
bacteria, AMX, AMC, CIP, and
GEN.
This data set is in R available as
example_isolates_unclean
, after you load the
AMR
package.
It was last updated on 27 August 2022 18:49:37 UTC. Find more info about the structure of this data set here.