All reference data (about microorganisms, antibiotics, R/SI interpretation, EUCAST rules, etc.) in this AMR package are reliable, up-to-date and freely available. We continually export our data sets to formats for use in R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. We also provide tab-separated text files that are machine-readable and suitable for input in any software program, such as laboratory information systems.
On this page, we explain how to download them and how the structure of the data sets look like.
microorganisms: Full Microbial Taxonomy
A data set with 48,787 rows and 22 columns, containing the following column names:
mo, fullname, status, kingdom, phylum, class, order, family, genus, species, subspecies, rank, ref, source, lpsn, lpsn_parent, lpsn_renamed_to, gbif, gbif_parent, gbif_renamed_to, prevalence and snomed.
This data set is in R available as microorganisms, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (1.1 MB)
 
- Download as tab-separated text file (0.4 kB)
 
- Download as Microsoft Excel workbook (4.8 MB)
 
- Download as Apache Feather file (5.1 MB)
 
- Download as Apache Parquet file (2.5 MB)
 
- Download as SAS data file (47.7 MB)
 
- Download as IBM SPSS Statistics data file (15.8 MB)
 
- Download as Stata DTA file (44.4 MB)
NOTE: The exported files for Excel, SAS, SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise exceed 100 MB; the file size limit of GitHub. Advice? Use R instead.
Source
This data set contains the full microbial taxonomy of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF):
- Parte, AC et al. (2020). List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; . Accessed from https://lpsn.dsmz.de on 12 September, 2022.
- GBIF Secretariat (November 26, 2021). GBIF Backbone Taxonomy. Checklist dataset . Accessed from https://www.gbif.org on 12 September, 2022.
- Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL: https://phinvads.cdc.gov
Example content
Included (sub)species per taxonomic kingdom:
| Kingdom | Number of (sub)species | 
|---|---|
| (unknown kingdom) | 3 | 
| Animalia | 1,523 | 
| Archaea | 1,237 | 
| Bacteria | 33,713 | 
| Fungi | 7,365 | 
| Protozoa | 4,946 | 
Example rows when filtering on genus Escherichia:
| mo | fullname | status | kingdom | phylum | class | order | family | genus | species | subspecies | rank | ref | source | lpsn | lpsn_parent | lpsn_renamed_to | gbif | gbif_parent | gbif_renamed_to | prevalence | snomed | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B_ESCHR | Escherichia | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | genus | Castellani et al., 1919 | LPSN | 515602 | 482 | 3221780 | 4899 | 1 | 407310004, 407251000, 407281008, … | ||||
| B_ESCHR_ADCR | Escherichia adecarboxylata | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | adecarboxylata | species | Leclerc, 1962 | LPSN | 776052 | 515602 | 777447 | 1 | |||||
| B_ESCHR_ALBR | Escherichia albertii | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | albertii | species | Huys et al., 2003 | LPSN | 776053 | 515602 | 5427575 | 3221780 | 1 | 419388003 | |||
| B_ESCHR_BLTT | Escherichia blattae | synonym | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | blattae | species | Burgess et al., 1973 | LPSN | 776056 | 515602 | 788468 | 1 | |||||
| B_ESCHR_COLI | Escherichia coli | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | coli | species | Castellani et al., 1919 | LPSN | 776057 | 515602 | 6110934 | 3221780 | 1 | 1095001000112106, 715307006, 737528008, … | |||
| B_ESCHR_DYSN | Escherichia dysenteriae | accepted | Bacteria | Pseudomonadota | Gammaproteobacteria | Enterobacterales | Enterobacteriaceae | Escherichia | dysenteriae | species | GBIF | 10862979 | 3221780 | 1 | 
antibiotics: Antibiotic Agents
A data set with 464 rows and 14 columns, containing the following column names:
ab, cid, name, group, atc, atc_group1, atc_group2, abbreviations, synonyms, oral_ddd, oral_units, iv_ddd, iv_units and loinc.
This data set is in R available as antibiotics, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (36 kB)
 
- Download as tab-separated text file (0.2 kB)
 
- Download as Microsoft Excel workbook (66 kB)
 
- Download as Apache Feather file (97 kB)
 
- Download as Apache Parquet file (74 kB)
 
- Download as SAS data file (1.8 MB)
 
- Download as IBM SPSS Statistics data file (0.3 MB)
 
- Download as Stata DTA file (0.3 MB)
Source
This data set contains all EARS-Net and ATC codes gathered from WHO and WHONET, and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.
- ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use)
- PubChem by the US National Library of Medicine
- WHONET software 2019
Example content
| ab | cid | name | group | atc | atc_group1 | atc_group2 | abbreviations | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | loinc | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AMK | 37768 | Amikacin | Aminoglycosides | D06AX12, J01GB06, S01AA21 | Aminoglycoside antibacterials | Other aminoglycosides | ak, ami, amik, … | amicacin, amikacillin, amikacin, … | 1.0 | g | 13546-7, 15098-7, 17798-0, … | ||
| AMX | 33613 | Amoxicillin | Beta-lactams/penicillins | J01CA04 | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | ac, amox, amx | actimoxi, amoclen, amolin, … | 1.5 | g | 3.0 | g | 16365-9, 25274-2, 3344-9, … | 
| AMC | 23665637 | Amoxicillin/clavulanic acid | Beta-lactams/penicillins | J01CR02 | Beta-lactam antibacterials, penicillins | Combinations of penicillins, incl. beta-lactamase inhibitors | a/c, amcl, aml, … | amocla, amoclan, amoclav, … | 1.5 | g | 3.0 | g | |
| AMP | 6249 | Ampicillin | Beta-lactams/penicillins | J01CA01, S01AA19 | Beta-lactam antibacterials, penicillins | Penicillins with extended spectrum | am, amp, ampi | acillin, adobacillin, amblosin, … | 2.0 | g | 6.0 | g | 21066-6, 3355-5, 33562-0, … | 
| AZM | 447043 | Azithromycin | Macrolides/lincosamides | J01FA10, S01AA26 | Macrolides, lincosamides and streptogramins | Macrolides | az, azi, azit, … | aritromicina, azasite, azenil, … | 0.3 | g | 0.5 | g | 16420-2, 25233-8 | 
| PEN | 5904 | Benzylpenicillin | Beta-lactams/penicillins | J01CE01, S01AA14 | Combinations of antibacterials | Combinations of antibacterials | bepe, pen, peni, … | abbocillin, ayercillin, bencilpenicilina, … | 3.6 | g | 3913-1 | 
antivirals: Antiviral Agents
A data set with 102 rows and 9 columns, containing the following column names:
atc, cid, name, atc_group, synonyms, oral_ddd, oral_units, iv_ddd and iv_units.
This data set is in R available as antivirals, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (4 kB)
 
- Download as tab-separated text file (16 kB)
 
- Download as Microsoft Excel workbook (14 kB)
 
- Download as Apache Feather file (12 kB)
 
- Download as Apache Parquet file (10 kB)
 
- Download as SAS data file (80 kB)
 
- Download as IBM SPSS Statistics data file (27 kB)
 
- Download as Stata DTA file (67 kB)
Source
This data set contains all ATC codes gathered from WHO and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.
- ATC/DDD index from WHO Collaborating Centre for Drug Statistics Methodology (note: this may not be used for commercial purposes, but is freely available from the WHO CC website for personal use)
- PubChem by the US National Library of Medicine
Example content
| atc | cid | name | atc_group | synonyms | oral_ddd | oral_units | iv_ddd | iv_units | 
|---|---|---|---|---|---|---|---|---|
| J05AF06 | 441300 | Abacavir | Nucleoside and nucleotide reverse transcriptase inhibitors | Abacavir, Abacavir sulfate, Ziagen | 0.6 | g | ||
| J05AB01 | 135398513 | Aciclovir | Nucleosides and nucleotides excl. reverse transcriptase inhibitors | Acicloftal, Aciclovier, Aciclovir, … | 4.0 | g | 4 | g | 
| J05AF08 | 60871 | Adefovir dipivoxil | Nucleoside and nucleotide reverse transcriptase inhibitors | Adefovir di ester, Adefovir dipivoxil, Adefovir Dipivoxil, … | 10.0 | mg | ||
| J05AE05 | 65016 | Amprenavir | Protease inhibitors | Agenerase, Amprenavir, Amprenavirum, … | 1.2 | g | ||
| J05AP06 | 16076883 | Asunaprevir | Antivirals for treatment of HCV infections | Asunaprevir, Sunvepra | ||||
| J05AE08 | 148192 | Atazanavir | Protease inhibitors | Atazanavir, Atazanavir Base, Latazanavir, … | 0.3 | g | 
rsi_translation: Interpretation from MIC values / disk diameters to R/SI
A data set with 20,369 rows and 11 columns, containing the following column names:
guideline, method, site, mo, rank_index, ab, ref_tbl, disk_dose, breakpoint_S, breakpoint_R and uti.
This data set is in R available as rsi_translation, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (49 kB)
 
- Download as tab-separated text file (2 MB)
 
- Download as Microsoft Excel workbook (0.9 MB)
 
- Download as Apache Feather file (0.8 MB)
 
- Download as Apache Parquet file (99 kB)
 
- Download as SAS data file (4 MB)
 
- Download as IBM SPSS Statistics data file (2.6 MB)
 
- Download as Stata DTA file (3.8 MB)
Source
This data set contains interpretation rules for MIC values and disk diffusion diameters. Included guidelines are CLSI (2011-2022) and EUCAST (2011-2022).
Example content
| guideline | method | site | mo | mo_name | rank_index | ab | ab_name | ref_tbl | disk_dose | breakpoint_S | breakpoint_R | uti | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EUCAST 2022 | MIC | F_ASPRG_MGTS | Aspergillus fumigatus | 2 | AMB | Amphotericin B | Aspergillus | 1 | 1 | FALSE | ||
| EUCAST 2022 | MIC | F_ASPRG_NIGR | Aspergillus niger | 2 | AMB | Amphotericin B | Aspergillus | 1 | 1 | FALSE | ||
| EUCAST 2022 | MIC | F_CANDD | Candida | 3 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | ||
| EUCAST 2022 | MIC | F_CANDD_ALBC | Candida albicans | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | ||
| EUCAST 2022 | MIC | F_CANDD_DBLN | Candida dubliniensis | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | ||
| EUCAST 2022 | MIC | F_CANDD_KRUS | Candida krusei | 2 | AMB | Amphotericin B | Candida | 1 | 1 | FALSE | 
intrinsic_resistant: Intrinsic Bacterial Resistance
A data set with 134,659 rows and 2 columns, containing the following column names:
mo and ab.
This data set is in R available as intrinsic_resistant, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (78 kB)
 
- Download as tab-separated text file (5.1 MB)
 
- Download as Microsoft Excel workbook (1.3 MB)
 
- Download as Apache Feather file (1.2 MB)
 
- Download as Apache Parquet file (0.2 MB)
 
- Download as SAS data file (9.8 MB)
 
- Download as IBM SPSS Statistics data file (7.4 MB)
 
- Download as Stata DTA file (9.6 MB)
Source
This data set contains all defined intrinsic resistance by EUCAST of all bug-drug combinations, and is based on ‘EUCAST Expert Rules’ and ‘EUCAST Intrinsic Resistance and Unusual Phenotypes’ v3.3 (2021).
Example content
Example rows when filtering on Enterobacter cloacae:
| microorganism | antibiotic | 
|---|---|
| Enterobacter cloacae | Acetylmidecamycin | 
| Enterobacter cloacae | Acetylspiramycin | 
| Enterobacter cloacae | Amoxicillin | 
| Enterobacter cloacae | Amoxicillin/clavulanic acid | 
| Enterobacter cloacae | Ampicillin | 
| Enterobacter cloacae | Ampicillin/sulbactam | 
| Enterobacter cloacae | Avoparcin | 
| Enterobacter cloacae | Azithromycin | 
| Enterobacter cloacae | Benzylpenicillin | 
| Enterobacter cloacae | Cadazolid | 
| Enterobacter cloacae | Cefadroxil | 
| Enterobacter cloacae | Cefazolin | 
| Enterobacter cloacae | Cefoxitin | 
| Enterobacter cloacae | Cephalexin | 
| Enterobacter cloacae | Cephalothin | 
| Enterobacter cloacae | Clarithromycin | 
| Enterobacter cloacae | Clindamycin | 
| Enterobacter cloacae | Cycloserine | 
| Enterobacter cloacae | Dalbavancin | 
| Enterobacter cloacae | Dirithromycin | 
| Enterobacter cloacae | Erythromycin | 
| Enterobacter cloacae | Flurithromycin | 
| Enterobacter cloacae | Fusidic acid | 
| Enterobacter cloacae | Gamithromycin | 
| Enterobacter cloacae | Josamycin | 
| Enterobacter cloacae | Kitasamycin | 
| Enterobacter cloacae | Lincomycin | 
| Enterobacter cloacae | Linezolid | 
| Enterobacter cloacae | Meleumycin | 
| Enterobacter cloacae | Midecamycin | 
| Enterobacter cloacae | Miocamycin | 
| Enterobacter cloacae | Nafithromycin | 
| Enterobacter cloacae | Norvancomycin | 
| Enterobacter cloacae | Oleandomycin | 
| Enterobacter cloacae | Oritavancin | 
| Enterobacter cloacae | Pirlimycin | 
| Enterobacter cloacae | Primycin | 
| Enterobacter cloacae | Pristinamycin | 
| Enterobacter cloacae | Quinupristin/dalfopristin | 
| Enterobacter cloacae | Ramoplanin | 
| Enterobacter cloacae | Rifampicin | 
| Enterobacter cloacae | Rokitamycin | 
| Enterobacter cloacae | Roxithromycin | 
| Enterobacter cloacae | Solithromycin | 
| Enterobacter cloacae | Spiramycin | 
| Enterobacter cloacae | Tedizolid | 
| Enterobacter cloacae | Teicoplanin | 
| Enterobacter cloacae | Telavancin | 
| Enterobacter cloacae | Telithromycin | 
| Enterobacter cloacae | Thiacetazone | 
| Enterobacter cloacae | Tildipirosin | 
| Enterobacter cloacae | Tilmicosin | 
| Enterobacter cloacae | Troleandomycin | 
| Enterobacter cloacae | Tulathromycin | 
| Enterobacter cloacae | Tylosin | 
| Enterobacter cloacae | Tylvalosin | 
| Enterobacter cloacae | Vancomycin | 
dosage: Dosage Guidelines from EUCAST
A data set with 169 rows and 9 columns, containing the following column names:
ab, name, type, dose, dose_times, administration, notes, original_txt and eucast_version.
This data set is in R available as dosage, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Direct download links:
- Download as original R Data Structure (RDS) file (3 kB)
 
- Download as tab-separated text file (15 kB)
 
- Download as Microsoft Excel workbook (14 kB)
 
- Download as Apache Feather file (11 kB)
 
- Download as Apache Parquet file (7 kB)
 
- Download as SAS data file (52 kB)
 
- Download as IBM SPSS Statistics data file (23 kB)
 
- Download as Stata DTA file (44 kB)
Source
EUCAST breakpoints used in this package are based on the dosages in this data set.
Currently included dosages in the data set are meant for: ‘EUCAST Clinical Breakpoint Tables’ v11.0 (2021).
Example content
| ab | name | type | dose | dose_times | administration | notes | original_txt | eucast_version | 
|---|---|---|---|---|---|---|---|---|
| AMK | Amikacin | standard_dosage | 25-30 mg/kg | 1 | iv | 25-30 mg/kg x 1 iv | 11 | |
| AMX | Amoxicillin | high_dosage | 2 g | 6 | iv | 2 g x 6 iv | 11 | |
| AMX | Amoxicillin | standard_dosage | 1 g | 3 | iv | 1 g x 3-4 iv | 11 | |
| AMX | Amoxicillin | high_dosage | 0.75-1 g | 3 | oral | 0.75-1 g x 3 oral | 11 | |
| AMX | Amoxicillin | standard_dosage | 0.5 g | 3 | oral | 0.5 g x 3 oral | 11 | |
| AMX | Amoxicillin | uncomplicated_uti | 0.5 g | 3 | oral | 0.5 g x 3 oral | 11 | 
example_isolates: Example Data for Practice
A data set with 2,000 rows and 46 columns, containing the following column names:
date, patient, age, gender, ward, mo, PEN, OXA, FLC, AMX, AMC, AMP, TZP, CZO, FEP, CXM, FOX, CTX, CAZ, CRO, GEN, TOB, AMK, KAN, TMP, SXT, NIT, FOS, LNZ, CIP, MFX, VAN, TEC, TCY, TGC, DOX, ERY, CLI, AZM, IPM, MEM, MTR, CHL, COL, MUP and RIF.
This data set is in R available as example_isolates, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.
Source
This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.
Example content
| date | patient | age | gender | ward | mo | PEN | OXA | FLC | AMX | AMC | AMP | TZP | CZO | FEP | CXM | FOX | CTX | CAZ | CRO | GEN | TOB | AMK | KAN | TMP | SXT | NIT | FOS | LNZ | CIP | MFX | VAN | TEC | TCY | TGC | DOX | ERY | CLI | AZM | IPM | MEM | MTR | CHL | COL | MUP | RIF | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2002-01-02 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | I | I | R | R | R | R | R | R | R | R | R | R | |||||||||||||||||||||||||||
| 2002-01-03 | A77334 | 65 | F | Clinical | B_ESCHR_COLI | R | I | I | R | R | R | R | R | R | R | R | R | R | |||||||||||||||||||||||||||
| 2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | S | S | S | S | S | S | R | R | R | |||||||||||||||||||||||||||
| 2002-01-07 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | S | S | S | S | S | S | R | R | R | |||||||||||||||||||||||||||
| 2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | R | S | S | S | S | R | R | R | ||||||||||||||||||||||||||||
| 2002-01-13 | 067927 | 45 | F | ICU | B_STPHY_EPDR | R | R | R | R | R | S | S | S | S | R | R | R | R | 
example_isolates_unclean: Example Data for Practice
A data set with 3,000 rows and 8 columns, containing the following column names:
patient_id, hospital, date, bacteria, AMX, AMC, CIP and GEN.
This data set is in R available as example_isolates_unclean, after you load the AMR package.
It was last updated on 19 October 2022 09:53:44 UTC. Find more info about the structure of this data set here.