Skip to contents

All reference data (about microorganisms, antibiotics, R/SI interpretation, EUCAST rules, etc.) in this AMR package are reliable, up-to-date and freely available. We continually export our data sets to formats for use in R, MS Excel, Apache Feather, Apache Parquet, SPSS, SAS, and Stata. We also provide tab-separated text files that are machine-readable and suitable for input in any software program, such as laboratory information systems.

On this page, we explain how to download them and how the structure of the data sets look like.

microorganisms: Full Microbial Taxonomy

A data set with 48,883 rows and 22 columns, containing the following column names:
mo, fullname, status, kingdom, phylum, class, order, family, genus, species, subspecies, rank, ref, source, lpsn, lpsn_parent, lpsn_renamed_to, gbif, gbif_parent, gbif_renamed_to, prevalence and snomed.

This data set is in R available as microorganisms, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

NOTE: The exported files for SAS, SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise exceed 100 MB; the file size limit of GitHub. Advice? Use R instead.

The tab-separated text file and Microsoft Excel workbook both contain all SNOMED codes as comma separated values.

Source

This data set contains the full microbial taxonomy of five kingdoms from the List of Prokaryotic names with Standing in Nomenclature (LPSN) and the Global Biodiversity Information Facility (GBIF):

  • Parte, AC et al. (2020). List of Prokaryotic names with Standing in Nomenclature (LPSN) moves to the DSMZ. International Journal of Systematic and Evolutionary Microbiology, 70, 5607-5612; . Accessed from https://lpsn.dsmz.de on 12 September, 2022.
  • GBIF Secretariat (November 26, 2021). GBIF Backbone Taxonomy. Checklist dataset . Accessed from https://www.gbif.org on 12 September, 2022.
  • Public Health Information Network Vocabulary Access and Distribution System (PHIN VADS). US Edition of SNOMED CT from 1 September 2020. Value Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL: https://phinvads.cdc.gov

Example content

Included (sub)species per taxonomic kingdom:

Kingdom Number of (sub)species
(unknown kingdom) 5
Animalia 1,524
Archaea 1,237
Bacteria 33,716
Fungi 7,450
Protozoa 4,951

Example rows when filtering on genus Escherichia:

mo fullname status kingdom phylum class order family genus species subspecies rank ref source lpsn lpsn_parent lpsn_renamed_to gbif gbif_parent gbif_renamed_to prevalence snomed
B_ESCHR Escherichia accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia genus Castellani et al., 1919 LPSN 515602 482 3221780 4899 1 407310004, 407251000, 407281008, …
B_ESCHR_ADCR Escherichia adecarboxylata synonym Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia adecarboxylata species Leclerc, 1962 LPSN 776052 515602 777447 1
B_ESCHR_ALBR Escherichia albertii accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia albertii species Huys et al., 2003 LPSN 776053 515602 5427575 3221780 1 419388003
B_ESCHR_BLTT Escherichia blattae synonym Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia blattae species Burgess et al., 1973 LPSN 776056 515602 788468 1
B_ESCHR_COLI Escherichia coli accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia coli species Castellani et al., 1919 LPSN 776057 515602 6110934 3221780 1 1095001000112106, 715307006, 737528008, …
B_ESCHR_DYSN Escherichia dysenteriae accepted Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia dysenteriae species GBIF 10862979 3221780 1

antibiotics: Antibiotic Agents

A data set with 484 rows and 14 columns, containing the following column names:
ab, cid, name, group, atc, atc_group1, atc_group2, abbreviations, synonyms, oral_ddd, oral_units, iv_ddd, iv_units and loinc.

This data set is in R available as antibiotics, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the ATC codes, common abbreviations, trade names and LOINC codes as comma separated values.

Source

This data set contains all EARS-Net and ATC codes gathered from WHO and WHONET, and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.

Example content

ab cid name group atc atc_group1 atc_group2 abbreviations synonyms oral_ddd oral_units iv_ddd iv_units loinc
AMK 37768 Amikacin Aminoglycosides D06AX12, J01GB06, S01AA21 Aminoglycoside antibacterials Other aminoglycosides ak, ami, amik, … amicacin, amikacillin, amikacin, … 1.0 g 13546-7, 15098-7, 17798-0, …
AMX 33613 Amoxicillin Beta-lactams/penicillins J01CA04 Beta-lactam antibacterials, penicillins Penicillins with extended spectrum ac, amox, amx actimoxi, amoclen, amolin, … 1.5 g 3.0 g 16365-9, 25274-2, 3344-9, …
AMC 23665637 Amoxicillin/clavulanic acid Beta-lactams/penicillins J01CR02 Beta-lactam antibacterials, penicillins Combinations of penicillins, incl. beta-lactamase inhibitors a/c, amcl, aml, … amocla, amoclan, amoclav, … 1.5 g 3.0 g
AMP 6249 Ampicillin Beta-lactams/penicillins J01CA01, S01AA19 Beta-lactam antibacterials, penicillins Penicillins with extended spectrum am, amp, ampi acillin, adobacillin, amblosin, … 2.0 g 6.0 g 21066-6, 3355-5, 33562-0, …
AZM 447043 Azithromycin Macrolides/lincosamides J01FA10, S01AA26 Macrolides, lincosamides and streptogramins Macrolides az, azi, azit, … aritromicina, aruzilina, azasite, … 0.3 g 0.5 g 16420-2, 25233-8
PEN 5904 Benzylpenicillin Beta-lactams/penicillins J01CE01, S01AA14 Combinations of antibacterials Combinations of antibacterials bepe, pen, peni, … abbocillin, ayercillin, bencilpenicilina, … 3.6 g

antivirals: Antiviral Agents

A data set with 102 rows and 9 columns, containing the following column names:
atc, cid, name, atc_group, synonyms, oral_ddd, oral_units, iv_ddd and iv_units.

This data set is in R available as antivirals, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

The tab-separated text file and Microsoft Excel workbook, and SAS, SPSS and Stata files all contain the trade names as comma separated values.

Source

This data set contains all ATC codes gathered from WHO and all compound IDs from PubChem. It also contains all brand names (synonyms) as found on PubChem and Defined Daily Doses (DDDs) for oral and parenteral administration.

Example content

atc cid name atc_group synonyms oral_ddd oral_units iv_ddd iv_units
J05AF06 441300 Abacavir Nucleoside and nucleotide reverse transcriptase inhibitors Abacavir, Abacavir sulfate, Ziagen 0.6 g
J05AB01 135398513 Aciclovir Nucleosides and nucleotides excl. reverse transcriptase inhibitors Acicloftal, Aciclovier, Aciclovir, … 4.0 g 4 g
J05AF08 60871 Adefovir dipivoxil Nucleoside and nucleotide reverse transcriptase inhibitors Adefovir di ester, Adefovir dipivoxil, Adefovir Dipivoxil, … 10.0 mg
J05AE05 65016 Amprenavir Protease inhibitors Agenerase, Amprenavir, Amprenavirum, … 1.2 g
J05AP06 16076883 Asunaprevir Antivirals for treatment of HCV infections Asunaprevir, Sunvepra
J05AE08 148192 Atazanavir Protease inhibitors Atazanavir, Atazanavir Base, Latazanavir, … 0.3 g

rsi_translation: Interpretation from MIC values / disk diameters to R/SI

A data set with 18,308 rows and 11 columns, containing the following column names:
guideline, method, site, mo, rank_index, ab, ref_tbl, disk_dose, breakpoint_S, breakpoint_R and uti.

This data set is in R available as rsi_translation, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains interpretation rules for MIC values and disk diffusion diameters. Included guidelines are CLSI (2013-2022) and EUCAST (2013-2022).

Example content

guideline method site mo mo_name rank_index ab ab_name ref_tbl disk_dose breakpoint_S breakpoint_R uti
EUCAST 2022 MIC F_ASPRG_MGTS Aspergillus fumigatus 2 AMB Amphotericin B Aspergillus 1 1 FALSE
EUCAST 2022 MIC F_ASPRG_NIGR Aspergillus niger 2 AMB Amphotericin B Aspergillus 1 1 FALSE
EUCAST 2022 MIC F_CANDD_ALBC Candida albicans 2 AMB Amphotericin B Candida 1 1 FALSE
EUCAST 2022 MIC F_CANDD_DBLN Candida dubliniensis 2 AMB Amphotericin B Candida 1 1 FALSE
EUCAST 2022 MIC F_CANDD_GLBR Candida glabrata 2 AMB Amphotericin B Candida 1 1 FALSE
EUCAST 2022 MIC F_CANDD_KRUS Candida krusei 2 AMB Amphotericin B Candida 1 1 FALSE

intrinsic_resistant: Intrinsic Bacterial Resistance

A data set with 134,659 rows and 2 columns, containing the following column names:
mo and ab.

This data set is in R available as intrinsic_resistant, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

This data set contains all defined intrinsic resistance by EUCAST of all bug-drug combinations, and is based on ‘EUCAST Expert Rules’ and ‘EUCAST Intrinsic Resistance and Unusual Phenotypes’ v3.3 (2021).

Example content

Example rows when filtering on Enterobacter cloacae:

microorganism antibiotic
Enterobacter cloacae Acetylmidecamycin
Enterobacter cloacae Acetylspiramycin
Enterobacter cloacae Amoxicillin
Enterobacter cloacae Amoxicillin/clavulanic acid
Enterobacter cloacae Ampicillin
Enterobacter cloacae Ampicillin/sulbactam
Enterobacter cloacae Avoparcin
Enterobacter cloacae Azithromycin
Enterobacter cloacae Benzylpenicillin
Enterobacter cloacae Cadazolid
Enterobacter cloacae Cefadroxil
Enterobacter cloacae Cefalexin
Enterobacter cloacae Cefalotin
Enterobacter cloacae Cefazolin
Enterobacter cloacae Cefoxitin
Enterobacter cloacae Clarithromycin
Enterobacter cloacae Clindamycin
Enterobacter cloacae Cycloserine
Enterobacter cloacae Dalbavancin
Enterobacter cloacae Dirithromycin
Enterobacter cloacae Erythromycin
Enterobacter cloacae Flurithromycin
Enterobacter cloacae Fusidic acid
Enterobacter cloacae Gamithromycin
Enterobacter cloacae Josamycin
Enterobacter cloacae Kitasamycin
Enterobacter cloacae Lincomycin
Enterobacter cloacae Linezolid
Enterobacter cloacae Meleumycin
Enterobacter cloacae Midecamycin
Enterobacter cloacae Miocamycin
Enterobacter cloacae Nafithromycin
Enterobacter cloacae Norvancomycin
Enterobacter cloacae Oleandomycin
Enterobacter cloacae Oritavancin
Enterobacter cloacae Pirlimycin
Enterobacter cloacae Primycin
Enterobacter cloacae Pristinamycin
Enterobacter cloacae Quinupristin/dalfopristin
Enterobacter cloacae Ramoplanin
Enterobacter cloacae Rifampicin
Enterobacter cloacae Rokitamycin
Enterobacter cloacae Roxithromycin
Enterobacter cloacae Solithromycin
Enterobacter cloacae Spiramycin
Enterobacter cloacae Tedizolid
Enterobacter cloacae Teicoplanin
Enterobacter cloacae Telavancin
Enterobacter cloacae Telithromycin
Enterobacter cloacae Thiacetazone
Enterobacter cloacae Tildipirosin
Enterobacter cloacae Tilmicosin
Enterobacter cloacae Troleandomycin
Enterobacter cloacae Tulathromycin
Enterobacter cloacae Tylosin
Enterobacter cloacae Tylvalosin
Enterobacter cloacae Vancomycin

dosage: Dosage Guidelines from EUCAST

A data set with 169 rows and 9 columns, containing the following column names:
ab, name, type, dose, dose_times, administration, notes, original_txt and eucast_version.

This data set is in R available as dosage, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Direct download links:

Source

EUCAST breakpoints used in this package are based on the dosages in this data set.

Currently included dosages in the data set are meant for: ‘EUCAST Clinical Breakpoint Tables’ v11.0 (2021).

Example content

ab name type dose dose_times administration notes original_txt eucast_version
AMK Amikacin standard_dosage 25-30 mg/kg 1 iv 25-30 mg/kg x 1 iv 11
AMX Amoxicillin high_dosage 2 g 6 iv 2 g x 6 iv 11
AMX Amoxicillin standard_dosage 1 g 3 iv 1 g x 3-4 iv 11
AMX Amoxicillin high_dosage 0.75-1 g 3 oral 0.75-1 g x 3 oral 11
AMX Amoxicillin standard_dosage 0.5 g 3 oral 0.5 g x 3 oral 11
AMX Amoxicillin uncomplicated_uti 0.5 g 3 oral 0.5 g x 3 oral 11

example_isolates: Example Data for Practice

A data set with 2,000 rows and 46 columns, containing the following column names:
date, patient, age, gender, ward, mo, PEN, OXA, FLC, AMX, AMC, AMP, TZP, CZO, FEP, CXM, FOX, CTX, CAZ, CRO, GEN, TOB, AMK, KAN, TMP, SXT, NIT, FOS, LNZ, CIP, MFX, VAN, TEC, TCY, TGC, DOX, ERY, CLI, AZM, IPM, MEM, MTR, CHL, COL, MUP and RIF.

This data set is in R available as example_isolates, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Source

This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.

Example content

date patient age gender ward mo PEN OXA FLC AMX AMC AMP TZP CZO FEP CXM FOX CTX CAZ CRO GEN TOB AMK KAN TMP SXT NIT FOS LNZ CIP MFX VAN TEC TCY TGC DOX ERY CLI AZM IPM MEM MTR CHL COL MUP RIF
2002-01-02 A77334 65 F Clinical B_ESCHR_COLI R I I R R R R R R R R R R
2002-01-03 A77334 65 F Clinical B_ESCHR_COLI R I I R R R R R R R R R R
2002-01-07 067927 45 F ICU B_STPHY_EPDR R R R R S S S S S S R R R
2002-01-07 067927 45 F ICU B_STPHY_EPDR R R R R S S S S S S R R R
2002-01-13 067927 45 F ICU B_STPHY_EPDR R R R R R S S S S R R R
2002-01-13 067927 45 F ICU B_STPHY_EPDR R R R R R S S S S R R R R

example_isolates_unclean: Example Data for Practice

A data set with 3,000 rows and 8 columns, containing the following column names:
patient_id, hospital, date, bacteria, AMX, AMC, CIP and GEN.

This data set is in R available as example_isolates_unclean, after you load the AMR package.

It was last updated on 30 October 2022 13:33:29 UTC. Find more info about the structure of this data set here.

Source

This data set contains randomised fictitious data, but reflects reality and can be used to practise AMR data analysis.

Example content

patient_id hospital date bacteria AMX AMC CIP GEN
J3 A 2012-11-21 E. coli R I S S
R7 A 2018-04-03 K. pneumoniae R I S S
P3 A 2014-09-19 E. coli R S S S
P10 A 2015-12-10 E. coli S I S S
B7 A 2015-03-02 E. coli S S S S
W3 A 2018-03-31 S. aureus R S R S