Data sets for download / own use
-08 May 2023
+12 May 2023
Source:vignettes/datasets.Rmd
datasets.Rmd
microorganisms
: Full Microbial Taxonomy
-A data set with 52 151 rows and 22 columns, containing the following +
A data set with 52 151 rows and 23 columns, containing the following
column names:
mo, fullname, status, kingdom,
phylum, class, order, family,
genus, species, subspecies, rank,
-ref, source, lpsn, lpsn_parent,
-lpsn_renamed_to, gbif, gbif_parent,
-gbif_renamed_to, prevalence, and snomed.
This data set is in R available as microorganisms
, after
you load the AMR
package.
It was last updated on 20 April 2023 13:20:41 UTC. Find more info -about the structure of this data set here.
+It was last updated on 11 May 2023 19:56:27 UTC. Find more info about +the structure of this data set here.
Direct download links:
- Download as original
R Data Structure (RDS) file (1.2 MB)
- Download as tab-separated
-text file (11.3 MB)
+text file (11.7 MB)
- Download as Microsoft
-Excel workbook (5 MB)
+Excel workbook (5.2 MB)
- Download as Apache
-Feather file (5.4 MB)
+Feather file (5.5 MB)
- Download as Apache
Parquet file (2.6 MB)
- Download as SAS
-data file (50.9 MB)
+data (SAS) file (50.9 MB)
+
+ - Download as SAS
+transport (XPT) file (48.4 MB)
- Download as IBM
-SPSS Statistics data file (16.9 MB)
+SPSS Statistics data file (17.7 MB)
- Download as Stata -DTA file (47.1 MB) +DTA file (48.5 MB)
NOTE: The exported files for SAS, SPSS and Stata contain only the first 50 SNOMED codes per record, as their file size would otherwise -exceed 100 MB; the file size limit of GitHub. Advice? Use R -instead.
+exceed 100 MB; the file size limit of GitHub. Their file +structures and compression techniques are very inefficient. Advice? Use +R instead. It’s free and much better in many ways.The tab-separated text file and Microsoft Excel workbook both contain all SNOMED codes as comma separated values.
Example rows when filtering on genus Escherichia:
-mo | @@ -335,6 +341,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL: subspeciesrank | ref | +oxygen_tolerance | source | lpsn | lpsn_parent | @@ -360,6 +367,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:genus | Castellani et al., 1919 | +facultative anaerobe | LPSN | 515602 | 482 | @@ -384,6 +392,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:species | Leclerc, 1962 | +aerobe | LPSN | 776052 | 515602 | @@ -408,6 +417,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:species | Huys et al., 2003 | +aerobe | LPSN | 776053 | 515602 | @@ -432,6 +442,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:species | Burgess et al., 1973 | +likely facultative anaerobe | LPSN | 776056 | 515602 | @@ -456,6 +467,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:species | Castellani et al., 1919 | +facultative anaerobe | LPSN | 776057 | 515602 | @@ -480,6 +492,7 @@ Set Name ‘Microoganism’, OID 2.16.840.1.114222.4.11.1009 (v12). URL:species | + | likely facultative anaerobe | GBIF | @@ -525,7 +538,10 @@ Feather file (0.1 MB) |
---|