<div class = "col-md-9 contents" >
<div class = "page-header toc-ignore" >
How to create frequency tables
Matthijs S. Berends
2019-02-08 16:06:54 +01:00
08 February 2019
2019-01-02 23:24:07 +01:00
freq.Rmd

<div id = "introduction" class = "section level2" >
<h2 class = "hasAnchor" >
Introduction
< p > Frequency tables (or frequency distributions) are summaries of the distribution of values in a sample. With the < code > freq< / code > function, you can create univariate frequency tables. Multiple variables will be pasted into one variable, so it forces a univariate distribution. We take the < code > septic_patients< / code > dataset (included in this AMR package) as example.< / p >
< / div >
<div id = "frequencies-of-one-variable" class = "section level2" >
<h2 class = "hasAnchor" >
Frequencies of one variable
< p > To only show and quickly review the content of one variable, you can just select this variable in various ways. Let’ s say we want to get the frequencies of the < code > gender< / code > variable of the < code > septic_patients< / code > dataset:< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> % freq(gender)
2019-02-08 16:06:54 +01:00
**Frequency table of `gender` from a `data.frame` (2,000 x 49)**
Class: `character` (`character`)
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 2

Shortest: 1
Longest: 1
Class: < code > character< / code > (< code > character< / code > )< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 2< / p >
< p > Shortest: 1< br >
Longest: 1< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > M< / td >
< td align = "right" > 1,031< / td >
< td align = "right" > 51.6%< / td >
< td align = "right" > 1,031< / td >
< td align = "right" > 51.6%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > F< / td >
< td align = "right" > 969< / td >
< td align = "right" > 48.5%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< p > This immediately shows the class of the variable, its length and availability (i.e. the amount of < code > NA< / code > ), the amount of unique values and (most importantly) that among septic patients men are more prevalent than women.< / p >
< / div >
<div id = "frequencies-of-more-than-one-variable" class = "section level2" >
<h2 class = "hasAnchor" >
Frequencies of more than one variable
< p > Multiple variables will be pasted into one variable to review individual cases, keeping a univariate frequency table.< / p >
< p > For illustration, we could add some more variables to the < code > septic_patients< / code > dataset to learn about bacterial properties:< / p >
2019-01-25 13:18:41 +01:00
my_patients <- septic_patients %> % left_join_microorganisms()
# Joining, by = "mo"
< a class = "sourceLine" id = "cb2-2" data-line-number = "2" > < span class = "co" > # Joining, by = "mo"< / span > < / a > < / code > < / pre > < / div >
2019-01-02 23:24:07 +01:00
< p > Now all variables of the < code > microorganisms< / code > dataset have been joined to the < code > septic_patients< / code > dataset. The < code > microorganisms< / code > dataset consists of the following variables:< / p >
2019-01-25 13:18:41 +01:00
colnames(microorganisms)
# [1] "mo" "tsn" "genus" "species" "subspecies"
# [6] "fullname" "family" "order" "class" "phylum" 
# [11] "subkingdom" "kingdom" "gramstain" "prevalence" "ref"
< a class = "sourceLine" id = "cb3-2" data-line-number = "2" > < span class = "co" > # [1] "mo" "tsn" "genus" "species" "subspecies"< / span > < / a >
< a class = "sourceLine" id = "cb3-3" data-line-number = "3" > < span class = "co" > # [6] "fullname" "family" "order" "class" "phylum" < / span > < / a >
< a class = "sourceLine" id = "cb3-4" data-line-number = "4" > < span class = "co" > # [11] "subkingdom" "kingdom" "gramstain" "prevalence" "ref"< / span > < / a > < / code > < / pre > < / div >
2019-01-02 23:24:07 +01:00
< p > If we compare the dimensions between the old and new dataset, we can see that these 14 variables were added:< / p >
2019-01-25 13:18:41 +01:00
dim(septic_patients)
# [1] 2000 49
dim(my_patients)
# [1] 2000 63
< a class = "sourceLine" id = "cb4-2" data-line-number = "2" > < span class = "co" > # [1] 2000 49< / span > < / a >
< a class = "sourceLine" id = "cb4-3" data-line-number = "3" > < span class = "kw" > < a href = "https://www.rdocumentation.org/packages/base/topics/dim" > dim< / a > < / span > (my_patients)< / a >
< a class = "sourceLine" id = "cb4-4" data-line-number = "4" > < span class = "co" > # [1] 2000 63< / span > < / a > < / code > < / pre > < / div >
2019-01-02 23:24:07 +01:00
< p > So now the < code > genus< / code > and < code > species< / code > variables are available. A frequency table of these combined variables can be created like this:< / p >
2019-01-25 13:18:41 +01:00
my_patients %> %
  freq(genus, species, nmax = 15)
< a class = "sourceLine" id = "cb5-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (genus, species, < span class = "dt" > nmax =< / span > < span class = "dv" > 15< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
**Frequency table of `genus` and `species` from a `data.frame` (2,000 x 63)**
Columns: 2
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 96

Shortest: 12
Longest: 34
Columns: 2< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 96< / p >
< p > Shortest: 12< br >
Longest: 34< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > Escherichia coli< / td >
< td align = "right" > 467< / td >
< td align = "right" > 23.4%< / td >
< td align = "right" > 467< / td >
< td align = "right" > 23.4%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > Staphylococcus coagulase negative< / td >
< td align = "right" > 313< / td >
< td align = "right" > 15.7%< / td >
< td align = "right" > 780< / td >
< td align = "right" > 39.0%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > Staphylococcus aureus< / td >
< td align = "right" > 235< / td >
< td align = "right" > 11.8%< / td >
< td align = "right" > 1,015< / td >
< td align = "right" > 50.7%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > Staphylococcus epidermidis< / td >
< td align = "right" > 174< / td >
< td align = "right" > 8.7%< / td >
< td align = "right" > 1,189< / td >
< td align = "right" > 59.5%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 5< / td >
< td align = "left" > Streptococcus pneumoniae< / td >
< td align = "right" > 117< / td >
< td align = "right" > 5.9%< / td >
< td align = "right" > 1,306< / td >
< td align = "right" > 65.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 6< / td >
< td align = "left" > Staphylococcus hominis< / td >
< td align = "right" > 81< / td >
< td align = "right" > 4.1%< / td >
< td align = "right" > 1,387< / td >
< td align = "right" > 69.4%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 7< / td >
< td align = "left" > Klebsiella pneumoniae< / td >
< td align = "right" > 58< / td >
< td align = "right" > 2.9%< / td >
< td align = "right" > 1,445< / td >
< td align = "right" > 72.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 8< / td >
< td align = "left" > Enterococcus faecalis< / td >
< td align = "right" > 39< / td >
< td align = "right" > 2.0%< / td >
< td align = "right" > 1,484< / td >
< td align = "right" > 74.2%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 9< / td >
< td align = "left" > Proteus mirabilis< / td >
< td align = "right" > 36< / td >
< td align = "right" > 1.8%< / td >
< td align = "right" > 1,520< / td >
< td align = "right" > 76.0%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 10< / td >
< td align = "left" > Pseudomonas aeruginosa< / td >
< td align = "right" > 30< / td >
< td align = "right" > 1.5%< / td >
< td align = "right" > 1,550< / td >
< td align = "right" > 77.5%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 11< / td >
< td align = "left" > Serratia marcescens< / td >
< td align = "right" > 25< / td >
< td align = "right" > 1.3%< / td >
< td align = "right" > 1,575< / td >
< td align = "right" > 78.8%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 12< / td >
< td align = "left" > Enterobacter cloacae< / td >
< td align = "right" > 23< / td >
< td align = "right" > 1.2%< / td >
< td align = "right" > 1,598< / td >
< td align = "right" > 79.9%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 13< / td >
< td align = "left" > Enterococcus faecium< / td >
< td align = "right" > 21< / td >
< td align = "right" > 1.1%< / td >
< td align = "right" > 1,619< / td >
< td align = "right" > 81.0%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 14< / td >
< td align = "left" > Staphylococcus capitis< / td >
< td align = "right" > 21< / td >
< td align = "right" > 1.1%< / td >
< td align = "right" > 1,640< / td >
< td align = "right" > 82.0%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 15< / td >
< td align = "left" > Bacteroides fragilis< / td >
< td align = "right" > 20< / td >
< td align = "right" > 1.0%< / td >
< td align = "right" > 1,660< / td >
< td align = "right" > 83.0%< / td >
< / tr >
< / tbody >
< / table >
< p > (omitted 81 entries, n = 340 [17.0%])< / p >
< / div >
<div id = "frequencies-of-numeric-values" class = "section level2" >
<h2 class = "hasAnchor" >
Frequencies of numeric values
< p > Frequency tables can be created of any input.< / p >
< p > In case of numeric values (like integers, doubles, etc.) additional descriptive statistics will be calculated and shown into the header:< / p >
2019-01-25 13:18:41 +01:00
# # get age distribution of unique patients
septic_patients %> % 
  distinct(patient_id, .keep_all = TRUE) %> % 
  freq(age, nmax = 5, header = TRUE)
< a class = "sourceLine" id = "cb6-2" data-line-number = "2" > septic_patients < span class = "op" > %> %< / span > < span class = "st" > < / span > < / a >
< a class = "sourceLine" id = "cb6-3" data-line-number = "3" > < span class = "st" > < / span > < span class = "kw" > distinct< / span > (patient_id, < span class = "dt" > .keep_all =< / span > < span class = "ot" > TRUE< / span > ) < span class = "op" > %> %< / span > < span class = "st" > < / span > < / a >
< a class = "sourceLine" id = "cb6-4" data-line-number = "4" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (age, < span class = "dt" > nmax =< / span > < span class = "dv" > 5< / span > , < span class = "dt" > header =< / span > < span class = "ot" > TRUE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
< p > < strong > Frequency table of < code > age< / code > from a < code > data.frame< / code > (981 x 49)< / strong > < br >
Class: < code > numeric< / code > (< code > numeric< / code > )< br >
2019-01-02 23:24:07 +01:00
Length: 981 (of which NA: 0 = 0.00%)< br >
Unique: 73< / p >
< p > Mean: 71.08< br >
SD: 14.05 (CV: 0.20, MAD: 13.34)< br >
Five-Num: 14 | 63 | 74 | 82 | 97 (IQR: 19, CQV: 0.13)< br >
Outliers: 15 (unique count: 12)< / p >
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "right" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "right" > 83< / td >
< td align = "right" > 44< / td >
< td align = "right" > 4.5%< / td >
< td align = "right" > 44< / td >
< td align = "right" > 4.5%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "right" > 76< / td >
< td align = "right" > 43< / td >
< td align = "right" > 4.4%< / td >
< td align = "right" > 87< / td >
< td align = "right" > 8.9%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "right" > 75< / td >
< td align = "right" > 37< / td >
< td align = "right" > 3.8%< / td >
< td align = "right" > 124< / td >
< td align = "right" > 12.6%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "right" > 82< / td >
< td align = "right" > 33< / td >
< td align = "right" > 3.4%< / td >
< td align = "right" > 157< / td >
< td align = "right" > 16.0%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 5< / td >
< td align = "right" > 78< / td >
< td align = "right" > 32< / td >
< td align = "right" > 3.3%< / td >
< td align = "right" > 189< / td >
< td align = "right" > 19.3%< / td >
< / tr >
< / tbody >
< / table >
< p > (omitted 68 entries, n = 792 [80.7%])< / p >
< p > So the following properties are determined, where < code > NA< / code > values are always ignored:< / p >
< ul >
< li > < p > < strong > Mean< / strong > < / p > < / li >
< li > < p > < strong > Standard deviation< / strong > < / p > < / li >
< li > < p > < strong > Coefficient of variation< / strong > (CV), the standard deviation divided by the mean< / p > < / li >
< li > < p > < strong > Five numbers of Tukey< / strong > (min, Q1, median, Q3, max)< / p > < / li >
< li > < p > < strong > Coefficient of quartile variation< / strong > (CQV, sometimes called coefficient of dispersion), calculated as (Q3 - Q1) / (Q3 + Q1) using quantile with < code > type = 6< / code > as quantile algorithm to comply with SPSS standards< / p > < / li >
< li > < p > < strong > Outliers< / strong > (total count and unique count)< / p > < / li >
< / ul >
< p > So for example, the above frequency table quickly shows the median age of patients being 74.< / p >
< / div >
<div id = "frequencies-of-factors" class = "section level2" >
<h2 class = "hasAnchor" >
Frequencies of factors
< p > To sort frequencies of factors on factor level instead of item count, use the < code > sort.count< / code > parameter.< / p >
< p > < code > sort.count< / code > is < code > TRUE< / code > by default. Compare this default behaviour…< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(hospital_id)
< a class = "sourceLine" id = "cb7-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (hospital_id)< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
**Frequency table of `hospital_id` from a `data.frame` (2,000 x 49)**
Class: `factor` (`numeric`)
Levels: A, B, C, D
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 4
Class: < code > factor< / code > (< code > numeric< / code > )< br >
Levels: A, B, C, D< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 4< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > D< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > B< / td >
< td align = "right" > 663< / td >
< td align = "right" > 33.2%< / td >
< td align = "right" > 1,425< / td >
< td align = "right" > 71.3%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > A< / td >
< td align = "right" > 321< / td >
< td align = "right" > 16.1%< / td >
< td align = "right" > 1,746< / td >
< td align = "right" > 87.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > C< / td >
< td align = "right" > 254< / td >
< td align = "right" > 12.7%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< p > … with this, where items are now sorted on count:< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(hospital_id, sort.count = FALSE)
< a class = "sourceLine" id = "cb8-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (hospital_id, < span class = "dt" > sort.count =< / span > < span class = "ot" > FALSE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
**Frequency table of `hospital_id` from a `data.frame` (2,000 x 49)**
Class: `factor` (`numeric`)
Levels: A, B, C, D
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 4
Class: < code > factor< / code > (< code > numeric< / code > )< br >
Levels: A, B, C, D< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 4< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > A< / td >
< td align = "right" > 321< / td >
< td align = "right" > 16.1%< / td >
< td align = "right" > 321< / td >
< td align = "right" > 16.1%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > B< / td >
< td align = "right" > 663< / td >
< td align = "right" > 33.2%< / td >
< td align = "right" > 984< / td >
< td align = "right" > 49.2%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > C< / td >
< td align = "right" > 254< / td >
< td align = "right" > 12.7%< / td >
< td align = "right" > 1,238< / td >
< td align = "right" > 61.9%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > D< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< p > All classes will be printed into the header (default is < code > FALSE< / code > when using markdown like this document). Variables with the new < code > rsi< / code > class of this AMR package are actually ordered factors and have three classes (look at < code > Class< / code > in the header):< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(amox, header = TRUE)
< a class = "sourceLine" id = "cb9-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (amox, < span class = "dt" > header =< / span > < span class = "ot" > TRUE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
< p > < strong > Frequency table of < code > amox< / code > from a < code > data.frame< / code > (2,000 x 49)< / strong > < br >
Class: < code > factor< / code > > < code > ordered< / code > > < code > rsi< / code > (< code > numeric< / code > )< br >
2019-01-02 23:24:07 +01:00
Levels: S < I < R< br >
Length: 2,000 (of which NA: 828 = 41.40%)< br >
Unique: 3< / p >
< p > %IR: 58.53% (ratio S : IR = 1.0 : 1.4)< / p >
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > R< / td >
< td align = "right" > 683< / td >
< td align = "right" > 58.3%< / td >
< td align = "right" > 683< / td >
< td align = "right" > 58.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > S< / td >
< td align = "right" > 486< / td >
< td align = "right" > 41.5%< / td >
< td align = "right" > 1,169< / td >
< td align = "right" > 99.7%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > I< / td >
< td align = "right" > 3< / td >
< td align = "right" > 0.3%< / td >
< td align = "right" > 1,172< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< / div >
<div id = "frequencies-of-dates" class = "section level2" >
<h2 class = "hasAnchor" >
Frequencies of dates
< p > Frequencies of dates will show the oldest and newest date in the data, and the amount of days between them:< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(date, nmax = 5, header = TRUE)
< a class = "sourceLine" id = "cb10-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (date, < span class = "dt" > nmax =< / span > < span class = "dv" > 5< / span > , < span class = "dt" > header =< / span > < span class = "ot" > TRUE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
< p > < strong > Frequency table of < code > date< / code > from a < code > data.frame< / code > (2,000 x 49)< / strong > < br >
Class: < code > Date< / code > (< code > numeric< / code > )< br >
2019-01-02 23:24:07 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 1,140< / p >
< p > Oldest: 2 January 2002< br >
Newest: 28 December 2017 (+5,839)< br >
Median: 31 July 2009 (47.39%)< / p >
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > 2016-05-21< / td >
< td align = "right" > 10< / td >
< td align = "right" > 0.5%< / td >
< td align = "right" > 10< / td >
< td align = "right" > 0.5%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > 2004-11-15< / td >
< td align = "right" > 8< / td >
< td align = "right" > 0.4%< / td >
< td align = "right" > 18< / td >
< td align = "right" > 0.9%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > 2013-07-29< / td >
< td align = "right" > 8< / td >
< td align = "right" > 0.4%< / td >
< td align = "right" > 26< / td >
< td align = "right" > 1.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > 2017-06-12< / td >
< td align = "right" > 8< / td >
< td align = "right" > 0.4%< / td >
< td align = "right" > 34< / td >
< td align = "right" > 1.7%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 5< / td >
< td align = "left" > 2015-11-19< / td >
< td align = "right" > 7< / td >
< td align = "right" > 0.4%< / td >
< td align = "right" > 41< / td >
< td align = "right" > 2.1%< / td >
< / tr >
< / tbody >
< / table >
< p > (omitted 1,135 entries, n = 1,959 [98.0%])< / p >
< / div >
<div id = "assigning-a-frequency-table-to-an-object" class = "section level2" >
<h2 class = "hasAnchor" >
Assigning a frequency table to an object
< p > A frequency table is actaually a regular < code > data.frame< / code > , with the exception that it contains an additional class.< / p >
2019-01-25 13:18:41 +01:00
my_df <- septic_patients %> % freq(age)
class(my_df)
< a class = "sourceLine" id = "cb11-2" data-line-number = "2" > < span class = "kw" > < a href = "https://www.rdocumentation.org/packages/base/topics/class" > class< / a > < / span > (my_df)< / a > < / code > < / pre > < / div >
2019-01-02 23:24:07 +01:00
< p > [1] “frequency_tbl” “data.frame”< / p >
< p > Because of this additional class, a frequency table prints like the examples above. But the object itself contains the complete table without a row limitation:< / p >
2019-01-25 13:18:41 +01:00
dim(my_df)
2019-01-02 23:24:07 +01:00
< p > [1] 74 5< / p >
< / div >
<div id = "additional-parameters" class = "section level2" >
<h2 class = "hasAnchor" >
Additional parameters
<div id = "parameter-na-rm" class = "section level3" >
<h3 class = "hasAnchor" >
Parameter `na.rm`
< / h3 >
< p > With the < code > na.rm< / code > parameter (defaults to < code > TRUE< / code > , but they will always be shown into the header), you can include < code > NA< / code > values in the frequency table:< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(amox, na.rm = FALSE)
< a class = "sourceLine" id = "cb13-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (amox, < span class = "dt" > na.rm =< / span > < span class = "ot" > FALSE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
< p > < strong > Frequency table of < code > amox< / code > from a < code > data.frame< / code > (2,000 x 49)< / strong > < br >
Class: < code > factor< / code > > < code > ordered< / code > > < code > rsi< / code > (< code > numeric< / code > )< br >
Levels: S < I < R< br >
Length: 2,828 (of which NA: 828 = 29.28%)< br >
Unique: 4< / p >
< p > %IR: 34.30% (ratio S : IR = 1.0 : 1.4)< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > (NA)< / td >
< td align = "right" > 828< / td >
< td align = "right" > 41.4%< / td >
< td align = "right" > 828< / td >
< td align = "right" > 41.4%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > R< / td >
< td align = "right" > 683< / td >
< td align = "right" > 34.2%< / td >
< td align = "right" > 1,511< / td >
< td align = "right" > 75.6%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > S< / td >
< td align = "right" > 486< / td >
< td align = "right" > 24.3%< / td >
< td align = "right" > 1,997< / td >
< td align = "right" > 99.9%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > I< / td >
< td align = "right" > 3< / td >
< td align = "right" > 0.2%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< / div >
<div id = "parameter-row-names" class = "section level3" >
<h3 class = "hasAnchor" >
Parameter `row.names`
< / h3 >
< p > The default frequency tables shows row indices. To remove them, use < code > row.names = FALSE< / code > :< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(hospital_id, row.names = FALSE)
< a class = "sourceLine" id = "cb14-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (hospital_id, < span class = "dt" > row.names =< / span > < span class = "ot" > FALSE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
**Frequency table of `hospital_id` from a `data.frame` (2,000 x 49)**
Class: `factor` (`numeric`)
Levels: A, B, C, D
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 4
Class: < code > factor< / code > (< code > numeric< / code > )< br >
Levels: A, B, C, D< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 4< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > D< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > B< / td >
< td align = "right" > 663< / td >
< td align = "right" > 33.2%< / td >
< td align = "right" > 1,425< / td >
< td align = "right" > 71.3%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > A< / td >
< td align = "right" > 321< / td >
< td align = "right" > 16.1%< / td >
< td align = "right" > 1,746< / td >
< td align = "right" > 87.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > C< / td >
< td align = "right" > 254< / td >
< td align = "right" > 12.7%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< / div >
<div id = "parameter-markdown" class = "section level3" >
<h3 class = "hasAnchor" >
Parameter `markdown`
< / h3 >
< p > The < code > markdown< / code > parameter is < code > TRUE< / code > at default in non-interactive sessions, like in reports created with R Markdown. This will always print all rows, unless < code > nmax< / code > is set.< / p >
2019-01-25 13:18:41 +01:00
septic_patients %> %
  freq(hospital_id, markdown = TRUE)
< a class = "sourceLine" id = "cb15-2" data-line-number = "2" > < span class = "st" > < / span > < span class = "kw" > < a href = "../reference/freq.html" > freq< / a > < / span > (hospital_id, < span class = "dt" > markdown =< / span > < span class = "ot" > TRUE< / span > )< / a > < / code > < / pre > < / div >
2019-02-08 16:06:54 +01:00
**Frequency table of `hospital_id` from a `data.frame` (2,000 x 49)**
Class: `factor` (`numeric`)
Levels: A, B, C, D
Length: 2,000 (of which NA: 0 = 0.00%)
Unique: 4
Class: < code > factor< / code > (< code > numeric< / code > )< br >
Levels: A, B, C, D< br >
Length: 2,000 (of which NA: 0 = 0.00%)< br >
Unique: 4< / p >
2019-01-02 23:24:07 +01:00
< table class = "table" >
< thead > < tr class = "header" >
< th align = "left" > < / th >
< th align = "left" > Item< / th >
< th align = "right" > Count< / th >
< th align = "right" > Percent< / th >
< th align = "right" > Cum. Count< / th >
< th align = "right" > Cum. Percent< / th >
< / tr > < / thead >
< tbody >
< tr class = "odd" >
< td align = "left" > 1< / td >
< td align = "left" > D< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< td align = "right" > 762< / td >
< td align = "right" > 38.1%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 2< / td >
< td align = "left" > B< / td >
< td align = "right" > 663< / td >
< td align = "right" > 33.2%< / td >
< td align = "right" > 1,425< / td >
< td align = "right" > 71.3%< / td >
< / tr >
< tr class = "odd" >
< td align = "left" > 3< / td >
< td align = "left" > A< / td >
< td align = "right" > 321< / td >
< td align = "right" > 16.1%< / td >
< td align = "right" > 1,746< / td >
< td align = "right" > 87.3%< / td >
< / tr >
< tr class = "even" >
< td align = "left" > 4< / td >
< td align = "left" > C< / td >
< td align = "right" > 254< / td >
< td align = "right" > 12.7%< / td >
< td align = "right" > 2,000< / td >
< td align = "right" > 100.0%< / td >
< / tr >
< / tbody >
< / table >
< / div >
< / div >
< / div >
