AMR/docs/articles/freq.html

925 lines
39 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>How to create frequency tables • AMR (for R)</title>
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png">
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png">
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png">
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.3.7/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script><!-- sticky kit --><script src="https://cdnjs.cloudflare.com/ajax/libs/sticky-kit/1.1.3/sticky-kit.min.js" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script><!-- docsearch --><script src="../docsearch.js"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/docsearch.js/2.6.1/docsearch.min.css" integrity="sha256-QOSRU/ra9ActyXkIBbiIB144aDBdtvXBcNc3OTNuX/Q=" crossorigin="anonymous">
<link href="../docsearch.css" rel="stylesheet">
<script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/jquery.mark.min.js" integrity="sha256-4HLtjeVgH0eIB3aZ9mLYF6E8oU5chNdjU6p6rrXpl9U=" crossorigin="anonymous"></script><link href="../extra.css" rel="stylesheet">
<script src="../extra.js"></script><meta property="og:title" content="How to create frequency tables">
<meta property="og:description" content="">
<meta property="og:image" content="https://msberends.gitlab.io/AMR/logo.png">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Latest development version">0.7.0.9013</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fa fa-home"></span>
Home
</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
<span class="fa fa-question-circle"></span>
How to
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/AMR.html">
<span class="fa fa-directions"></span>
Conduct AMR analysis
</a>
</li>
<li>
<a href="../articles/resistance_predict.html">
<span class="fa fa-dice"></span>
Predict antimicrobial resistance
</a>
</li>
<li>
<a href="../articles/MDR.html">
<span class="fa fa-skull-crossbones"></span>
Determine multi-drug resistance (MDR)
</a>
</li>
<li>
<a href="../articles/WHONET.html">
<span class="fa fa-globe-americas"></span>
Work with WHONET data
</a>
</li>
<li>
<a href="../articles/SPSS.html">
<span class="fa fa-file-upload"></span>
Import data from SPSS/SAS/Stata
</a>
</li>
<li>
<a href="../articles/EUCAST.html">
<span class="fa fa-exchange-alt"></span>
Apply EUCAST rules
</a>
</li>
<li>
<a href="../reference/mo_property.html">
<span class="fa fa-bug"></span>
Get properties of a microorganism
</a>
</li>
<li>
<a href="../reference/ab_property.html">
<span class="fa fa-capsules"></span>
Get properties of an antibiotic
</a>
</li>
<li>
<a href="../articles/freq.html">
<span class="fa fa-sort-amount-down"></span>
Create frequency tables
</a>
</li>
<li>
<a href="../articles/benchmarks.html">
<span class="fa fa-shipping-fast"></span>
Other: benchmarks
</a>
</li>
</ul>
</li>
<li>
<a href="../reference/">
<span class="fa fa-book-open"></span>
Manual
</a>
</li>
<li>
<a href="../authors.html">
<span class="fa fa-users"></span>
Authors
</a>
</li>
<li>
<a href="../news/">
<span class="far fa far fa-newspaper"></span>
Changelog
</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://gitlab.com/msberends/AMR">
<span class="fab fa fab fa-gitlab"></span>
Source Code
</a>
</li>
<li>
<a href="../LICENSE-text.html">
<span class="fa fa-book"></span>
Licence
</a>
</li>
</ul>
<form class="navbar-form navbar-right" role="search">
<div class="form-group">
<input type="search" class="form-control" name="search-input" id="search-input" placeholder="Search..." aria-label="Search for..." autocomplete="off">
</div>
</form>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>How to create frequency tables</h1>
<h4 class="author">Matthijs S. Berends</h4>
<h4 class="date">22 June 2019</h4>
<div class="hidden name"><code>freq.Rmd</code></div>
</div>
<div id="introduction" class="section level2">
<h2 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h2>
<p>Frequency tables (or frequency distributions) are summaries of the distribution of values in a sample. With the <code><a href="../reference/freq.html">freq()</a></code> function, you can create univariate frequency tables. Multiple variables will be pasted into one variable, so it forces a univariate distribution. We take the <code>septic_patients</code> dataset (included in this AMR package) as example.</p>
</div>
<div id="frequencies-of-one-variable" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-one-variable" class="anchor"></a>Frequencies of one variable</h2>
<p>To only show and quickly review the content of one variable, you can just select this variable in various ways. Lets say we want to get the frequencies of the <code>gender</code> variable of the <code>septic_patients</code> dataset:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1"><span class="co"># Any of these will work:</span></a>
<a class="sourceLine" id="cb1-2" title="2"><span class="co"># freq(septic_patients$gender)</span></a>
<a class="sourceLine" id="cb1-3" title="3"><span class="co"># freq(septic_patients[, "gender"])</span></a>
<a class="sourceLine" id="cb1-4" title="4"></a>
<a class="sourceLine" id="cb1-5" title="5"><span class="co"># Using tidyverse:</span></a>
<a class="sourceLine" id="cb1-6" title="6"><span class="co"># septic_patients$gender %&gt;% freq()</span></a>
<a class="sourceLine" id="cb1-7" title="7"><span class="co"># septic_patients[, "gender"] %&gt;% freq()</span></a>
<a class="sourceLine" id="cb1-8" title="8"><span class="co"># septic_patients %&gt;% freq("gender")</span></a>
<a class="sourceLine" id="cb1-9" title="9"></a>
<a class="sourceLine" id="cb1-10" title="10"><span class="co"># Probably the fastest and easiest:</span></a>
<a class="sourceLine" id="cb1-11" title="11">septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(gender) </a></code></pre></div>
<p><strong>Frequency table of <code>gender</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: character<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Unique: 2</p>
<p>Shortest: 1<br>
Longest: 1</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">M</td>
<td align="right">1,031</td>
<td align="right">51.6%</td>
<td align="right">1,031</td>
<td align="right">51.6%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">F</td>
<td align="right">969</td>
<td align="right">48.4%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>This immediately shows the class of the variable, its length and availability (i.e. the amount of <code>NA</code>), the amount of unique values and (most importantly) that among septic patients men are more prevalent than women.</p>
</div>
<div id="frequencies-of-more-than-one-variable" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-more-than-one-variable" class="anchor"></a>Frequencies of more than one variable</h2>
<p>Multiple variables will be pasted into one variable to review individual cases, keeping a univariate frequency table.</p>
<p>For illustration, we could add some more variables to the <code>septic_patients</code> dataset to learn about bacterial properties:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" title="1">my_patients &lt;-<span class="st"> </span>septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/join.html">left_join_microorganisms</a></span>()</a>
<a class="sourceLine" id="cb2-2" title="2"><span class="co"># Joining, by = "mo"</span></a></code></pre></div>
<p>Now all variables of the <code>microorganisms</code> dataset have been joined to the <code>septic_patients</code> dataset. The <code>microorganisms</code> dataset consists of the following variables:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/colnames">colnames</a></span>(microorganisms)</a>
<a class="sourceLine" id="cb3-2" title="2"><span class="co"># [1] "mo" "col_id" "fullname" "kingdom" "phylum" </span></a>
<a class="sourceLine" id="cb3-3" title="3"><span class="co"># [6] "class" "order" "family" "genus" "species" </span></a>
<a class="sourceLine" id="cb3-4" title="4"><span class="co"># [11] "subspecies" "rank" "ref" "species_id" "source" </span></a>
<a class="sourceLine" id="cb3-5" title="5"><span class="co"># [16] "prevalence"</span></a></code></pre></div>
<p>If we compare the dimensions between the old and new dataset, we can see that these 15 variables were added:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(septic_patients)</a>
<a class="sourceLine" id="cb4-2" title="2"><span class="co"># [1] 2000 49</span></a>
<a class="sourceLine" id="cb4-3" title="3"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(my_patients)</a>
<a class="sourceLine" id="cb4-4" title="4"><span class="co"># [1] 2000 64</span></a></code></pre></div>
<p>So now the <code>genus</code> and <code>species</code> variables are available. A frequency table of these combined variables can be created like this:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" title="1">my_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb5-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(genus, species, <span class="dt">nmax =</span> <span class="dv">15</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from <code>my_patients</code> (2,000 x 64)</strong></p>
<p>Columns: 2<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Unique: 95</p>
<p>Shortest: 8<br>
Longest: 34</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">Escherichia coli</td>
<td align="right">467</td>
<td align="right">23.4%</td>
<td align="right">467</td>
<td align="right">23.4%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">Staphylococcus coagulase-negative</td>
<td align="right">313</td>
<td align="right">15.6%</td>
<td align="right">780</td>
<td align="right">39.0%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">Staphylococcus aureus</td>
<td align="right">235</td>
<td align="right">11.7%</td>
<td align="right">1,015</td>
<td align="right">50.7%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">Staphylococcus epidermidis</td>
<td align="right">174</td>
<td align="right">8.7%</td>
<td align="right">1,189</td>
<td align="right">59.4%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="left">Streptococcus pneumoniae</td>
<td align="right">117</td>
<td align="right">5.8%</td>
<td align="right">1,306</td>
<td align="right">65.3%</td>
</tr>
<tr class="even">
<td align="left">6</td>
<td align="left">Staphylococcus hominis</td>
<td align="right">81</td>
<td align="right">4.0%</td>
<td align="right">1,387</td>
<td align="right">69.4%</td>
</tr>
<tr class="odd">
<td align="left">7</td>
<td align="left">Klebsiella pneumoniae</td>
<td align="right">58</td>
<td align="right">2.9%</td>
<td align="right">1,445</td>
<td align="right">72.2%</td>
</tr>
<tr class="even">
<td align="left">8</td>
<td align="left">Enterococcus faecalis</td>
<td align="right">39</td>
<td align="right">2.0%</td>
<td align="right">1,484</td>
<td align="right">74.2%</td>
</tr>
<tr class="odd">
<td align="left">9</td>
<td align="left">Proteus mirabilis</td>
<td align="right">36</td>
<td align="right">1.8%</td>
<td align="right">1,520</td>
<td align="right">76.0%</td>
</tr>
<tr class="even">
<td align="left">10</td>
<td align="left">Pseudomonas aeruginosa</td>
<td align="right">30</td>
<td align="right">1.5%</td>
<td align="right">1,550</td>
<td align="right">77.5%</td>
</tr>
<tr class="odd">
<td align="left">11</td>
<td align="left">Serratia marcescens</td>
<td align="right">25</td>
<td align="right">1.2%</td>
<td align="right">1,575</td>
<td align="right">78.8%</td>
</tr>
<tr class="even">
<td align="left">12</td>
<td align="left">Enterobacter cloacae</td>
<td align="right">23</td>
<td align="right">1.2%</td>
<td align="right">1,598</td>
<td align="right">79.9%</td>
</tr>
<tr class="odd">
<td align="left">13</td>
<td align="left">Enterococcus faecium</td>
<td align="right">21</td>
<td align="right">1.0%</td>
<td align="right">1,619</td>
<td align="right">81.0%</td>
</tr>
<tr class="even">
<td align="left">14</td>
<td align="left">Staphylococcus capitis</td>
<td align="right">21</td>
<td align="right">1.0%</td>
<td align="right">1,640</td>
<td align="right">82.0%</td>
</tr>
<tr class="odd">
<td align="left">15</td>
<td align="left">Bacteroides fragilis</td>
<td align="right">20</td>
<td align="right">1.0%</td>
<td align="right">1,660</td>
<td align="right">83.0%</td>
</tr>
</tbody>
</table>
<p>(omitted 80 entries, n = 340 [17.0%])</p>
</div>
<div id="frequencies-of-numeric-values" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-numeric-values" class="anchor"></a>Frequencies of numeric values</h2>
<p>Frequency tables can be created of any input.</p>
<p>In case of numeric values (like integers, doubles, etc.) additional descriptive statistics will be calculated and shown into the header:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" title="1"><span class="co"># # get age distribution of unique patients</span></a>
<a class="sourceLine" id="cb6-2" title="2">septic_patients <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb6-3" title="3"><span class="st"> </span><span class="kw">distinct</span>(patient_id, <span class="dt">.keep_all =</span> <span class="ot">TRUE</span>) <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb6-4" title="4"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(age, <span class="dt">nmax =</span> <span class="dv">5</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>age</code> from a <code>data.frame</code> (981 x 49)</strong></p>
<p>Class: numeric<br>
Length: 981 (of which NA: 0 = 0.00%)<br>
Unique: 73</p>
<p>Mean: 71.08<br>
SD: 14.05 (CV: 0.20, MAD: 13.34)<br>
Five-Num: 14 | 63 | 74 | 82 | 97 (IQR: 19, CQV: 0.13)<br>
Outliers: 15 (unique count: 12)</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="right">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="right">83</td>
<td align="right">44</td>
<td align="right">4.5%</td>
<td align="right">44</td>
<td align="right">4.5%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="right">76</td>
<td align="right">43</td>
<td align="right">4.4%</td>
<td align="right">87</td>
<td align="right">8.9%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="right">75</td>
<td align="right">37</td>
<td align="right">3.8%</td>
<td align="right">124</td>
<td align="right">12.6%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="right">82</td>
<td align="right">33</td>
<td align="right">3.4%</td>
<td align="right">157</td>
<td align="right">16.0%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="right">78</td>
<td align="right">32</td>
<td align="right">3.3%</td>
<td align="right">189</td>
<td align="right">19.3%</td>
</tr>
</tbody>
</table>
<p>(omitted 68 entries, n = 792 [80.7%])</p>
<p>So the following properties are determined, where <code>NA</code> values are always ignored:</p>
<ul>
<li><p><strong>Mean</strong></p></li>
<li><p><strong>Standard deviation</strong></p></li>
<li><p><strong>Coefficient of variation</strong> (CV), the standard deviation divided by the mean</p></li>
<li><p><strong>Mean absolute deviation</strong> (MAD), the median of the absolute deviations from the median - a more robust statistic than the standard deviation</p></li>
<li><p><strong>Five numbers of Tukey</strong>, namely: the minimum, Q1, median, Q3 and maximum</p></li>
<li><p><strong>Interquartile range</strong> (IQR), the distance between Q1 and Q3</p></li>
<li><p><strong>Coefficient of quartile variation</strong> (CQV, sometimes called <em>coefficient of dispersion</em>), calculated as (Q3 - Q1) / (Q3 + Q1) using <code><a href="https://www.rdocumentation.org/packages/stats/topics/quantile">quantile()</a></code> with <code>type = 6</code> as quantile algorithm to comply with SPSS standards</p></li>
<li><p><strong>Outliers</strong> (total count and unique count)</p></li>
</ul>
<p>So for example, the above frequency table quickly shows the median age of patients being 74.</p>
</div>
<div id="frequencies-of-factors" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-factors" class="anchor"></a>Frequencies of factors</h2>
<p>To sort frequencies of factors on their levels instead of item count, use the <code>sort.count</code> parameter.</p>
<p><code>sort.count</code> is <code>TRUE</code> by default. Compare this default behaviour…</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb7-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: factor (numeric)<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Levels: 4: A, B, C, D<br>
Unique: 4</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">762</td>
<td align="right">38.1%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">1,425</td>
<td align="right">71.2%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.0%</td>
<td align="right">1,746</td>
<td align="right">87.3%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>… to this, where items are now sorted on factor levels:</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb8-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">sort.count =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: factor (numeric)<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Levels: 4: A, B, C, D<br>
Unique: 4</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.0%</td>
<td align="right">321</td>
<td align="right">16.0%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">984</td>
<td align="right">49.2%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">1,238</td>
<td align="right">61.9%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>All classes will be printed into the header. Variables with the new <code>rsi</code> class of this AMR package are actually ordered factors and have three classes (look at <code>Class</code> in the header):</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb9-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb9-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(AMX, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>AMX</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: factor &gt; ordered &gt; rsi (numeric)<br>
Length: 2,000 (of which NA: 771 = 38.55%)<br>
Levels: 3: S &lt; I &lt; R<br>
Unique: 3</p>
<p>Drug: Amoxicillin (AMX, J01CA04)<br>
Group: Beta-lactams/penicillins<br>
%SI: 44.43%</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">R</td>
<td align="right">683</td>
<td align="right">55.6%</td>
<td align="right">683</td>
<td align="right">55.6%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">S</td>
<td align="right">543</td>
<td align="right">44.2%</td>
<td align="right">1,226</td>
<td align="right">99.8%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">I</td>
<td align="right">3</td>
<td align="right">0.2%</td>
<td align="right">1,229</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="frequencies-of-dates" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-dates" class="anchor"></a>Frequencies of dates</h2>
<p>Frequencies of dates will show the oldest and newest date in the data, and the amount of days between them:</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb10-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb10-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(date, <span class="dt">nmax =</span> <span class="dv">5</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>date</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: Date (numeric)<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Unique: 1,140</p>
<p>Oldest: 2 January 2002<br>
Newest: 28 December 2017 (+5,839)<br>
Median: 31 July 2009 (47.39%)</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">2016-05-21</td>
<td align="right">10</td>
<td align="right">0.5%</td>
<td align="right">10</td>
<td align="right">0.5%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">2004-11-15</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">18</td>
<td align="right">0.9%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">2013-07-29</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">26</td>
<td align="right">1.3%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">2017-06-12</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">34</td>
<td align="right">1.7%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="left">2015-11-19</td>
<td align="right">7</td>
<td align="right">0.4%</td>
<td align="right">41</td>
<td align="right">2.0%</td>
</tr>
</tbody>
</table>
<p>(omitted 1,135 entries, n = 1,959 [98.0%])</p>
</div>
<div id="assigning-a-frequency-table-to-an-object" class="section level2">
<h2 class="hasAnchor">
<a href="#assigning-a-frequency-table-to-an-object" class="anchor"></a>Assigning a frequency table to an object</h2>
<p>A frequency table is actually a regular <code>data.frame</code>, with the exception that it contains an additional class.</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb11-1" title="1">my_df &lt;-<span class="st"> </span>septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(age)</a>
<a class="sourceLine" id="cb11-2" title="2"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/class">class</a></span>(my_df)</a></code></pre></div>
<p>[1] “freq” “data.frame”</p>
<p>Because of this additional class, a frequency table prints like the examples above. But the object itself contains the complete table without a row limitation:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(my_df)</a></code></pre></div>
<p>[1] 74 5</p>
</div>
<div id="additional-parameters" class="section level2">
<h2 class="hasAnchor">
<a href="#additional-parameters" class="anchor"></a>Additional parameters</h2>
<div id="parameter-na-rm" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-na-rm" class="anchor"></a>Parameter <code>na.rm</code>
</h3>
<p>With the <code>na.rm</code> parameter you can remove <code>NA</code> values from the frequency table (defaults to <code>TRUE</code>, but the number of <code>NA</code> values will always be shown into the header):</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb13-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(AMX, <span class="dt">na.rm =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>AMX</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: factor &gt; ordered &gt; rsi (numeric)<br>
Length: 2,000 (of which NA: 771 = 38.55%)<br>
Levels: 3: S &lt; I &lt; R<br>
Unique: 4</p>
<p>Drug: Amoxicillin (AMX, J01CA04)<br>
Group: Beta-lactams/penicillins<br>
%SI: 44.43%</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">(NA)</td>
<td align="right">771</td>
<td align="right">38.6%</td>
<td align="right">771</td>
<td align="right">38.6%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">R</td>
<td align="right">683</td>
<td align="right">34.2%</td>
<td align="right">1,454</td>
<td align="right">72.7%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">S</td>
<td align="right">543</td>
<td align="right">27.2%</td>
<td align="right">1,997</td>
<td align="right">99.8%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">I</td>
<td align="right">3</td>
<td align="right">0.2%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="parameter-row-names" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-row-names" class="anchor"></a>Parameter <code>row.names</code>
</h3>
<p>A frequency table shows row indices. To remove them, use <code>row.names = FALSE</code>:</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb14-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">row.names =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from <code>septic_patients</code> (2,000 x 49)</strong></p>
<p>Class: factor (numeric)<br>
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Levels: 4: A, B, C, D<br>
Unique: 4</p>
<table class="table">
<thead><tr class="header">
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">762</td>
<td align="right">38.1%</td>
</tr>
<tr class="even">
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">1,425</td>
<td align="right">71.2%</td>
</tr>
<tr class="odd">
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.0%</td>
<td align="right">1,746</td>
<td align="right">87.3%</td>
</tr>
<tr class="even">
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="parameter-markdown" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-markdown" class="anchor"></a>Parameter <code>markdown</code>
</h3>
<p>The <code>markdown</code> parameter is <code>TRUE</code> at default in non-interactive sessions, like in reports created with R Markdown. This will always print all rows, unless <code>nmax</code> is set. Without markdown (like in regular R), a frequency table would print like:</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb15-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb15-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">markdown =</span> <span class="ot">FALSE</span>)</a>
<a class="sourceLine" id="cb15-3" title="3"><span class="co"># Frequency table of `hospital_id` from `septic_patients` (2,000 x 49) </span></a>
<a class="sourceLine" id="cb15-4" title="4"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-5" title="5"><span class="co"># Class: factor (numeric)</span></a>
<a class="sourceLine" id="cb15-6" title="6"><span class="co"># Length: 2,000 (of which NA: 0 = 0.00%)</span></a>
<a class="sourceLine" id="cb15-7" title="7"><span class="co"># Levels: 4: A, B, C, D</span></a>
<a class="sourceLine" id="cb15-8" title="8"><span class="co"># Unique: 4</span></a>
<a class="sourceLine" id="cb15-9" title="9"><span class="co"># </span></a>
<a class="sourceLine" id="cb15-10" title="10"><span class="co"># Item Count Percent Cum. Count Cum. Percent</span></a>
<a class="sourceLine" id="cb15-11" title="11"><span class="co"># --- ----- ------ -------- ----------- -------------</span></a>
<a class="sourceLine" id="cb15-12" title="12"><span class="co"># 1 D 762 38.1% 762 38.1%</span></a>
<a class="sourceLine" id="cb15-13" title="13"><span class="co"># 2 B 663 33.2% 1,425 71.2%</span></a>
<a class="sourceLine" id="cb15-14" title="14"><span class="co"># 3 A 321 16.0% 1,746 87.3%</span></a>
<a class="sourceLine" id="cb15-15" title="15"><span class="co"># 4 C 254 12.7% 2,000 100.0%</span></a></code></pre></div>
</div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#introduction">Introduction</a></li>
<li><a href="#frequencies-of-one-variable">Frequencies of one variable</a></li>
<li><a href="#frequencies-of-more-than-one-variable">Frequencies of more than one variable</a></li>
<li><a href="#frequencies-of-numeric-values">Frequencies of numeric values</a></li>
<li><a href="#frequencies-of-factors">Frequencies of factors</a></li>
<li><a href="#frequencies-of-dates">Frequencies of dates</a></li>
<li><a href="#assigning-a-frequency-table-to-an-object">Assigning a frequency table to an object</a></li>
<li><a href="#additional-parameters">Additional parameters</a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p>Developed by <a href="https://www.rug.nl/staff/m.s.berends/">Matthijs S. Berends</a>, <a href="https://www.rug.nl/staff/c.f.luz/">Christian F. Luz</a>, <a href="https://www.rug.nl/staff/c.glasner/">Corinna Glasner</a>, <a href="https://www.rug.nl/staff/a.w.friedrich/">Alex W. Friedrich</a>, <a href="https://www.rug.nl/staff/b.sinha/">Bhanu N. M. Sinha</a>.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.3.0.</p>
</div>
</footer>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/docsearch.js/2.6.1/docsearch.min.js" integrity="sha256-GKvGqXDznoRYHCwKXGnuchvKSwmx9SRMrZOTh2g4Sb0=" crossorigin="anonymous"></script><script>
docsearch({
apiKey: 'f737050abfd4d726c63938e18f8c496e',
indexName: 'amr',
inputSelector: 'input#search-input.form-control',
transformData: function(hits) {
return hits.map(function (hit) {
hit.url = updateHitURL(hit);
return hit;
});
}
});
</script>
</body>
</html>