AMR/docs/articles/freq.html

947 lines
38 KiB
HTML
Raw Normal View History

2019-01-02 23:24:07 +01:00
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>How to create frequency tables • AMR (for R)</title>
<!-- favicons --><link rel="icon" type="image/png" sizes="16x16" href="../favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="../favicon-32x32.png">
<link rel="apple-touch-icon" type="image/png" sizes="180x180" href="../apple-touch-icon.png">
<link rel="apple-touch-icon" type="image/png" sizes="120x120" href="../apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" type="image/png" sizes="76x76" href="../apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" type="image/png" sizes="60x60" href="../apple-touch-icon-60x60.png">
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.3.7/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script><!-- sticky kit --><script src="https://cdnjs.cloudflare.com/ajax/libs/sticky-kit/1.1.3/sticky-kit.min.js" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script><!-- docsearch --><script src="../docsearch.js"></script><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/docsearch.js/2.6.1/docsearch.min.css" integrity="sha256-QOSRU/ra9ActyXkIBbiIB144aDBdtvXBcNc3OTNuX/Q=" crossorigin="anonymous">
<link href="../docsearch.css" rel="stylesheet">
<script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/jquery.mark.min.js" integrity="sha256-4HLtjeVgH0eIB3aZ9mLYF6E8oU5chNdjU6p6rrXpl9U=" crossorigin="anonymous"></script><link href="../extra.css" rel="stylesheet">
<script src="../extra.js"></script><meta property="og:title" content="How to create frequency tables">
<meta property="og:description" content="">
<meta property="og:image" content="https://msberends.gitlab.io/AMR/logo.png">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">AMR (for R)</a>
2019-03-27 11:22:36 +01:00
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.6.0</span>
2019-01-02 23:24:07 +01:00
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fa fa-home"></span>
Home
</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
<span class="fa fa-question-circle"></span>
How to
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/AMR.html">
<span class="fa fa-directions"></span>
Conduct AMR analysis
</a>
</li>
<li>
2019-02-14 10:23:51 +01:00
<a href="../articles/resistance_predict.html">
2019-01-02 23:24:07 +01:00
<span class="fa fa-dice"></span>
Predict antimicrobial resistance
</a>
</li>
2019-01-29 00:06:50 +01:00
<li>
<a href="../articles/WHONET.html">
<span class="fa fa-globe-americas"></span>
Work with WHONET data
</a>
</li>
2019-02-14 15:18:17 +01:00
<li>
<a href="../articles/SPSS.html">
<span class="fa fa-file-upload"></span>
Import data from SPSS/SAS/Stata
</a>
</li>
2019-01-02 23:24:07 +01:00
<li>
<a href="../articles/EUCAST.html">
<span class="fa fa-exchange-alt"></span>
Apply EUCAST rules
</a>
</li>
<li>
2019-02-14 10:23:51 +01:00
<a href="../reference/mo_property.html">
2019-01-02 23:24:07 +01:00
<span class="fa fa-bug"></span>
Get properties of a microorganism
</a>
</li>
<li>
2019-02-14 10:23:51 +01:00
<a href="../reference/atc_property.html">
2019-01-02 23:24:07 +01:00
<span class="fa fa-capsules"></span>
Get properties of an antibiotic
</a>
</li>
<li>
<a href="../articles/freq.html">
<span class="fa fa-sort-amount-down"></span>
Create frequency tables
</a>
</li>
<li>
<a href="../articles/G_test.html">
<span class="fa fa-clipboard-check"></span>
Use the G-test
</a>
</li>
2019-01-11 20:37:23 +01:00
<li>
<a href="../articles/benchmarks.html">
<span class="fa fa-shipping-fast"></span>
Other: benchmarks
</a>
</li>
2019-01-02 23:24:07 +01:00
</ul>
</li>
<li>
<a href="../reference/">
<span class="fa fa-book-open"></span>
Manual
</a>
</li>
<li>
<a href="../authors.html">
<span class="fa fa-users"></span>
Authors
</a>
</li>
<li>
<a href="../news/">
<span class="far fa far fa-newspaper"></span>
Changelog
</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
<li>
<a href="https://gitlab.com/msberends/AMR">
<span class="fab fa fab fa-gitlab"></span>
Source Code
</a>
</li>
<li>
<a href="../LICENSE-text.html">
<span class="fa fa-book"></span>
Licence
</a>
</li>
</ul>
<form class="navbar-form navbar-right" role="search">
<div class="form-group">
<input type="search" class="form-control" name="search-input" id="search-input" placeholder="Search..." aria-label="Search for..." autocomplete="off">
</div>
</form>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>How to create frequency tables</h1>
<h4 class="author">Matthijs S. Berends</h4>
2019-03-27 11:22:36 +01:00
<h4 class="date">27 March 2019</h4>
2019-01-02 23:24:07 +01:00
<div class="hidden name"><code>freq.Rmd</code></div>
</div>
<div id="introduction" class="section level2">
<h2 class="hasAnchor">
<a href="#introduction" class="anchor"></a>Introduction</h2>
<p>Frequency tables (or frequency distributions) are summaries of the distribution of values in a sample. With the <code>freq</code> function, you can create univariate frequency tables. Multiple variables will be pasted into one variable, so it forces a univariate distribution. We take the <code>septic_patients</code> dataset (included in this AMR package) as example.</p>
</div>
<div id="frequencies-of-one-variable" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-one-variable" class="anchor"></a>Frequencies of one variable</h2>
<p>To only show and quickly review the content of one variable, you can just select this variable in various ways. Lets say we want to get the frequencies of the <code>gender</code> variable of the <code>septic_patients</code> dataset:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1">septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(gender)</a></code></pre></div>
<p><strong>Frequency table of <code>gender</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>character</code> (<code>character</code>)<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Unique: 2</p>
<p>Shortest: 1<br>
Longest: 1</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">M</td>
<td align="right">1,031</td>
<td align="right">51.6%</td>
<td align="right">1,031</td>
<td align="right">51.6%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">F</td>
<td align="right">969</td>
<td align="right">48.5%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>This immediately shows the class of the variable, its length and availability (i.e. the amount of <code>NA</code>), the amount of unique values and (most importantly) that among septic patients men are more prevalent than women.</p>
</div>
<div id="frequencies-of-more-than-one-variable" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-more-than-one-variable" class="anchor"></a>Frequencies of more than one variable</h2>
<p>Multiple variables will be pasted into one variable to review individual cases, keeping a univariate frequency table.</p>
<p>For illustration, we could add some more variables to the <code>septic_patients</code> dataset to learn about bacterial properties:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" title="1">my_patients &lt;-<span class="st"> </span>septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/join.html">left_join_microorganisms</a></span>()</a>
<a class="sourceLine" id="cb2-2" title="2"><span class="co"># Joining, by = "mo"</span></a></code></pre></div>
2019-01-02 23:24:07 +01:00
<p>Now all variables of the <code>microorganisms</code> dataset have been joined to the <code>septic_patients</code> dataset. The <code>microorganisms</code> dataset consists of the following variables:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/colnames">colnames</a></span>(microorganisms)</a>
2019-02-18 02:33:37 +01:00
<a class="sourceLine" id="cb3-2" title="2"><span class="co"># [1] "mo" "col_id" "fullname" "kingdom" "phylum" </span></a>
<a class="sourceLine" id="cb3-3" title="3"><span class="co"># [6] "class" "order" "family" "genus" "species" </span></a>
2019-03-27 11:22:36 +01:00
<a class="sourceLine" id="cb3-4" title="4"><span class="co"># [11] "subspecies" "rank" "ref" "species_id" "source" </span></a>
<a class="sourceLine" id="cb3-5" title="5"><span class="co"># [16] "prevalence"</span></a></code></pre></div>
<p>If we compare the dimensions between the old and new dataset, we can see that these 15 variables were added:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(septic_patients)</a>
<a class="sourceLine" id="cb4-2" title="2"><span class="co"># [1] 2000 49</span></a>
<a class="sourceLine" id="cb4-3" title="3"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(my_patients)</a>
2019-03-27 11:22:36 +01:00
<a class="sourceLine" id="cb4-4" title="4"><span class="co"># [1] 2000 64</span></a></code></pre></div>
2019-01-02 23:24:07 +01:00
<p>So now the <code>genus</code> and <code>species</code> variables are available. A frequency table of these combined variables can be created like this:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" title="1">my_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb5-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(genus, species, <span class="dt">nmax =</span> <span class="dv">15</span>)</a></code></pre></div>
2019-03-27 11:22:36 +01:00
<p><strong>Frequency table of <code>genus</code> and <code>species</code> from a <code>data.frame</code> (2,000 x 64)</strong></p>
2019-02-14 10:23:51 +01:00
<p>Columns: 2<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
2019-02-18 02:33:37 +01:00
Unique: 95</p>
<p>Shortest: 8<br>
2019-02-08 16:06:54 +01:00
Longest: 34</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">Escherichia coli</td>
<td align="right">467</td>
<td align="right">23.4%</td>
<td align="right">467</td>
<td align="right">23.4%</td>
</tr>
<tr class="even">
<td align="left">2</td>
2019-03-27 11:22:36 +01:00
<td align="left">Staphylococcus coagulase-negative</td>
2019-01-02 23:24:07 +01:00
<td align="right">313</td>
<td align="right">15.7%</td>
<td align="right">780</td>
<td align="right">39.0%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">Staphylococcus aureus</td>
<td align="right">235</td>
<td align="right">11.8%</td>
<td align="right">1,015</td>
<td align="right">50.7%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">Staphylococcus epidermidis</td>
<td align="right">174</td>
<td align="right">8.7%</td>
<td align="right">1,189</td>
<td align="right">59.5%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="left">Streptococcus pneumoniae</td>
<td align="right">117</td>
<td align="right">5.9%</td>
<td align="right">1,306</td>
<td align="right">65.3%</td>
</tr>
<tr class="even">
<td align="left">6</td>
<td align="left">Staphylococcus hominis</td>
<td align="right">81</td>
<td align="right">4.1%</td>
<td align="right">1,387</td>
<td align="right">69.4%</td>
</tr>
<tr class="odd">
<td align="left">7</td>
<td align="left">Klebsiella pneumoniae</td>
<td align="right">58</td>
<td align="right">2.9%</td>
<td align="right">1,445</td>
<td align="right">72.3%</td>
</tr>
<tr class="even">
<td align="left">8</td>
<td align="left">Enterococcus faecalis</td>
<td align="right">39</td>
<td align="right">2.0%</td>
<td align="right">1,484</td>
<td align="right">74.2%</td>
</tr>
<tr class="odd">
<td align="left">9</td>
<td align="left">Proteus mirabilis</td>
<td align="right">36</td>
<td align="right">1.8%</td>
<td align="right">1,520</td>
<td align="right">76.0%</td>
</tr>
<tr class="even">
<td align="left">10</td>
<td align="left">Pseudomonas aeruginosa</td>
<td align="right">30</td>
<td align="right">1.5%</td>
<td align="right">1,550</td>
<td align="right">77.5%</td>
</tr>
<tr class="odd">
<td align="left">11</td>
<td align="left">Serratia marcescens</td>
<td align="right">25</td>
<td align="right">1.3%</td>
<td align="right">1,575</td>
<td align="right">78.8%</td>
</tr>
<tr class="even">
<td align="left">12</td>
<td align="left">Enterobacter cloacae</td>
<td align="right">23</td>
<td align="right">1.2%</td>
<td align="right">1,598</td>
<td align="right">79.9%</td>
</tr>
<tr class="odd">
<td align="left">13</td>
<td align="left">Enterococcus faecium</td>
<td align="right">21</td>
<td align="right">1.1%</td>
<td align="right">1,619</td>
<td align="right">81.0%</td>
</tr>
<tr class="even">
<td align="left">14</td>
<td align="left">Staphylococcus capitis</td>
<td align="right">21</td>
<td align="right">1.1%</td>
<td align="right">1,640</td>
<td align="right">82.0%</td>
</tr>
<tr class="odd">
<td align="left">15</td>
<td align="left">Bacteroides fragilis</td>
<td align="right">20</td>
<td align="right">1.0%</td>
<td align="right">1,660</td>
<td align="right">83.0%</td>
</tr>
</tbody>
</table>
2019-02-18 02:33:37 +01:00
<p>(omitted 80 entries, n = 340 [17.0%])</p>
2019-01-02 23:24:07 +01:00
</div>
<div id="frequencies-of-numeric-values" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-numeric-values" class="anchor"></a>Frequencies of numeric values</h2>
<p>Frequency tables can be created of any input.</p>
<p>In case of numeric values (like integers, doubles, etc.) additional descriptive statistics will be calculated and shown into the header:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" title="1"><span class="co"># # get age distribution of unique patients</span></a>
<a class="sourceLine" id="cb6-2" title="2">septic_patients <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb6-3" title="3"><span class="st"> </span><span class="kw">distinct</span>(patient_id, <span class="dt">.keep_all =</span> <span class="ot">TRUE</span>) <span class="op">%&gt;%</span><span class="st"> </span></a>
<a class="sourceLine" id="cb6-4" title="4"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(age, <span class="dt">nmax =</span> <span class="dv">5</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>age</code> from a <code>data.frame</code> (981 x 49)</strong></p>
<p>Class: <code>numeric</code> (<code>numeric</code>)<br>
2019-01-02 23:24:07 +01:00
Length: 981 (of which NA: 0 = 0.00%)<br>
Unique: 73</p>
<p>Mean: 71.08<br>
SD: 14.05 (CV: 0.20, MAD: 13.34)<br>
Five-Num: 14 | 63 | 74 | 82 | 97 (IQR: 19, CQV: 0.13)<br>
Outliers: 15 (unique count: 12)</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="right">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="right">83</td>
<td align="right">44</td>
<td align="right">4.5%</td>
<td align="right">44</td>
<td align="right">4.5%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="right">76</td>
<td align="right">43</td>
<td align="right">4.4%</td>
<td align="right">87</td>
<td align="right">8.9%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="right">75</td>
<td align="right">37</td>
<td align="right">3.8%</td>
<td align="right">124</td>
<td align="right">12.6%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="right">82</td>
<td align="right">33</td>
<td align="right">3.4%</td>
<td align="right">157</td>
<td align="right">16.0%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="right">78</td>
<td align="right">32</td>
<td align="right">3.3%</td>
<td align="right">189</td>
<td align="right">19.3%</td>
</tr>
</tbody>
</table>
<p>(omitted 68 entries, n = 792 [80.7%])</p>
<p>So the following properties are determined, where <code>NA</code> values are always ignored:</p>
<ul>
<li><p><strong>Mean</strong></p></li>
<li><p><strong>Standard deviation</strong></p></li>
<li><p><strong>Coefficient of variation</strong> (CV), the standard deviation divided by the mean</p></li>
<li><p><strong>Five numbers of Tukey</strong> (min, Q1, median, Q3, max)</p></li>
<li><p><strong>Coefficient of quartile variation</strong> (CQV, sometimes called coefficient of dispersion), calculated as (Q3 - Q1) / (Q3 + Q1) using quantile with <code>type = 6</code> as quantile algorithm to comply with SPSS standards</p></li>
<li><p><strong>Outliers</strong> (total count and unique count)</p></li>
</ul>
<p>So for example, the above frequency table quickly shows the median age of patients being 74.</p>
</div>
<div id="frequencies-of-factors" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-factors" class="anchor"></a>Frequencies of factors</h2>
<p>To sort frequencies of factors on factor level instead of item count, use the <code>sort.count</code> parameter.</p>
<p><code>sort.count</code> is <code>TRUE</code> by default. Compare this default behaviour…</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb7-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> (<code>numeric</code>)<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
2019-02-14 10:23:51 +01:00
Levels: 4: <code>A</code>, <code>B</code>, <code>C</code>, <code>D</code><br>
2019-02-08 16:06:54 +01:00
Unique: 4</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">762</td>
<td align="right">38.1%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">1,425</td>
<td align="right">71.3%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.1%</td>
<td align="right">1,746</td>
<td align="right">87.3%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>… with this, where items are now sorted on count:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb8-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">sort.count =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> (<code>numeric</code>)<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
2019-02-14 10:23:51 +01:00
Levels: 4: <code>A</code>, <code>B</code>, <code>C</code>, <code>D</code><br>
2019-02-08 16:06:54 +01:00
Unique: 4</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.1%</td>
<td align="right">321</td>
<td align="right">16.1%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">984</td>
<td align="right">49.2%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">1,238</td>
<td align="right">61.9%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
<p>All classes will be printed into the header (default is <code>FALSE</code> when using markdown like this document). Variables with the new <code>rsi</code> class of this AMR package are actually ordered factors and have three classes (look at <code>Class</code> in the header):</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb9-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb9-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(amox, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>amox</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> &gt; <code>ordered</code> &gt; <code>rsi</code> (<code>numeric</code>)<br>
2019-02-09 22:16:24 +01:00
Length: 2,000 (of which NA: 771 = 38.55%)<br>
2019-02-14 10:23:51 +01:00
Levels: 3: <code>S</code> &lt; <code>I</code> &lt; <code>R</code><br>
2019-01-02 23:24:07 +01:00
Unique: 3</p>
2019-03-27 11:22:36 +01:00
<p>Drug: Amoxicillin<br>
%IR: 55.82%</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">R</td>
<td align="right">683</td>
2019-02-09 22:16:24 +01:00
<td align="right">55.6%</td>
2019-01-02 23:24:07 +01:00
<td align="right">683</td>
2019-02-09 22:16:24 +01:00
<td align="right">55.6%</td>
2019-01-02 23:24:07 +01:00
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">S</td>
2019-02-09 22:16:24 +01:00
<td align="right">543</td>
<td align="right">44.2%</td>
<td align="right">1,226</td>
<td align="right">99.8%</td>
2019-01-02 23:24:07 +01:00
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">I</td>
<td align="right">3</td>
2019-02-09 22:16:24 +01:00
<td align="right">0.2%</td>
<td align="right">1,229</td>
2019-01-02 23:24:07 +01:00
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="frequencies-of-dates" class="section level2">
<h2 class="hasAnchor">
<a href="#frequencies-of-dates" class="anchor"></a>Frequencies of dates</h2>
<p>Frequencies of dates will show the oldest and newest date in the data, and the amount of days between them:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb10-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb10-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(date, <span class="dt">nmax =</span> <span class="dv">5</span>, <span class="dt">header =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>date</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>Date</code> (<code>numeric</code>)<br>
2019-01-02 23:24:07 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
Unique: 1,140</p>
<p>Oldest: 2 January 2002<br>
Newest: 28 December 2017 (+5,839)<br>
Median: 31 July 2009 (47.39%)</p>
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">2016-05-21</td>
<td align="right">10</td>
<td align="right">0.5%</td>
<td align="right">10</td>
<td align="right">0.5%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">2004-11-15</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">18</td>
<td align="right">0.9%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">2013-07-29</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">26</td>
<td align="right">1.3%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">2017-06-12</td>
<td align="right">8</td>
<td align="right">0.4%</td>
<td align="right">34</td>
<td align="right">1.7%</td>
</tr>
<tr class="odd">
<td align="left">5</td>
<td align="left">2015-11-19</td>
<td align="right">7</td>
<td align="right">0.4%</td>
<td align="right">41</td>
<td align="right">2.1%</td>
</tr>
</tbody>
</table>
<p>(omitted 1,135 entries, n = 1,959 [98.0%])</p>
</div>
<div id="assigning-a-frequency-table-to-an-object" class="section level2">
<h2 class="hasAnchor">
<a href="#assigning-a-frequency-table-to-an-object" class="anchor"></a>Assigning a frequency table to an object</h2>
<p>A frequency table is actaually a regular <code>data.frame</code>, with the exception that it contains an additional class.</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb11"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb11-1" title="1">my_df &lt;-<span class="st"> </span>septic_patients <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(age)</a>
<a class="sourceLine" id="cb11-2" title="2"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/class">class</a></span>(my_df)</a></code></pre></div>
2019-01-02 23:24:07 +01:00
<p>[1] “frequency_tbl” “data.frame”</p>
<p>Because of this additional class, a frequency table prints like the examples above. But the object itself contains the complete table without a row limitation:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb12-1" title="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/dim">dim</a></span>(my_df)</a></code></pre></div>
2019-01-02 23:24:07 +01:00
<p>[1] 74 5</p>
</div>
<div id="additional-parameters" class="section level2">
<h2 class="hasAnchor">
<a href="#additional-parameters" class="anchor"></a>Additional parameters</h2>
<div id="parameter-na-rm" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-na-rm" class="anchor"></a>Parameter <code>na.rm</code>
</h3>
<p>With the <code>na.rm</code> parameter (defaults to <code>TRUE</code>, but they will always be shown into the header), you can include <code>NA</code> values in the frequency table:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb13-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(amox, <span class="dt">na.rm =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>amox</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> &gt; <code>ordered</code> &gt; <code>rsi</code> (<code>numeric</code>)<br>
Length: 2,000 (of which NA: 771 = 38.55%)<br>
Levels: 3: <code>S</code> &lt; <code>I</code> &lt; <code>R</code><br>
2019-02-08 16:06:54 +01:00
Unique: 4</p>
2019-03-27 11:22:36 +01:00
<p>Drug: Amoxicillin<br>
%IR: 55.82%</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">(NA)</td>
2019-02-09 22:16:24 +01:00
<td align="right">771</td>
<td align="right">38.6%</td>
<td align="right">771</td>
<td align="right">38.6%</td>
2019-01-02 23:24:07 +01:00
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">R</td>
<td align="right">683</td>
<td align="right">34.2%</td>
2019-02-09 22:16:24 +01:00
<td align="right">1,454</td>
<td align="right">72.7%</td>
2019-01-02 23:24:07 +01:00
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">S</td>
2019-02-09 22:16:24 +01:00
<td align="right">543</td>
<td align="right">27.2%</td>
2019-01-02 23:24:07 +01:00
<td align="right">1,997</td>
<td align="right">99.9%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">I</td>
<td align="right">3</td>
<td align="right">0.2%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="parameter-row-names" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-row-names" class="anchor"></a>Parameter <code>row.names</code>
</h3>
<p>The default frequency tables shows row indices. To remove them, use <code>row.names = FALSE</code>:</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb14-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">row.names =</span> <span class="ot">FALSE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> (<code>numeric</code>)<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
2019-02-14 10:23:51 +01:00
Levels: 4: <code>A</code>, <code>B</code>, <code>C</code>, <code>D</code><br>
2019-02-08 16:06:54 +01:00
Unique: 4</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">762</td>
<td align="right">38.1%</td>
</tr>
<tr class="even">
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">1,425</td>
<td align="right">71.3%</td>
</tr>
<tr class="odd">
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.1%</td>
<td align="right">1,746</td>
<td align="right">87.3%</td>
</tr>
<tr class="even">
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
<div id="parameter-markdown" class="section level3">
<h3 class="hasAnchor">
<a href="#parameter-markdown" class="anchor"></a>Parameter <code>markdown</code>
</h3>
<p>The <code>markdown</code> parameter is <code>TRUE</code> at default in non-interactive sessions, like in reports created with R Markdown. This will always print all rows, unless <code>nmax</code> is set.</p>
2019-02-14 10:23:51 +01:00
<div class="sourceCode" id="cb15"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb15-1" title="1">septic_patients <span class="op">%&gt;%</span></a>
<a class="sourceLine" id="cb15-2" title="2"><span class="st"> </span><span class="kw"><a href="../reference/freq.html">freq</a></span>(hospital_id, <span class="dt">markdown =</span> <span class="ot">TRUE</span>)</a></code></pre></div>
<p><strong>Frequency table of <code>hospital_id</code> from a <code>data.frame</code> (2,000 x 49)</strong></p>
<p>Class: <code>factor</code> (<code>numeric</code>)<br>
2019-02-08 16:06:54 +01:00
Length: 2,000 (of which NA: 0 = 0.00%)<br>
2019-02-14 10:23:51 +01:00
Levels: 4: <code>A</code>, <code>B</code>, <code>C</code>, <code>D</code><br>
2019-02-08 16:06:54 +01:00
Unique: 4</p>
2019-01-02 23:24:07 +01:00
<table class="table">
<thead><tr class="header">
<th align="left"></th>
<th align="left">Item</th>
<th align="right">Count</th>
<th align="right">Percent</th>
<th align="right">Cum. Count</th>
<th align="right">Cum. Percent</th>
</tr></thead>
<tbody>
<tr class="odd">
<td align="left">1</td>
<td align="left">D</td>
<td align="right">762</td>
<td align="right">38.1%</td>
<td align="right">762</td>
<td align="right">38.1%</td>
</tr>
<tr class="even">
<td align="left">2</td>
<td align="left">B</td>
<td align="right">663</td>
<td align="right">33.2%</td>
<td align="right">1,425</td>
<td align="right">71.3%</td>
</tr>
<tr class="odd">
<td align="left">3</td>
<td align="left">A</td>
<td align="right">321</td>
<td align="right">16.1%</td>
<td align="right">1,746</td>
<td align="right">87.3%</td>
</tr>
<tr class="even">
<td align="left">4</td>
<td align="left">C</td>
<td align="right">254</td>
<td align="right">12.7%</td>
<td align="right">2,000</td>
<td align="right">100.0%</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#introduction">Introduction</a></li>
<li><a href="#frequencies-of-one-variable">Frequencies of one variable</a></li>
<li><a href="#frequencies-of-more-than-one-variable">Frequencies of more than one variable</a></li>
<li><a href="#frequencies-of-numeric-values">Frequencies of numeric values</a></li>
<li><a href="#frequencies-of-factors">Frequencies of factors</a></li>
<li><a href="#frequencies-of-dates">Frequencies of dates</a></li>
<li><a href="#assigning-a-frequency-table-to-an-object">Assigning a frequency table to an object</a></li>
<li><a href="#additional-parameters">Additional parameters</a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p>Developed by <a href="https://www.rug.nl/staff/m.s.berends/">Matthijs S. Berends</a>, <a href="https://www.rug.nl/staff/c.f.luz/">Christian F. Luz</a>, <a href="https://www.rug.nl/staff/c.glasner/">Corinna Glasner</a>, <a href="https://www.rug.nl/staff/a.w.friedrich/">Alex W. Friedrich</a>, <a href="https://www.rug.nl/staff/b.sinha/">Bhanu N. M. Sinha</a>.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.3.0.</p>
</div>
</footer>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/docsearch.js/2.6.1/docsearch.min.js" integrity="sha256-GKvGqXDznoRYHCwKXGnuchvKSwmx9SRMrZOTh2g4Sb0=" crossorigin="anonymous"></script><script>
docsearch({
apiKey: 'f737050abfd4d726c63938e18f8c496e',
indexName: 'amr',
inputSelector: 'input#search-input.form-control',
transformData: function(hits) {
return hits.map(function (hit) {
hit.url = updateHitURL(hit);
return hit;
});
}
});
</script>
</body>
</html>