AMR/man/freq.Rd

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/freq.R
\name{freq}
\alias{freq}
\alias{frequency_tbl}
\alias{top_freq}
\title{Frequency table}
\usage{
freq(x, sort.count = TRUE, nmax = getOption("max.print.freq"),
  na.rm = TRUE, row.names = TRUE, markdown = FALSE,
  as.data.frame = FALSE, digits = 2, sep = " ")

frequency_tbl(x, sort.count = TRUE, nmax = getOption("max.print.freq"),
  na.rm = TRUE, row.names = TRUE, markdown = FALSE,
  as.data.frame = FALSE, digits = 2, sep = " ")

top_freq(f, n)
}
\arguments{
\item{x}{data}

\item{sort.count}{sort on count. Use \code{FALSE} to sort alphabetically on item.}

\item{nmax}{number of row to print. The default, \code{15}, uses \code{\link{getOption}("max.print.freq")}. Use \code{nmax = 0} or \code{nmax = NA} to print all rows.}

\item{na.rm}{a logical value indicating whether NA values should be removed from the frequency table. The header will always print the amount of \code{NA}s.}

\item{row.names}{a logical value indicating whether row indices should be printed as \code{1:nrow(x)}}

\item{markdown}{print table in markdown format (this forces \code{nmax = NA})}

\item{as.data.frame}{return frequency table without header as a \code{data.frame} (e.g. to assign the table to an object)}

\item{digits}{how many significant digits are to be used for numeric values (not for the items themselves, that depends on \code{\link{getOption}("digits")})}

\item{sep}{a character string to separate the terms when selecting multiple columns}

\item{f}{a frequency table as \code{data.frame}, used as \code{freq(..., as.data.frame = TRUE)}}

\item{n}{number of top \emph{n} items to return, use -n for the bottom \emph{n} items. It will include more than \code{n} rows if there are ties.}
}
\value{
\itemize{
  \item{When using \code{as.data.frame = FALSE} (default): only printed text}
  \item{When using \code{as.data.frame = TRUE}: a \code{data.frame} object with an additional class \code{"frequency_tbl"}}
}
}
\description{
Create a frequency table of a vector of data, a single column or a maximum of 9 columns of a data frame. Supports markdown for reports. \code{top_freq} can be used to get the top/bottom \emph{n} items of a frequency table, with counts as names.
}
\details{
This package also has a vignette available about this function, run: \code{browseVignettes("AMR")} to read it.

For numeric values of any class, these additional values will be calculated and shown into the header:
\itemize{
  \item{Mean, using \code{\link[base]{mean}}}
  \item{Standard deviation, using \code{\link[stats]{sd}}}
  \item{Five numbers of Tukey (min, Q1, median, Q3, max), using \code{\link[stats]{fivenum}}}
  \item{Outliers (total count and unique count), using \code{\link{boxplot.stats}}}
  \item{Coefficient of variation (CV), the standard deviation divided by the mean}
  \item{Coefficient of quartile variation (CQV, sometimes called coefficient of dispersion), calculated as \code{(Q3 - Q1) / (Q3 + Q1)} using \code{\link{quantile}} with \code{type = 6} as quantile algorithm to comply with SPSS standards}
}

For dates and times of any class, these additional values will be calculated and shown into the header:
\itemize{
  \item{Oldest, using \code{\link[base]{min}}}
  \item{Newest, using \code{\link[base]{max}}, with difference between newest and oldest}
  \item{Median, using \code{\link[stats]{median}}, with percentage since oldest}
}

The function \code{top_freq} uses \code{\link[dplyr]{top_n}} internally and will include more than \code{n} rows if there are ties.
}
\examples{
library(dplyr)

freq(septic_patients$hospital_id)

septic_patients \%>\%
  filter(hospital_id == "A") \%>\%
  select(bactid) \%>\%
  freq()

# select multiple columns; they will be pasted together
septic_patients \%>\%
  left_join_microorganisms \%>\%
  filter(hospital_id == "A") \%>\%
  select(genus, species) \%>\%
  freq()

# save frequency table to an object
years <- septic_patients \%>\%
  mutate(year = format(date, "\%Y")) \%>\%
  select(year) \%>\%
  freq(as.data.frame = TRUE)

# get top 10 bugs of hospital A as a vector
septic_patients \%>\%
  filter(hospital_id == "A") \%>\%
  select(bactid) \%>\%
  freq(as.data.frame = TRUE) \%>\%
  top_freq(10)
}
\keyword{freq}
\keyword{frequency}
\keyword{summarise}
\keyword{summary}
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`% Generated by roxygen2: do not edit by hand`
			`% Please edit documentation in R/freq.R`
			`\name{freq}`
			`\alias{freq}`
			`\alias{frequency_tbl}`
top_freq 2018-06-20 14:47:37 +02:00			`\alias{top_freq}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`\title{Frequency table}`
			`\usage{`
added vignette of freq 2018-05-09 11:44:46 +02:00			`freq(x, sort.count = TRUE, nmax = getOption("max.print.freq"),`
extra unit tests, add row.names to freq 2018-06-19 15:20:14 +02:00			`na.rm = TRUE, row.names = TRUE, markdown = FALSE,`
			`as.data.frame = FALSE, digits = 2, sep = " ")`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00
added vignette of freq 2018-05-09 11:44:46 +02:00			`frequency_tbl(x, sort.count = TRUE, nmax = getOption("max.print.freq"),`
extra unit tests, add row.names to freq 2018-06-19 15:20:14 +02:00			`na.rm = TRUE, row.names = TRUE, markdown = FALSE,`
			`as.data.frame = FALSE, digits = 2, sep = " ")`
top_freq 2018-06-20 14:47:37 +02:00
			`top_freq(f, n)`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`}`
			`\arguments{`
			`\item{x}{data}`

edit as.rsi algor. 2018-05-30 23:02:16 +02:00			`\item{sort.count}{sort on count. Use \code{FALSE} to sort alphabetically on item.}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00
Update freq function 2018-05-22 16:34:22 +02:00			`\item{nmax}{number of row to print. The default, \code{15}, uses \code{\link{getOption}("max.print.freq")}. Use \code{nmax = 0} or \code{nmax = NA} to print all rows.}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00
added vignette of freq 2018-05-09 11:44:46 +02:00			`\item{na.rm}{a logical value indicating whether NA values should be removed from the frequency table. The header will always print the amount of \code{NA}s.}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00
extra unit tests, add row.names to freq 2018-06-19 15:20:14 +02:00			`\item{row.names}{a logical value indicating whether row indices should be printed as \code{1:nrow(x)}}`

MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`\item{markdown}{print table in markdown format (this forces \code{nmax = NA})}`

added vignette of freq 2018-05-09 11:44:46 +02:00			`\item{as.data.frame}{return frequency table without header as a \code{data.frame} (e.g. to assign the table to an object)}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00
			`\item{digits}{how many significant digits are to be used for numeric values (not for the items themselves, that depends on \code{\link{getOption}("digits")})}`

			`\item{sep}{a character string to separate the terms when selecting multiple columns}`
top_freq 2018-06-20 14:47:37 +02:00
			`\item{f}{a frequency table as \code{data.frame}, used as \code{freq(..., as.data.frame = TRUE)}}`

			`\item{n}{number of top \emph{n} items to return, use -n for the bottom \emph{n} items. It will include more than \code{n} rows if there are ties.}`
			`}`
			`\value{`
			`\itemize{`
			`\item{When using \code{as.data.frame = FALSE} (default): only printed text}`
			`\item{When using \code{as.data.frame = TRUE}: a \code{data.frame} object with an additional class \code{"frequency_tbl"}}`
			`}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`}`
			`\description{`
top_freq 2018-06-20 14:47:37 +02:00			`Create a frequency table of a vector of data, a single column or a maximum of 9 columns of a data frame. Supports markdown for reports. \code{top_freq} can be used to get the top/bottom \emph{n} items of a frequency table, with counts as names.`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`}`
			`\details{`
top_freq 2018-06-20 14:47:37 +02:00			`This package also has a vignette available about this function, run: \code{browseVignettes("AMR")} to read it.`

			`For numeric values of any class, these additional values will be calculated and shown into the header:`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`\itemize{`
			`\item{Mean, using \code{\link[base]{mean}}}`
			`\item{Standard deviation, using \code{\link[stats]{sd}}}`
			`\item{Five numbers of Tukey (min, Q1, median, Q3, max), using \code{\link[stats]{fivenum}}}`
added vignette of freq 2018-05-09 11:44:46 +02:00			`\item{Outliers (total count and unique count), using \code{\link{boxplot.stats}}}`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`\item{Coefficient of variation (CV), the standard deviation divided by the mean}`
			`\item{Coefficient of quartile variation (CQV, sometimes called coefficient of dispersion), calculated as \code{(Q3 - Q1) / (Q3 + Q1)} using \code{\link{quantile}} with \code{type = 6} as quantile algorithm to comply with SPSS standards}`
			`}`
top_freq 2018-06-20 14:47:37 +02:00
			`For dates and times of any class, these additional values will be calculated and shown into the header:`
			`\itemize{`
			`\item{Oldest, using \code{\link[base]{min}}}`
			`\item{Newest, using \code{\link[base]{max}}, with difference between newest and oldest}`
			`\item{Median, using \code{\link[stats]{median}}, with percentage since oldest}`
			`}`

			`The function \code{top_freq} uses \code{\link[dplyr]{top_n}} internally and will include more than \code{n} rows if there are ties.`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`}`
			`\examples{`
			`library(dplyr)`

			`freq(septic_patients$hospital_id)`

			`septic_patients \%>\%`
			`filter(hospital_id == "A") \%>\%`
			`select(bactid) \%>\%`
			`freq()`

			`# select multiple columns; they will be pasted together`
			`septic_patients \%>\%`
			`left_join_microorganisms \%>\%`
			`filter(hospital_id == "A") \%>\%`
			`select(genus, species) \%>\%`
			`freq()`

			`# save frequency table to an object`
			`years <- septic_patients \%>\%`
			`mutate(year = format(date, "\%Y")) \%>\%`
			`select(year) \%>\%`
added vignette of freq 2018-05-09 11:44:46 +02:00			`freq(as.data.frame = TRUE)`
top_freq 2018-06-20 14:47:37 +02:00
			`# get top 10 bugs of hospital A as a vector`
			`septic_patients \%>\%`
			`filter(hospital_id == "A") \%>\%`
			`select(bactid) \%>\%`
			`freq(as.data.frame = TRUE) \%>\%`
			`top_freq(10)`
MDRO, freq tables, new print format for tibbles 2018-04-18 12:24:54 +02:00			`}`
			`\keyword{freq}`
			`\keyword{frequency}`
			`\keyword{summarise}`
			`\keyword{summary}`