1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-08 16:42:10 +02:00

(v1.4.0.9025) is_new_episode()

This commit is contained in:
2020-11-23 21:50:27 +01:00
parent 363218da7e
commit b045b571a6
29 changed files with 706 additions and 366 deletions

View File

@ -4,7 +4,6 @@
\alias{first_isolate}
\alias{filter_first_isolate}
\alias{filter_first_weighted_isolate}
\alias{is_new_episode}
\title{Determine first (weighted) isolates}
\source{
Methodology of this function is strictly based on:
@ -49,13 +48,6 @@ filter_first_weighted_isolate(
col_keyantibiotics = NULL,
...
)
is_new_episode(
.data,
episode_days = 365,
col_date = NULL,
col_patient_id = NULL
)
}
\arguments{
\item{x, .data}{a \link{data.frame} containing isolates.}
@ -98,10 +90,10 @@ is_new_episode(
A \code{\link{logical}} vector
}
\description{
Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use \code{\link[=is_new_episode]{is_new_episode()}} that also supports grouping with the \code{dplyr} package, see \emph{Examples}.
Determine first (weighted) isolates of all microorganisms of every patient per episode and (if needed) per specimen type. To determine patient episodes not necessarily based on microorganisms, use \code{\link[=is_new_episode]{is_new_episode()}} that also supports grouping with the \code{dplyr} package.
}
\details{
The \code{\link[=is_new_episode]{is_new_episode()}} function is a wrapper around the \code{\link[=first_isolate]{first_isolate()}} function and can be used for data sets without isolates to just determine patient episodes based on any combination of grouping variables (using \code{dplyr}), please see \emph{Examples}. Since it runs \code{\link[=first_isolate]{first_isolate()}} for every group, it is quite slow.
The \code{\link[=first_isolate]{first_isolate()}} function is a wrapper around the \code{\link[=is_new_episode]{is_new_episode()}} function, but more efficient for data sets containing microorganism codes or names.
All isolates with a microbial ID of \code{NA} will be excluded as first isolate.
\subsection{Why this is so important}{
@ -191,42 +183,6 @@ if (require("dplyr")) {
# Gentamicin resistance in hospital D appears to be 3.7\% higher than
# when you (erroneously) would have used all isolates for analysis.
}
# filtering based on any other condition -----------------------------------
if (require("dplyr")) {
# is_new_episode() can be used in dplyr verbs to determine patient
# episodes based on any (combination of) grouping variables:
example_isolates \%>\%
mutate(condition = sample(x = c("A", "B", "C"),
size = 2000,
replace = TRUE)) \%>\%
group_by(condition) \%>\%
mutate(new_episode = is_new_episode())
example_isolates \%>\%
group_by(hospital_id) \%>\%
summarise(patients = n_distinct(patient_id),
n_episodes_365 = sum(is_new_episode(episode_days = 365)),
n_episodes_60 = sum(is_new_episode(episode_days = 60)),
n_episodes_30 = sum(is_new_episode(episode_days = 30)))
# grouping on microorganisms leads to the same results as first_isolate():
x <- example_isolates \%>\%
filter_first_isolate(include_unknown = TRUE)
y <- example_isolates \%>\%
group_by(mo) \%>\%
filter(is_new_episode())
identical(x$patient_id, y$patient_id)
# but now you can group on isolates and many more:
example_isolates \%>\%
group_by(mo, hospital_id, ward_icu) \%>\%
mutate(flag_episode = is_new_episode())
}
}
}
\seealso{

80
man/is_new_episode.Rd Normal file
View File

@ -0,0 +1,80 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/is_new_episode.R
\name{is_new_episode}
\alias{is_new_episode}
\title{Determine (new) episodes for patients}
\usage{
is_new_episode(x, episode_days = 365, ...)
}
\arguments{
\item{x}{vector of dates (class \code{Date} or \code{POSIXt})}
\item{episode_days}{length of the required episode in days, defaults to 365. Every element in the input will return \code{TRUE} after this number of days has passed since the last included date, independent of calendar years. Please see \emph{Details}.}
\item{...}{arguments passed on to \code{\link[=as.Date]{as.Date()}}}
}
\value{
a \link{logical} vector
}
\description{
This function determines which items in a vector can be considered (the start of) a new episode, based on the parameter \code{episode_days}. This can be used to determine clinical episodes for any epidemiological analysis.
}
\details{
Dates are first sorted from old to new. The oldest date will mark the start of the first episode. After this date, the next date will be marked that is at least \code{episode_days} days later than the start of the first episode. From that second marked date on, the next date will be marked that is at least \code{episode_days} days later than the start of the second episode which will be the start of the third episode, and so on. Before the vector is being returned, the original order will be restored.
The \code{dplyr} package is not required for this function to work, but this function works conveniently inside \code{dplyr} verbs such as \code{\link[=filter]{filter()}}, \code{\link[=mutate]{mutate()}} and \code{\link[=summarise]{summarise()}}.
}
\section{Experimental lifecycle}{
\if{html}{\figure{lifecycle_experimental.svg}{options: style=margin-bottom:5px} \cr}
The \link[=lifecycle]{lifecycle} of this function is \strong{experimental}. An experimental function is in early stages of development. The unlying code might be changing frequently. Experimental functions might be removed without deprecation, so you are generally best off waiting until a function is more mature before you use it in production code. Experimental functions are only available in development versions of this \code{AMR} package and will thus not be included in releases that are submitted to CRAN, since such functions have not yet matured enough.
}
\section{Read more on our website!}{
On our website \url{https://msberends.github.io/AMR/} you can find \href{https://msberends.github.io/AMR/articles/AMR.html}{a comprehensive tutorial} about how to conduct AMR analysis, the \href{https://msberends.github.io/AMR/reference/}{complete documentation of all functions} and \href{https://msberends.github.io/AMR/articles/WHONET.html}{an example analysis using WHONET data}. As we would like to better understand the backgrounds and needs of our users, please \href{https://msberends.github.io/AMR/survey.html}{participate in our survey}!
}
\examples{
# `example_isolates` is a dataset available in the AMR package.
# See ?example_isolates.
is_new_episode(example_isolates$date)
is_new_episode(example_isolates$date, episode_days = 60)
if (require("dplyr")) {
# is_new_episode() can also be used in dplyr verbs to determine patient
# episodes based on any (combination of) grouping variables:
example_isolates \%>\%
mutate(condition = sample(x = c("A", "B", "C"),
size = 2000,
replace = TRUE)) \%>\%
group_by(condition) \%>\%
mutate(new_episode = is_new_episode(date))
example_isolates \%>\%
group_by(hospital_id) \%>\%
summarise(patients = n_distinct(patient_id),
n_episodes_365 = sum(is_new_episode(date, episode_days = 365)),
n_episodes_60 = sum(is_new_episode(date, episode_days = 60)),
n_episodes_30 = sum(is_new_episode(date, episode_days = 30)))
# grouping on patients and microorganisms leads to the same results
# as first_isolate():
x <- example_isolates \%>\%
filter(first_isolate(., include_unknown = TRUE))
y <- example_isolates \%>\%
group_by(patient_id, mo) \%>\%
filter(is_new_episode(date))
identical(x$patient_id, y$patient_id)
# but is_new_episode() has a lot more flexibility than first_isolate(),
# since you can now group on anything that seems relevant:
example_isolates \%>\%
group_by(patient_id, mo, hospital_id, ward_icu) \%>\%
mutate(flag_episode = is_new_episode(date))
}
}

View File

@ -3,9 +3,7 @@
\name{like}
\alias{like}
\alias{\%like\%}
\alias{\%not_like\%}
\alias{\%like_case\%}
\alias{\%not_like_case\%}
\title{Pattern matching with keyboard shortcut}
\source{
Idea from the \href{https://github.com/Rdatatable/data.table/blob/master/R/like.R}{\code{like} function from the \code{data.table} package}
@ -15,11 +13,7 @@ like(x, pattern, ignore.case = TRUE)
x \%like\% pattern
x \%not_like\% pattern
x \%like_case\% pattern
x \%not_like_case\% pattern
}
\arguments{
\item{x}{a character vector where matches are sought, or an object which can be coerced by \code{\link[=as.character]{as.character()}} to a character vector.}
@ -43,9 +37,7 @@ The \verb{\%like\%} function:
\item Tries again with \code{perl = TRUE} if regex fails
}
Using RStudio? This function can also be inserted in your code from the Addins menu and can have its own Keyboard Shortcut like \code{Ctrl+Shift+L} or \code{Cmd+Shift+L} (see \code{Tools} > \verb{Modify Keyboard Shortcuts...}). This addin iterates over all 'like' variants. So if you have defined the keyboard shortcut Ctrl/Cmd + L to this addin, it will first insert \verb{\%like\%} and by pressing it again it will be replaced with \verb{\%not_like\%}, then \verb{\%like_case\%}, then \verb{\%not_like_case\%} and then back to \verb{\%like\%}.
The \code{"\%not_like\%"} and \code{"\%not_like_case\%"} functions are wrappers around \code{"\%like\%"} and \code{"\%like_case\%"}.
Using RStudio? The text \verb{\%like\%} can also be directly inserted in your code from the Addins menu and can have its own Keyboard Shortcut like \code{Ctrl+Shift+L} or \code{Cmd+Shift+L} (see \code{Tools} > \verb{Modify Keyboard Shortcuts...}).
}
\section{Stable lifecycle}{
@ -80,11 +72,6 @@ a \%like\% b
if (require("dplyr")) {
example_isolates \%>\%
filter(mo_name(mo) \%like\% "^ent")
example_isolates \%>\%
mutate(group = case_when(hospital_id \%like\% "A|D" ~ "Group 1",
mo_name(mo) \%not_like\% "^Staph" ~ "Group 2a",
TRUE ~ "Group 2b"))
}
}
}