1
0
mirror of https://github.com/msberends/AMR.git synced 2025-07-09 14:21:51 +02:00

(v1.0.1.9002) PCA unit tests

This commit is contained in:
2020-03-08 11:18:59 +01:00
parent 9fc858f208
commit 77656a676c
20 changed files with 182 additions and 135 deletions

View File

@ -11,7 +11,7 @@ As per their GPL-2 licence that demands documentation of code changes, the chang
\item Rewritten code to remove the dependency on packages \code{plyr}, \code{scales} and \code{grid}
\item Parametrised more options, like arrow and ellipse settings
\item Added total amount of explained variance as a caption in the plot
\item Cleaned all syntax based on the \code{lintr} package
\item Cleaned all syntax based on the \code{lintr} package and added integrity checks
\item Updated documentation
}
}
@ -20,14 +20,15 @@ ggplot_pca(
x,
choices = 1:2,
scale = TRUE,
pc.biplot = TRUE,
labels = NULL,
labels_textsize = 3,
labels_text_placement = 1.5,
groups = NULL,
ellipse = FALSE,
ellipse = TRUE,
ellipse_prob = 0.68,
ellipse_size = 0.5,
ellipse_alpha = 0.25,
ellipse_alpha = 0.5,
points_size = 2,
points_alpha = 0.25,
arrows = TRUE,
@ -55,6 +56,14 @@ ggplot_pca(
will be issued if the specified \code{scale} is outside this range.
}
\item{pc.biplot}{
If true, use what Gabriel (1971) refers to as a "principal component
biplot", with \code{lambda = 1} and observations scaled up by sqrt(n) and
variables scaled down by sqrt(n). Then inner products between
variables approximate covariances and distances between observations
approximate Mahalanobis distance.
}
\item{labels}{an optional vector of labels for the observations. If set, the labels will be placed below their respective points. When using the \code{\link[=pca]{pca()}} function as input for \code{x}, this will be determined automatically based on the attribute \code{non_numeric_cols}, see \code{\link[=pca]{pca()}}.}
\item{labels_textsize}{the size of the text used for the labels}
@ -93,7 +102,7 @@ ggplot_pca(
This function is to produce a \code{ggplot2} variant of a so-called \href{https://en.wikipedia.org/wiki/Biplot}{biplot} for PCA (principal component analysis), but is more flexible and more appealing than the base \R \code{\link[=biplot]{biplot()}} function.
}
\details{
The default colours for labels and points is set with \code{\link[=scale_colour_viridis_d]{scale_colour_viridis_d()}}, but these can be changed by adding another scale for colour, like \code{\link[=scale_colour_brewer]{scale_colour_brewer()}}.
The colours for labels and points can be changed by adding another scale layer for colour, like \code{\link[=scale_colour_viridis_d]{scale_colour_viridis_d()}} or \code{\link[=scale_colour_brewer]{scale_colour_brewer()}}.
}
\section{Maturing lifecycle}{

View File

@ -1,11 +1,10 @@
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/pca.R
\name{prcomp.data.frame}
\alias{prcomp.data.frame}
\name{pca}
\alias{pca}
\title{Principal Component Analysis (for AMR)}
\usage{
\method{prcomp}{data.frame}(
pca(
x,
...,
retx = TRUE,
@ -14,8 +13,6 @@
tol = NULL,
rank. = NULL
)
pca(x, ...)
}
\arguments{
\item{x}{a \link{data.frame} containing numeric columns}
@ -51,13 +48,16 @@ pca(x, ...)
alternative or in addition to \code{tol}, useful notably when the
desired rank is considerably smaller than the dimensions of the matrix.}
}
\value{
An object of classes \link{pca} and \link{prcomp}
}
\description{
Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels.
Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels, and automatic filtering on only suitable (i.e. non-empty and numeric) variables.
}
\details{
The \code{\link[=pca]{pca()}} function takes a \link{data.frame} as input and performs the actual PCA with the R function \code{\link[=prcomp]{prcomp()}}.
The \code{\link[=pca]{pca()}} function takes a \link{data.frame} as input and performs the actual PCA with the \R function \code{\link[=prcomp]{prcomp()}}.
The result of the \code{\link[=pca]{pca()}} function is a \code{\link{prcomp}} object, with an additional attribute \code{non_numeric_cols} which is a vector with the column names of all columns that do not contain numeric values. These are probably the groups and labels, and will be used by \code{\link[=ggplot_pca]{ggplot_pca()}}.
The result of the \code{\link[=pca]{pca()}} function is a \link{prcomp} object, with an additional attribute \code{non_numeric_cols} which is a vector with the column names of all columns that do not contain numeric values. These are probably the groups and labels, and will be used by \code{\link[=ggplot_pca]{ggplot_pca()}}.
}
\section{Experimental lifecycle}{