Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels, and automatic filtering on only suitable (i.e. non-empty and numeric) variables.
pca( x, ..., retx = TRUE, center = TRUE, scale. = TRUE, tol = NULL, rank. = NULL )
x | a data.frame containing numeric columns |
---|---|
... | columns of |
retx | a logical value indicating whether the rotated variables should be returned. |
center | a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of |
scale. | a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is |
tol | a value indicating the magnitude below which components
should be omitted. (Components are omitted if their
standard deviations are less than or equal to |
rank. | optionally, a number specifying the maximal rank, i.e.,
maximal number of principal components to be used. Can be set as
alternative or in addition to |
An object of classes pca and prcomp
The pca()
function takes a data.frame as input and performs the actual PCA with the R function prcomp()
.
The result of the pca()
function is a prcomp object, with an additional attribute non_numeric_cols
which is a vector with the column names of all columns that do not contain numeric values. These are probably the groups and labels, and will be used by ggplot_pca()
.
The lifecycle of this function is maturing. The unlying code of a maturing function has been roughed out, but finer details might still change. This function needs wider usage and more extensive testing in order to optimise the unlying code.
# `example_isolates` is a dataset available in the AMR package. # See ?example_isolates. if (FALSE) { # calculate the resistance per group first library(dplyr) resistance_data <- example_isolates %>% group_by(order = mo_order(mo), # group on anything, like order genus = mo_genus(mo)) %>% # and genus as we do here summarise_if(is.rsi, resistance) # then get resistance of all drugs # now conduct PCA for certain antimicrobial agents pca_result <- resistance_data %>% pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) pca_result summary(pca_result) biplot(pca_result) ggplot_pca(pca_result) # a new and convenient plot function }