Performs a principal component analysis (PCA) based on a data set with automatic determination for afterwards plotting the groups and labels, and automatic filtering on only suitable (i.e. non-empty and numeric) variables.
pca( x, ..., retx = TRUE, center = TRUE, scale. = TRUE, tol = NULL, rank. = NULL )
x | a data.frame containing numeric columns |
---|---|
... | columns of |
retx | a logical value indicating whether the rotated variables should be returned. |
center | a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of |
scale. | a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is |
tol | a value indicating the magnitude below which components
should be omitted. (Components are omitted if their
standard deviations are less than or equal to |
rank. | optionally, a number specifying the maximal rank, i.e.,
maximal number of principal components to be used. Can be set as
alternative or in addition to |
An object of classes pca and prcomp
The pca()
function takes a data.frame as input and performs the actual PCA with the R function prcomp()
.
The result of the pca()
function is a prcomp object, with an additional attribute non_numeric_cols
which is a vector with the column names of all columns that do not contain numeric values. These are probably the groups and labels, and will be used by ggplot_pca()
.
The lifecycle of this function is maturing. The unlying code of a maturing function has been roughed out, but finer details might still change. We will strive to maintain backward compatibility, but the function needs wider usage and more extensive testing in order to optimise the unlying code.
# `example_isolates` is a dataset available in the AMR package. # See ?example_isolates. if (FALSE) { # calculate the resistance per group first library(dplyr) resistance_data <- example_isolates %>% group_by(order = mo_order(mo), # group on anything, like order genus = mo_genus(mo)) %>% # and genus as we do here summarise_if(is.rsi, resistance) # then get resistance of all drugs # now conduct PCA for certain antimicrobial agents pca_result <- resistance_data %>% pca(AMC, CXM, CTX, CAZ, GEN, TOB, TMP, SXT) pca_result summary(pca_result) biplot(pca_result) ggplot_pca(pca_result) # a new and convenient plot function }