Title: | Set of Assumptions for Factor and Principal Component Analysis |
---|---|
Description: | Tests for Kaiser-Meyer-Olkin (KMO) and communalities in a dataset. It provides a final sample by removing variables in a iterable manner while keeping account of the variables that were removed in each step. It follows the best practices and assumptions according to Hair, Black, Babin & Anderson (2018, ISBN:9781473756540). |
Authors: | Jose Storopoli [aut, cre] |
Maintainer: | Jose Storopoli <[email protected]> |
License: | GPL-3 |
Version: | 2.0.1 |
Built: | 2025-03-09 02:55:36 UTC |
Source: | https://github.com/storopoli/factorassumptions |
communalities_optimal_solution()
call upon either the principal
or fa
functions from psych
package to iterate over the variables of a dataframe.
communalities_optimal_solution( df, nfactors, type, rotate = "varimax", fm = "minres", squared = TRUE )
communalities_optimal_solution( df, nfactors, type, rotate = "varimax", fm = "minres", squared = TRUE )
df |
a dataframe with only |
nfactors |
number of factors to extract in principal components or factor analysis |
type |
either |
rotate |
rotation to be employed (default is varimax). "none", "varimax", "quartimax", "bentlerT", "equamax", "varimin", "geominT" and "bifactor" are orthogonal rotations. "Promax", "promax", "oblimin", "simplimax", "bentlerQ, "geominQ" and "biquartimin" and "cluster" are possible oblique transformations of the solution. The default is to do a oblimin transformation, although versions prior to 2009 defaulted to varimax. SPSS seems to do a Kaiser normalization before doing Promax, this is done here by the call to "promax" which does the normalization before calling Promax in GPArotation. |
fm |
Factoring method fm="minres" (default) will do a minimum residual as will fm="uls". Both of these use a first derivative. fm="ols" differs very slightly from "minres" in that it minimizes the entire residual matrix using an OLS procedure but uses the empirical first derivative. This will be slower. fm="wls" will do a weighted least squares (WLS) solution, fm="gls" does a generalized weighted least squares (GLS), fm="pa" will do the principal factor solution, fm="ml" will do a maximum likelihood factor analysis. fm="minchi" will minimize the sample size weighted chi square when treating pairwise correlations with different number of subjects per pair. fm ="minrank" will do a minimum rank factor analysis. "old.min" will do minimal residual the way it was done prior to April, 2017 (see discussion below). fm="alpha" will do alpha factor analysis as described in Kaiser and Coffey (1965) |
squared |
TRUE if matrix is squared (such as adjacency matrices), FALSE otherwise |
If finds any individual communality below the optimal value of 0.5 then removes the lowest communality value variable until no more variable has not-optimal communality values.
A list with
df
- A dataframe that has reached its optimal solution in terms of KMO values
removed
- A list of removed variables ordered by the first to last removed during the procedure
loadings
- A table with the communalities loadings from the variables final iteration
results
- Results of the final iteration of either the principal
or fa
functions from psych
package
principal
the PCA function from psych and
fa
the Factor Analysis function from psych
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) communalities_optimal_solution(df, nfactors = 2,type = "principal", squared = FALSE)
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) communalities_optimal_solution(df, nfactors = 2,type = "principal", squared = FALSE)
kmo()
handles both positive definite and not-positive definite matrix by employing the Moore-Penrose inverse (pseudoinverse)
kmo(x, squared = TRUE)
kmo(x, squared = TRUE)
x |
a matrix or dataframe |
squared |
TRUE if matrix is squared (such as adjacency matrices), FALSE otherwise |
A list with
overall
- Overall KMO value
individual
- Individual KMO's dataframe
AIS
- Anti-image Covariance Matrix
AIR
- Anti-image Correlation Matrix
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) kmo(df, squared = FALSE)
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) kmo(df, squared = FALSE)
kmo_optimal_solution()
call upon the kmo
function to iterate over the variables of a dataframe.
kmo_optimal_solution(df, squared = TRUE)
kmo_optimal_solution(df, squared = TRUE)
df |
a dataframe with only |
squared |
TRUE if matrix is squared (such as adjacency matrices), FALSE otherwise |
If finds any individual KMO's below the optimal value of 0.5 then removes the lowest KMO value variable until no more variable has not-optimal KMO values.
A list with
df
- A dataframe that has reached its optimal solution in terms of KMO values
removed
- A list of removed variables ordened by the first to last removed during the procedure
kmo_results
- Results of the final iteration of the kmo
function
kmo
for kmo computation function
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) kmo_optimal_solution(df, squared = FALSE)
set.seed(123) df <- as.data.frame(matrix(rnorm(100*10, 1, .5), ncol=10)) kmo_optimal_solution(df, squared = FALSE)