Questions tagged [pca]
Principal component analysis (PCA) is a statistical technique for dimension reduction often used in clustering or factor analysis. Given any number of explanatory or causal variables, PCA ranks the variables by their ability to explain greatest variation in the data. It is this property that allows PCA to be used for dimension reduction, i.e. to identify the most important variables from amongst a large set possible influences.
2,739
questions
-3
votes
0
answers
13
views
How to properly run a PCA-EFA-CFA [closed]
I am analysing the results of a psychology questionnaire in which 105 persons answered 71 questions on a scale from 1 to 7. There are no NA in the data set. My aims are to 1) reduce the number of ...
1
vote
1
answer
44
views
Handling missing values using missMDA? [closed]
I am working on an SNP dataset for fish. The SNP data has a lot of NA values due to rare alleles in each population. I want to handle the NAs using the missMDA package. However, when I run my code, I ...
0
votes
0
answers
12
views
I faced an error when I used PCA with LSTM model
I have a time series dataset with 20 classes, but they are imbalanced; when I tried a method like "RandomOverSampler", I got an error because of the 3D of our data so could you suggest a ...
0
votes
1
answer
13
views
When to use PCA(n_components=0.95) and when to use PCA(n_components=2), what is the difference between them?
For the Principal Component Analysis (PCA) model training
when to pass variance as PCA(n_components=0.95) and when to use PCA(n_components=2) with pipeline having Standardscaler for standardizes the ...
0
votes
0
answers
12
views
Factor Analysis with Multiple Imputation
I have a dataset with 49 Items of a questionnaire (ordinal; 0,1,2,3,4) with 5 diagnostic group of samples (n1: 50, n2: 25; n3:30, n4:23, n5:60). However, the dataset have missings like for 10-12 ...
2
votes
1
answer
85
views
How to weight principal componets by their variance?
I'm following the paper New ECOSTRESS and MODIS Land Surface Temperature Data Reveal Fine-Scale Heat Vulnerability in Cities: A Case Study for Los Angeles County, California, and I quote:
In ...
0
votes
0
answers
18
views
Supplementary qualitative variable labels in FactoMinR?
Does anyone know how to work the quali.sup labels on a biplot in FactoMinR / FactoExtra?
The manual says it is possible to specify these variables on the labels with label = "quali" but it ...
0
votes
0
answers
39
views
Clustering multi-dimensional dataframe
I am trying to perform a cluster analysis to address the true nature of investment strategies using Python.
To do so, I performed some rolling regressions using different indices as regressors and ...
0
votes
0
answers
15
views
In package `factoextra` PCA , in given varialbe , how to know which individual's contribution is high
In package factoextra Principle Component Analysis (PCA),facto_summarize can get variable or individual summary information . How can I get variable and individual summary in one step ?
I mean , want ...
0
votes
0
answers
27
views
ggPlotly PCA hover row names
I would really appreciate some help solving this issue with ggplot/ggplotly in R
I'm trying to feed into ggplotly() a ggplot made with autoplot(), specifically, a PCA.
The goal I have is for plotly to ...
1
vote
1
answer
42
views
Which PCA results are correct?
I aim to find which directions in my data have "vary greatly". To do that, I understand a method called PCA, which uses the eigenvectors of the covariant matrix to find them.
I used ...
0
votes
1
answer
40
views
PCA in Python: Reproducing pca.fit_transform() results using pca.fit()?
I have a data frame called data_principal_components with dimensions (306x21154), so 306 observations and 21154 features. Using PCA, I want to project the data into 10 dimensions.
As far as I ...
0
votes
0
answers
32
views
How come the PC1 loadings in my RDA are all zero?
I have a dataset of some 130 ponds that were sampled for species presence and environmental variables. I have tried to run an RDA (based on the Vegan package) to find which habitat variables have a ...
0
votes
1
answer
42
views
Why do we need to standardize data before PCA?
I tried to understand what we should do before PCA: standartization (x-m)/s or normalization (scale into [0, 1] interval). In the sklearn tutorial they use standardization and show that PCA with ...
0
votes
0
answers
33
views
Transformation of original data after PCA
Beginner here. I'm trying to calculate a state's infrastrucure index using different variables and I applied PCA.
At first I did a dot product of original data and the principal components.
pca = PCA(...