Home > Back-end >  PCA x must be numeric in R
PCA x must be numeric in R

Time:07-24

I have a dataset like this called df

head(df[, 1:3])
ratio P T H S p1 p2 PM10 CO2 B G Month Year
0.5 89 -7 98 133 0 40 50 30 3 20 1 2019
0.5 55 4 43 43 30 30 40 32 1 15 1 2019
0.85 75 4 63 43 30 30 42 32 1 18 1 2019

I would like to do a principal component analysis to reduced number of variables for regression analysis. I gave that code

library(factoextra)
df.pca <- prcomp(df, scale = TRUE)

But I got this error message and for that reason I was not able to continue

Error in colMeans(x, na.rm = TRUE) : ​​'x' must be numeric

What I am doing wrong?

CodePudding user response:

prcomp() will assume that every column in the object you are passing to it should be used in the analysis. You'll need to drop any non-numeric columns, as well as any numeric columns that should not be used in the PCA.

library(factoextra)

# Example data
df <- data.frame(
  x = letters,
  y1 = rbinom(26,1,0.5),
  y2 = rnorm(26),
  y3 = 1:26,
  id = 1:26
)

# Reproduce your error
prcomp(df)
#> Error in colMeans(x, na.rm = TRUE): 'x' must be numeric

# Remove all non-numeric columns
df_nums <- df[sapply(df, is.numeric)]

# Conduct PCA - works but ID column is in there!
prcomp(df_nums, scale = TRUE)
#> Standard deviations (1, .., p=4):
#> [1] 1.445005e 00 1.039765e 00 9.115092e-01 1.333315e-16
#> 
#> Rotation (n x k) = (4 x 4):
#>            PC1        PC2        PC3           PC4
#> y1  0.27215111 -0.5512026 -0.7887391  0.000000e 00
#> y2  0.07384194 -0.8052981  0.5882536  4.715914e-16
#> y3 -0.67841033 -0.1543868 -0.1261909 -7.071068e-01
#> id -0.67841033 -0.1543868 -0.1261909  7.071068e-01

# Remove ID
df_nums$id <- NULL

# Conduct PCA without ID - success!
prcomp(df_nums, scale = TRUE)
#> Standard deviations (1, .., p=3):
#> [1] 1.1253120 0.9854030 0.8733006
#> 
#> Rotation (n x k) = (3 x 3):
#>           PC1         PC2        PC3
#> y1 -0.6856024  0.05340108 -0.7260149
#> y2 -0.4219813 -0.84181344  0.3365738
#> y3  0.5931957 -0.53712052 -0.5996836
  •  Tags:  
  • r
  • Related