Home > other >  Find which 5 variables are most correlated with response
Find which 5 variables are most correlated with response

Time:11-04

I have a dataset "insurance" which contains 22 variables on medical expenditure data. I have to find which 5 variables are most correlated with the variable: "totexp". I have tried cor(insurance$totexp,insurance) but it just gives me the correlations without sorting them. then I tried sort(cor(insurance$totexp,insurance)) and it shows the correlations sorted but it doesn't show the names of the variables.

Do you know what's the best way to do this?

Thanks in advance

CodePudding user response:

cors <- cor(mtcars$mpg, mtcars)
cors[, order(cors[1, ])]

Returns:

        wt        cyl       disp         hp       carb       qsec       gear         am         vs       drat        mpg 
-0.8676594 -0.8521620 -0.8475514 -0.7761684 -0.5509251  0.4186840  0.4802848  0.5998324  0.6640389  0.6811719  1.0000000

We can use

cors[, order(cors[1, ], decreasing = TRUE)]

For the decreasing ordering...

  • Related