Home > Blockchain >  Sort columns in order of decreasing variance in R
Sort columns in order of decreasing variance in R

Time:11-27

I wish to order the columns of a dataset in order of decreasing column variance but I have had no luck in doing so. This is what I have so far:

og_data <- og_data[, sort(apply(og_data, 2, var), decreasing=TRUE)]

Now, I know this doesn't work since sort(apply(og_data, 2, var), decreasing=TRUE) returns the variance values of the columns in order of decreasing variance. I have no idea how to extract the column indexes from this which is what I would need to use. Any help would be much appreciated.

CodePudding user response:

Since you did not give reproducible data for me to work with you can try this method below

# sorting examples using the mtcars dataset
attach(mtcars)

# sort by mpg
new_data <- mtcars[order(mpg),]

# sort by mpg and cyl
new_data <- mtcars[order(mpg, cyl),]

#sort by mpg (ascending) and cyl (descending)
new_data <- mtcars[order(mpg, -cyl),]

hope this solved your question

CodePudding user response:

Since the goal is to order the columns of the data frame by descending variance, we calculate the variances and use order() to sort by descending variance.

We'll use mtcars to illustrate, given the absence of a minimal reproducible example:

mtcars[,order(apply(mtcars,2,var),decreasing=TRUE)]

...and the output:

                     disp  hp  mpg  qsec cyl carb    wt gear drat vs am
Mazda RX4           160.0 110 21.0 16.46   6    4 2.620    4 3.90  0  1
Mazda RX4 Wag       160.0 110 21.0 17.02   6    4 2.875    4 3.90  0  1
Datsun 710          108.0  93 22.8 18.61   4    1 2.320    4 3.85  1  1
Hornet 4 Drive      258.0 110 21.4 19.44   6    1 3.215    3 3.08  1  0
Hornet Sportabout   360.0 175 18.7 17.02   8    2 3.440    3 3.15  0  0
Valiant             225.0 105 18.1 20.22   6    1 3.460    3 2.76  1  0
Duster 360          360.0 245 14.3 15.84   8    4 3.570    3 3.21  0  0
Merc 240D           146.7  62 24.4 20.00   4    2 3.190    4 3.69  1  0
Merc 230            140.8  95 22.8 22.90   4    2 3.150    4 3.92  1  0
Merc 280            167.6 123 19.2 18.30   6    4 3.440    4 3.92  1  0
Merc 280C           167.6 123 17.8 18.90   6    4 3.440    4 3.92  1  0
Merc 450SE          275.8 180 16.4 17.40   8    3 4.070    3 3.07  0  0
Merc 450SL          275.8 180 17.3 17.60   8    3 3.730    3 3.07  0  0
Merc 450SLC         275.8 180 15.2 18.00   8    3 3.780    3 3.07  0  0
Cadillac Fleetwood  472.0 205 10.4 17.98   8    4 5.250    3 2.93  0  0
Lincoln Continental 460.0 215 10.4 17.82   8    4 5.424    3 3.00  0  0
Chrysler Imperial   440.0 230 14.7 17.42   8    4 5.345    3 3.23  0  0
Fiat 128             78.7  66 32.4 19.47   4    1 2.200    4 4.08  1  1
Honda Civic          75.7  52 30.4 18.52   4    2 1.615    4 4.93  1  1
Toyota Corolla       71.1  65 33.9 19.90   4    1 1.835    4 4.22  1  1
Toyota Corona       120.1  97 21.5 20.01   4    1 2.465    3 3.70  1  0
Dodge Challenger    318.0 150 15.5 16.87   8    2 3.520    3 2.76  0  0
AMC Javelin         304.0 150 15.2 17.30   8    2 3.435    3 3.15  0  0
Camaro Z28          350.0 245 13.3 15.41   8    4 3.840    3 3.73  0  0
Pontiac Firebird    400.0 175 19.2 17.05   8    2 3.845    3 3.08  0  0
Fiat X1-9            79.0  66 27.3 18.90   4    1 1.935    4 4.08  1  1
Porsche 914-2       120.3  91 26.0 16.70   4    2 2.140    5 4.43  0  1
Lotus Europa         95.1 113 30.4 16.90   4    2 1.513    5 3.77  1  1
Ford Pantera L      351.0 264 15.8 14.50   8    4 3.170    5 4.22  0  1
Ferrari Dino        145.0 175 19.7 15.50   6    6 2.770    5 3.62  0  1
Maserati Bora       301.0 335 15.0 14.60   8    8 3.570    5 3.54  0  1
Volvo 142E          121.0 109 21.4 18.60   4    2 2.780    4 4.11  1  1
> 

To cross-check the results, we'll sort and print the vector of variances:

#cross-check variances

variances <- apply(mtcars,2,var)
variances[order(variances,decreasing = TRUE)]

Notice that the ordering of the vector matches the ordering of the columns from the prior operation.

        disp           hp          mpg         qsec          cyl         carb 
1.536080e 04 4.700867e 03 3.632410e 01 3.193166e 00 3.189516e 00 2.608871e 00 
          wt         gear         drat           vs           am 
9.573790e-01 5.443548e-01 2.858814e-01 2.540323e-01 2.489919e-01 
> 
  • Related