Home > Net >  How to change all variables of a big dataset into ordinal factors that have different sets of orders
How to change all variables of a big dataset into ordinal factors that have different sets of orders

Time:10-15

I have this dataframe with it structure (let's imagine it very big)

df = data.frame(x = 1:5, 
                y = 2:6, 
                z = letters[6:10], 
                m =10:14, 
                n = 15:19,
                o = 20:24 )

str(df)

and I wish to convert all these variables x, y, z, m, n and o into ordinal factors with a simple function so that each one of them has its own order of levels as follows :

x : 5 < 4 < 3 < 2 < 1

y : 6 < 5 < 4 < 3 < 2

z : f < g < h < i < j

m : 10 > 11 > 12 > 13 > 14

n : 15 > 16 > 17 > 18 > 19

o : 20 < 21 < 22 < 23 < 24

CodePudding user response:

You could do:

df[] <- lapply(df, function(x) {
  if(is.numeric(x)) ordered(x, rev(sort(unique(x))))
  else ordered(x)
})

Resulting in

df
#>   x y z  m  n
#> 1 1 2 f 10 15
#> 2 2 3 g 11 16
#> 3 3 4 h 12 17
#> 4 4 5 i 13 18
#> 5 5 6 j 14 19

df$x
#> [1] 1 2 3 4 5
#> Levels: 5 < 4 < 3 < 2 < 1

df$y
#> [1] 2 3 4 5 6
#> Levels: 6 < 5 < 4 < 3 < 2

df$z
#> [1] f g h i j
#> Levels: f < g < h < i < j

df$m
#> [1] 10 11 12 13 14
#> Levels: 14 < 13 < 12 < 11 < 10

df$n
#> [1] 15 16 17 18 19
#> Levels: 19 < 18 < 17 < 16 < 15

Note that the levels in an ordered factor are always printed smallest to largest, so the desired levels you show in the last two columns are not valid for ordered factors in R (though the above is a direct equivalent)


EDIT

If you have different rules for each column, then you need to handle them separately:

df[1:5] <- lapply(df[1:5], function(x) {
  if(is.numeric(x)) ordered(x, rev(sort(unique(x))))
  else ordered(x)
})

df$o <- ordered(df$o)

df$o
#> [1] 20 21 22 23 24
#> Levels: 20 < 21 < 22 < 23 < 24

Created on 2022-10-14 with reprex v2.0.2

  • Related