Home > Enterprise >  Convert data.frame from long to wide with rank in R
Convert data.frame from long to wide with rank in R

Time:08-07

I have a data.frame as follows:

d1 = data.frame(
  g = factor(rep(1:3, rep(4,3))),
  x = c(21,24,27,20,20,18,19,15,22,25,27,22)
)
d2 = data.frame(d1, r=rank(d1$x)) #add ranks of x-elements 
d2
   # g  x    r
# 1  1 21  6.0
# 2  1 24  9.0
# 3  1 27 11.5
# 4  1 20  4.5
# 5  2 20  4.5
# 6  2 18  2.0
# 7  2 19  3.0
# 8  2 15  1.0
# 9  3 22  7.5
# 10 3 25 10.0
# 11 3 27 11.5
# 12 3 22  7.5

I want convert d2 to a wide format table like

g  x1  r1   x2  r2   x3  r3    x4  r4
1  21  6.0  24  9.0  27  11.5  20  4.5
2  20  4.5  18  2.0  19  3.0   15  1.0
3  22  7.5  25  10.0 27  11.5  22  7.5

or even a better and more compact table like

g   xr1      xr2      xr3       xr4
1  21(6.0)  24(9.0 ) 27(11.5)  20(4.5)
2  20(4.5)  18(2.0 ) 19(3.0 )  15(1.0)
3  22(7.5)  25(10.0) 27(11.5)  22(7.5)

I search and find that library(reshape2) or library(data.table) may solve my problem; see such as stackoverflow-10589693 or stackoverflow-5890584. Then I try

library(reshape2)
dcast(melt(d2, id.vars=c("g")), g~variable value)

which shows

Aggregate function missing, defaulting to 'length'
   g x_15 x_18 x_19 x_20 x_21 x_22 x_24 x_25 x_27 r_1 r_2 r_3 r_4.5 r_6
1: 1    0    0    0    1    1    0    1    0    1   0   0   0     1   1
2: 2    1    1    1    1    0    0    0    0    0   1   1   1     1   0
3: 3    0    0    0    0    0    2    0    1    1   0   0   0     0   0
   r_7.5 r_9 r_10 r_11.5
1:     0   1    0      1
2:     0   0    0      0
3:     2   0    1      1

and the output is far from my expectations! Also, the code

library(data.table)
dcast(setDT(d2), g~g, value.var=c('x', 'r'))

does't work for me. How can I fix these issues?

CodePudding user response:

One way to solve your problem:

dcast(setDT(d2), g ~ rowid(g), sep="", value.var=c('x', 'r'))

Key: <g>
        g    x1    x2    x3    x4    r1    r2    r3    r4
   <fctr> <num> <num> <num> <num> <num> <num> <num> <num>
1:      1    21    24    27    20   6.0     9  11.5   4.5
2:      2    20    18    19    15   4.5     2   3.0   1.0
3:      3    22    25    27    22   7.5    10  11.5   7.5

if you want your second output table

dcast(data = d2[, .(g, xr = sprintf("%.f(%.1f)", x, r))], 
      formula = g ~ paste0("xr", rowid(g)), 
      value.var = "xr")

Key: <g>
        g     xr1      xr2      xr3     xr4
   <fctr>  <char>   <char>   <char>  <char>
1:      1 21(6.0)  24(9.0) 27(11.5) 20(4.5)
2:      2 20(4.5)  18(2.0)  19(3.0) 15(1.0)
3:      3 22(7.5) 25(10.0) 27(11.5) 22(7.5)

CodePudding user response:

With dplyr and tidyr, you can do:

library(dplyr)
library(tidyr)

d2 %>% 
  group_by(g) %>% 
  mutate(id = row_number(),
         xr = paste0(x, "(", r, ")")) %>% 
  pivot_wider(id_cols = -c(x, r), names_from = id, names_prefix = "xr", values_from = xr)

Which gives:

  g     xr1     xr2    xr3      xr4    
1 1     21(6)   24(9)  27(11.5) 20(4.5)
2 2     20(4.5) 18(2)  19(3)    15(1)  
3 3     22(7.5) 25(10) 27(11.5) 22(7.5)
  • Related