I have a data.frame as follows:
d1 = data.frame(
g = factor(rep(1:3, rep(4,3))),
x = c(21,24,27,20,20,18,19,15,22,25,27,22)
)
d2 = data.frame(d1, r=rank(d1$x)) #add ranks of x-elements
d2
# g x r
# 1 1 21 6.0
# 2 1 24 9.0
# 3 1 27 11.5
# 4 1 20 4.5
# 5 2 20 4.5
# 6 2 18 2.0
# 7 2 19 3.0
# 8 2 15 1.0
# 9 3 22 7.5
# 10 3 25 10.0
# 11 3 27 11.5
# 12 3 22 7.5
I want convert d2
to a wide format table like
g x1 r1 x2 r2 x3 r3 x4 r4
1 21 6.0 24 9.0 27 11.5 20 4.5
2 20 4.5 18 2.0 19 3.0 15 1.0
3 22 7.5 25 10.0 27 11.5 22 7.5
or even a better and more compact table like
g xr1 xr2 xr3 xr4
1 21(6.0) 24(9.0 ) 27(11.5) 20(4.5)
2 20(4.5) 18(2.0 ) 19(3.0 ) 15(1.0)
3 22(7.5) 25(10.0) 27(11.5) 22(7.5)
I search and find that library(reshape2)
or library(data.table)
may solve my problem; see such as stackoverflow-10589693 or stackoverflow-5890584.
Then I try
library(reshape2)
dcast(melt(d2, id.vars=c("g")), g~variable value)
which shows
Aggregate function missing, defaulting to 'length'
g x_15 x_18 x_19 x_20 x_21 x_22 x_24 x_25 x_27 r_1 r_2 r_3 r_4.5 r_6
1: 1 0 0 0 1 1 0 1 0 1 0 0 0 1 1
2: 2 1 1 1 1 0 0 0 0 0 1 1 1 1 0
3: 3 0 0 0 0 0 2 0 1 1 0 0 0 0 0
r_7.5 r_9 r_10 r_11.5
1: 0 1 0 1
2: 0 0 0 0
3: 2 0 1 1
and the output is far from my expectations! Also, the code
library(data.table)
dcast(setDT(d2), g~g, value.var=c('x', 'r'))
does't work for me. How can I fix these issues?
CodePudding user response:
One way to solve your problem:
dcast(setDT(d2), g ~ rowid(g), sep="", value.var=c('x', 'r'))
Key: <g>
g x1 x2 x3 x4 r1 r2 r3 r4
<fctr> <num> <num> <num> <num> <num> <num> <num> <num>
1: 1 21 24 27 20 6.0 9 11.5 4.5
2: 2 20 18 19 15 4.5 2 3.0 1.0
3: 3 22 25 27 22 7.5 10 11.5 7.5
if you want your second output table
dcast(data = d2[, .(g, xr = sprintf("%.f(%.1f)", x, r))],
formula = g ~ paste0("xr", rowid(g)),
value.var = "xr")
Key: <g>
g xr1 xr2 xr3 xr4
<fctr> <char> <char> <char> <char>
1: 1 21(6.0) 24(9.0) 27(11.5) 20(4.5)
2: 2 20(4.5) 18(2.0) 19(3.0) 15(1.0)
3: 3 22(7.5) 25(10.0) 27(11.5) 22(7.5)
CodePudding user response:
With dplyr
and tidyr
, you can do:
library(dplyr)
library(tidyr)
d2 %>%
group_by(g) %>%
mutate(id = row_number(),
xr = paste0(x, "(", r, ")")) %>%
pivot_wider(id_cols = -c(x, r), names_from = id, names_prefix = "xr", values_from = xr)
Which gives:
g xr1 xr2 xr3 xr4
1 1 21(6) 24(9) 27(11.5) 20(4.5)
2 2 20(4.5) 18(2) 19(3) 15(1)
3 3 22(7.5) 25(10) 27(11.5) 22(7.5)