I have the following table in R:
S <- c("A","A","A","B","B","B","C","C","C")
TS <- c(1,1,1,2,2,2,3,3,3)
f1 <- c(10,20,30,15,25,35,17,27,37)
p <- c(100,200,300,150,250,350,170,270,370)
df <- data.frame(S, TS, f1, p)
So it looks like that:
S | TS | f1 | p |
---|---|---|---|
A | 1 | 10 | 100 |
A | 2 | 20 | 200 |
A | 3 | 30 | 300 |
B | 1 | 15 | 150 |
B | 2 | 25 | 250 |
B | 3 | 35 | 350 |
C | 1 | 17 | 170 |
C | 2 | 27 | 270 |
C | 3 | 37 | 370 |
Now I want to transform my dataframe so that I have unique values for TS (timestamps) for every row and binded my variables right to it, like this:
TS | SA_f1 | pA | SB_f1 | pB | SC_f1 | pC |
---|---|---|---|---|---|---|
1 | 10 | 100 | 15 | 150 | 17 | 170 |
2 | 20 | 200 | 25 | 250 | 27 | 270 |
3 | 30 | 300 | 35 | 250 | 37 | 370 |
What is the most elegant way doing this?
CodePudding user response:
You could use:
library(dplyr)
df %>% group_by(TS) %>% summarise(SA_f1=f1[1], SB_f1=f1[2],SC_f1=f1[3], pA=p[1], pB=p[2], pC=p[3])
# A tibble: 3 x 7
TS SA_f1 SB_f1 SC_f1 pA pB pC
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 10 20 30 100 200 300
2 2 15 25 35 150 250 350
3 3 17 27 37 170 270 370
Differences in result come I think from the fact that the original input differs from what is shown
CodePudding user response:
There is an error in your R code, it should be
TS <- rep(1:3,3)
anyway, a very elegant solution
library(reshape2)
dcast(
melt(df,id.vars=c("S","TS")),
TS~S variable,
value.var="value",
fun.aggregate=mean
)
TS A_f1 A_p B_f1 B_p C_f1 C_p
1 1 10 100 15 150 17 170
2 2 20 200 25 250 27 270
3 3 30 300 35 350 37 370