Here is an example of what i need in R. I have this kind of data
df <-
Scenario Group1 Group2
S1 8 9
S2 7 8
S3 6 7
S4 4 5
S5 3 4
From this above data i want to obtain following data
df_converted <-
Scenario1 Scenario2 Group1 Group2
S1 S2 -1 -1
S1 S3 -2 -2
S1 S4 -4 -4
S1 S5 -5 -5
S2 S3 -1 -1
S2 S4 -3 -3
S2 S5 -4 -4
S3 S4 -2 -2
S3 S5 -3 -3
S4 S5 -1 -1
Each row in new data is obtained by subtracting previous or lagging row. And this subtraction is obtained for all combination of scenarios in first columns.
If the solution is in dplyr
then it will be extremely helpful thanks.
The above example is for subtraction i need another converted data that give percentage difference between rows
CodePudding user response:
You could use a full_join()
here, but there is another option (below), which is faster, and takes advantage of combn()
full_join(df,df, by=character()) %>%
filter(Scenario.x<Scenario.y) %>%
mutate(
Group1 = Group1.y-Group1.x,
Group2 = Group2.y-Group2.x,
Perc1 = (Group1.y-Group1.x)/Group1.y,
Perc2 = (Group2.y-Group2.x)/Group2.y
) %>%
select(Scenario1 = Scenario.x,
Sceanrio2 = Scenario.y,
Group1, Group2, Perc1,Perc2)
Output:
Scenario1 Sceanrio2 Group1 Group2 Perc1 Perc2
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 S1 S2 -1 -1 -0.143 -0.125
2 S1 S3 -2 -2 -0.333 -0.286
3 S1 S4 -4 -4 -1 -0.8
4 S1 S5 -5 -5 -1.67 -1.25
5 S2 S3 -1 -1 -0.167 -0.143
6 S2 S4 -3 -3 -0.75 -0.6
7 S2 S5 -4 -4 -1.33 -1
8 S3 S4 -2 -2 -0.5 -0.4
9 S3 S5 -3 -3 -1 -0.75
10 S4 S5 -1 -1 -0.333 -0.25
Updated, additional faster option:
# Make sure Scenario is sorted
df <- df %>% arrange(Scenario)
# Create a function the gets difference and percent, given all combn
f <- function(df) {
d= df[2,] - df[1,]
p = d/df[2,]
tibble(d,p)
}
# Column bind the results
do.call(cbind, list(
as_tibble(t(combn(df$Scenario,2)),.name_repair = ~c("Scenario1","Scenario2")),
df[,2:3] %>% map(~f(combn(.x,2)))
))
Output:
Scenario1 Scenario2 Group1.d Group1.p Group2.d Group2.p
1 S1 S2 -1 -0.1428571 -1 -0.1250000
2 S1 S3 -2 -0.3333333 -2 -0.2857143
3 S1 S4 -4 -1.0000000 -4 -0.8000000
4 S1 S5 -5 -1.6666667 -5 -1.2500000
5 S2 S3 -1 -0.1666667 -1 -0.1428571
6 S2 S4 -3 -0.7500000 -3 -0.6000000
7 S2 S5 -4 -1.3333333 -4 -1.0000000
8 S3 S4 -2 -0.5000000 -2 -0.4000000
9 S3 S5 -3 -1.0000000 -3 -0.7500000
10 S4 S5 -1 -0.3333333 -1 -0.2500000