Add together values of unique column combinations in data frame-CodePudding

I have a data frame that looks like this:

  iso_o iso_d     value 
1   ABW   AFG       5  
2   AFG   ABW       100    
3   NDL   ALB       17
4   NDL   ARE       4758   
5   ALB   NDL       13   
6   ABW   ARM       10

I now want to combine all occurrences of pairs in the "iso_o" and "iso_d" columns to and sum up the value they have in "value". So for example, the pair ABW - AFG should only occur one time with the value 105, NDL - ALB also only one time with the value 30. Like this:

  iso_o iso_d     value 
1   ABW   AFG       105  
2   NDL   ALB       30    
3   NDL   ARE       4758   
4   ABW   ARM       10

Any ideas on how to do this?

CodePudding user response：

base R

l <- apply(df[1:2], 1, \(x) toString(sort(x)))
agg <- aggregate(value ~ match(l, l), 
          data = df, FUN = sum)
cbind(df[agg[[1]], 1:2], agg[2])

#   iso_o iso_d value
# 1   ABW   AFG   105
# 3   NDL   ALB    30
# 4   NDL   ARE  4758
# 6   ABW   ARM    10

dplyr

You can group_by the rowwise sorted values of iso_o and iso_d:

library(dplyr)
df %>% 
  rowwise() %>% 
  mutate(sorted = list(sort(c(iso_o, iso_d)))) %>% 
  group_by(sorted) %>% 
  summarise(across(iso_o:iso_d, first),
            value = sum(value)) %>% 
  select(-sorted)

# A tibble: 4 × 3
  iso_o iso_d value
  <chr> <chr> <int>
1 ABW   AFG     105
2 NDL   ALB      30
3 NDL   ARE    4758
4 ABW   ARM      10

CodePudding user response：

Here is a tidyverse approach:

df %>% group_by(iso_o, iso_d) %>% mutate(value = sum(value)) %>% ungroup() %>% distinct()