I feel like this is an easy problem that I can't solve for some reason. Suppose I have a dataframe that lists trades between entities and I want to find the net amount for each pair. The dataframe might look like this:
sending_entity | receiving_entity | amount_traded |
---|---|---|
Tim | Bob | 100 |
Bob | Tim | 50 |
Sarah | Jane | 25 |
Jane | Sarah | 50 |
And the result I want is to know that Tim and Bob settled where Tim gave Bob a net amount of 50, and Jane gave Sarah a net amount of 25.
I thought about creating 2 sparse matrices, one for sending and one for receiving, and then subtracting the matrices. But this seems janky and would prefer a tidyverse way of doing this (or at least within the dataframe data structure), if possible.
CodePudding user response:
I would create a pair column that is built so that unique pairings have the same value, then used xtabs
:
dfrm$pair <- with( dfrm, paste( ifelse(sending_entity>receiving_entity,
paste(receiving_entity,sending_entity),
paste(sending_entity, receiving_entity) ))
)
dfrm
#-------------
sending_entity receiving_entity amount_traded pair
1 Tim Bob 100 Bob Tim
2 Bob Tim 50 Bob Tim
3 Sarah Jane 25 Jane Sarah
4 Jane Sarah 50 Jane Sarah
#--------------
xtabs(amount_traded~pair, dfrm)
#--------------
pair
Bob Tim Jane Sarah
150 75
CodePudding user response:
An alternative tidyverse approach
df |>
pivot_longer(cols = sending_entity:receiving_entity, names_to = "entity", values_to = "name") |>
count(name, wt = ifelse(entity == "receiving_entity", amount_traded, -amount_traded), name = "received") |>
arrange(order(df$receiving_entity)) |>
mutate(from = df$sending_entity)
name received from
<chr> <int> <chr>
1 Bob 50 Tim
2 Tim -50 Bob
3 Jane -25 Sarah
4 Sarah 25 Jane
Using wt
in count allows us to calculate the sum/net.