Home > Net >  R: Sumif equivalent from one dataframe to another
R: Sumif equivalent from one dataframe to another

Time:05-12

I have two dataframes that look somewhat like this (they're much bigger)

df1 <- data.frame(center = c("5012","5012","5025"), 
                  product = c("Apple","Grape","Apple"),
                  value = c(20,30,50))

df1:
            Center   Product     Value
    1       5012       Apple      20
    2       5012       Grape      30
    3       5025       Apple      50

df2 <- data.frame(center = c("5012","5012","5012","5012","5012","5025"),
                  profitCenter = c("A","B","C","D","A","A"),
                  product = c("Apple","Apple","Apple", "Apple","Grape","Apple"),
                  volume = c(20,30,50,70,60,80))

df2:
        Center  Profitcenter  Product   Volume
1       5012        A          Apple     20
2       5012        B          Apple     30
3       5012        C          Apple     50
4       5012        D          Apple     70
5       5012        A          Grape     60
6       5025        A          Apple     80

I wanted to get the sum of the "volume" column from DF2 by "center" and "product" into DF1. In excel I would do a sumif, but I'm struggling to think on how to properly do this with R:

DF1:
            Center   Product     Value  Volume
    1       5012       Apple      20      170
    2       5012       Grape      30      60
    3       5025       Apple      50      80

Currently I'm creating an aggregated version of DF2 (with dplyr's group_by) and then doing a left_join, but I have to do this a few more times and I'm sure there's a better way.

CodePudding user response:

Similar to @MrFlick commented but I would prefer to first summarise in df2 and then add the resulting column to df1. But outcome and performance are the same. As @r2evans pointed out:

library(tidyverse)
df2 %>% 
  group_by(center, product) %>% 
  summarise(Volume=sum(volume)) %>% 
  right_join(df1,by=c("center","product"))

Or:

library(tidyverse)
df1 %>% 
  left_join(
    df2 %>% 
      group_by(center,product) %>% 
      summarise(sum(volume)),
  by=c("center","product")
  )

Output:

  center product value sum(volume)
1   5012   Apple    20         170
2   5012   Grape    30          60
3   5025   Apple    50          80
  • Related