Home > Software design >  R Dataframe: Combine rows / values when two other values match
R Dataframe: Combine rows / values when two other values match

Time:11-11

I have a dataframe that looks like this:

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   4
Cathy Orange   5

Dataframe creation:

df=data.frame(Name=c("Adam","Adam","Bob","Cathy","Cathy"),Fruit=c("Orange","Apple","Orange","Orange","Orange"),Cost=c(2,3,3,4,5))

I would like to script a combine that says when Name and Fruit match, add the Cost and delete the other row. For the example, the result would look like this, with two Cathy costs being combined because the Name and Fruit are the same:

Name  Fruit Cost
Adam  Orange   2
Adam  Apple    3
Bob   Orange   3
Cathy Orange   9

I was thinking of writing a for loop to compare line by line, value by value, compare and add and then delete. But I have to imagine there's a faster/cleaner way.

CodePudding user response:

We may use

library(data.table)
setDT(df)[, .(Cost = sum(Cost)), .(Name, Fruit)]

CodePudding user response:

What you are trying to do is sum Cost within a group.

In base R:

aggregate(Cost ~ Name   Fruit, df, sum)

Or using dplyr:

library(dplyr)

df %>% 
  group_by(Name, Fruit) %>% 
  summarize(Cost = sum(Cost), .groups = "drop")
  • Related