I have a dataframe that looks like this:
Name Fruit Cost
Adam Orange 2
Adam Apple 3
Bob Orange 3
Cathy Orange 4
Cathy Orange 5
Dataframe creation:
df=data.frame(Name=c("Adam","Adam","Bob","Cathy","Cathy"),Fruit=c("Orange","Apple","Orange","Orange","Orange"),Cost=c(2,3,3,4,5))
I would like to script a combine that says when Name and Fruit match, add the Cost and delete the other row. For the example, the result would look like this, with two Cathy costs being combined because the Name and Fruit are the same:
Name Fruit Cost
Adam Orange 2
Adam Apple 3
Bob Orange 3
Cathy Orange 9
I was thinking of writing a for loop to compare line by line, value by value, compare and add and then delete. But I have to imagine there's a faster/cleaner way.
CodePudding user response:
We may use
library(data.table)
setDT(df)[, .(Cost = sum(Cost)), .(Name, Fruit)]
CodePudding user response:
What you are trying to do is sum Cost
within a group.
In base R:
aggregate(Cost ~ Name Fruit, df, sum)
Or using dplyr
:
library(dplyr)
df %>%
group_by(Name, Fruit) %>%
summarize(Cost = sum(Cost), .groups = "drop")