I have a dataframe that looks like this:
Name Fruit CostA CostB
Adam Orange 2 5
Adam Apple 3 6
Bob Orange 3 5
Cathy Orange 4 3
Cathy Orange 5 7
dataframe creation:
df=data.frame(Name=c("Adam","Adam","Bob","Cathy","Cathy"),Fruit=c("Orange","Apple","Orange","Orange","Orange"),CostA=c(2,3,3,4,5),CostB=c(7,8,9,3,4))
This dataframe will grow by unspecified number of columns with CostC, CostD, etc. I need to sum/aggregate all columns (CostA, CostB, etc.) when Name and Fruit values match.
I have accomplished similar when the cost columns to be summed are known, by using this:
aggregate(cbind(CostA,CostB) ~ Name Fruit, df, sum)
However, I now need to accomplish without identifying all columns to be summed.
CodePudding user response:
If these are the only columns, use .
to specify the rest of the columns in the formula
aggregate(.~ Name Fruit, df, sum)
If there are other columns as well and wants to include only the cost
columns in addition to 'Name', 'Fruit', subset
with select
option
aggregate(.~ Name Fruit,
subset(df, selct = c(Name, Fruit, startsWith(names(df), "Cost"))), sum)