I have a data frame (df) that is a larger version of this:
txnID date product sold repID lastName
1001 8/5/2020 Clobromizen 600 203 Kappoorthy
1002 6/28/2020 Alaraphosol 276 887 da Silva
1003 6/28/2020 Alaraphosol 184 887 da Silva
1004 4/16/2020 Diaprogenix 36 887 da Silva
1005 6/14/2020 Diaprogenix 40 887 da Silva
1006 5/19/2020 Xinoprozen 5640 332 McRowe
1007 8/23/2020 Diaprogenix 60 332 McRowe
1008 11/14/2020 Clobromizen 2880 332 McRowe
1009 9/26/2020 Colophrazen 738 203 Kappoorthy
1010 2/5/2020 Diaprogenix 20 332 McRowe
1011 9/23/2020 Gerantrazeophem 3740 100 Schwab
1012 12/4/2020 Clobromizen 1584 221 Sixt
I want to create a new data frame that takes the sum of all the sold products for each employee shown (All of the employees are shown), which would look something like this:
View(df1)
lastName totalSold
1 Kappoorthy sum(df$sold)
2 da Silva sum(df$sold)
3 McRowe sum(df$sold)
4 Schwab sum(df$sold)
5 Sixt sum(df$sold)
CodePudding user response:
Here a way to do it with dplyr
library(dplyr)
df %>%
group_by(lastName) %>%
summarize(totalSold = sum(sold,na.rm = TRUE))
CodePudding user response:
using R base aggregate
aggregate(sold ~ lastName, sum, na.rm=TRUE, data=df)