Home > OS >  How to sum data from multiple rows and columns according to unique variable in R?
How to sum data from multiple rows and columns according to unique variable in R?

Time:06-28

I am trying to sum the average total sales of each employee in this data frame (sorry for format, i'm not sure how else to get a .csv uploaded here). For example, Tim sold ten vehicles day 1, nine day 2, and eleven day 3, for an average of 10 per day. Is there a way to grab all of the columns each time the employee's name shows up, then sum and average the total sales?

CodePudding user response:

How about this:

library(dplyr)
dat <- tibble::tribble(
  ~Employee,    ~Day,   ~Car_Sales, ~Van_Sales, ~Truck_Sales,
"Tim",  "1/1",  5,  2,  3,
"Tim",  "1/2",  4,  2,  3,
"Tim",  "1/3",  7,  1,  3,
"Craig", "1/1", 2,  6,  1,
"Craig", "1/2", 5,  7,  2,
"Samantha", "1/1",  4,  3,  2)


dat %>% 
  rowwise() %>% 
  mutate(total_sales = sum(c_across(-c(1,2)))) %>% 
  group_by(Employee) %>% 
  summarise(average = mean(total_sales))
#> # A tibble: 3 × 2
#>   Employee average
#>   <chr>      <dbl>
#> 1 Craig       11.5
#> 2 Samantha     9  
#> 3 Tim         10

Created on 2022-06-27 by the reprex package (v2.0.1)

CodePudding user response:

dat <- tibble::tribble(
  ~Employee,    ~Day,   ~Car_Sales, ~Van_Sales, ~Truck_Sales,
"Tim",  "1/1",  5,  2,  3,
"Tim",  "1/2",  4,  2,  3,
"Tim",  "1/3",  7,  1,  3,
"Craig", "1/1", 2,  6,  1,
"Craig", "1/2", 5,  7,  2,
"Samantha", "1/1",  4,  3,  2)

dat %>% mutate(sales = Car_Sales   Van_Sales   Truck_Sales) %>% group_by(Employee) %>% summarise(Avg = mean(sales))

Output should look like this:

Employee Avg
Craig 11.5
Samantha 9.0
Tim 10.0

I hope this is what you mean.

CodePudding user response:

One option would be to put into a long format, then get the mean by summing all the data and divide by the number of days (i.e., using n_distinct).

library(tidyverse)

df %>%
  pivot_longer(ends_with("Sales")) %>%
  group_by(Employee) %>%
  summarise(Avg = sum(value) / n_distinct(Day))

Output

  Employee   Avg
  <chr>    <dbl>
1 Craig     11.5
2 Samantha   9  
3 Tim       10  

Data

df <- structure(list(Employee = c("Tim", "Tim", "Tim", "Craig", "Craig", 
"Samantha"), Day = c("1/1", "1/2", "1/3", "1/1", "1/2", "1/1"
), Car_Sales = c(5, 4, 7, 2, 5, 4), Van_Sales = c(2, 2, 1, 6, 
7, 3), Truck_Sales = c(3, 3, 3, 1, 2, 2)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -6L))
  • Related