R insert row with mean after group of values-CodePudding

How to insert a new row after a group of rows with the mean of the values grouped by date ?

id   date        price
1    2022-01-01  4
2    2022-01-01  2
3    2022-01-01  2
1    2022-01-02  5
2    2022-01-02  3
3    2022-01-02  1

Desired output

id   date        price
1    2022-01-01  4
2    2022-01-01  2
3    2022-01-01  2
mean 2022-01-01  2.66
1    2022-01-02  5
2    2022-01-02  3
3    2022-01-02  1
mean 2022-01-02  3

CodePudding user response：

You can do this (although I can't really understand why you would want your data in this format):

bind_rows(
  dat %>% mutate(id = as.character(id)),
  dat %>% group_by(date) %>% 
  summarize(price=mean(price)) %>% 
  mutate(id = "mean")
) %>% 
  arrange(date,id)

Output:

       id       date    price
   <char>     <IDat>    <num>
1:      1 2022-01-01 4.000000
2:      2 2022-01-01 2.000000
3:      3 2022-01-01 2.000000
4:   mean 2022-01-01 2.666667
5:      1 2022-01-02 5.000000
6:      2 2022-01-02 3.000000
7:      3 2022-01-02 1.000000
8:   mean 2022-01-02 3.000000

Perhaps better just to do this:

dat %>% group_by(date) %>% mutate(mean = mean(price))

Output:

     id date       price  mean
  <int> <date>     <int> <dbl>
1     1 2022-01-01     4  2.67
2     2 2022-01-01     2  2.67
3     3 2022-01-01     2  2.67
4     1 2022-01-02     5  3   
5     2 2022-01-02     3  3   
6     3 2022-01-02     1  3

CodePudding user response：

We can use add_row

library(dplyr)
library(tibble)
df1 %>%
  mutate(id = as.character(id)) %>% 
  group_by(date) %>%
  group_modify(~ .x %>% 
     add_row(id = 'mean', price = mean(.x$price, na.rm = TRUE))) %>% 
  ungroup %>% 
  select(names(df1))

-output

# A tibble: 8 × 3
  id    date       price
  <chr> <chr>      <dbl>
1 1     2022-01-01  4   
2 2     2022-01-01  2   
3 3     2022-01-01  2   
4 mean  2022-01-01  2.67
5 1     2022-01-02  5   
6 2     2022-01-02  3   
7 3     2022-01-02  1   
8 mean  2022-01-02  3

data

df1 <- structure(list(id = c(1L, 2L, 3L, 1L, 2L, 3L), date = c("2022-01-01", 
"2022-01-01", "2022-01-01", "2022-01-02", "2022-01-02", "2022-01-02"
), price = c(4L, 2L, 2L, 5L, 3L, 1L)), class = "data.frame", row.names = c(NA, 
-6L))