Home > Software design >  How can I have repetitive means across rows by groups
How can I have repetitive means across rows by groups

Time:10-17

I would like to get the mean value, by group when there are two identifiers. Say I have the following dataset:

set.seed(123)
df <- data.frame(id = 1:2,
                 id2 = c("a","b", "c","c", "b","c", "a","b", "c","b"),
                 var1 = runif(10))
                 

I am trying to get the average valeu of 2 groups with data.table. I would like to create another column (avg) with the average values. Which means, the average will repeat itself across rows when it matches with the id and id2. This is what I am trying to do:

setDT(df)[, avg := mean(var1), by=list(id,id2)]

So, just to clarify. There are two values identified with id=1 and id2 = a. The average would be (0.2875775 0.5281055)/2 = 0.4078415. I would like this value to repeat itself next to row 1 and row 7, which correspond to id=1 and id2 = a, and successively for all other averages. How can I do this?

CodePudding user response:

library(tidyverse)

df %>% 
  group_by(id, id2) %>%
  mutate(avg = mean(var1))

      id id2     var1   avg
   <int> <chr>  <dbl> <dbl>
 1     1 a     0.288  0.408
 2     2 b     0.788  0.712
 3     1 c     0.409  0.480
 4     2 c     0.883  0.464
 5     1 b     0.940  0.940
 6     2 c     0.0456 0.464
 7     1 a     0.528  0.408
 8     2 b     0.892  0.712
 9     1 c     0.551  0.480
10     2 b     0.457  0.712

The code you presented also does the same task as well using data.table. note, that setDT will transform the same df, if you print df you will not that the additional column was created.

  • Related