Home > Mobile >  R - Calculate means including all factor levels but one
R - Calculate means including all factor levels but one

Time:11-01

lets say, we use the dataframe mtcars and I would like to add the column qsec_control which is calculated as the mean(qsec) of all rows that don't have the same cyl as the current row (e.g. if cyl == 6, it would take mean(qsec[cyl != 6]) ). The question feels somewhat dumb, but I cant figure out how to do this.

Thanks in advance

CodePudding user response:

This solution groups by cyl, then uses dplyr::cur_group_rows() to index into mtcars$qsec:

library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  mutate(
    qsec_control = mean(mtcars$qsec[-cur_group_rows()])
  ) %>%
  ungroup()
# A tibble: 32 × 12
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb qsec_cont…¹
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>       <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4        17.8
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4        17.8
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1        17.2
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1        17.8
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2        18.7
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1        17.8
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4        18.7
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2        17.2
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2        17.2
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4        17.8
# … with 22 more rows, and abbreviated variable name ¹​qsec_control

CodePudding user response:

Using data.table's :=, apply the function you had in mind to each of unique value of mtcars$cyl using lapply().

library(data.table)
data(mtcars)
setDT(mtcars)

lapply(unique(mtcars$cyl), 
function(x) {mtcars[cyl == x, qsec_control := mean(mtcars[cyl != x, mean(qsec)])]})

mtcars
     mpg cyl  disp  hp drat    wt  qsec vs am gear carb qsec_control
 1: 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4     17.81280
 2: 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4     17.81280
 3: 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1     17.17381
 4: 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1     17.81280
 5: 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2     18.68611

28: 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2     17.17381
29: 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4     18.68611
30: 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6     17.81280
31: 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8     18.68611
32: 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2     17.17381
     mpg cyl  disp  hp drat    wt  qsec vs am gear carb qsec_control
  • Related