Home > database >  Compute mean excluding current value
Compute mean excluding current value

Time:10-01

I have the following table

   a b avg 
1: 1 7   3    
2: 1 0   3    
3: 1 2   3    
4: 2 1   2    
5: 2 3   2    

where 'a' and 'b' are data and 'avg' calculates the average of 'b' grouped by 'a'.

Now I want to calculate the average ('avg2') of 'b' grouped by 'a' excluding the current value:

   a b avg    avg2
1: 1 7   3    1.00
2: 1 0   3    4.50
3: 1 2   3    3.50
4: 2 1   2    3.00
5: 2 3   2    1.00

I have tried a manual calculation,

dt[ , (sum(b) - ?? )/(.N -1), by = a]

but I don't know how to fill the gap in the numerator. I guess a related question I have is if there is a way to refer to the current row while performing a summary calculation.

CodePudding user response:

I am not sure if your calculation is correct for group 1 but you can do -

library(data.table)

setDT(df)[, avg2 := (sum(b) - b)/(.N -1), a]
df

#   a b avg avg2
#1: 1 7   3  1.0
#2: 1 0   3  4.5
#3: 1 2   3  3.5
#4: 2 1   2  3.0
#5: 2 3   2  1.0
  • Related