Home > Software design >  applying weighted.mean for specific values in a column
applying weighted.mean for specific values in a column

Time:12-10

I have a data frame named df with five columns :

age <- c(10,11,12,12,10,11,11,12,10,11,12)
time <- c(20,26,41,60,29,28,54,24,59,70,25)
weight <- c(123,330,445,145,67,167,190,104,209,146,201)
gender <- c(1,1,2,2,2,2,1,2,2,2,1)
Q2 <- c(112,119,114,120,121,117,116,114,121,122,124)
df <- data_frame(age, w, time, gender, Q2)

what I want is applying the weighted.mean based on each age to my data frame by using two conditions: 1)gender = 2 and 2) Q2 >=114 & Q2 <= 121

by the code below, I can simply apply weighted.mean but I do not know how to use my two conditions.

df1<-
  df %>% 
  group_by(age) %>% 
  summarise(weighted_time = weighted.mean(time, weight))

CodePudding user response:

Is the following what you are looking for?

library(tidyverse)

age <- c(10,11,12,12,10,11,11,12,10,11,12)
time <- c(20,26,41,60,29,28,54,24,59,70,25)
weight <- c(123,330,445,145,67,167,190,104,209,146,201)
gender <- c(1,1,2,2,2,2,1,2,2,2,1)
Q2 <- c(112,119,114,120,121,117,116,114,121,122,124)
df <- data.frame(age, weight, time, gender, Q2)

df %>% 
  group_by(age) %>% 
  filter(gender == 2 & Q2 >=114 &  Q2 <= 121) %>% 
  summarise(weighted_time = weighted.mean(time, weight), .groups = "drop")

#> # A tibble: 3 × 2
#>     age weighted_time
#>   <dbl>         <dbl>
#> 1    10          51.7
#> 2    11          28  
#> 3    12          42.4

CodePudding user response:

You can add a filter for those 2 (3) conditions:

df %>% filter(gender == 2 & Q2 >= 114 & Q2 <= 121) %>% group_by(age) %>% summarise(weighted_time = weighted.mean(time, weight))

This gives

# A tibble: 3 x 2
    age weighted_time
  <dbl>         <dbl>
1    10          51.7
2    11          28  
3    12          42.4

CodePudding user response:

data.table

age <- c(10,11,12,12,10,11,11,12,10,11,12)
time <- c(20,26,41,60,29,28,54,24,59,70,25)
weight <- c(123,330,445,145,67,167,190,104,209,146,201)
gender <- c(1,1,2,2,2,2,1,2,2,2,1)
Q2 <- c(112,119,114,120,121,117,116,114,121,122,124)
df <- data.frame(age, weight, time, gender, Q2)

library(data.table)
setDT(df)[gender == 2 & (Q2 >=114 &  Q2 <= 121), list(res = weighted.mean(time, weight)), by = age
          ][order(age)]
#>    age      res
#> 1:  10 51.71739
#> 2:  11 28.00000
#> 3:  12 42.42219

Created on 2021-12-10 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related