R loop to do calculation based on condition from another column-CodePudding

I have a data frame which looks like this

Based on column V1, if the input in V1 = 1, it will be assigned to the treatment group and I will therefore take the result of column Yi(1), if the input in V1 = 0, I will take the result of Yi(0).

Ultimately, I need to do this for all 6 columns, V1 to V6. I have done it on V1, but have trouble expanding the function, could a for-loop do the job? My ultimate aim is to input and store all the diff into a data frame.

    t_group <- which(combined_df$V1>0)
c_group <- which(combined_df$V1<=0)

diff <- mean(combined_df$`Yi(1)`[t_group]) - mean(combined_df$`Yi(0)`[c_group])
diff

CodePudding user response：

Untested, but you could do something like this:

Apply your logic as a function to work on one column:

pm <- function(x){
  ifelse(df[,x] == 0, df[,"Y1(0)"],
         ifelse(df[,x] == 1, df[,"Y1(1)"], "error"))
}

Apply that function to all columns of interest:

sapply(X = 5:10, FUN = pm)

where

X is a vector of column numbers for V1 to V6,

FUN is the function where your logic is defined.

CodePudding user response：

With dplyr you can try this:

Data

df <- data.frame(Yi0 =  1:10,
                 Yi1 = 21:30,
                 V1 = c(1,1,1,1,1,0,0,0,0,0),
                 V2 = c(0,1,0,1,0,1,0,1,0,1))

Code

df %>%
  summarise(across(V1:V2, 
                ~ mean(df %>% 
                         filter(.x == 1) %>%
                         pull(Yi1), na.rm = T) -
                  mean(df %>% 
                         filter(.x == 0) %>%
                         pull(Yi0), na.rm = T)))

For your data you may edit V1:V2 to V1:V6.
Output

  V1 V2
1 15 21