Home > front end >  Loop for creating multiple new 3 level variables from another 5 level variable
Loop for creating multiple new 3 level variables from another 5 level variable

Time:09-30

I'm looking for a way to generate multiple 3-level variables from an older 5-level variable, while keeping the old variables.

This is how it is now:

structure(list(Quesiton1 = c("I", "5", "4", "4"), Question2 = c("I", 
"5", "4", "4"), Question3 = c("I", "3", "2", "4")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -4L))

I would like this:

structure(list(Quesiton1 = c("I", "5", "4", "4"), Question2 = c("I", 
"5", "4", "4"), Question3 = c("I", "3", "2", "4"), Question1_3l = c("NA", 
"3", "3", "3"), Question2_3l = c("NA", "3", "3", "3"), Question3_3l = c("NA", 
"2", "1", "3")), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))

I have this code to recode the 5-level variable

    df2 %>% 
  mutate_at(vars(Question1, Question2, Question3), recode,'1'=1, '2'=1, '3'=3, '4'=5, '5'=5, 'l' = NA)

But what I want to do is to keep the old variable and generate the 3 level variable into something like Question1_3l, Question2_3l, Question3_3l.

It shouldn't be too difficult. In Stata it looks something like this:

foreach i of varlist ovsat-not_type_number {
    local lbl : variable label `i' 
    recode `i' (1/2=1)(3=2)(4/5=3), gen(`i'_3l)
    }

Thank you.

CodePudding user response:

Not the most elegant, not the fastest (but still pretty fast), not the most idiomatic, but this does what you want (I think) and should be easy to read and customize.

dt <- structure(list(Quesiton1 = c("I", "5", "4", "4"), 
                     Question2 = c("I", "5", "4", "4"), 
                     Question3 = c("I", "3", "2", "4")), 
  class = c("tbl_df", "tbl", "data.frame"), 
  row.names = c(NA, -4L))

#transfor your data into a data.table
setDT(dt)

#define the names of the columns that you want to recode
vartoconv <- names(dt)

#define the names of the recoded columns
newnames <- paste0(vartoconv, "_3l")

#define an index along the vector of the names of the columns to recode
for(varname_loopid in seq_along(vartoconv)){
  
  #identify the name of the column to recode for each iteration
  varname_loop <- vartoconv[varname_loopid]

  #identify the name of the recoded column for each iteration
  newname_loop <- newnames[varname_loopid]

  #create the recoded variable by using conditionals on the variable to recode
  dt[get(varname_loop) %in% c(1, 2), (newname_loop) := 1]
  dt[get(varname_loop) == 3, (newname_loop) := 2]
  dt[get(varname_loop) %in% c(4, 5), (newname_loop) := 3]
  
}

CodePudding user response:

Try:

library(tidyverse)
library(stringr)

df2 <- replicate(6, sample(as.character(1:5), 50, replace = TRUE), simplify = "matrix") %>%
  as_tibble(.name_repair = ~str_c("Question", 1:6))

df2 %>%
  mutate_at(vars(Question1:Question3), 
            ~case_when(.x %in% c('1', '2') ~ 1L, # 1L means integer 1
                       .x %in% c('3') ~ 3L,
                       .x %in% c('4', '5') ~ 5L,
                       TRUE ~ as.integer(NA)))
  • Related