Home > Software design >  Create a function to iterate over columns and create a new column each iteration in R
Create a function to iterate over columns and create a new column each iteration in R

Time:05-25

On occassion I get survey data with likert scale string items that I need to change to numeric in order to calculate basic descriptive statistics. In order to do this, I usually use the case_when function to create a new column for each item and assign each data point a numeric value. I am trying to write a function that can do this for many different columns all at once, so that I don't have to keep copy and pasting code. I am relatively new to this so any help would be appreciated:)

Here is what I have done previously in R:

#create data frame
df <- data.frame(v1 = c("Definitely True", "Somewhat True","Somewhat False","Definitely False"),
                 v2 = c("Definitely False","Somewhat False","Somewhat True","Definitely True"))

#Use case_when to add numeric columns to dataframe
df$v1n <- case_when((df$v1 == "Definitely True")==TRUE ~ "1",
                         (df$v1 == "Somewhat True")==TRUE ~ "2",
                         (df$v1 == "Somewhat False")==TRUE ~ "3",
                         (df$v1 == "Definitely False")==TRUE ~ "4")
df$v2n <- case_when((df$v2 == "Definitely True")==TRUE ~ "1",
                         (df$v2 == "Somewhat True")==TRUE ~ "2",
                         (df$v2 == "Somewhat False")==TRUE ~ "3",
                         (df$v2 == "Definitely False")==TRUE ~ "4")

This works if I want to replace each string value with a numeric value and overwrite data in the existing columns:

for(i in colnames(data_x)) {
  data_x[[i]] <- case_when((data_x[,i] == "Definitely True")==TRUE ~ "1",
                         (data_x[,i] == "Somewhat True")==TRUE ~ "2",
                         (data_x[,i] == "Somewhat False")==TRUE ~ "3",
                         (data_x[,i] == "Definitely False")==TRUE ~ "4")
}

But I would like to find a way to create a new column for each iteration as I did with the copy and paste version. Here is something I have tried but I haven't had any success. Any help on this would be appreciated.

for(i in colnames(df)) {
  df[[var[i]]] <- case_when((df[,i] == "Definitely True")==TRUE ~ "1",
                         (df[,i] == "Somewhat True")==TRUE ~ "2",
                         (df[,i] == "Somewhat False")==TRUE ~ "3",
                         (df[,i] == "Definitely False")==TRUE ~ "4")
}

CodePudding user response:

dplyr

df %>%
  mutate(across(v1:v2, ~ case_when(
    . == "Definitely True" ~ "1", 
    . == "Somewhat True" ~ "2", 
    . == "Somewhat False" ~ "3", 
    TRUE ~ "4"
    ), .names = "{.col}n")
  )
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1
  • across gives us the ability to do one thing across multiple columns. We can use v1:v2-syntax, or one of the other dplyr selector functions like matches, starts_with, etc.
  • the second argument for across here is a tilde-function (rlang-style), inside which . is replaced with the column data each iteration. For instance, the first time that this tilde-function is evaluated, the . references the vector df$v1.
  • because the default action of mutate(across(...)) will be to replace the columns, I add .names= to control the naming of the resulting data. This notation uses glue-syntax, where {.col} is replaced by the name of the column being evaluated in each iteration.

base R

I'll add the optional use of a lookup map.

lookup <- c("Definitely True" = "1", "Somewhat True" = "2", "Somewhat False" = "3", "Definitely False" = "4")
df <- cbind(df, setNames(lapply(df[,1:2], function(z) lookup[z]), paste0(names(df[,1:2]), "n")))
rownames(df) <- NULL
df
#                 v1               v2 v1n v2n
# 1  Definitely True Definitely False   1   4
# 2    Somewhat True   Somewhat False   2   3
# 3   Somewhat False    Somewhat True   3   2
# 4 Definitely False  Definitely True   4   1
  • Related