How to evaluate variable names in a vector of pasted strings-CodePudding

I have an example df:

df <- data.frame(
Date1 = as.Date(c("2010-09-10", "2015-12-03")),
Date2 = as.Date(c("2012-09-10", "2016-12-03")),
Selected =  c("Date1", "Date1, Date2"),
Value = c("2010-09-10", "2015-12-03, 2016-12-03"))

  Date1      Date2      Selected     Value                 
1 2010-09-10 2012-09-10 Date1        2010-09-10            
2 2015-12-03 2016-12-03 Date1, Date2 2015-12-03, 2016-12-03

and I want to be able to create/mutate the new column 'Value' that evaluates whatever column names are in the 'Selected' column. I could use get() if the string in 'Selected' was just a single column name but it could be a concatenation of variable names separated by a comma which is where I'm stuck on how to evaluate.

CodePudding user response：

You can do in base R:

columns <- strsplit(df$Selected, ", ")
df$Value <- sapply(1:nrow(df), function(n) paste(format(df[n, columns[[n]]]), collapse = ", "))

CodePudding user response：

1) For each word in Selected get that column name and replace the name in the string with the value of that column in the same row using gsubfn. gsubfn is like gsub except the replacement string can be a replacement function to which the match to the capture group in the regex (the part in parentheses) in the pattern is input and replaced with the output of the function. The function can be specified using formula notation as we do here.

library(dplyr)
library(gsubfn)

df %>%
  rowwise %>%
  mutate(Value=gsubfn("(\\w )", ~ format(get(x)), Selected)) %>%
  ungroup
## # A tibble: 2 x 4
##   Date1      Date2      Selected     Value                 
##   <date>     <date>     <chr>        <chr>                 
## 1 2010-09-10 2012-09-10 Date1        2010-09-10            
## 2 2015-12-03 2016-12-03 Date1, Date2 2015-12-03, 2016-12-03

2) This can also be done using just gsubfn.

library(gsubfn)

nr <- nrow(df)
df |>
  by(1:nr, transform, Value = gsubfn("(\\w )", ~ format(get(x)), Selected)) |>
  do.call(what = "rbind")
##        Date1      Date2     Selected                  Value
## 1 2010-09-10 2012-09-10        Date1             2010-09-10
## 2 2015-12-03 2016-12-03 Date1, Date2 2015-12-03, 2016-12-03

CodePudding user response：

A possible solution, based on tidyverse:

library(tidyverse)

df %>% 
  separate_rows(Selected, sep = ",\\s*") %>% 
  rowwise %>% 
  mutate(Value = cur_data()[[Selected]]) %>% 
  group_by(Date1, Date2) %>% 
  summarise(Selected = str_c(Selected, collapse = ", "), 
            Value = str_c(Value, collapse = ", "), .groups = "drop")

#> # A tibble: 2 × 4
#>   Date1      Date2      Selected     Value                 
#>   <date>     <date>     <chr>        <chr>                 
#> 1 2010-09-10 2012-09-10 Date1        2010-09-10            
#> 2 2015-12-03 2016-12-03 Date1, Date2 2015-12-03, 2016-12-03

A more succinct one:

library(tidyverse)

df %>%
  rowwise %>%
  mutate(Value = map(str_split(Selected, ", "), ~ cur_data()[.x]),
         Value = reduce(Value, c) %>% str_c(collapse=", ")) %>%
  ungroup

#> # A tibble: 2 × 4
#>   Date1      Date2      Selected     Value                 
#>   <date>     <date>     <chr>        <chr>                 
#> 1 2010-09-10 2012-09-10 Date1        2010-09-10            
#> 2 2015-12-03 2016-12-03 Date1, Date2 2015-12-03, 2016-12-03