Home > OS >  Replace particular text strings with colname in dataframe in R
Replace particular text strings with colname in dataframe in R

Time:05-10

I have two columns called Apple and Pear. If the text in these columns say "Yes" or "No", I want this text to be replaced with the column name. The desired result is shown in the fields Apple2 and Pear2 in the example.

Apple<-c("Yes","No",  "Yes",  "Other",  NA)
Pear<-c("Yes",NA,    "No",    "Other","Yes")
Apple2<-c("Apple","Apple","Apple","Other",NA)
Pear2<-c("Pear",NA,"Pear","Other","Pear")

data<-data.frame(Apple,Pear,Apple2,Pear2, stringsAsFactors = FALSE)

Can anyone suggest a way of achieving this?

CodePudding user response:

With dplyr, across, it can be done by getting the column name i.e. cur_column() with a conditional expression in ifelse/case_when

library(dplyr)
data <- data %>%
    mutate(across(c(Apple, Pear),  
       ~ case_when(.x %in% c("Yes", "No") ~ cur_column(), TRUE ~ .x)))

-output

data
  Apple  Pear Apple2 Pear2
1 Apple  Pear  Apple  Pear
2 Apple  <NA>  Apple  <NA>
3 Apple  Pear  Apple  Pear
4 Other Other  Other Other
5  <NA>  Pear   <NA>  Pear

CodePudding user response:

My answer is conceptually the same as @akrun's. What's different is that I assume your input data only has two columns. Therefore in your across, you need to use .names to set the column names of newly created ones. Also, we need to specify what to do if .x == "Others" in case_when.

library(dplyr)

Apple<-c("Yes","No",  "Yes",  "Other",  NA)
Pear<-c("Yes",NA,    "No",    "Other","Yes")
data<-data.frame(Apple,Pear)

data %>% 
  mutate(across(everything(), 
                ~case_when(.x == "Other" ~ "Other",
                           .x %in% c("Yes", "No") ~ cur_column(),
                           TRUE ~ NA_character_), 
                .names = "{.col}2"))

  Apple  Pear Apple2 Pear2
1   Yes   Yes  Apple  Pear
2    No  <NA>  Apple  <NA>
3   Yes    No  Apple  Pear
4 Other Other  Other Other
5  <NA>   Yes   <NA>  Pear

CodePudding user response:

Here's another option using imap_dfr from purrr:

library(tidyverse)

imap_dfr(data, ~ replace(.x, .x %in% c("Yes", "No"), .y))

Output

  Apple Pear  Apple2 Pear2
  <chr> <chr> <chr>  <chr>
1 Apple Pear  Apple  Pear 
2 Apple NA    Apple  NA   
3 Apple Pear  Apple  Pear 
4 Other Other Other  Other
5 NA    Pear  NA     Pear

Or another option using transmute:

data %>%
  transmute(across(everything(), ~ ifelse(
    .x %in% c("Yes", "No"), deparse(substitute(.)), .
  )))

Data

data <- structure(list(Apple = c("Yes", "No", "Yes", "Other", NA), Pear = c("Yes", 
NA, "No", "Other", "Yes"), Apple2 = c("Apple", "Apple", "Apple", 
"Other", NA), Pear2 = c("Pear", NA, "Pear", "Other", "Pear")), class = "data.frame", row.names = c(NA, 
-5L))
  •  Tags:  
  • r
  • Related