Home > Software engineering >  Using R - If columns A and B do not match, replace entries in columns C and D with 'NA'
Using R - If columns A and B do not match, replace entries in columns C and D with 'NA'

Time:02-24

I have 4 columns in a dataframe:

dd <- data.frame(
  A= c(102,101,100,107,55),
  B= c(102,101,30,77,55),
  C= c("cycle", "walk", "walk", "walk", "drive"),
  D= c("English", "French", "English", "English", "Japanese")
)

Using R, I wish to compare if entries in columns A and B matches. If entries in A and B do not match, then I wish to replace their corresponding entries in column C and D with 'NA'.

desired <- data.frame(
  A= c(102,101,100,107, 55),
  B= c(102,101,30,77, 55),
  C= c("cycle", "walk", NA, NA, "drive"),
  D= c("English", "French", NA, NA, "Japanese")
)

Can anyone advise? The qn is not new but the proposed solutions are usually in Excel. I hope to do this in R. Thank you in advance!

CodePudding user response:

Here is a dplyr approach:

Across column C and column D, if A is not equal to B, set NA, if they are equal, use their original values.

UPDATE: since your column A and B are factors of different levels, you might want to change them into character first, do the operation, then change them back to factor.

library(dplyr)

dd %>% mutate(across(A:B, ~as.character(.x)), 
              across(C:D, ~ ifelse(A != B, NA, as.character(.x))), 
              across(A:B, ~as.factor(.x)))

   A   B     C        D
1 102 102 cycle  English
2 101 101  walk   French
3 100  30  <NA>     <NA>
4 107  77  <NA>     <NA>
5  55  55 drive Japanese

CodePudding user response:

A simple option

df[df$A!=df$B,c("C","D")]=NA

if A and B are factors you could try

df[as.character(df$A)!=as.character(df$B),c("C","D")]=NA
  •  Tags:  
  • r
  • Related