I have a dataframe which includes two columns in which the data are categorical, with the values T1, T2, T3. I want to create a new column which considers the values in these two columns and returns an output: If col1 is T1 and col2 is T1 = T1 If col1 is T3 and col2 is T3 = T3 And any other variation such as T1 and T3, returns NA. Any thoughts on how to do this would be appreciated. Thanks!
CodePudding user response:
This dataframe has enough combinations of values from col1 and col2 to represent your question.
My approach to your question is to use dplyr::mutate() to create a third column, combined with dplyr::case_when() which allows you to define conditions.
## First install the dplyr package
# install.packages("dplyr")
library(dplyr)
# Add the third column based on your conditions
df <- df %>%
mutate(col3 = case_when(
col1 == col2 ~ col1,
TRUE ~ NA_character_
))
The dataframe df will look like this:
You see that if col1 and col2 have the same value, col3 will have that value also. Otherwise, there will be NA.
I hope this helps.