Home > Blockchain >  Using R, create a new column in a dataframe based on the values in two existing columns
Using R, create a new column in a dataframe based on the values in two existing columns

Time:10-01

I have a dataframe which includes two columns in which the data are categorical, with the values T1, T2, T3. I want to create a new column which considers the values in these two columns and returns an output: If col1 is T1 and col2 is T1 = T1 If col1 is T3 and col2 is T3 = T3 And any other variation such as T1 and T3, returns NA. Any thoughts on how to do this would be appreciated. Thanks!

CodePudding user response:

I suggest doing a df

This dataframe has enough combinations of values from col1 and col2 to represent your question.

My approach to your question is to use dplyr::mutate() to create a third column, combined with dplyr::case_when() which allows you to define conditions.

## First install the dplyr package
# install.packages("dplyr")
library(dplyr)

# Add the third column based on your conditions
df <- df %>%
  mutate(col3 = case_when(
    col1 == col2 ~ col1, 
    TRUE ~ NA_character_
  ))

The dataframe df will look like this:

df with col3

You see that if col1 and col2 have the same value, col3 will have that value also. Otherwise, there will be NA.

I hope this helps.

  •  Tags:  
  • r
  • Related