Home > Software design >  Check conditions in two columns in R and print letter into another column
Check conditions in two columns in R and print letter into another column

Time:08-13

I would like to calculate in R: if in my df in column 2 the value is 1 for example and in column 3 the value is 1, then write in a new column 4 "TP".

col1 <- c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4)
col2 <- c(2,5,7,9,1,4,2,1,8,3,4,1,2,5,7,1,5,4,8,1,2,6,4,8,9,1,2,4,3,5,7,2,8,6,1,2)
col3 <- c(1,0,1,1,0,1,0,1,0,1,0,0,1,1,1,1,0,1,1,1,0,0,1,0,1,0,1,0,1,0,1,1,1,0,1,0)

data <- data.frame(col1, col2, col3)

data$col4 <- NA

Like in this df it would look like

col1 <- c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4)
col2 <- c(2,5,7,9,1,4,2,1,8,3,4,1,2,5,7,1,5,4,8,1,2,6,4,8,9,1,2,4,3,5,7,2,8,6,1,2)
col3 <- c(1,0,1,1,0,1,0,1,0,1,0,0,1,1,1,1,0,1,1,1,0,0,1,0,1,0,1,0,1,0,1,1,1,0,1,0)

data <- data.frame(col1, col2, col3)
data$col4 <- NA
data[8,4] <- "TP"
data[16,4] <- "TP"
data[20,4] <- "TP"
data[35,4] <- "TP"

Thanks in advance!

CodePudding user response:

You can use ifelse() like this:

data$col4 = ifelse(data$col2 ==1 & data$col3 == 1, "TP", NA)

If you have multiple conditions, move to case_when() from dplyr, of fcase from data.table. An example of the former is here:

library(dplyr)
data %>% 
  mutate(col4 = case_when(
    col2 ==1 & col3 == 1 ~"TP",
    between(col2,2,9) & col3==1~ "FP"
  ))

If you want to count FP and TP by col1 values, this is one way to extend the pipeline:

data %>% 
  mutate(col4 = case_when(
    col2 ==1 & col3 == 1 ~"TP",
    between(col2,2,9) & col3==1~ "FP"
  )) %>% 
  group_by(col1) %>% 
  summarize(FP = sum(col4=="FP", na.rm=T),
            TP = sum(col4=="TP", na.rm=T)
  )

Output:

   col1    FP    TP
  <dbl> <int> <int>
1     1     4     1
2     2     5     1
3     3     4     1
4     4     4     1

CodePudding user response:

  • We can use
library(dplyr)

data |>
      mutate(col4 = case_when(col2 == 1 & col3 == 1 ~ "TP" ,
      col2 %in% 2:9 & col3 == 1 ~ "FP", TRUE ~ NA_character_))

  • output
  col1 col2 col3 col4
1     1    2    1   FP
2     1    5    0 <NA>
3     1    7    1   FP
4     1    9    1   FP
5     1    1    0 <NA>
6     1    4    1   FP
7     1    2    0 <NA>
8     1    1    1   TP
9     1    8    0 <NA>
10    2    3    1   FP
11    2    4    0 <NA>
12    2    1    0 <NA>
13    2    2    1   FP
14    2    5    1   FP
15    2    7    1   FP
16    2    1    1   TP
17    2    5    0 <NA>
18    2    4    1   FP
19    3    8    1   FP
20    3    1    1   TP
21    3    2    0 <NA>
22    3    6    0 <NA>
23    3    4    1   FP
24    3    8    0 <NA>
25    3    9    1   FP
26    3    1    0 <NA>
27    3    2    1   FP
28    4    4    0 <NA>
29    4    3    1   FP
30    4    5    0 <NA>
31    4    7    1   FP
32    4    2    1   FP
33    4    8    1   FP
34    4    6    0 <NA>
35    4    1    1   TP
36    4    2    0 <NA>
  • Related