Home > Blockchain >  Replace row within a df if other row from a group is equal to 1 [R]
Replace row within a df if other row from a group is equal to 1 [R]

Time:11-15

I have a dataframe such as :

Names COL1 COL2 COL3 COL4
SP1   0    1    NaN  LA
SP1   0    1    NaN  LE
SP1   0    1    1    LI
SP2   1    0    0    LO
SP2   0    0    0    LU
SP3   1    1    NaN  LY
SP3   1    1    NaN  LZ

And I would like for each group of Names, to replace for each COL1, COL2 and COL3 values 0 or NaN when another row within a column is equal to 1.

For example in the Group SP1:

Names COL1 COL2 COL3 COL4
SP1   0    1    NaN  LA
SP1   0    1    NaN  LE
SP1   0    1    1    LI

As you can see in the COL3 there are two NaN, and one 1 value, so since there is at least on 1 value, I transform the two NaN into a 1:

Names COL1 COL2 COL3 COL4
SP1   0    1    1    LA
SP1   0    1    1    LE
SP1   0    1    1    LI

Same for the group SP2:

Names COL1 COL2 COL3 COL4
SP2   1    0    0    LO
SP2   0    0    0    LU

As you can see in the COL1 there are one 0, and one 1 value, so since there is at least on 1 value, I transform the 0 into a 1:

Names COL1 COL2 COL3 COL4
SP2   1    0    0    LO
SP2   1    0    0    LU

At the end I should then get :

Names COL1 COL2 COL3 COL4
SP1   0    1    1    LA
SP1   0    1    1    LE
SP1   0    1    1    LI
SP2   1    0    0    LO
SP2   1    0    0    LU
SP3   1    1    NaN  LY
SP3   1    1    NaN  LZ

Here is the dput format of the example df :

structure(list(Names = c("SP1", "SP1", "SP1", "SP2", "SP2", "SP3", 
"SP3"), COL1 = c(0L, 0L, 0L, 1L, 0L, 1L, 1L), COL2 = c(1L, 1L, 
1L, 0L, 0L, 1L, 1L), COL3 = c(NaN, NaN, 1, 0, 0, NaN, NaN), COL4 = c("LA", 
"LE", "LI", "LO", "LU", "LY", "LZ")), class = "data.frame", row.names = c(NA, 
-7L))

CodePudding user response:

After grouping by 'Names', create a condition with if/else to do the replacement i.e. if 1 is present %in% the column, replace the NaN or 0 with 1

library(dplyr)
df <- df %>%
   group_by(Names) %>%
   mutate(across(COL1:COL3, ~ if(1 %in% .x) replace(.x, is.nan(.x)|
          .x %in% 0, 1) else .x)) %>%
   ungroup

-output

# A tibble: 7 × 5
  Names  COL1  COL2  COL3 COL4 
  <chr> <dbl> <dbl> <dbl> <chr>
1 SP1       0     1     1 LA   
2 SP1       0     1     1 LE   
3 SP1       0     1     1 LI   
4 SP2       1     0     0 LO   
5 SP2       1     0     0 LU   
6 SP3       1     1   NaN LY   
7 SP3       1     1   NaN LZ   
  • Related