Home > Enterprise >  how to change row values based on information from another dataframe in R
how to change row values based on information from another dataframe in R

Time:10-15

I have the original df:

A <- c("A1", "A2", "A3", "A4")
B <- c(1,0,1,NA)
C <- c(0,1,0,NA)
D <- c(NA, 1, 0, NA)
              
df <- data.frame(A, B, C, D)

And my second df2:

A <- c("A2", "A3")
df2 <- data.frame(A)

I would like to modify df_modified to look like this

A    B   C   D
A1   1   0   NA
A2   NA  NA  NA
A3   NA  NA  NA 
A4   NA  NA  NA

My current code, which generated all rows filled by NA is:

df_modifed <- df %>% mutate(B = case_when(df$A == df2$A ~ NA),
              C = case_when(df$A == df2$A ~ NA),
               D = case_when(df$A == df2$A ~ NA))

And assistance in that problem would be greatly appreciated!!

Thanks in advanced!!

CodePudding user response:

In base R, this is easier i.e. specify the logical index as row and column index without the first column (-1) and assign those elements to NA

df[df$A %in% df2$A, -1] <- NA

-output

> df
   A  B  C  D
1 A1  1  0 NA
2 A2 NA NA NA
3 A3 NA NA NA
4 A4 NA NA NA

Or if we want to use tidyverse, use across

library(dplyr)
df %>%
   mutate(across(where(is.numeric), ~ case_when(!A %in% df2$A~ .)))

-output

   A  B  C  D
1 A1  1  0 NA
2 A2 NA NA NA
3 A3 NA NA NA
4 A4 NA NA NA

CodePudding user response:

Here is an alternative dplyr way:

bind_rows(df, df2) %>% 
  group_by(A) %>% 
  mutate(across(c(B,C,D), ~first(.)==last(.))*1) %>% 
  distinct()
  A         B     C     D
  <chr> <dbl> <dbl> <dbl>
1 A1        1     1    NA
2 A2       NA    NA    NA
3 A3       NA    NA    NA
4 A4       NA    NA    NA
  •  Tags:  
  • r
  • Related