Home > Software engineering >  Replace NA in row with value in adjacent row (not only one row)
Replace NA in row with value in adjacent row (not only one row)

Time:11-18

input data

V1 V2 V3 V4 V5 #header
a b c d e #full data 1
NA f g NA NA
NA NA NA i NA
NA j NA NA NA k
a1 b1 c1 d1 e1 #full data 2
NA NA f1 g1 NA

Expected output

V1 V2 V3 V4 V5
a bfj cg di ek
a1 b1 c1f1 d1g1 e1

It is useful link: Replace NA in row with value in adjacent row "ROW" not column

I used a lot of for loops. OMG.. My code is very dirty.

CodePudding user response:

Here's an approach using dplyr. First, I identify the columns with no NAs. Then I use the cumulative count of those to define groups. Within those groups, I paste all the rows' values (excluding NA's) together.

library(dplyr)
df1 %>%
  rowwise() %>% mutate(full = sum(is.na(c_across()))) %>% ungroup() %>%
  group_by(group = cumsum(full == 0)) %>%
  summarize(across(.fns = ~paste0(na.omit(.x), collapse = ""))) %>%
  select(-group, -full)

# A tibble: 2 × 5
  V1    V2    V3    V4    V5   
  <chr> <chr> <chr> <chr> <chr>
1 a     bfj   cg    di    e    
2 a1    b1    c1f1  d1g1  e1  
  • Related