Coalesce multiple columns at once-CodePudding

My question is similar to existing questions about coalesce, but I want to coalesce several columns by row such that NAs are pushed to the last column.

Here's an example:

If I have

a <- data.frame(A=c(2,NA,4,3,2), B=c(NA,3,4,NA,5), C= c(1,3,6,7,NA), D=c(5,6,NA,4,3), E=c(2,NA,1,3,NA))

   A  B  C  D  E
1  2 NA  1  5  2
2 NA  3  3  6 NA
3  4  4  6 NA  1
4  3 NA  7  4  3
5  2  5 NA  3 NA

I would like to get

b <- data.frame(A=c(2,3,4,3,2), B=c(1,3,4,7,5), C=c(5,6,6,4,3), D=c(2,NA,1,3,NA))

  A B C  D
1 2 1 5  2
2 3 3 6 NA
3 4 4 6  1
4 3 7 4  3
5 2 5 3 NA

Does anyone have any ideas for how I could do this? I would be so grateful for any tips, as my searches have come up dry.

CodePudding user response：

You can use unite and separate:

library(tidyverse)

a %>% 
  unite(newcol, everything(), na.rm = TRUE) %>% 
  separate(newcol, into = LETTERS[1:4])

  A B C    D
1 2 1 5    2
2 3 3 6 <NA>
3 4 4 6    1
4 3 7 4    3
5 2 5 3 <NA>

Since you have an unknown number of new columns in separate, one can use splitstackshape's function cSplit:

library(splitstackshape)

a %>% 
  unite(newcol, na.rm = TRUE) %>% 
  cSplit("newcol", "_", type.convert = F) %>% 
  rename_with(~ LETTERS)

CodePudding user response：

This could be another solution. From what I understand you basically just want to shift the values in each row after the first NA to the left replacing the NA.

library(dplyr)
library(purrr)

a %>%
  pmap_dfr(~ {x <- c(...)[-which(is.na(c(...)))[1]]
  setNames(x, LETTERS[seq_along(x)])})

# A tibble: 5 x 4
      A     B     C     D
  <dbl> <dbl> <dbl> <dbl>
1     2     1     5     2
2     3     3     6    NA
3     4     4     6     1
4     3     7     4     3
5     2     5     3    NA

CodePudding user response：

We may use base R - loop over the rows, order based on the NA elements and remove the columns that have all NAs

a[] <- t(apply(a, 1, \(x) x[order(is.na(x))]))
a[colSums(!is.na(a)) > 0]
  A B C  D
1 2 1 5  2
2 3 3 6 NA
3 4 4 6  1
4 3 7 4  3
5 2 5 3 NA