My question is similar to existing questions about coalesce
, but I want to coalesce several columns by row such that NAs are pushed to the last column.
Here's an example:
If I have
a <- data.frame(A=c(2,NA,4,3,2), B=c(NA,3,4,NA,5), C= c(1,3,6,7,NA), D=c(5,6,NA,4,3), E=c(2,NA,1,3,NA))
A B C D E
1 2 NA 1 5 2
2 NA 3 3 6 NA
3 4 4 6 NA 1
4 3 NA 7 4 3
5 2 5 NA 3 NA
I would like to get
b <- data.frame(A=c(2,3,4,3,2), B=c(1,3,4,7,5), C=c(5,6,6,4,3), D=c(2,NA,1,3,NA))
A B C D
1 2 1 5 2
2 3 3 6 NA
3 4 4 6 1
4 3 7 4 3
5 2 5 3 NA
Does anyone have any ideas for how I could do this? I would be so grateful for any tips, as my searches have come up dry.
CodePudding user response:
You can use unite
and separate
:
library(tidyverse)
a %>%
unite(newcol, everything(), na.rm = TRUE) %>%
separate(newcol, into = LETTERS[1:4])
A B C D
1 2 1 5 2
2 3 3 6 <NA>
3 4 4 6 1
4 3 7 4 3
5 2 5 3 <NA>
Since you have an unknown number of new columns in separate
, one can use splitstackshape
's function cSplit
:
library(splitstackshape)
a %>%
unite(newcol, na.rm = TRUE) %>%
cSplit("newcol", "_", type.convert = F) %>%
rename_with(~ LETTERS)
CodePudding user response:
This could be another solution. From what I understand you basically just want to shift the values in each row after the first NA
to the left replacing the NA
.
library(dplyr)
library(purrr)
a %>%
pmap_dfr(~ {x <- c(...)[-which(is.na(c(...)))[1]]
setNames(x, LETTERS[seq_along(x)])})
# A tibble: 5 x 4
A B C D
<dbl> <dbl> <dbl> <dbl>
1 2 1 5 2
2 3 3 6 NA
3 4 4 6 1
4 3 7 4 3
5 2 5 3 NA
CodePudding user response:
We may use base R
- loop over the rows, order
based on the NA
elements and remove the columns that have all NAs
a[] <- t(apply(a, 1, \(x) x[order(is.na(x))]))
a[colSums(!is.na(a)) > 0]
A B C D
1 2 1 5 2
2 3 3 6 NA
3 4 4 6 1
4 3 7 4 3
5 2 5 3 NA