problem with using coalesce to merge 2 columns into 1-CodePudding

I would like to merge values from 2 columns into 1. For example, here is sample data:

id   x    y
1   12  
1         14  
2   13      
3   15 
3         18
4         19

I want

id   x    y     z
1   12          12  
1         14    14  
2   13          13 
3   15          15 
3         18    18
4         19    19

I tried using coalesce to create a new variable.

coalesce <- function(...) {
  apply(cbind(...), 1, function(x) {
    x[which(!is.na(x))[1]]
  })
}

df$z <- coalesce(df$x, df$y)

However, the variable doesn't reflect the columns joined. Am I using this function incorrectly?

CodePudding user response：

You could use the dplyr::coalesce function:

> df$z <- dplyr::coalesce(ifelse(df$x == "", NA, df$x), df$y)
> df
  id  x  y  z
1  1 12    12
2  1    14 14
3  2 13    13
4  3 15    15
5  3    18 18
6  4    19 19
>

To implement my own mycoalesce:

mycoalesce <- function(...) {apply(cbind(...), 1, max)}

And:

> df$z <- mycoalesce(df$x, df$y)
> df
  id  x  y  z
1  1 12    12
2  1    14 14
3  2 13    13
4  3 15    15
5  3    18 18
6  4    19 19
>

CodePudding user response：

This might be a more crude and inefficient way than the other methods posted above, but still worth a try:

df1<-df
df1[is.na(df1)]=0
z=df1$x df1$y
df<-cbind(df,z)
df
#  ID  x  y  z
#1  1 12 NA 12
#2  2 NA 14 14
#3  3 13 NA 13
#4  4 15 NA 15
#5  5 NA 18 18
#6  6 NA 19 19

I mainly copied the original dataframe to a new dataframe so as to preserve the NA values in the original dataframe. Also, I assumed that none of the ID's are missing along with @Park's assumption.

Data: df<-data.frame(ID=1:6,x=c(12,NA,13,15,NA,NA),y=c(NA,14,NA,NA,18,19))

CodePudding user response：

If one of x and y is always NA and one has value,

for a custom function % % defined like

`% %` <- function(x, y)  mapply(sum, x, y, MoreArgs = list(na.rm = TRUE))

df$z <- df$x % % df$y
df
  id  x  y  z
1  1 12 NA 12
2  1 NA 14 14
3  2 13 NA 13
4  3 15 NA 15
5  3 NA 18 18
6  4 NA 19 19