I would like to merge values from 2 columns into 1. For example, here is sample data:
id x y
1 12
1 14
2 13
3 15
3 18
4 19
I want
id x y z
1 12 12
1 14 14
2 13 13
3 15 15
3 18 18
4 19 19
I tried using coalesce to create a new variable.
coalesce <- function(...) {
apply(cbind(...), 1, function(x) {
x[which(!is.na(x))[1]]
})
}
df$z <- coalesce(df$x, df$y)
However, the variable doesn't reflect the columns joined. Am I using this function incorrectly?
CodePudding user response:
You could use the dplyr::coalesce
function:
> df$z <- dplyr::coalesce(ifelse(df$x == "", NA, df$x), df$y)
> df
id x y z
1 1 12 12
2 1 14 14
3 2 13 13
4 3 15 15
5 3 18 18
6 4 19 19
>
To implement my own mycoalesce
:
mycoalesce <- function(...) {apply(cbind(...), 1, max)}
And:
> df$z <- mycoalesce(df$x, df$y)
> df
id x y z
1 1 12 12
2 1 14 14
3 2 13 13
4 3 15 15
5 3 18 18
6 4 19 19
>
CodePudding user response:
This might be a more crude and inefficient way than the other methods posted above, but still worth a try:
df1<-df
df1[is.na(df1)]=0
z=df1$x df1$y
df<-cbind(df,z)
df
# ID x y z
#1 1 12 NA 12
#2 2 NA 14 14
#3 3 13 NA 13
#4 4 15 NA 15
#5 5 NA 18 18
#6 6 NA 19 19
I mainly copied the original dataframe to a new dataframe so as to preserve the NA
values in the original dataframe. Also, I assumed that none of the ID
's are missing along with @Park's assumption.
Data: df<-data.frame(ID=1:6,x=c(12,NA,13,15,NA,NA),y=c(NA,14,NA,NA,18,19))
CodePudding user response:
If one of x
and y
is always NA
and one has value,
for a custom function % %
defined like
`% %` <- function(x, y) mapply(sum, x, y, MoreArgs = list(na.rm = TRUE))
df$z <- df$x % % df$y
df
id x y z
1 1 12 NA 12
2 1 NA 14 14
3 2 13 NA 13
4 3 15 NA 15
5 3 NA 18 18
6 4 NA 19 19