I have the following two similar dataframes:
d1<-data.frame(TNR=c(1,2,3,4,5,6,7),EP=c(0,3,1,2,NA,NA,NA))
d2<-data.frame(TNR=c("E",1,3,5,100), EP=c(NA,NA,NA,NA,NA))
In d2 the vector "EP" is empty and i want to fill it with the data from d1$EP, where the vectors TNR overlap with each other. Where they don't overlap I want to keep d2 and not d1.
The result should look something like that:
d3<-data.frame(TNR=c("E",1,3,5,100),EP=c(NA,0,1,NA,NA))
I tried the following approach with ifelse and %in%:
d3<-data.frame(TNR=ifelse(test=d1$TNR %in% d2$TNR, yes=d2$EP, no=NA),
EP=ifelse(test=d1$TNR %in% d2$TNR, yes=d1$EP, no=""))
#remove rows where TNR is NA
d4=d3[!is.na(d1$baumnummer),]
However, the result is that EP is inserted in the wrong order and it seems to delete more rows than I anticipated because the resulting dataframe is shorter than d2
CodePudding user response:
Is that OK?
d2$EP <- d1[match(x = d2$TNR, table = d1$TNR), ]$EP
CodePudding user response:
Is this what you mean?
> d1 <- data.frame(TNR=as.character(c(1,2,3,4,5,6,7)),EP=c(0,3,1,2,NA,NA,NA))
> d2 <- data.frame(TNR=c('E',1,3,5,100))
> d3 <- merge(d1, d2, by = 'TNR', all = TRUE)
> d3
TNR EP
1 1 0
2 100 NA
3 2 3
4 3 1
5 4 2
6 5 NA
7 6 NA
8 7 NA
9 E NA
To sort you can use order()
.
There are alternatives to base merge perhaps with slightly different behavior: data.table::merge, dplyr:join_*
CodePudding user response:
We can do it with dplyr
. First convert the TNR column to character, then right_join
the two dataframes. Finally, we can use coalesce
to merge the EP
column, eliminating NAs.
library(dplyr)
d1 %>%
mutate(TNR = as.character(TNR)) %>%
right_join(d2, by = "TNR") %>%
mutate(EP = coalesce(EP.x, EP.y), .keep = 'unused')
TNR EP
1 1 0
2 3 1
3 5 NA
4 E NA
5 100 NA