I'd like to know how to change a whole part of a df with rows from another one .
Here the two dataframes:
df1 <- data.frame(x = c(1,2,3,4,5,6), y = c("a", "b", "c", "d", 'e', "f"), z = c("YES", "YES", "YES", "YES", "NO", "NO"))
df2 <- data.frame(x = c(8,9), y = c("l", "g"), z = c("YES","YES"))
I want to replace the "YES" rows of df1 with the ones from df2 by 'z' column.
x y z
8 l YES
9 g YES
5 e NO
6 f NO
How I can do it?
Thanks in advance.
CodePudding user response:
Do you want to bind the "NO" rows of df1
with df2
?
rbind(df2, df1[df1$z == "NO", ])
# x y z
#1 8 l YES
#2 9 g YES
#5 5 e NO
#6 6 f NO
Using dplyr
-
library(dplyr)
df2 %>% bind_rows(df1 %>% filter(z == "NO"))
CodePudding user response:
I'm still not entirely sure about this, and there are a number of assumptions - but thought this may be helpful.
You can quickly make a counter in df1
for those with "YES". These would be marked for replacement. This also could be combined with next step, but included for transparency.
df1$ctr <- cumsum(df1$z == "YES")
Then, you can create a logical vector ind
to indicate which rows are "YES" in df1
where the counter is less than the number of rows in df2
(assumes all df2
rows would be used). This addresses the mismatch between number of rows to replace in df1
, and number of replacement rows in df2
.
ind <- df1$z == "YES" & df1$ctr <= nrow(df2)
Then, to replace the rows marked for replacement, use:
df1[ind,] <- df2
Finally, remove rows with "YES" in df1
that did not get replaced. This is the same as keeping those that were replaced or had "NO" for column z
(the -4
at the end removes the temporary counter):
df1[ind | df1$z == "NO", -4]
Output
x y z
1 8 l YES
2 9 g YES
5 5 e NO
6 6 f NO