Change rows in a df with rows from another one by a condition R-CodePudding

I'd like to know how to change a whole part of a df with rows from another one .

Here the two dataframes:

df1 <- data.frame(x = c(1,2,3,4,5,6), y = c("a", "b", "c", "d", 'e', "f"), z = c("YES", "YES", "YES", "YES", "NO", "NO"))

df2 <- data.frame(x = c(8,9), y = c("l", "g"), z = c("YES","YES"))

I want to replace the "YES" rows of df1 with the ones from df2 by 'z' column.

x y   z
8 l YES
9 g YES
5 e  NO
6 f  NO

How I can do it?

Thanks in advance.

CodePudding user response：

Do you want to bind the "NO" rows of df1 with df2 ?

rbind(df2, df1[df1$z == "NO", ])

#  x y   z
#1 8 l YES
#2 9 g YES
#5 5 e  NO
#6 6 f  NO

Using dplyr -

library(dplyr)

df2 %>% bind_rows(df1 %>% filter(z == "NO"))

CodePudding user response：

I'm still not entirely sure about this, and there are a number of assumptions - but thought this may be helpful.

You can quickly make a counter in df1 for those with "YES". These would be marked for replacement. This also could be combined with next step, but included for transparency.

df1$ctr <- cumsum(df1$z == "YES")

Then, you can create a logical vector ind to indicate which rows are "YES" in df1 where the counter is less than the number of rows in df2 (assumes all df2 rows would be used). This addresses the mismatch between number of rows to replace in df1, and number of replacement rows in df2.

ind <- df1$z == "YES" & df1$ctr <= nrow(df2)

Then, to replace the rows marked for replacement, use:

df1[ind,] <- df2

Finally, remove rows with "YES" in df1 that did not get replaced. This is the same as keeping those that were replaced or had "NO" for column z (the -4 at the end removes the temporary counter):

df1[ind | df1$z == "NO", -4]

Output

  x y   z
1 8 l YES
2 9 g YES
5 5 e  NO
6 6 f  NO