I have two dataframes named TableA and TableB (reproducible example below).
Within TableA$D, whenever the value is NA, I would like to do two things:
1.) Add/duplicate the relevant rows "n" times, where "n" is defined by how many rows are in TableB.
2.) With the rows duplicated, replace the "NA"'s in TableA$D with the values from TableB.
Sometimes TableB will have 7 values, sometimes 100, and there is no prior knowledge of what values will appear in TableB.
A <- c("foo", "foo bar", "foo bar", "bar")
B <- c("bar", "foo", "bar bar", "foo")
C <- c("foo", "foo", "foo", "foo")
D <- c("X", NA, "X", NA)
TableA <- data.frame(A,B,C,D)
TableB <- data.frame(c(1,2,3,4,5,6,7))
I am trying to make like the result to look like this:
A | B | C | D |
---|---|---|---|
foo | bar | foo | 1 |
foo | bar | foo | 2 |
foo | bar | foo | 3 |
foo | bar | foo | 4 |
foo | bar | foo | 5 |
foo | bar | foo | 6 |
foo | bar | foo | 7 |
foo bar | foo | foo | X |
foo bar | bar bar | foo | 1 |
foo bar | bar bar | foo | 2 |
foo bar | bar bar | foo | 3 |
foo bar | bar bar | foo | 4 |
foo bar | bar bar | foo | 5 |
foo bar | bar bar | foo | 6 |
foo bar | bar bar | foo | 7 |
bar | foo | bar | X |
I have tried following this example Add values to dataframe when condition met in R however the logic used doesn't quite seem to fit my scenario.
Any help or tips would be greatly appreciated
CodePudding user response:
This seems to be doing what you want. Due to the for-loops, it will be slow but this is the easiest way I found.
A <- c("foo", "foo bar", "foo bar", "bar")
B <- c("bar", "foo", "bar bar", "foo")
C <- c("foo", "foo", "foo", "foo")
D <- c("X", NA, "X", NA)
TableA <- data.frame(A,B,C,D)
TableB <- data.frame(c(1,2,3,4,5,6,7))
collector<-data.frame()
lTableB<-dim(TableB)[1]
for (i in 1:dim(TableA)[1])
{
print(i)
tmp<-TableA[i,]
repet<-ifelse(is.na(TableA$D[i]), lTableB, 1 )
for (j in 1:repet) {
tmp$D<-ifelse(repet>1, j, tmp$D)
collector<-rbind(collector, tmp)
}
}
collector