I have a loop in R that always has a matrix as output. Two examples (my data is more complex and has more columns):
A | B |
---|---|
1 | -1 |
1 | -1 |
A | D |
---|---|
-1 | -1 |
-1 | -1 |
I want to get a combination like this as my output:
A | B | D |
---|---|---|
1 | -1 | 0 |
1 | -1 | 0 |
-1 | 0 | -1 |
-1 | 0 | -1 |
So the goal is to bind the rows if the column is already existent (A). If the column is not existent (D), then add a new column for it. And fill the other entries with zeros.
I think a matrix or data frame as the output would be fine. Any ideas? Thanks in advance!
CodePudding user response:
Maybe you want to use rbind.fill
from plyr
:
m1 <- read.table(text="A B
1 -1
1 -1", header = TRUE)
m2 <- read.table(text = "A D
-1 -1
-1 -1", header = TRUE)
m3 <- plyr::rbind.fill(m1, m2)
m3
#> A B D
#> 1 1 -1 NA
#> 2 1 -1 NA
#> 3 -1 NA -1
#> 4 -1 NA -1
m3[is.na(m3)] <- 0
m3
#> A B D
#> 1 1 -1 0
#> 2 1 -1 0
#> 3 -1 0 -1
#> 4 -1 0 -1
Created on 2022-07-12 by the reprex package (v2.0.1)
CodePudding user response:
I propose:
merge(df1, df2, by = "A", all = TRUE)
But have NA instead of zero.
CodePudding user response:
Since merge
also works with "data.frame"
s, we may use the usual Reduce / merge
approach where we put all the matrices into a list
. (It might be advantageous if your loop already throws such a list.)
Reduce(\(...) merge(..., all=TRUE), list(m1, m2, m3))
# D A B F
# 1 -1 -1 NA -1
# 2 -1 -1 NA -1
# 3 -1 -1 NA -1
# 4 -1 -1 NA -1
# 5 NA 1 -1 NA
# 6 NA 1 -1 NA
We can refine the result,
res <- Reduce(\(...) merge(..., all=TRUE), list(m1, m2, m3)) |>
{\(.) subset(., select=order(colnames(.)))}() |> ## order columns
{\(.) replace(., is.na(.), 0)}() ## turn NA to 0
res
# A B D F
# 1 -1 0 -1 -1
# 2 -1 0 -1 -1
# 3 -1 0 -1 -1
# 4 -1 0 -1 -1
# 5 1 -1 0 0
# 6 1 -1 0 0
where:
class(res)
# [1] "data.frame"
Data:
m1 <- structure(c(1L, 1L, -1L, -1L), dim = c(2L, 2L), dimnames = list(
NULL, c("A", "B")))
m2 <- structure(c(-1L, -1L, -1L, -1L), dim = c(2L, 2L), dimnames = list(
NULL, c("A", "D")))
m3 <- structure(c(-1L, -1L, -1L, -1L), dim = c(2L, 2L), dimnames = list(
NULL, c("D", "F")))
CodePudding user response:
A possible solution (the matrices need to be converted to data.frame before using bind_rows
):
library(tidyverse)
bind_rows(df1, df2) %>%
mutate(across(everything(), ~ replace_na(.x, 0)))
#> A B D
#> [1,] 1 -1 0
#> [2,] 1 -1 0
#> [3,] -1 0 -1
#> [4,] -1 0 -1