Home > Net >  Concatenate/merge dataframes in R into vector type cells
Concatenate/merge dataframes in R into vector type cells

Time:11-05

I would like to merge two dataframe into one, each cell becoming a vector or a list. Columns have the same name in both dataframes. Some columns are made of numerical values that I want to keep as numerical values in the merged dataframe. Some columns are made of characters.

For example I would like from these two dataframes:

DF1 <- data.frame(
    xx = c(1:5),
    yy = c(2:6),
    zz = c("a","b","c","d","e"))
DF2 <- data.frame(
    xx = c(3:7),
    yy = c(5:9),
    zz = c("a","i","h","g","f"))

Which look like this:

DF1

xx yy zz
1 2 a
2 3 b
3 4 c
4 5 d
5 6 e

DF2

xx yy zz
3 5 a
4 6 i
5 7 h
6 8 g
7 9 f

To get a dataframe looking like this:

xx yy zz
c(1,3) c(2,5) c(a,a)
c(2,4) c(3,6) c(b,i)
c(3,5) c(4,7) c(c,h)
c(4,6) c(5,8) c(d,g)
c(5,7) c(6,9) c(e,f)

I have tried with paste() or str_c() but it always transforms my numerical values into char and it does not create a list or a vector like I want.

Do you know of any functions that coule help me do that?

CodePudding user response:

As your data consists of different types, There is no straight forward answer. However I produced some solution, that might do the trick by creating a nested list. Let me know, if this is what you need:

library(BBmisc)
library(dplyr)

colvec <- c("xx2","yy2","zz2")
colnames(DF2) <- colvec

DF <- bind_cols(DF1,DF2)
cols.num <- c("xx","xx2","yy","yy2")
DF[cols.num] <- sapply(DF[cols.num],as.character)
DF <- DF[,c(1,4,2,5,3,6)] 

xx <- convertRowsToList(DF[,1:2])
yy <- convertRowsToList(DF[,3:4])
zz <- convertRowsToList(DF[,5:6])

final_list <- list(xx,yy,zz)

CodePudding user response:

Try the following base R option

> data.frame(Map(function(x, y) asplit(cbind(x, y), 1), DF1, DF2))
    xx   yy   zz
1 1, 3 2, 5 a, a
2 2, 4 3, 6 b, i
3 3, 5 4, 7 c, h
4 4, 6 5, 8 d, g
5 5, 7 6, 9 e, f

CodePudding user response:

Using some tidyverse, you can invert the lists and then build it all back together.

library(purrr)
library(dplyr)

as_tibble(map2(DF1, DF2, ~ map(transpose(list(.x, .y)), unlist)))

This gets you your data frame of vectors.

# A tibble: 5 x 3
  xx        yy        zz       
  <list>    <list>    <list>   
1 <int [2]> <int [2]> <chr [2]>
2 <int [2]> <int [2]> <chr [2]>
3 <int [2]> <int [2]> <chr [2]>
4 <int [2]> <int [2]> <chr [2]>
5 <int [2]> <int [2]> <chr [2]>

Breaking this down...

  1. transpose(list(.x, .y)) will flip a paired list of columns inside-out from a list of two vectors to a list of 5 elements (one for each row, each with two list elements in it).
  2. map(transpose(list(.x, .y)), unlist)) will iterate over each of the 5 lists and unlist them back from a list of 2 to a vector of 2.
  3. map2(DF1, DF2, ~ map(transpose(list(.x, .y)), unlist)) will iterate over each column pair from DF1 and DF2 (e.g., xx, yy, zz) doing steps 1 and 2.
  4. as_tibble(map2(DF1, DF2, ~ map(transpose(list(.x, .y)), unlist))) converts the list to a tibble (basically a data.frame).

Another thing you can do is stack the data and then nest() it. You again need a few steps to do it. This would scale better because you could do this with more than 2 data frames.

library(dplyr)
library(tibble)
library(tidyr)

bind_rows(rowid_to_column(DF1),
          rowid_to_column(DF2)) %>% 
  group_by(rowid) %>% 
  nest(nest_data = -rowid) %>% 
  unnest_wider(nest_data) %>% 
  ungroup() %>% 
  select(-rowid)

This also gets you your data frame of vectors.

# A tibble: 5 x 3
  xx        yy        zz       
  <list>    <list>    <list>   
1 <int [2]> <int [2]> <chr [2]>
2 <int [2]> <int [2]> <chr [2]>
3 <int [2]> <int [2]> <chr [2]>
4 <int [2]> <int [2]> <chr [2]>
5 <int [2]> <int [2]> <chr [2]>

CodePudding user response:

This gives you matrices in a list:

res <- setNames(
 lapply( colnames(DF1), function(x) cbind(DF1[[x]], DF2[[x]]) ), 
colnames(DF1) )

To convert the result into a data frame you can use this:

data.frame( sapply(
 names(res), function(x){ sapply(
  1:nrow(res$xx), function(y){ list(res[[x]][y,1:2]) } 
  ) }
 ) )

    xx   yy   zz
1 1, 3 2, 5 a, a
2 2, 4 3, 6 b, i
3 3, 5 4, 7 c, h
4 4, 6 5, 8 d, g
5 5, 7 6, 9 e, f

Put together in a function:

morph <- function(a, b){
  res <- setNames(
    lapply( colnames(a), function(x) cbind(a[[x]], b[[x]]) ), 
  colnames(a) )
 
  data.frame( sapply(
    names(res), function(x){ sapply(
      1:nrow(res$xx), function(y){ list(res[[x]][y,1:2]) } 
    ) }
  ) )
}

morph(DF1,DF2)
    xx   yy   zz
1 1, 3 2, 5 a, a
2 2, 4 3, 6 b, i
3 3, 5 4, 7 c, h
4 4, 6 5, 8 d, g
5 5, 7 6, 9 e, f
  • Related