Home > Mobile >  Comparing two dataset
Comparing two dataset

Time:06-02

I made minimal reproducible example, but my real data is huge so I cannot do manually

iq<-c(100,103,33,150)
id<-1:4

data<-data.frame(iq,id)


iq<-c(180,183,113,100)
id<-1:4

data2<-data.frame(iq,id)

iq<-c(190,103,113,140)
id<-1:4

data100<-data.frame(iq,id)


Let's say I have these two dataset. I want to merge these two data into one data, such that the final one dataset have many columns such as :"id_data1", "id_data2", "iq_data1","iq_data2" ..."id_data100", "iq_data100"

This data is small so I can do it manually, but my real data is really huge. How can I do this?

CodePudding user response:

You can try,

d1 <- Reduce(cbind.data.frame, mget(ls(pattern = 'data')))
names(d1) <- make.unique(names(d1))

d1
   iq id iq.1 id.1 iq.2 id.2
1 100  1  190    1  180    1
2 103  2  103    2  183    2
3  33  3  113    3  113    3
4 150  4  140    4  100    4

CodePudding user response:

And an approach where you can keep the names of the dataframes as part of the variable names in the new dataframe could be:

library(tidyverse)

mget(ls(pattern = "data")) |> # Change prefix here
map_df(c, .id = "dataset") |>
pivot_wider(names_from = dataset,
            names_sep = "_",
            values_from = c(iq, id),
            values_fn = list) |> # ID's non unique, keep as list
unnest(everything()) # Get rid of lists

# A tibble: 4 × 6
#   iq_data iq_data100 iq_data2 id_data id_data100 id_data2
#     <dbl>      <dbl>    <dbl>   <int>      <int>    <int>
# 1     100        190      180       1          1        1
# 2     103        103      183       2          2        2
# 3      33        113      113       3          3        3
# 4     150        140      100       4          4        4

Note though that unless you have a very specific reason to do so, I'd suggest keeping the combined data in a long format instead. In such case skip everything efter map_df.

CodePudding user response:

Another possible solution, based on purrr::imap_dfr and tidyr::pivot_wider:

library(tidyverse)

imap_dfr(list(data, data2, data100), ~ data.frame(.x, df_id =.y)) %>% 
  pivot_wider(values_from = c(iq, id), names_from = df_id, values_fn = list) %>% 
  unchop(everything()) %>% 
  select(order(parse_number(names(.))))

#> # A tibble: 4 × 6
#>    iq_1  id_1  iq_2  id_2  iq_3  id_3
#>   <dbl> <int> <dbl> <int> <dbl> <int>
#> 1   100     1   180     1   190     1
#> 2   103     2   183     2   103     2
#> 3    33     3   113     3   113     3
#> 4   150     4   100     4   140     4
  • Related