Home > Software design >  Join columns, duplicating existing row for each variable in new column
Join columns, duplicating existing row for each variable in new column

Time:05-25

Ok so say we have two different columns of data. One is just a running date column:

day
2017-11-08              
2017-11-09              
2017-11-10              
2017-11-13              
2017-11-14              
2017-11-15  

The other is just an ID column

ID
asdflauih3298b43f9n
akjwn3ibfun9834n93n
nv43879n4vliuhs87ba

I would like to join these columns in a way so that each ID occurs for every day. Like this:

day            ID
2017-11-08     asdflauih3298b43f9n  
2017-11-08     akjwn3ibfun9834n93n
2017-11-08     nv43879n4vliuhs87ba  
2017-11-09     asdflauih3298b43f9n  
2017-11-09     akjwn3ibfun9834n93n      
2017-11-09     nv43879n4vliuhs87ba              
2017-11-10     asdflauih3298b43f9n  
2017-11-10     akjwn3ibfun9834n93n
2017-11-10     nv43879n4vliuhs87ba

Looking to do this in tidyverse. I know this should be really easy, but I'm stumped. Thanks!

CodePudding user response:

You can use the regular joining functions from dplyr to perform a cross-join like this by setting the by argument to character():

library(dplyr)

left_join(A, B, by = character())

#> # A tibble: 18 × 2
#>    day        ID                 
#>    <chr>      <chr>              
#>  1 2017-11-08 asdflauih3298b43f9n
#>  2 2017-11-08 akjwn3ibfun9834n93n
#>  3 2017-11-08 nv43879n4vliuhs87ba
#>  4 2017-11-09 asdflauih3298b43f9n
#>  5 2017-11-09 akjwn3ibfun9834n93n
#>  6 2017-11-09 nv43879n4vliuhs87ba
#>  7 2017-11-10 asdflauih3298b43f9n
#>  8 2017-11-10 akjwn3ibfun9834n93n
#>  9 2017-11-10 nv43879n4vliuhs87ba
#> 10 2017-11-13 asdflauih3298b43f9n
#> 11 2017-11-13 akjwn3ibfun9834n93n
#> 12 2017-11-13 nv43879n4vliuhs87ba
#> 13 2017-11-14 asdflauih3298b43f9n
#> 14 2017-11-14 akjwn3ibfun9834n93n
#> 15 2017-11-14 nv43879n4vliuhs87ba
#> 16 2017-11-15 asdflauih3298b43f9n
#> 17 2017-11-15 akjwn3ibfun9834n93n
#> 18 2017-11-15 nv43879n4vliuhs87ba

Created on 2022-05-24 by the reprex package (v2.0.1)

Data

A <- tibble::tribble(
  ~day,
  "2017-11-08",              
  "2017-11-09",              
  "2017-11-10",              
  "2017-11-13",              
  "2017-11-14",              
  "2017-11-15"  
)

B <- tibble::tribble(
  ~ID,
  "asdflauih3298b43f9n",
  "akjwn3ibfun9834n93n",
  "nv43879n4vliuhs87ba"
)

CodePudding user response:

It may be a case for crossing

tidyr::crossing(df1, df2)

-output

# A tibble: 18 × 2
   day        ID                 
   <chr>      <chr>              
 1 2017-11-08 akjwn3ibfun9834n93n
 2 2017-11-08 asdflauih3298b43f9n
 3 2017-11-08 nv43879n4vliuhs87ba
 4 2017-11-09 akjwn3ibfun9834n93n
 5 2017-11-09 asdflauih3298b43f9n
 6 2017-11-09 nv43879n4vliuhs87ba
 7 2017-11-10 akjwn3ibfun9834n93n
 8 2017-11-10 asdflauih3298b43f9n
 9 2017-11-10 nv43879n4vliuhs87ba
10 2017-11-13 akjwn3ibfun9834n93n
11 2017-11-13 asdflauih3298b43f9n
12 2017-11-13 nv43879n4vliuhs87ba
13 2017-11-14 akjwn3ibfun9834n93n
14 2017-11-14 asdflauih3298b43f9n
15 2017-11-14 nv43879n4vliuhs87ba
16 2017-11-15 akjwn3ibfun9834n93n
17 2017-11-15 asdflauih3298b43f9n
18 2017-11-15 nv43879n4vliuhs87ba

data

df1 <- structure(list(day = c("2017-11-08", "2017-11-09", "2017-11-10", 
"2017-11-13", "2017-11-14", "2017-11-15")), 
class = "data.frame", row.names = c(NA, 
-6L))

df2 <- structure(list(ID = c("asdflauih3298b43f9n", "akjwn3ibfun9834n93n", 
"nv43879n4vliuhs87ba")), class = "data.frame",
 row.names = c(NA, 
-3L))
  • Related