Home > Back-end >  Repeat dataframe with new column in R
Repeat dataframe with new column in R

Time:04-14

I have a dataframe:

my_df <- data.frame(var1 = c(1,2,3,4,5), var2 = c(6,7,8,9,10))
my_df
  var1 var2
1    1    6
2    2    7
3    3    8
4    4    9
5    5   10

I also have a vector:

my_vec <- c("a", "b", "c")

I want to repeat the dataframe length(my_vec) times, filling in the values of a new variable with the vector values. Is there a simple way to do this? If possible, i'd like to do this in a dplyr chain. Desired output:

  var1 var2 var3
1    1    6    a
2    2    7    a
3    3    8    a
4    4    9    a
5    5   10    a
6    1    6    b
7    2    7    b
8    3    8    b
9    4    9    b
10   5   10    b
11   1    6    c
12   2    7    c
13   3    8    c
14   4    9    c
15   5   10    c

CodePudding user response:

We can use crossing or with expand_grid

library(tidyr)
crossing(my_df, var3 = my_vec)
#expand_grid(my_df, var3 = my_vec)

If the order is important, use arrange

library(dplyr)
crossing(my_df, var3 = my_vec) %>% 
    arrange(var3)

-output

# A tibble: 15 × 3
    var1  var2 var3 
   <dbl> <dbl> <chr>
 1     1     6 a    
 2     2     7 a    
 3     3     8 a    
 4     4     9 a    
 5     5    10 a    
 6     1     6 b    
 7     2     7 b    
 8     3     8 b    
 9     4     9 b    
10     5    10 b    
11     1     6 c    
12     2     7 c    
13     3     8 c    
14     4     9 c    
15     5    10 c   

CodePudding user response:

Though I don't think this is likely to be the simplest answer in practice, I specifically saw that you wanted a dplyr chain that would solve this, and so I tried to do this without using the pre-existing functions that do this for you.
For your example specifically, you could use this chain with the tibble package functions add_column and add_row

my_df %>%
  tibble::add_column(var3 = my_vec[1]) %>%
  tibble::add_row(tibble::add_column(my_df, var3 = my_vec[2])) %>%
  tibble::add_row(tibble::add_column(my_df, var3 = my_vec[3]))

which directly yields

  var1 var2 var3
1     1    6    a
2     2    7    a
3     3    8    a
4     4    9    a
5     5   10    a
6     1    6    b
7     2    7    b
8     3    8    b
9     4    9    b
10    5   10    b
11    1    6    c
12    2    7    c
13    3    8    c
14    4    9    c
15    5   10    c
Though the principle can be extended a bit, it can still be more adaptable for whatever it is you want to apply this to. *So I decided to make a function* to do it for you.
my_fxn <-
  function(frame, yourVector, new.col.name = paste0("var", NCOL(frame)   1)) {
    require(tidyverse)
    origcols <- colnames(frame)
    for (i in 1:length(yourVector)) {
      intermediateFrame <- tibble::add_column(
        frame,
        temp.name = rep_len(yourVector[[i]], nrow(frame))
      )
      colnames(intermediateFrame) <- append(origcols, new.col.name)
      if (i == 1) {
        Frame3 <- intermediateFrame
      } else {
        Frame3 <- tibble::add_row(Frame3, intermediateFrame)
      }
    }
    return(Frame3)
  }

Running my_fxn(my_df, my_vec) should get you the same data frame/table that we got above. I also experimented with using a for loop outside a function one its own to do this, but decided that it was getting to be overkill. That approach is definitely also possible, though.

  •  Tags:  
  • r
  • Related