I have a dataframe:
my_df <- data.frame(var1 = c(1,2,3,4,5), var2 = c(6,7,8,9,10))
my_df
var1 var2
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10
I also have a vector:
my_vec <- c("a", "b", "c")
I want to repeat the dataframe length(my_vec)
times, filling in the values of a new variable with the vector values. Is there a simple way to do this? If possible, i'd like to do this in a dplyr chain. Desired output:
var1 var2 var3
1 1 6 a
2 2 7 a
3 3 8 a
4 4 9 a
5 5 10 a
6 1 6 b
7 2 7 b
8 3 8 b
9 4 9 b
10 5 10 b
11 1 6 c
12 2 7 c
13 3 8 c
14 4 9 c
15 5 10 c
CodePudding user response:
We can use crossing
or with expand_grid
library(tidyr)
crossing(my_df, var3 = my_vec)
#expand_grid(my_df, var3 = my_vec)
If the order is important, use arrange
library(dplyr)
crossing(my_df, var3 = my_vec) %>%
arrange(var3)
-output
# A tibble: 15 × 3
var1 var2 var3
<dbl> <dbl> <chr>
1 1 6 a
2 2 7 a
3 3 8 a
4 4 9 a
5 5 10 a
6 1 6 b
7 2 7 b
8 3 8 b
9 4 9 b
10 5 10 b
11 1 6 c
12 2 7 c
13 3 8 c
14 4 9 c
15 5 10 c
CodePudding user response:
Though I don't think this is likely to be the simplest answer in practice, I specifically saw that you wanted a dplyr chain that would solve this, and so I tried to do this without using the pre-existing functions that do this for you.
For your example specifically, you could use this chain with the tibble package functions add_column
and add_row
my_df %>%
tibble::add_column(var3 = my_vec[1]) %>%
tibble::add_row(tibble::add_column(my_df, var3 = my_vec[2])) %>%
tibble::add_row(tibble::add_column(my_df, var3 = my_vec[3]))
which directly yields
Though the principle can be extended a bit, it can still be more adaptable for whatever it is you want to apply this to. *So I decided to make a function* to do it for you.var1 var2 var3 1 1 6 a 2 2 7 a 3 3 8 a 4 4 9 a 5 5 10 a 6 1 6 b 7 2 7 b 8 3 8 b 9 4 9 b 10 5 10 b 11 1 6 c 12 2 7 c 13 3 8 c 14 4 9 c 15 5 10 c
my_fxn <-
function(frame, yourVector, new.col.name = paste0("var", NCOL(frame) 1)) {
require(tidyverse)
origcols <- colnames(frame)
for (i in 1:length(yourVector)) {
intermediateFrame <- tibble::add_column(
frame,
temp.name = rep_len(yourVector[[i]], nrow(frame))
)
colnames(intermediateFrame) <- append(origcols, new.col.name)
if (i == 1) {
Frame3 <- intermediateFrame
} else {
Frame3 <- tibble::add_row(Frame3, intermediateFrame)
}
}
return(Frame3)
}
Running my_fxn(my_df, my_vec)
should get you the same data frame/table that we got above.
I also experimented with using a for
loop outside a function one its own to do this, but decided that it was getting to be overkill. That approach is definitely also possible, though.