Home > Software engineering >  I want to create a synthetic dataframe with two variables with repeat observations
I want to create a synthetic dataframe with two variables with repeat observations

Time:04-06

I have to create a synthetic dataset with multiple variables and >50 observations. I have selected to create a synthetic data for an oil field which has 10 wells and five producing reservoirs. So my dataframe would have 3 variables - "Well ID","Reservoir Name" and "Reservoir Quality".

So, I want to create a dataframe in which for each well, I would have 5 reservoirs, and for each reservoir, I would have 3 rock qualities - "Sand","Shale", and "Cement".

I tried for 2 variables in a crude way -

well1 <- data.frame(Wells = rep(1, 5), Reservoirs = c("A", "B", "C", "D","E"))
well2 <- data.frame(Wells = rep(2, 5), Reservoirs = c("A", "B", "C", "D","E"))
.
.
static_data <- rbind(well1,well2,...)

Now, I am struggling how to add the 3rd variable, and is there any smarter way of doing this? I

I am looking for something like this -

Well Reservoir Rock Quality
1 A Sand
1 A Shale
1 A Cement
1 B Sand
1 B Shale
1 B Cement

CodePudding user response:

The package data.table has a cross-join function that gives what I think you need.

library(data.table)
CJ(a=c(1,2,3), b=c('a', 'b'), c=c('Y', 'Z'))
  • Related