I have a simple data generation question. I would request for any kind of help with the code in R or Python. I am pasting the table first.
Total | Num1_betw_1_to_4 | Num2_betw_1_to_3 | Num3_betw_1_to_3 |
---|---|---|---|
9 | 3 | 3 | 3 |
7 | 1 | 3 | 3 |
9 | 4 | 3 | 2 |
9 | 3 | 3 | 3 |
5 | 2 | 2 | 1 |
7 | 3 | 2 | 2 |
9 | 3 | 3 | 3 |
7 | 2 | 3 | 2 |
5 | |||
6 | |||
2 | |||
4 | |||
9 |
In the above table, first column values are given. Now I want to generate 3 values in column 2, 3 and 4 which sum up to value in column 1 for each row. But each of the column 2, 3 and 4 have some predefined data ranges like: column 2 value must lie between 1 and 4, column 3 value must lie between 1 and 3, and, column 4 value must lie between 1 and 3.
I have printed first 8 rows for your understanding. In real case, only "Total" column values will be given and remaining 3 columns will be blank for which values have to be generated.
Any help would be appreciated with the code.
CodePudding user response:
This is straightforward in R.
First make a data frame of all possible allowed values of each column:
df <- expand.grid(Num1_1_to_4 = 1:4,
Num2_1_to_3 = 1:3,
Num3_1_to_3 = 1:3)
Now throw away any rows that don't sum to 7:
df <- df[rowSums(df) == 7,]
Finally, sample this data frame:
df[sample(nrow(df), 1),]
#> Num1_1_to_4 Num2_1_to_3 Num3_1_to_3
#> 19 3 2 2
CodePudding user response:
here is an algorithm to generate numbers in a range: ex range = (0,20)
import random
num = 20
temp=0
res = []
while temp != 20:
res.append(random.randint(0,num))
temp = res[-1]
num -= res[-1]
print(res)
print(temp)
Hope this helps you abit and try to optimize the idea further. sorry it's late gotta go
CodePudding user response:
Here is a base R solution. The input ranges and totals must be in the formats below:
ranges
is a list of integer vectors of length 2;sums
is a vector of sums.
The output is a matrix with as many rows as the length of the sums vector and with as many columns as the length of ranges
.
rintsum <- function(ranges, sums) {
f <- function(r, s) {
n <- length(r)
x <- integer(n)
while(x[n] < r[[n]][1] || x[n] > r[[n]][2]) {
for(i in seq_along(x)[-n]) {
x[i] <- sample(r[[i]][1]:r[[i]][2], 1L)
}
x[n] <- s - sum(x[-n])
}
x
}
t(sapply(sums, \(s) f(ranges, s)))
}
Total <- c(9, 7, 9, 9, 5, 7, 9, 7)
ranges <- list(c(1, 4), c(1, 3), c(1, 3))
set.seed(2022)
rintsum(ranges, Total)
#> [,1] [,2] [,3]
#> [1,] 4 3 2
#> [2,] 2 3 2
#> [3,] 4 3 2
#> [4,] 4 3 2
#> [5,] 1 2 2
#> [6,] 3 2 2
#> [7,] 4 3 2
#> [8,] 4 2 1
Created on 2022-10-23 with reprex v2.0.2