I have this data set in R:
first_variable = rexp(100,100)
second_variable = rexp(100,100)
n_obs = 1:100
question_data = data.frame(n_obs, first_variable, second_variable)
I want to make this dataset so that:
- The rows 1-10 has id:1,2,3,4,5,6,7,8,9,10
- The rows 11-20 has id: 1,2,3,4,5,6,7,8,9,10
- The rows 21-30 has id : 1,2,,3,4,5,6,7,8,9,10 etc
In other words, the id's 1-10 repeat for each sets of 10 rows.
I found this code that I thought would work:
# here, n = 10 (a set of n = 10 rows)
bloc_len <- 10
question_data$id <-
rep(seq(1, 1 nrow(question_data) %/% bloc_len), each = bloc_len, length.out = nrow(question_data))
But this is not working, and is making each set of 10 rows as the same ID:
n_obs first_variable second_variable id
1 1 0.006223412 0.0258968583 1
2 2 0.004473815 0.0065543554 1
3 3 0.011745754 0.0005061101 1
4 4 0.005620351 0.0033549525 1
5 5 0.045860202 0.0132625822 1
6 6 0.002477348 0.0068517981 1
I would have wanted something like this:
n_obs first_variable second_variable id
1 1 0.0062234115 0.0258968583 1
2 2 0.0044738150 0.0065543554 2
3 3 0.0117457544 0.0005061101 3
4 4 0.0056203508 0.0033549525 4
5 5 0.0458602019 0.0132625822 5
6 6 0.0024773478 0.0068517981 6
7 7 0.0049527013 0.0047461094 7
8 8 0.0058581805 0.0108604478 8
9 9 0.0041171801 0.0002445268 9
10 10 0.0090667287 0.0019289691 10
11 11 0.0039002449 0.0135441919 1
12 12 0.0064558661 0.0230979415 2
13 13 0.0104993267 0.0005609776 3
14 14 0.0153162705 0.0038364012 4
15 15 0.0107109676 0.0183818539 5
16 16 0.0131620151 0.0029710189 6
17 17 0.0244441763 0.0095645480 7
18 18 0.0058112355 0.0125754349 8
19 19 0.0005022588 0.0156614272 9
20 20 0.0007572985 0.0049964333 10
21 21 0.0276024376 0.0024303513 1
Is this possible?
Thank you!
CodePudding user response:
Instead of each
, try using times
:
question_data$id <-
rep(seq(bloc_len), times = nrow(question_data) %/% bloc_len, length.out = nrow(question_data))
CodePudding user response:
Like the example shared, if the number of rows in the data (100) is completely divisible by the number of id's (10) then we can use R's recycling property to repeat the id's.
bloc_len <- 10
question_data$id <- seq_len(bloc_len)
If they are not completely divisible we can use rep
-
question_data$id <- rep(seq_len(bloc_len), length.out = nrow(question_data))