Home > Back-end >  How to give a consecutive id number for each distinct study in r
How to give a consecutive id number for each distinct study in r

Time:09-13

I am trying to create consecutive ID numbers for each distinct study. I found an example of data where they managed to create such an ID number under esid variable

Browse[1]> dat <- dat.assink2016
Browse[1]> head(dat, 9)
  study esid id      yi     vi pubstatus year deltype
1     1    1  1  0.9066 0.0740         1  4.5 general
2     1    2  2  0.4295 0.0398         1  4.5 general
3     1    3  3  0.2679 0.0481         1  4.5 general
4     1    4  4  0.2078 0.0239         1  4.5 general
5     1    5  5  0.0526 0.0331         1  4.5 general
6     1    6  6 -0.0507 0.0886         1  4.5 general
7     2    1  7  0.5117 0.0115         1  1.5 general
8     2    2  8  0.4738 0.0076         1  1.5 general
9     2    3  9  0.3544 0.0065         1  1.5 general

I would like to create the same for my study, can anyone show me how to do it?

CodePudding user response:

If the id column is consecutive (i.e. no jumps or repeated values) you could subtract the minimum value of id for each study and add one:

# Example data
df = data.frame(study=c(1,1,1,2,2,2,2,3,3),
                id=1:9)

# Calculate minima
min.id = tapply(X=df$id,
                INDEX=df$study,
                FUN=min)

# merge this with the data
df$min.id = min.id[df$study]

# Calculate consecutive id as required
df$esid = df$id - df$min.id 1

CodePudding user response:

The key is to group_by id, then use row_number

library(dplyr)

df %>% 
    group_by(study) %>%
    mutate(esid = row_number())

with the example data from @njp:

# A tibble: 9 × 3
# Groups:   study [3]
  study    id  esid
  <dbl> <int> <int>
1     1     1     1
2     1     2     2
3     1     3     3
4     2     4     1
5     2     5     2
6     2     6     3
7     2     7     4
8     3     8     1
9     3     9     2
  •  Tags:  
  • r
  • Related