So basically I have a data frame that looks like this:
BX | BY |
---|---|
1 | 12 |
1 | 12 |
1 | 12 |
2 | 14 |
2 | 14 |
3 | 5 |
I want to create another colum ID
, which will have the same number for the same values in BX
and BY
. So the table would look like this then:
BX | BY | ID |
---|---|---|
1 | 12 | 1 |
1 | 12 | 1 |
1 | 12 | 1 |
2 | 14 | 2 |
2 | 14 | 2 |
3 | 5 | 3 |
CodePudding user response:
Here is a base R way.
Subset the data.frame by the grouping columns, find the duplicated rows and use a standard cumsum
trick.
df1<-'BX BY
1 12
1 12
1 12
2 14
2 14
3 5'
df1 <- read.table(textConnection(df1), header = TRUE)
cumsum(!duplicated(df1[c("BX", "BY")]))
#> [1] 1 1 1 2 2 3
df1$ID <- cumsum(!duplicated(df1[c("BX", "BY")]))
df1
#> BX BY ID
#> 1 1 12 1
#> 2 1 12 1
#> 3 1 12 1
#> 4 2 14 2
#> 5 2 14 2
#> 6 3 5 3
Created on 2022-10-12 with reprex v2.0.2
CodePudding user response:
You can do:
transform(dat, ID = as.numeric(interaction(dat, drop = TRUE, lex.order = TRUE)))
BX BY ID
1 1 12 1
2 1 12 1
3 1 12 1
4 2 14 2
5 2 14 2
6 3 5 3
Or if you prefer dplyr
:
library(dplyr)
dat %>%
group_by(across()) %>%
mutate(ID = cur_group_id()) %>%
ungroup()
# A tibble: 6 × 3
BX BY ID
<dbl> <dbl> <int>
1 1 12 1
2 1 12 1
3 1 12 1
4 2 14 2
5 2 14 2
6 3 5 3