Numbering duplicate rows with the same value-CodePudding

I have this data

df <- data.frame(
  id = c(1L,1L,1L,2L,2L,2L,3L,3L),
  groupA = c("A","A","B","B","B","B","A","A"),
  groupB = c("red", "red", "red", "blue", "red", "blue", "blue", "red"))


      id groupA groupB
    1  1      A    red
    2  1      A    red
    3  1      B    red
    4  2      B   blue
    5  2      B    red
    6  2      B   blue
    7  3      A   blue
    8  3      A    red

I would like to make groups by multiple columns to get this

      id groupA groupB  nr.group
    1  1      A    red     1
    2  1      A    red     1 
    3  1      B    red     2
    4  2      B   blue     1
    5  2      B    red     2
    6  2      B   blue     1
    7  3      A   blue     1
    8  3      A    red     2

My solution

  df %>%
    group_by(id, groupA, groupB)%>%
    mutate(nr.group = 1:n())

But it count rows within groups. And I would like to try dplyr and basic R solution to compare.

CodePudding user response：

You could try

library(dplyr)

df %>%
  group_by(id) %>%
  mutate(grp = paste(groupA, groupB),
         nr.group = match(grp, unique(grp))) %>%
  ungroup() %>%
  select(-grp)

df %>%
  distinct() %>%
  group_by(id) %>%
  mutate(nr.group = 1:n()) %>%
  left_join(df, .)

Output

# A tibble: 8 × 4
     id groupA groupB nr.group
  <int> <chr>  <chr>     <int>
1     1 A      red           1
2     1 A      red           1
3     1 B      red           2
4     2 B      blue          1
5     2 B      red           2
6     2 B      blue          1
7     3 A      blue          1
8     3 A      red           2

CodePudding user response：

We could convert to factor and coerce to integer to do this

library(dplyr)
library(stringr)
df %>%
   group_by(id) %>%
   mutate(grp = str_c(groupA, groupB), 
     nr.group = as.integer(factor(grp, levels = unique(grp)))) %>% 
   ungroup %>%
   select(-grp)

-output

# A tibble: 8 × 4
     id groupA groupB nr.group
  <int> <chr>  <chr>     <int>
1     1 A      red           1
2     1 A      red           1
3     1 B      red           2
4     2 B      blue          1
5     2 B      red           2
6     2 B      blue          1
7     3 A      blue          1
8     3 A      red           2