How to copy multiple row values at once in an increasing order into a new column?-CodePudding

Although I have searched a lot on stackoverflow I could not find an appropriate answer for my problem. It is quite tricky because I want to extract which players where matched in a group together and store the value in a new column. I have 6 Groups, each consisting of 5 Players, over 20 Periods. Every period Players are randomly shuffled into a new Group. I want to copy 5 row values within the same column (Player ID) in an increasing order into a new column "Unique_GroupID" if and only if they they are in the same Group and Period. The idea behind this player id sorting is to check whether people were in the same Group before or not. In total, I have three columns: PlayerID; Group; Period

Ideally, the output "Unique_GroupID" should look like this:

    df
  PlayerID    Group      Period    Unique_GroupID
    A1         1           1          A1-A2-A3-A4-A5
    A2         1           1          A1-A2-A3-A4-A5
    A3         1           1          A1-A2-A3-A4-A5
    A4         1           1          A1-A2-A3-A4-A5
    A5         1           1          A1-A2-A3-A4-A5

    B1         2           1          B1-B2-B3-B4-B5
    B2         2           1          B1-B2-B3-B4-B5
    B3         2           1          B1-B2-B3-B4-B5
    B4         2           1          B1-B2-B3-B4-B5
    B5         2           1          B1-B2-B3-B4-B5
  .
  .
  .
  .
  .
  .
    A1         1           2          A1-B2-B3-A4-A5
    A5         1           2          A1-B2-B3-A4-A5
    B3         1           2          A1-B2-B3-A4-A5
    A4         1           2          A1-B2-B3-A4-A5
    B2         1           2          A1-B2-B3-A4-A5

    B1         2           2          B1-A2-A3-B4-B5
    B4         2           2          B1-A2-A3-B4-B5
    B5         2           2          B1-A2-A3-B4-B5
    A3         2           2          B1-A2-A3-B4-B5
    A2         2           2          B1-A2-A3-B4-B5

Any ideas? Help would be very much appreciated. Thanks a lot in advance!

CodePudding user response：

First of all always give us a reproducible example such as the one I made below.

What you're after can tersely be done using the dplyr syntax

library(dplyr)
playerid <- c(paste0("A",1:5), paste0("B",1:5))
group <- c(rep(1,5), rep(2,5))
period <- rep(1,10)
df <- data.frame(playerid, group, period)
df <- df %>% group_by(group, period) %>% mutate(uniquegroupid = paste(playerid, collapse = "-"))
# A tibble: 10 x 4
# Groups:   group, period [2]
   playerid group period uniquegroupid 
   <chr>    <dbl>  <dbl> <chr>         
 1 A1           1      1 A1-A2-A3-A4-A5
 2 A2           1      1 A1-A2-A3-A4-A5
 3 A3           1      1 A1-A2-A3-A4-A5
 4 A4           1      1 A1-A2-A3-A4-A5
 5 A5           1      1 A1-A2-A3-A4-A5
 6 B1           2      1 B1-B2-B3-B4-B5
 7 B2           2      1 B1-B2-B3-B4-B5
 8 B3           2      1 B1-B2-B3-B4-B5
 9 B4           2      1 B1-B2-B3-B4-B5
10 B5           2      1 B1-B2-B3-B4-B5

CodePudding user response：

Try this

library(dplyr)

df |> group_by(Group , Period) |> mutate(Unique_GroupID =
paste0(PlayerID[order(gsub("\\D" , "" , PlayerID))] , collapse = "-"))

output

# A tibble: 20 × 4
# Groups:   Group, Period [4]
   PlayerID Group Period Unique_GroupID
   <chr>    <int>  <int> <chr>         
 1 A1           1      1 A1-A2-A3-A4-A5
 2 A2           1      1 A1-A2-A3-A4-A5
 3 A3           1      1 A1-A2-A3-A4-A5
 4 A4           1      1 A1-A2-A3-A4-A5
 5 A5           1      1 A1-A2-A3-A4-A5
 6 B1           2      1 B1-B2-B3-B4-B5
 7 B2           2      1 B1-B2-B3-B4-B5
 8 B3           2      1 B1-B2-B3-B4-B5
 9 B4           2      1 B1-B2-B3-B4-B5
10 B5           2      1 B1-B2-B3-B4-B5
11 A1           1      2 A1-B2-B3-A4-A5
12 A5           1      2 A1-B2-B3-A4-A5
13 B3           1      2 A1-B2-B3-A4-A5
14 A4           1      2 A1-B2-B3-A4-A5
15 B2           1      2 A1-B2-B3-A4-A5
16 B1           2      2 B1-A2-A3-B4-B5
17 B4           2      2 B1-A2-A3-B4-B5
18 B5           2      2 B1-A2-A3-B4-B5
19 A3           2      2 B1-A2-A3-B4-B5
20 A2           2      2 B1-A2-A3-B4-B5

Data

df <- structure(list(PlayerID = c("A1", "A2", "A3", "A4", "A5", "B1", 
"B2", "B3", "B4", "B5", "A1", "A5", "B3", "A4", "B2", "B1", "B4", 
"B5", "A3", "A2"), Group = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), Period = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L)), row.names = c(NA, -20L), class = "data.frame")