Although I have searched a lot on stackoverflow I could not find an appropriate answer for my problem. It is quite tricky because I want to extract which players where matched in a group together and store the value in a new column. I have 6 Groups, each consisting of 5 Players, over 20 Periods. Every period Players are randomly shuffled into a new Group. I want to copy 5 row values within the same column (Player ID) in an increasing order into a new column "Unique_GroupID" if and only if they they are in the same Group and Period. The idea behind this player id sorting is to check whether people were in the same Group before or not. In total, I have three columns: PlayerID; Group; Period
Ideally, the output "Unique_GroupID" should look like this:
df
PlayerID Group Period Unique_GroupID
A1 1 1 A1-A2-A3-A4-A5
A2 1 1 A1-A2-A3-A4-A5
A3 1 1 A1-A2-A3-A4-A5
A4 1 1 A1-A2-A3-A4-A5
A5 1 1 A1-A2-A3-A4-A5
B1 2 1 B1-B2-B3-B4-B5
B2 2 1 B1-B2-B3-B4-B5
B3 2 1 B1-B2-B3-B4-B5
B4 2 1 B1-B2-B3-B4-B5
B5 2 1 B1-B2-B3-B4-B5
.
.
.
.
.
.
A1 1 2 A1-B2-B3-A4-A5
A5 1 2 A1-B2-B3-A4-A5
B3 1 2 A1-B2-B3-A4-A5
A4 1 2 A1-B2-B3-A4-A5
B2 1 2 A1-B2-B3-A4-A5
B1 2 2 B1-A2-A3-B4-B5
B4 2 2 B1-A2-A3-B4-B5
B5 2 2 B1-A2-A3-B4-B5
A3 2 2 B1-A2-A3-B4-B5
A2 2 2 B1-A2-A3-B4-B5
Any ideas? Help would be very much appreciated. Thanks a lot in advance!
CodePudding user response:
First of all always give us a reproducible example such as the one I made below.
What you're after can tersely be done using the dplyr syntax
library(dplyr)
playerid <- c(paste0("A",1:5), paste0("B",1:5))
group <- c(rep(1,5), rep(2,5))
period <- rep(1,10)
df <- data.frame(playerid, group, period)
df <- df %>% group_by(group, period) %>% mutate(uniquegroupid = paste(playerid, collapse = "-"))
# A tibble: 10 x 4
# Groups: group, period [2]
playerid group period uniquegroupid
<chr> <dbl> <dbl> <chr>
1 A1 1 1 A1-A2-A3-A4-A5
2 A2 1 1 A1-A2-A3-A4-A5
3 A3 1 1 A1-A2-A3-A4-A5
4 A4 1 1 A1-A2-A3-A4-A5
5 A5 1 1 A1-A2-A3-A4-A5
6 B1 2 1 B1-B2-B3-B4-B5
7 B2 2 1 B1-B2-B3-B4-B5
8 B3 2 1 B1-B2-B3-B4-B5
9 B4 2 1 B1-B2-B3-B4-B5
10 B5 2 1 B1-B2-B3-B4-B5
CodePudding user response:
Try this
library(dplyr)
df |> group_by(Group , Period) |> mutate(Unique_GroupID =
paste0(PlayerID[order(gsub("\\D" , "" , PlayerID))] , collapse = "-"))
- output
# A tibble: 20 × 4
# Groups: Group, Period [4]
PlayerID Group Period Unique_GroupID
<chr> <int> <int> <chr>
1 A1 1 1 A1-A2-A3-A4-A5
2 A2 1 1 A1-A2-A3-A4-A5
3 A3 1 1 A1-A2-A3-A4-A5
4 A4 1 1 A1-A2-A3-A4-A5
5 A5 1 1 A1-A2-A3-A4-A5
6 B1 2 1 B1-B2-B3-B4-B5
7 B2 2 1 B1-B2-B3-B4-B5
8 B3 2 1 B1-B2-B3-B4-B5
9 B4 2 1 B1-B2-B3-B4-B5
10 B5 2 1 B1-B2-B3-B4-B5
11 A1 1 2 A1-B2-B3-A4-A5
12 A5 1 2 A1-B2-B3-A4-A5
13 B3 1 2 A1-B2-B3-A4-A5
14 A4 1 2 A1-B2-B3-A4-A5
15 B2 1 2 A1-B2-B3-A4-A5
16 B1 2 2 B1-A2-A3-B4-B5
17 B4 2 2 B1-A2-A3-B4-B5
18 B5 2 2 B1-A2-A3-B4-B5
19 A3 2 2 B1-A2-A3-B4-B5
20 A2 2 2 B1-A2-A3-B4-B5
- Data
df <- structure(list(PlayerID = c("A1", "A2", "A3", "A4", "A5", "B1",
"B2", "B3", "B4", "B5", "A1", "A5", "B3", "A4", "B2", "B1", "B4",
"B5", "A3", "A2"), Group = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), Period = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L)), row.names = c(NA, -20L), class = "data.frame")