I have a data set which lists a range of teams (e.g. "A", "B") and the id of the users in each team.
Example data set below (first 2 columns).Actual data runs in to the 500k plus, so I have been trying to implement a loop in R, without success.
I want to be able assess each row against the next,
i.e. if row 1 = team A & row 2 = team A, "pass",
if row 1 = team A & row 2 = team B, "fail"
if row 1 = team B & row 2 = team B, "pass"
if row 1 = team B & row 2 = team A, "fail",
with each iteration comparing the next set of rows (e.g. 1 compared to 2, then 2 to 3, and so on).
The output for each being in a new column, detailing the outcome i.e pass/fail.
If the outcome is "pass", I would like 2 new columns the first having the user id in the first row and the second the user in the row to which it was compared - like an edge list -e.g. in the desired output format below, row 1 and 2 are compared, each having team, and "Pass", therefore user 1 is inserted into user_id_out, and user 2 into the user_id_in column.
If the outcome was fail, the content for user_id_out (and in), will be NA.
Any help on how to do this will be greatly appreciated. :)
CodePudding user response:
Welcome to stackoverflow. Remember to always include a reproducible example of your data, do not share screenshots.
You can do it using dplyr functions.
library(dplyr)
# The data
data <- data.frame(
team = c('A', 'A', 'A', 'A', 'B', 'B', 'B', 'A', 'B', 'A', 'A', 'B', 'B'),
user_id = c(1, 2, 3, 4, 5, 6, 7, 1, 5, 3, 4, 6, 7)
)
data %>%
mutate(
outcome = case_when(
team == lead(team) ~ 'pass',
TRUE ~ 'fail'
),
user_id_out = ifelse(outcome == 'pass', user_id, NA),
user_id_in = ifelse(outcome == 'pass', lead(user_id), NA)
)
#> team user_id outcome user_id_out user_id_in
#> 1 A 1 pass 1 2
#> 2 A 2 pass 2 3
#> 3 A 3 pass 3 4
#> 4 A 4 fail NA NA
#> 5 B 5 pass 5 6
#> 6 B 6 pass 6 7
#> 7 B 7 fail NA NA
#> 8 A 1 fail NA NA
#> 9 B 5 fail NA NA
#> 10 A 3 pass 3 4
#> 11 A 4 fail NA NA
#> 12 B 6 pass 6 7
#> 13 B 7 fail NA NA
Created on 2022-03-29 by the reprex package (v2.0.1)