I have a data frame that is supposed to show the winners of a tournament and their opponents. Currently the loser is in every other row. So, row 1 is the winner, row 2 is the loser, row 3 is the winner, row 4 is the loser, and so on.
I want the winner and their opponent to be next to each other so that it's easier to see who competed against who. The tricky part is keeping the gym, names, and competitor number for each person together in the same row.
How do I move every other row to a new column so that the winner and their opponent are in the same row?
y = read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/y.csv')
# FAILED ATTEMPT
library(data.table)
z=dcast(setDT(y)[, grp := gl(.N, 2, .N)], grp ~ rowid(grp),
value.var = setdiff(names(y), 'grp'))[, grp := NULL][]
Note that both photos are different data sets
What my df currently looks like:
Similar to what I want it to look like:
CodePudding user response:
Using dplyr
you could do:
library(dplyr)
read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/y.csv') %>%
group_by(fight, date) %>%
summarise(division = first(division),
competitor_1 = first(competitor),
name_1 = first(name),
competitor_2 = last(competitor),
name_2 = last(name))
#> `summarise()` has grouped output by 'fight'. You can override using the
#> `.groups` argument.
#> # A tibble: 61 x 7
#> # Groups: fight [26]
#> fight date division competitor_1 name_1 compe~1 name_2
#> <chr> <chr> <chr> <int> <chr> <int> <chr>
#> 1 BYE BYE Master 2 1 Rafael M~ 1 Rafae~
#> 2 FIGHT 19 Thu 09/01 at 12:14 PM Master 2 2 Piter Fr~ 63 Alan ~
#> 3 FIGHT 20 Thu 09/01 at 01:01 PM Master 2 16 Marques ~ 55 Diego~
#> 4 FIGHT 20 Thu 09/01 at 12:13 PM Master 2 28 Kenned D~ 44 Verge~
#> 5 FIGHT 22 Thu 09/01 at 12:27 PM Master 2 4 Marcus V~ 52 Kian ~
#> 6 FIGHT 23 Thu 09/01 at 12:33 PM Master 2 30 Adam Col~ 46 Steph~
#> 7 FIGHT 23 Thu 09/01 at 12:54 PM Master 2 31 Namrod B~ 47 Stefa~
#> 8 FIGHT 23 Thu 09/01 at 12:58 PM Master 2 13 David Ch~ 53 Joshu~
#> 9 FIGHT 24 Thu 09/01 at 01:08 PM Master 2 3 Sandro G~ 56 Carlo~
#> 10 FIGHT 24 Thu 09/01 at 12:35 PM Master 2 8 Rafael R~ 60 Andre~
#> # ... with 51 more rows, and abbreviated variable name 1: competitor_2
Created on 2022-09-16 with reprex v2.0.2
CodePudding user response:
There are some problems with your dataset, e.g. for "FIGHT 22" there are four entries (from your description I expected two entries).
division gender belt weight fight date competitor name gym
<chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr>
1 Master 2 Male BLACK Middle FIGHT 22 Thu 09/01 at 12:27 PM 4 Marcus V. C. Antelante Ares BJJ
2 Master 2 Male BLACK Middle FIGHT 22 Thu 09/01 at 12:27 PM 62 Andrew E. Ganthier Renzo Gracie Academy
3 Master 2 Male BLACK Middle FIGHT 22 Thu 09/01 at 12:27 PM 11 Jimmy Dang Khoa Tat CheckMat
4 Master 2 Male BLACK Middle FIGHT 22 Thu 09/01 at 12:27 PM 52 Kian Takumi Kadota Brasa CTA
The same problem exists for fights 26 and 35. Assuming these are corrected, and assuming odd rows contain winners and even rows contain losers, the following code should work (using tidyverse):
y %>%
mutate(outcome = if_else(row_number() %% 2 == 1, "winner", "loser")) %>%
pivot_wider(names_from = outcome, values_from = c(competitor, name, gym))