I am working with the R programming language. I have the following data set:
set.seed(123)
Game = c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15,16,16,17,17,18,18,19,19,20,20)
id = c(3,4,3,4,3,4,3,4,3,4,3,4, 3,4,3,4,3,4,3,4)
c <- c("1", "2")
coin <- sample(c, 20, replace=TRUE, prob=c(0.5,0.5))
winner <- c("win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win", "win", "win", "lose", "lose", "tie", "tie", "lose", "lose", "win", "win")
my_data = data.frame(Game, id, coin, winner)
The data ("my_data") looks something like this:
Game id coin winner
1 1 3 2 win
2 1 4 1 win
3 2 3 2 lose
4 2 4 1 lose
5 3 3 1 tie
6 3 4 2 tie
For this dataset ("my_data"), I want to perform the following operations:
For each unique value of the "Game" variable (e.g. Game = 1, Game = 2, etc.), find out the frequency for each unique "Coin" combinations. For example, perhaps Coin = (2,1) happens 5 times , Coin = (2,2) happens 11 times, etc.
Next, for each of these unique "Coin" combinations - find the breakdowns of "win", "lose" and "tie". For example, for Coin = (2,2) there might be 5/11 wins, 3/11 loss and 3/11 tie.
I tried to accomplish this with the following code :
Part 1: (Manually) Find out unique coin combinations per Game (e.g, 1,1 OR 1,2 OR 2,1 OR 2,2)
for (i in 1:19) {
for (j in 2:20) {
my_data$comb = ifelse(my_data[i,3] == "1" & my_data[j,3] == "1", "one,one", ifelse(my_data[i,3] == "2" & my_data[j,3] == "1", "two, one", ifelse(my_data[i,3] == "1" & my_data[j,3] == "2", "one,two", "two,two)))
}
}
Part 2: (Had this worked) Find out Win/Tie/Loss Breakdown for each unique combination from Part 1:
library(dplyr)
my_data %>% group_by(comb) %>% summarise(percent = n() )
The desired output should look something like this (note: 1,2 = 2,1):
Currently, I am importing "my_data" into Microsoft Excel - but can someone please show me how to do this in R?
Can someone please show me how to obtain the above table?
Thanks!
CodePudding user response:
You can use the following codes:
library(tidyverse)
ff <- my_data %>% group_by(Game) %>% arrange(Game, coin) %>%
do(as.data.frame(t(combn(.[["coin"]], 2)))) %>% mutate(coin = paste(V1, V2, sep = ",")) %>% select(Game, coin)
my_data <- my_data %>% select(Game, winner) %>% distinct() %>% left_join(ff)
Then your desired output could be obtained by:
my_data %>% group_by(coin, winner) %>% summarise(n = n()) %>% mutate(p = 100 * n / sum(n, na.rm = T))
# A tibble: 5 x 4
# Groups: coin [3]
coin winner n p
<chr> <chr> <int> <dbl>
1 1,1 lose 4 100
2 1,2 lose 2 14.3
3 1,2 tie 4 28.6
4 1,2 win 8 57.1
5 2,2 lose 2 100
CodePudding user response:
Here is my approach. I am not sure if this is the intended way:
library(dplyr)
my_data %>%
group_by(Game) %>%
mutate(combinations = toString(coin)) %>%
distinct(combinations, .keep_all = TRUE) %>%
ungroup() %>%
group_by(combinations, winner) %>%
summarise(n = n()) %>%
mutate(freq = n/sum(n))
combinations winner n freq
<chr> <chr> <int> <dbl>
1 1, 1 lose 4 1
2 1, 2 tie 2 0.333
3 1, 2 win 4 0.667
4 2, 1 lose 2 0.25
5 2, 1 tie 2 0.25
6 2, 1 win 4 0.5
7 2, 2 lose 2 1