I have csv file containing 3 columns: name, position and plays (plays has value 1 if player plays at that position, otherwise has value 0). One player can play more than one position. There are no two players with the same name. Sample data:
Brian,goalkeeper,0
Brian,fullback,1
Brian,playmaker,1
Brian,centerback,0
Brian,striker,0
Teddy,goalkeeper,0
Teddy,fullback,0
Teddy,playmaker,1
Teddy,centerback,1
Teddy,striker,1
Alan,goalkeeper,1
Alan,fullback,0
Alan,playmaker,0
Alan,centerback,1
Alan,striker,0
I want to transform this data to data frame, to have names in rows, positions as column names, and values 0/1 as data in cells.
players <-read.csv("players.csv", sep=",", stringsAsFactors = FALSE)
command
table1 <- table(players$name,players$position,players$plays)
gives me two tables
centerback fullback goalkeeper playmaker striker
Alan 0 1 0 1 1
Brian 1 0 1 0 1
Teddy 0 1 1 0 0
and
centerback fullback goalkeeper playmaker striker
Alan 1 0 1 0 0
Brian 0 1 0 1 0
Teddy 1 0 0 1 1
(Why is table function doing this?, second one is correct for me). How is it possible to transform table1 to data frame, to have player names in rows, positions as column names, and values 0/1 as data in cells?
CodePudding user response:
table is giving you an 3-D array because you have 3-variables to take into consideration, You need to drop one of them, ie only subset to get where plays == 1
then do a table on that.
as.data.frame.matrix(table(subset(players, plays == 1, -plays)))
centerback fullback goalkeeper playmaker striker
Alan 1 0 1 0 0
Brian 0 1 0 1 0
Teddy 1 0 0 1 1
Or you could also use xtabs
as below:
as.data.frame.matrix(xtabs(plays~name position, players))
centerback fullback goalkeeper playmaker striker
Alan 1 0 1 0 0
Brian 0 1 0 1 0
Teddy 1 0 0 1 1
If you need the names as a column, use reshape:
reshape(df, timevar = 'position', idvar = 'name', direction = 'wide', sep='_')
name plays_goalkeeper plays_fullback plays_playmaker plays_centerback plays_striker
1 Brian 0 1 1 0 0
6 Teddy 0 0 1 1 1
11 Alan 1 0 0 1 0
CodePudding user response:
I would use pivot_wider
from tidyr
for this.
library(tidyr)
players %>%
pivot_wider(names_from = position, values_from = plays)
# A tibble: 3 x 6
name goalkeeper fullback playmaker centerback striker
<chr> <int> <int> <int> <int> <int>
1 Brian 0 1 1 0 0
2 Teddy 0 0 1 1 1
3 Alan 1 0 0 1 0