Home > Back-end >  How to transform table data to dataframe in R
How to transform table data to dataframe in R

Time:02-11

I have csv file containing 3 columns: name, position and plays (plays has value 1 if player plays at that position, otherwise has value 0). One player can play more than one position. There are no two players with the same name. Sample data:

Brian,goalkeeper,0
Brian,fullback,1
Brian,playmaker,1
Brian,centerback,0
Brian,striker,0
Teddy,goalkeeper,0
Teddy,fullback,0
Teddy,playmaker,1
Teddy,centerback,1
Teddy,striker,1
Alan,goalkeeper,1
Alan,fullback,0
Alan,playmaker,0
Alan,centerback,1
Alan,striker,0

I want to transform this data to data frame, to have names in rows, positions as column names, and values 0/1 as data in cells.

players <-read.csv("players.csv", sep=",", stringsAsFactors = FALSE)

command

table1 <-    table(players$name,players$position,players$plays)

gives me two tables

        centerback fullback goalkeeper playmaker striker
  Alan           0        1          0         1       1
  Brian          1        0          1         0       1
  Teddy          0        1          1         0       0

and 
        centerback fullback goalkeeper playmaker striker
  Alan           1        0          1         0       0
  Brian          0        1          0         1       0
  Teddy          1        0          0         1       1

(Why is table function doing this?, second one is correct for me). How is it possible to transform table1 to data frame, to have player names in rows, positions as column names, and values 0/1 as data in cells?

CodePudding user response:

table is giving you an 3-D array because you have 3-variables to take into consideration, You need to drop one of them, ie only subset to get where plays == 1 then do a table on that.

as.data.frame.matrix(table(subset(players, plays == 1, -plays)))
      centerback fullback goalkeeper playmaker striker
Alan           1        0          1         0       0
Brian          0        1          0         1       0
Teddy          1        0          0         1       1

Or you could also use xtabs as below:

as.data.frame.matrix(xtabs(plays~name   position, players))
      centerback fullback goalkeeper playmaker striker
Alan           1        0          1         0       0
Brian          0        1          0         1       0
Teddy          1        0          0         1       1

If you need the names as a column, use reshape:

reshape(df, timevar = 'position', idvar = 'name', direction = 'wide', sep='_')
    name plays_goalkeeper plays_fullback plays_playmaker plays_centerback plays_striker
1  Brian                0              1               1                0             0
6  Teddy                0              0               1                1             1
11  Alan                1              0               0                1             0

CodePudding user response:

I would use pivot_wider from tidyr for this.

library(tidyr)
players %>%
  pivot_wider(names_from = position, values_from = plays)

# A tibble: 3 x 6
  name  goalkeeper fullback playmaker centerback striker
  <chr>      <int>    <int>     <int>      <int>   <int>
1 Brian          0        1         1          0       0
2 Teddy          0        0         1          1       1
3 Alan           1        0         0          1       0
  •  Tags:  
  • r
  • Related