Home > database >  Rowsums in R by unique values in multiple columns
Rowsums in R by unique values in multiple columns

Time:12-15

Suppose I have a table with chess players pairs (just an imaginary example). The table shows who played white, black and the number of games between the players.

| White| Black| Games|              
|:---- |:------:| -----:|               
| Anand| Caruana| 13 |         
| Carlsen| Naka| 12 |             
| Caruana| Giri| 14 |          
| Giri| Anand| 10 |           
| Grischuk| Carlsen| 7|    

What I want is the total number of games (Black White) per player, i.e. all the games he played against any other grandmaster.

| Player | Games_total|           
|:---- |:------:|            
| Anand| 33|      
| Caruana| 27|            
| Carlsen| 21|         
| Naka| 12|           
| Giri| 34| 
| Grischuk| 9| 
   

CodePudding user response:

Try this solution using aggregate

setNames( aggregate( Games ~ values, 
  cbind( stack(df1, c("White","Black")), Games=df1$Games), sum ), 
  c("Players","Games_total") )

   Players Games_total
1    Anand          23
2  Carlsen          19
3  Caruana          27
4     Giri          24
5 Grischuk           7
6     Naka          12

Data

df1 <- structure(list(White = c("Anand", "Carlsen", "Caruana", "Giri",
"Grischuk"), Black = c("Caruana", "Naka", "Giri", "Anand", "Carlsen"
), Games = c(13L, 12L, 14L, 10L, 7L)), class = "data.frame", row.names = c(NA,
-5L))

CodePudding user response:

You may get the data in long format and then sum the Games for each Player.

library(dplyr)
library(tidyr)

df %>%
  pivot_longer(cols = c(White, Black), values_to = 'Player') %>%
  group_by(Player) %>%
  summarise(Games_total = sum(Games))

# Player   Games_total
#  <chr>          <int>
#1 Anand             23
#2 Carlsen           19
#3 Caruana           27
#4 Giri              24
#5 Grischuk           7
#6 Naka              12

data

df <- structure(list(White = c("Anand", "Carlsen", "Caruana", "Giri", 
"Grischuk"), Black = c("Caruana", "Naka", "Giri", "Anand", "Carlsen"
), Games = c(13L, 12L, 14L, 10L, 7L)), row.names = c(NA, -5L), class = "data.frame")
  • Related