Home > Blockchain >  How can I make columns out of winner and loser data based on data from another dataframe?
How can I make columns out of winner and loser data based on data from another dataframe?

Time:05-20

I have two datasets. One (C.1) contains winner and loser data. The other one (C.1sex) contains data on the sex of each individual. I want to combine these datasets together so that I would have a winnersex column and a losersex column. I want to convert the winner and loser data in the columns from their name to their sex. How can I combine these datasets with different numbers of rows, and how can I create these new columns in r?

C.1

    Winner  Loser
1   George   Paul
2   George   Paul
3   George   Paul
4    Horst   Paul
5      Tom Louise
6   George  Horst
7   George Louise
8      Tom  Nobel
9   George  Adele
10   Rufus   Paul

There are 591 rows

Individual Sex
1       Adele   f
2      George   m
3       Horst   m
4      Laggie   m
5      Louise   f
6         Max   m
7       Nobel   f
8        Paul   m
9       Rufus   m
10        Tom   m

There only 10 rows

CodePudding user response:

You could use two left_join()'s and bind the columns via cbind() together, e.g.

library(dplyr)
cbind(
df1 %>% left_join(df2, by = c("Winner" = "Individual")) %>% rename(Winner_sex = Sex),
df1 %>% left_join(df2, by = c("Loser" = "Individual")) %>% select(Loser_sex = Sex)
)

Output:

  Winner  Loser Winner_sex Loser_sex
1  George   Paul          m         m
2  George   Paul          m         m
3  George   Paul          m         m
4   Horst   Paul          m         m
5     Tom Louise          m         f
6  George  Horst          m         m
7  George Louise          m         f
8     Tom  Nobel          m         f
9  George  Adele          m         f
10  Rufus   Paul          m         m

Data:

df1 <- read.table(text = "    Winner  Loser
1   George   Paul
2   George   Paul
3   George   Paul
4    Horst   Paul
5      Tom Louise
6   George  Horst
7   George Louise
8      Tom  Nobel
9   George  Adele
10   Rufus   Paul", h =T)

df2 <- read.table(text = "Individual Sex
1       Adele   f
2      George   m
3       Horst   m
4      Laggie   m
5      Louise   f
6         Max   m
7       Nobel   f
8        Paul   m
9       Rufus   m
10        Tom   m", h = T)

CodePudding user response:

Some updates to your data to make it reproducible.

df_games <- data.frame(
  winner = c("G", "G", "G", "H", "T", "G", "G", "T", "G", "R"),
  loser = c("P", "P", "P", "P", "L", "H", "L", "N", "A", "P")
)

df_sex <- data.frame(
  individual = c("A", "G", "H", "L", "N", "P", "R", "T"),
  sex = c("f", "m", "m", "f", "f", "m", "m", "m")
)

And the code. You'll want to pivot your game results into a long format so you can more easily join it with the table of sexes. After the tables are joined, you can pivot it back into a wide format.

library(tidyr)
library(dplyr)

df_results <- df_games |> 
  mutate(game = row_number()) |> # keep track of the games
  pivot_longer(
    cols = c(winner, loser), names_to = "result", values_to = "individual"
  ) |> 
  left_join(df_sex, by = c("individual"))

df_games_sex <- df_results |> 
  select(-individual) |> # remove this column so it doesn't uniquely ID each row
  pivot_wider(names_from = result, values_from = sex)
  •  Tags:  
  • r
  • Related