I have two datasets. One (C.1) contains winner and loser data. The other one (C.1sex) contains data on the sex of each individual. I want to combine these datasets together so that I would have a winnersex column and a losersex column. I want to convert the winner and loser data in the columns from their name to their sex. How can I combine these datasets with different numbers of rows, and how can I create these new columns in r?
C.1
Winner Loser
1 George Paul
2 George Paul
3 George Paul
4 Horst Paul
5 Tom Louise
6 George Horst
7 George Louise
8 Tom Nobel
9 George Adele
10 Rufus Paul
There are 591 rows
Individual Sex
1 Adele f
2 George m
3 Horst m
4 Laggie m
5 Louise f
6 Max m
7 Nobel f
8 Paul m
9 Rufus m
10 Tom m
There only 10 rows
CodePudding user response:
You could use two left_join()
's and bind the columns via cbind()
together, e.g.
library(dplyr)
cbind(
df1 %>% left_join(df2, by = c("Winner" = "Individual")) %>% rename(Winner_sex = Sex),
df1 %>% left_join(df2, by = c("Loser" = "Individual")) %>% select(Loser_sex = Sex)
)
Output:
Winner Loser Winner_sex Loser_sex
1 George Paul m m
2 George Paul m m
3 George Paul m m
4 Horst Paul m m
5 Tom Louise m f
6 George Horst m m
7 George Louise m f
8 Tom Nobel m f
9 George Adele m f
10 Rufus Paul m m
Data:
df1 <- read.table(text = " Winner Loser
1 George Paul
2 George Paul
3 George Paul
4 Horst Paul
5 Tom Louise
6 George Horst
7 George Louise
8 Tom Nobel
9 George Adele
10 Rufus Paul", h =T)
df2 <- read.table(text = "Individual Sex
1 Adele f
2 George m
3 Horst m
4 Laggie m
5 Louise f
6 Max m
7 Nobel f
8 Paul m
9 Rufus m
10 Tom m", h = T)
CodePudding user response:
Some updates to your data to make it reproducible.
df_games <- data.frame(
winner = c("G", "G", "G", "H", "T", "G", "G", "T", "G", "R"),
loser = c("P", "P", "P", "P", "L", "H", "L", "N", "A", "P")
)
df_sex <- data.frame(
individual = c("A", "G", "H", "L", "N", "P", "R", "T"),
sex = c("f", "m", "m", "f", "f", "m", "m", "m")
)
And the code. You'll want to pivot your game results into a long format so you can more easily join it with the table of sexes. After the tables are joined, you can pivot it back into a wide format.
library(tidyr)
library(dplyr)
df_results <- df_games |>
mutate(game = row_number()) |> # keep track of the games
pivot_longer(
cols = c(winner, loser), names_to = "result", values_to = "individual"
) |>
left_join(df_sex, by = c("individual"))
df_games_sex <- df_results |>
select(-individual) |> # remove this column so it doesn't uniquely ID each row
pivot_wider(names_from = result, values_from = sex)