Home > other >  How to pivot_wider the n unique values of variable A grouped_by variable B?
How to pivot_wider the n unique values of variable A grouped_by variable B?

Time:01-24

I am trying to pivot_wider() the column X of a data frame containing various persons names. Within group_by() another variable Y of the df there are always 2 of these names. I would like R to take the 2 unique X names values within each unique identifier of Y and put them in 2 new columns ex_X_Name_1 and ex_X_Name_2.

My data frame is looking like this:

df <- data.frame(Student = rep(c(17383, 16487, 17646, 2648, 3785), each = 2),
                 Referee = c("Paul Severe", "Cathy Nice", "Jean Exigeant", "Hilda Ehrlich", "John Rates",
                             "Eva Luates", "Fred Notebien", "Aldous Grading", "Hans Streng", "Anna Filaktic"),
                 Rating = format(round(x = sqrt(sample(15:95, 10, replace = TRUE)), digits = 3), nsmall = 3)
)

df

I would like to make the transformation of the Referee column to 2 new columns Referee_1 and Referee_2 with the 2 unique Referees assigned to each student and end with this result:

even_row_df <- as.logical(seq_len(length(df$Referee)) %% 2)

df_wanted <- data_frame(
  Student = unique(df$Student),
  Referee_1 = df$Referee[even_row_df],
  Rating_Ref_1 = df$Rating[even_row_df],
  Referee_2 = df$Referee[!even_row_df],
  Rating_Ref_2 = df$Rating[!even_row_df]
)

df_wanted

I guess I could achieve this with by subsetting unique rows of student/referee combinations and make joins , but is there a way to handle this in one call to pivot_wider?

CodePudding user response:

You should create a row id per group first:

library(dplyr)
library(tidyr)
df %>% 
  group_by(Student) %>% 
  mutate(row_n = row_number()) %>% 
  ungroup() %>% 
  pivot_wider(names_from = "row_n", values_from = c("Referee", "Rating"))

# A tibble: 5 × 5
  Student Referee_1     Referee_2      Rating_1 Rating_2
    <dbl> <chr>         <chr>          <chr>    <chr>   
1   17383 Paul Severe   Cathy Nice     9.165    7.810   
2   16487 Jean Exigeant Hilda Ehrlich  5.196    6.557   
3   17646 John Rates    Eva Luates     7.211    5.568   
4    2648 Fred Notebien Aldous Grading 4.000    8.124   
5    3785 Hans Streng   Anna Filaktic  7.937    6.325   

CodePudding user response:

using data.table

library(data.table)

setDT(df)

merge(df[, .SD[1], Student], df[, .SD[2], Student], by = "Student", suffixes = c("_1", "_2"))

# Student     Referee_1 Rating_1      Referee_2 Rating_2
# 1:    2648 Fred Notebien    6.708 Aldous Grading    9.747
# 2:    3785   Hans Streng    6.245  Anna Filaktic    8.775
# 3:   16487 Jean Exigeant    7.681  Hilda Ehrlich    4.359
# 4:   17383   Paul Severe    4.583     Cathy Nice    7.616
# 5:   17646    John Rates    6.708     Eva Luates    8.246
  • Related