Home > OS >  Pivot wider with variable number of rows por ID
Pivot wider with variable number of rows por ID

Time:11-18

Here a very simple problem that is driving me crazy. I have a data frame with duplicate IDs associated with different names. What I want is to create categorial columns "Names_1", "Names_2", "Names_3"... (actually "Names_3" is enough as each ID has no more than 3 names) in order to remove duplicated IDs.

Here what I have :

ID <- c(1,2,3,3,4,4,5,6,6,6)
Names <- c("a","b","c","d","e","f","j","h","i","j")
df <- data.frame(ID,Names)
   ID Names
1   1     a
2   2     b
3   3     c
4   3     d
5   4     e
6   4     f
7   5     j
8   6     h
9   6     i
10  6     j

Here what I want :

ID <- c(1,2,3,4,5,6)
Names_1 <- c("a","b","c","e","j","h")
Names_2 <- c("","","d","f","","i")
Names_3 <- c("","","","","","j")
df <- data.frame(ID,Names_1,Names_2,Names_3)
  ID Names_1 Names_2 Names_3
1  1       a                
2  2       b                
3  3       c       d        
4  4       e       f        
5  5       j                
6  6       h       i       j

Thanks ! Valentin

CodePudding user response:

We can use dplyr's group_by(ID) to create an index, then tidyr's pivot_wider to pivot with the argument names_from = index

library(dplyr)
library(tidyr)

df %>% group_by(ID)%>%
  mutate(index = row_number())%>%
  pivot_wider(names_from=index, values_from = Names, names_prefix = 'Names_')

# A tibble: 6 × 4
# Groups:   ID [6]
     ID Names_1 Names_2 Names_3
  <dbl> <chr>   <chr>   <chr>  
1     1 a       NA      NA     
2     2 b       NA      NA     
3     3 c       d       NA     
4     4 e       f       NA     
5     5 j       NA      NA     
6     6 h       i       j   

If we really want these NAs to be empty characters instead, just add the argument values_fill = '' to the call to pivot_wider, as in:

df %>% group_by(ID)%>%
  mutate(index = row_number())%>%
  pivot_wider(names_from=index, values_from = Names, names_prefix = 'Names_', values_fill = '')

# A tibble: 6 × 4
# Groups:   ID [6]
     ID Names_1 Names_2 Names_3
  <dbl> <chr>   <chr>   <chr>  
1     1 a       ""      ""     
2     2 b       ""      ""     
3     3 c       "d"     ""     
4     4 e       "f"     ""     
5     5 j       ""      ""     
6     6 h       "i"     "j"    
  •  Tags:  
  • r
  • Related