Home > Blockchain >  Split column values into separate columns in R
Split column values into separate columns in R

Time:12-13

I would like to separate a dataframe of unknown values into multiple values. For example:

example data:

df <- data.frame(ID = c(1, 2, 3, 4, 5, 6, 7),
                  var1 = c('a', 'a', 'b', 'b', 'b','c','d'))

> df
  ID  var1
  1   'a'
  2   'a'
  3   'b'
  4   'b'
  5   'b'
  6   'c'
  7   'd'

I would like to return:

df
  ID   a    b    c    d
  1   'a'   NA   NA   NA
  2   'a'   NA   NA   NA
  3    NA  'b'   NA   NA
  4    NA  'b'   NA   NA
  5    NA  'b'   NA   NA
  6    NA   NA  'c'   NA
  7    NA   NA   NA  'd'

This should probably be really simple but for the life of me I cannot figure it out and my search efforts on here came up with things similar in nature but not quite the answer I needed.

Thanks!

CodePudding user response:

You can use sapply over the unique values of df$var:

cbind.data.frame(df[1], 
                 sapply(unique(df$var), function(x) ifelse(x == df$var1, x, NA)))

output

     ID a     b     c     d    
1     1 a     NA    NA    NA   
2     2 a     NA    NA    NA   
3     3 NA    b     NA    NA   
4     4 NA    b     NA    NA   
5     5 NA    b     NA    NA   
6     6 NA    NA    c     NA   
7     7 NA    NA    NA    d   

CodePudding user response:

Another option is using pivot_wider with specifying the column names with names_from and the values with values_from of your var1 column like this:

library(tidyr)
df %>%
  pivot_wider(names_from = var1, values_from = var1)
#> # A tibble: 7 × 5
#>      ID a     b     c     d    
#>   <dbl> <chr> <chr> <chr> <chr>
#> 1     1 a     <NA>  <NA>  <NA> 
#> 2     2 a     <NA>  <NA>  <NA> 
#> 3     3 <NA>  b     <NA>  <NA> 
#> 4     4 <NA>  b     <NA>  <NA> 
#> 5     5 <NA>  b     <NA>  <NA> 
#> 6     6 <NA>  <NA>  c     <NA> 
#> 7     7 <NA>  <NA>  <NA>  d

Created on 2022-12-12 with reprex v2.0.2

  • Related