I would like to separate a dataframe of unknown values into multiple values. For example:
example data:
df <- data.frame(ID = c(1, 2, 3, 4, 5, 6, 7),
var1 = c('a', 'a', 'b', 'b', 'b','c','d'))
> df
ID var1
1 'a'
2 'a'
3 'b'
4 'b'
5 'b'
6 'c'
7 'd'
I would like to return:
df
ID a b c d
1 'a' NA NA NA
2 'a' NA NA NA
3 NA 'b' NA NA
4 NA 'b' NA NA
5 NA 'b' NA NA
6 NA NA 'c' NA
7 NA NA NA 'd'
This should probably be really simple but for the life of me I cannot figure it out and my search efforts on here came up with things similar in nature but not quite the answer I needed.
Thanks!
CodePudding user response:
You can use sapply
over the unique values of df$var
:
cbind.data.frame(df[1],
sapply(unique(df$var), function(x) ifelse(x == df$var1, x, NA)))
output
ID a b c d
1 1 a NA NA NA
2 2 a NA NA NA
3 3 NA b NA NA
4 4 NA b NA NA
5 5 NA b NA NA
6 6 NA NA c NA
7 7 NA NA NA d
CodePudding user response:
Another option is using pivot_wider
with specifying the column names with names_from
and the values with values_from
of your var1 column like this:
library(tidyr)
df %>%
pivot_wider(names_from = var1, values_from = var1)
#> # A tibble: 7 × 5
#> ID a b c d
#> <dbl> <chr> <chr> <chr> <chr>
#> 1 1 a <NA> <NA> <NA>
#> 2 2 a <NA> <NA> <NA>
#> 3 3 <NA> b <NA> <NA>
#> 4 4 <NA> b <NA> <NA>
#> 5 5 <NA> b <NA> <NA>
#> 6 6 <NA> <NA> c <NA>
#> 7 7 <NA> <NA> <NA> d
Created on 2022-12-12 with reprex v2.0.2