the data.frame is like:
|col_1|
|------|
|A |
|B |
|B |
|A |
|A |
The expected output should be:
|col_A |col_B |
|---------|-------|
|A |NA |
|NA |B |
|A |NA |
|NA |B |
|A |NA |
|A |NA |
I can do this manually in excel but is there a concise way in R to get this done?
CodePudding user response:
Using tidyverse
you could try the following. Add row numbers to make each row unique, and then pivot_wider
to put data into wide format.
library(tidyverse)
df %>%
mutate(rn = row_number()) %>%
pivot_wider(values_from = col_1, names_from = col_1, names_prefix = "col_") %>%
select(-rn)
Output
col_A col_B
<chr> <chr>
1 A NA
2 NA B
3 NA B
4 A NA
5 A NA
CodePudding user response:
We can use dcast
library(data.table)
dcast(setDT(df1), seq_along(col_1) ~ paste0('col_', col_1),
value.var = 'col_1')[, .(col_A, col_B)]
-output
col_A col_B
<char> <char>
1: A <NA>
2: <NA> B
3: <NA> B
4: A <NA>
5: A <NA>
Or with base R
out <- model.matrix(~ col_1-1, df1)
out[] <- unique(df1$col_1)[(NA^!(out)) * col(out)]
out <- as.data.frame(out)
data
df1 <- structure(list(col_1 = c("A", "B", "B", "A", "A")),
class = "data.frame", row.names = c(NA,
-5L))