I have a dataframe with one column and n rows, like this one:
data.frame(rep(x=c("c","a","c","b","c","d"),times=c(1,4,1,4,1,4)))
Now, I want to split up this column of the dataframe that for every c
a new column is created. The aim is to transform the dataframe that only has one column into this form:
c | c | c |
---|---|---|
a | b | d |
a | b | d |
a | b | d |
a | b | d |
CodePudding user response:
With tidyverse
, we could create a new group everytime c
appears in the x
column, then we can pivot the data wide. Generally, duplicate names are discouraged, so I created a sequential c
column names.
library(tidyverse)
results <- df %>%
group_by(idx = cumsum(x == "c")) %>%
filter(x != "c") %>%
mutate(rn = row_number()) %>%
pivot_wider(names_from = idx, values_from = x, names_prefix = "c_") %>%
select(-rn)
Output
c_1 c_2 c_3
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d
However, if you really want duplicate names, then we could add on set_names
:
purrr::set_names(results, "c")
c c c
<chr> <chr> <chr>
1 a b d
2 a b d
3 a b d
4 a b d
Or in base R, we could create the grouping with cumsum
, then split those groups, then bind back together with cbind
. Then, we remove the first row that contains the c
characters.
names(df) <- "c"
do.call(cbind, split(df, cumsum(df$c == "c")))[-1,]
# c c c
#2 a b d
#3 a b d
#4 a b d
#5 a b d
CodePudding user response:
You your columns have the same number of values, as in the example given do:
unstack(df, x ~ cumsum(x=="c"))
X1 X2 X3
1 c c c
2 a b d
3 a b d
4 a b d
5 a b d
You can then remove the first row