Home > database >  How to convert rows into columns by group?
How to convert rows into columns by group?

Time:07-21

I'd like to do association analysis using apriori algorithm.

To do that, I have to make a dataset.

What I have data is like this.

data.frame("order_number"=c("100145",  "100155", "100155", "100155", 
"500002", "500002", "500002", "500007"),
"order_item"=c("27684535","15755576", 
"1357954","124776249","12478324","15755576","13577","27684535"))
 order_number order_item

1       100145   27684535

2       100155   15755576

3       100155    1357954

4       100155  124776249

5       500002   12478324

6       500002   15755576

7       500002      13577

8       500007   27684535

and I want to transfer the data like this

data.frame("order_number"=c("100145","100155","500002","500007"), 
"col1"=c("27684535","15755576","12478324","27684535"),
"col2"=c(NA,"1357954","15755576",NA),
"col3"=c(NA,"124776249","13577",NA))
 order_number     col1     col2      col3

1       100145 27684535     <NA>      <NA>

2       100155 15755576  1357954 124776249

3       500002 12478324 15755576     13577

4       500007 27684535     <NA>      <NA>

Thank you for your help.

CodePudding user response:

This would be a case of pivot_wider (or other functions for changing column layout). First step would be creating a row id variable to note whether each is 1, 2 or 3, then shaping this into the dataframe you want:

df <- data.frame("order_number"=c("100145",  "100155", "100155", "100155", 
                            "500002", "500002", "500002", "500007"),
           "order_item"=c("27684535","15755576", 
                          "1357954","124776249","12478324","15755576","13577","27684535"))

library(tidyr)
library(dplyr)


df |>
  group_by(order_number) |> 
  mutate(rank = row_number()) |> 
  pivot_wider(names_from = rank, values_from = order_item,
              names_prefix = "col")
#> # A tibble: 4 × 4
#> # Groups:   order_number [4]
#>   order_number col1     col2     col3     
#>   <chr>        <chr>    <chr>    <chr>    
#> 1 100145       27684535 <NA>     <NA>     
#> 2 100155       15755576 1357954  124776249
#> 3 500002       12478324 15755576 13577    
#> 4 500007       27684535 <NA>     <NA>
  • Related