Home > Back-end >  How can I groupby a dataframe and then transpose and rbind groups in R?
How can I groupby a dataframe and then transpose and rbind groups in R?

Time:01-11

library(dplyr)
sample.df <- data.frame(group.column = c("first_a","first_b","first_c","first_d",
                       "second_a","second_b","second_c","second_d",
                       "third_a","third_b","third_c","third_d"),
           value.column.first = 1:12,
           value.column.second = 13:24)

I created a buffer column and I tried the code below but it was no use.

sample.df %>% mutate(buffer = `group.column` %>% strsplit(split="_") %>% lapply(function(x) {x[1]}) %>% unlist) %>% group_by(buffer) %>% t %>% rbind

I want this table

A header a.1 a.2 b.1 b.2 c.1 c.2 d.1 d.2
first 1 13 2 14 3 15 4 16
second 5 17 6 18 7 19 8 20
third 9 21 10 22 11 23 12 24

What should I do ?

CodePudding user response:

separate your group column, and use pivot_wider:

library(tidyr)
library(dplyr)
sample.df %>% 
  separate(group.column, into = c("A header", "name")) %>% 
  pivot_wider(values_from = c(value.column.first, value.column.second),
              names_glue = "{name}.{match(.value, unique(.value))}",
              names_vary = "slowest")

# A tibble: 3 × 9
  `A header`   a.1   a.2   b.1   b.2   c.1   c.2   d.1   d.2
  <chr>      <int> <int> <int> <int> <int> <int> <int> <int>
1 first          1    13     2    14     3    15     4    16
2 second         5    17     6    18     7    19     8    20
3 third          9    21    10    22    11    23    12    24

Explanation:

  • separate divides the group.column string into two columns, one called A header with the first part (first, second, third), and one called name with a, b, c.... I call it name because that's the default value for the names_from parameter in pivot_wider.
  • in pivot_wider, names are taken from name, and values from the value columns. The names of the column are composed of the name value (a, b, c) and match(.value, unique(.value)) is a way to get the position of each of the 2 .value columns. names_vary = "slowest" is used to get the column ordering correct.
  • Related