Home > Mobile >  Pivot_longer column groups
Pivot_longer column groups

Time:04-28

In this dataframe I need to pivot_longer the two column groups starting with f, with each group pivoted separately:

df <- structure(list(bigr_1_2 = c("i_PNP 'm_VBB",NA, NA), 
                     bigr_2_3 = c("it_PNP 's_VBZ", "'ve_VHB got_VVN", NA), 
                     bigr_3_4 = c("you_PNP know_VVB", "it_PNP 's_VBZ", "'ve_VHB got_VVN"), 
                     f_bigr_1_2 = c(14010L, NA, NA), 
                     f_bigr_2_3 = c(31831L, 10089L, NA), 
                     f_bigr_3_4 = c(14157L, 31831L, 10089L),
                     f1 = c(1,2,3),
                     f2 = c(4,5,6),
                     f3 = c(7,8,9)), 
                class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))

EDIT: This two-step pivoting does not work:

library(tidyr)
df %>%
  pivot_longer(cols = matches("^f_bigr"), 
                      names_to = "bigr", 
                      values_to = "f_bigr") %>%
  pivot_longer(cols = matches("^f\\d"), 
                      names_to = "w", 
                      values_to = "f_w")

How can it be properly done, perhaps (but not necessariyl) in one step? I've experimented with names_pattern but have not come to terms. Any help?

The expected outcome:

bigr_1_2     bigr_2_3        bigr_3_4           w      f_w     bigr   f_bigr
  <chr>        <chr>           <chr>            <chr>  <dbl>   <chr>   <int>
1 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB   f1   1   f_bigr_1_2  14010
2 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB   f2   4   f_bigr_2_3  31831
3 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB   f3   7   f_bigr_3_4  14157
4 NA           've_VHB got_VVN it_PNP 's_VBZ      f1   2   f_bigr_1_2     NA
5 NA           've_VHB got_VVN it_PNP 's_VBZ      f2   5   f_bigr_2_3  10089
6 NA           've_VHB got_VVN it_PNP 's_VBZ      f3   8   f_bigr_3_4  31831
7 NA           NA              've_VHB got_VVN    f1   3   f_bigr_1_2     NA
8 NA           NA              've_VHB got_VVN    f2   6   f_bigr_2_3     NA
9 NA           NA              've_VHB got_VVN    f3   9   f_bigr_3_4  10089

CodePudding user response:

This cannot be done by one pivot_longer name_vars per row are often similar. The closes you can get:

df %>%
pivot_longer(starts_with('f'), names_to = c('.value', 'grp'), 
                               names_pattern = '([^_] )_?(\\d )')

# A tibble: 9 x 6
  bigr_1_2     bigr_2_3        bigr_3_4         grp    bigr     f
  <chr>        <chr>           <chr>            <chr> <int> <dbl>
1 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB 1     14010     1
2 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB 2     31831     4
3 i_PNP 'm_VBB it_PNP 's_VBZ   you_PNP know_VVB 3     14157     7
4 NA           've_VHB got_VVN it_PNP 's_VBZ    1        NA     2
5 NA           've_VHB got_VVN it_PNP 's_VBZ    2     10089     5
6 NA           've_VHB got_VVN it_PNP 's_VBZ    3     31831     8
7 NA           NA              've_VHB got_VVN  1        NA     3
8 NA           NA              've_VHB got_VVN  2        NA     6
9 NA           NA              've_VHB got_VVN  3     10089     9

This captures the 1,2,3 part of the name and put is as the group variable.

Although there might be other packages to tackle your problem. Note that the frame you have has no common grouping variable hence cannot be transformed into a wide column at once.

CodePudding user response:

Regexes can be combined e.g. with (foo|bar):

df <- structure(list(bigr_1_2 = c("i_PNP 'm_VBB",NA, NA), 
                     bigr_2_3 = c("it_PNP 's_VBZ", "'ve_VHB got_VVN", NA), 
                     bigr_3_4 = c("you_PNP know_VVB", "it_PNP 's_VBZ", "'ve_VHB got_VVN"), 
                     f_bigr_1_2 = c(14010L, NA, NA), 
                     f_bigr_2_3 = c(31831L, 10089L, NA), 
                     f_bigr_3_4 = c(14157L, 31831L, 10089L),
                     f1 = c(1,2,3),
                     f2 = c(4,5,6),
                     f3 = c(7,8,9)), 
                class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))

df %>% pivot_longer(cols = matches("^f(\\d|_bigr)"), names_to = "w") 
  • Related