In this dataframe I need to pivot_longer
the two column groups starting with f
, with each group pivoted separately:
df <- structure(list(bigr_1_2 = c("i_PNP 'm_VBB",NA, NA),
bigr_2_3 = c("it_PNP 's_VBZ", "'ve_VHB got_VVN", NA),
bigr_3_4 = c("you_PNP know_VVB", "it_PNP 's_VBZ", "'ve_VHB got_VVN"),
f_bigr_1_2 = c(14010L, NA, NA),
f_bigr_2_3 = c(31831L, 10089L, NA),
f_bigr_3_4 = c(14157L, 31831L, 10089L),
f1 = c(1,2,3),
f2 = c(4,5,6),
f3 = c(7,8,9)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))
EDIT: This two-step pivoting does not work:
library(tidyr)
df %>%
pivot_longer(cols = matches("^f_bigr"),
names_to = "bigr",
values_to = "f_bigr") %>%
pivot_longer(cols = matches("^f\\d"),
names_to = "w",
values_to = "f_w")
How can it be properly done, perhaps (but not necessariyl) in one step? I've experimented with names_pattern
but have not come to terms. Any help?
The expected outcome:
bigr_1_2 bigr_2_3 bigr_3_4 w f_w bigr f_bigr
<chr> <chr> <chr> <chr> <dbl> <chr> <int>
1 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB f1 1 f_bigr_1_2 14010
2 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB f2 4 f_bigr_2_3 31831
3 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB f3 7 f_bigr_3_4 14157
4 NA 've_VHB got_VVN it_PNP 's_VBZ f1 2 f_bigr_1_2 NA
5 NA 've_VHB got_VVN it_PNP 's_VBZ f2 5 f_bigr_2_3 10089
6 NA 've_VHB got_VVN it_PNP 's_VBZ f3 8 f_bigr_3_4 31831
7 NA NA 've_VHB got_VVN f1 3 f_bigr_1_2 NA
8 NA NA 've_VHB got_VVN f2 6 f_bigr_2_3 NA
9 NA NA 've_VHB got_VVN f3 9 f_bigr_3_4 10089
CodePudding user response:
This cannot be done by one pivot_longer
name_vars per row are often similar. The closes you can get:
df %>%
pivot_longer(starts_with('f'), names_to = c('.value', 'grp'),
names_pattern = '([^_] )_?(\\d )')
# A tibble: 9 x 6
bigr_1_2 bigr_2_3 bigr_3_4 grp bigr f
<chr> <chr> <chr> <chr> <int> <dbl>
1 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB 1 14010 1
2 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB 2 31831 4
3 i_PNP 'm_VBB it_PNP 's_VBZ you_PNP know_VVB 3 14157 7
4 NA 've_VHB got_VVN it_PNP 's_VBZ 1 NA 2
5 NA 've_VHB got_VVN it_PNP 's_VBZ 2 10089 5
6 NA 've_VHB got_VVN it_PNP 's_VBZ 3 31831 8
7 NA NA 've_VHB got_VVN 1 NA 3
8 NA NA 've_VHB got_VVN 2 NA 6
9 NA NA 've_VHB got_VVN 3 10089 9
This captures the 1,2,3
part of the name and put is as the group variable.
Although there might be other packages to tackle your problem. Note that the frame you have has no common grouping variable hence cannot be transformed into a wide column at once.
CodePudding user response:
Regexes can be combined e.g. with (foo|bar)
:
df <- structure(list(bigr_1_2 = c("i_PNP 'm_VBB",NA, NA),
bigr_2_3 = c("it_PNP 's_VBZ", "'ve_VHB got_VVN", NA),
bigr_3_4 = c("you_PNP know_VVB", "it_PNP 's_VBZ", "'ve_VHB got_VVN"),
f_bigr_1_2 = c(14010L, NA, NA),
f_bigr_2_3 = c(31831L, 10089L, NA),
f_bigr_3_4 = c(14157L, 31831L, 10089L),
f1 = c(1,2,3),
f2 = c(4,5,6),
f3 = c(7,8,9)),
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L))
df %>% pivot_longer(cols = matches("^f(\\d|_bigr)"), names_to = "w")