I have a question about tidyr::pivot_longer().
Lets say I start with an imaginary dataframe. Imagine people were asked two Questions about two Genres. SQ1_2 would be the second question about the first genre.
set.seed(1234)
genres <- c("Crime", "Horror", "Love", "Sci-Fi", NA)
wide <- data.frame(
ID = 1:10,
Genre_1 = sample(genres, 10, replace = TRUE),
Genre_2 = sample(genres, 10, replace = TRUE),
SQ1_1 = sample(1:5, 10, replace = TRUE),
SQ1_2 = sample(1:5, 10, replace = TRUE),
SQ2_1 = sample(1:5, 10, replace = TRUE),
SQ2_2 = sample(1:5, 10, replace = TRUE)
)
ID | Genre_1 | Genre_2 | SQ1_1 | SQ1_2 | SQ2_1 | SQ2_2 |
---|---|---|---|---|---|---|
1 | Sci-Fi | Sci-Fi | 3 | 5 | 2 | 2 |
2 | Horror | Sci-Fi | 4 | 1 | 3 | 5 |
I strive to achieve an end result like this:
ID | time | Genre | SQ1 | SQ2 |
---|---|---|---|---|
1 | Genre_1 | Sci-Fi | 3 | 5 |
1 | Genre_2 | Sci-Fi | 2 | 2 |
2 | Genre_1 | Horror | 4 | 1 |
2 | Genre_2 | Sci-Fi | 3 | 5 |
I used to do this task the following way:
long1 <- wide |> tidyr::pivot_longer(col = starts_with("Genre"),
names_to = "time",
values_to = "genre")
long2 <- wide |> tidyr::pivot_longer(col = c(SQ1_1, SQ2_1),
names_to = "time",
values_to = "SQ1")
long3 <- wide |> tidyr::pivot_longer(col = c(SQ1_2, SQ2_2),
names_to = "time",
values_to = "SQ2")
long <- long1 |>
dplyr::select(ID, time:genre) |>
dplyr::mutate(SQ1 = long2$SQ1,
SQ2 = long3$SQ2)
I was just curious if there is any way to create multiple new columns from different variables with one call of pivot_longer() (or any other function for that matter).
Any help is much appreciated! :)
CodePudding user response:
You need to make the naming rule consistent for those columns you want to pivot into longer format:
- The j-th genre =
Genre_j
- The i-th question about the j-th genre =
SQi_j
Then stack Genre
, SQ1
, SQ2
respectively:
library(dplyr)
library(tidyr)
wide %>%
rename_with(~ sub('(\\d)_(\\d)', '\\2_\\1', .x), starts_with('SQ')) %>%
pivot_longer(-1,
names_to = c(".value", "Time"),
names_sep = '_')
# # A tibble: 20 × 5
# ID Time Genre SQ1 SQ2
# <int> <chr> <chr> <int> <int>
# 1 1 1 Sci-Fi 3 5
# 2 1 2 Sci-Fi 2 2
# 3 2 1 Horror 4 1
# 4 2 2 Sci-Fi 3 5