library(data.table)
dat1 <- data.table(id = c(1, 2, 34, 99),
class = c("sports", "", "music, sports", ""),
hobby = c("knitting, music, sports", "", "", "music"))
> dat1
id class hobby
1 1 sports knitting, music, sports
2 2
3 34 music, sports
4 99 music
I have the above dataset, dat1
, where each row corresponds to a unique id
. For each id
, multiple inputs for either class
or hobby
are separated by a comma.
I would like to exchange the row and column of this dataset so that I get the following:
input class hobby
1 sports 1, 34 1
2 knitting 1
3 music 34 1, 99
In this dataset, each row corresponds to a unique input
from dat1
. Now the class
and hobby
columns are storing the corresponding id
s from dat1
, each separated by a comma.
Is there a quick way to swap the row and columns like this in R?
CodePudding user response:
Here is an (not that long) tidyverse
option. The issue is that values for the future 'input` column are scattered into multiple columns and can contain several value into the same string.
First a pivot_longer
to have all the future input
into the same column:
dat2 = dat1 %>% pivot_longer(!id)
dat2
# A tibble: 8 x 3
id name value
<dbl> <chr> <chr>
1 1 class "sports"
2 1 hobby "knitting, music, sports"
3 2 class ""
4 2 hobby ""
5 34 class "music, sports"
6 34 hobby ""
7 99 class ""
8 99 hobby "music"
Then the approach is to use the stringr
package to split all the potential input
values to then pivoting again to have all combination. Then filtering for the row having an input
(nchar(input)>0
).
Finally grouping by the input
and name
to paste the multiple id
together id = paste0(id, collapse = ",")
. And a last pivot_wider
to have the data frame as desired:
cbind(dat2, str_split(str_remove_all(dat2$value, ","), " ", simplify = T)) %>%
select(!value) %>%
pivot_longer(!c("id", "name"), names_to = "name_2", values_to = "input") %>%
select(!name_2) %>%
filter(nchar(input)>0) %>%
group_by(input, name) %>%
summarise(id = paste0(id, collapse = ",")) %>%
pivot_wider(names_from = name, values_from = id)
# A tibble: 3 x 3
# Groups: input [3]
input hobby class
<chr> <chr> <chr>
1 knitting 1 NA
2 music 1,99 34
3 sports 1 1,34
CodePudding user response:
Alternative way, using dplyr, listing the values in the rows by calling strsplit
then unnesting it, pivoting longer, back into pivot wider, to reverse the values.
library(dplyr)
dat1 |>
rowwise() |>
mutate(across(-id, ~ ifelse(nchar(.x), strsplit(.x, ", "), list("")))) |>
unnest(cols = -id) |>
pivot_longer(cols = class:hobby) |>
filter(value != "") |>
pivot_wider(id_cols = value, names_from = name, values_from = id,
values_fn = \(x) paste(unique(x), collapse = " ,"))
value class hobby
<chr> <chr> <chr>
1 sports 1 ,34 1
2 knitting NA 1
3 music 34 1 ,99