I do some reshaping within my data. I'm basically doing two things:
- All
x
cases with the same value should be nested/summarized into a list column. - For the remaining
x
values I will have a character value iny
separated by one or more commas. I want to split up these character values into separate values and then put that into the existingy
list column as character vector. So e.g. for x == "second", I do have the char vector "C, D" (so just one value) and I want to create a char vector c("C", "D"), i.e. with length 2.
The code below seems to do what I want, but I'm getting a warning message. Although it is only a warning message, I want to make sure I'm doing the right thing.
library(tidyverse)
df <- data.frame(x = c("first", "first", "second", "third"),
y = c("A", "B", "C, D", "E, F, G"))
df
x y
1 first A
2 first B
3 second C, D
4 third E, F, G
.
df |>
group_by(x) |>
summarise(y = list(y)) |>
rowwise() |>
mutate(y = list(as.vector(y, mode = "character"))) |>
ungroup() |>
mutate(across(y, ~if_else(!str_detect(x, "first"), str_split(., ", "), y)))
which (correctly) gives:
# A tibble: 3 x 2
x y
<chr> <list>
1 first <chr [2]>
2 second <chr [2]>
3 third <chr [2]>
But with a warning:
Warning message:
Problem with `mutate()` input `..1`.
i `..1 = across(...)`.
i argument is not an atomic vector; coercing
Waht can/should I do?
CodePudding user response:
You can get this result more easily with tidyr::separate_rows
:
df |>
separate_rows(y) |>
group_by(x) |>
summarise(y = list(y))
# x y
# <chr> <list>
# 1 first <chr [2]>
# 2 second <chr [2]>
# 3 third <chr [3]>