My data is a more complex but simplified it basically looks like this. Each row has only one Yes in one of the multiple Color:* columns. The "Color" before the colon needs to be the new column name, with the named color "Blue", "Red", or "Green" after the colon pivoted under "Color" to the correct "IDENTITY" I forgot to mention there are variable columns before and after these multiple columns I want to pivot into one column. the columns I want to pivot all have the same name before the colon.
IDENTITY | Color:Blue | Color:Red | Color:Green |
---|---|---|---|
1 | Yes | ||
2 | Yes | ||
3 | Yes |
and what I would like is to pivot to this
IDENTITY | Color |
---|---|
1 | Green |
2 | Blue |
3 | Red |
I am not sure if this is a pivot problem. I have read through the tidyr pivot documentation at
https://tidyr.tidyverse.org/articles/pivot.html
I do not see a similar example or able to identify one of the solutions that might work with my supplied data.
Can anyone help me with a code chunk I can follow to solve what seems on the face of it a simple problem, but eludes my limited proficiency with R. Thank you and Seasons Blessings
CodePudding user response:
We may convert the blanks (""
) to NA
, in character
columns by looping across
the character columns, then reshape to 'long' by selecting the cols
that have column names starts_with
"Color:"
library(tidyr)
library(dplyr)
df1 %>%
mutate(across(where(is.character), ~ na_if(.x, ""))) %>%
pivot_longer(cols = starts_with("Color:"), names_to = c(".value", "Color2"),
names_sep = ":", values_drop=TRUE) %>%
select(IDENTITY, Color = Color2)
-output
# A tibble: 3 × 2
IDENTITY Color
<int> <chr>
1 1 Green
2 2 Blue
3 3 Red
CodePudding user response:
You would need to use the pivot_longer
function. This will gather columns and turn them into rows.
Note: I had to put ``
in front of the column names since :
is one of those pesky reserved characters in R. You want to avoid having that as a column separator ( _
could be a useful substitution). You can't always control this, but it's just something to look out for.
library(tidyverse)
x <- tribble( ~IDENTITY, ~`Color:Blue`, ~`Color:Red`, ~`Color:Green`,
1L, "", "", "Yes",
2L, "Yes", "", "",
3L, "" , "Yes", "")
x %>%
rename_with(stringr::str_replace,
pattern = "Color:",
replacement = "") %>%
pivot_longer(!IDENTITY,
names_to = "Color") %>%
filter(value != "") %>%
select(!value)