I would like to apply pivot_wider
to a tidyverse tibble by combining multiple columns without combining the names. My data contains information about patients' medication. There are various drugs that a given patient may or may not take. The columns contain the drug names and the daily doses in arbitrary order.
The problem is that my current approach produces far too many columns because the drug names are combined. See this reproducible example:
library(tidyverse)
# Let's have data of patients who may or may not take some drugs with a
# corresponding dose:
(
medication <- tribble(
~dob, ~drug_1, ~dose_1, ~drug_2, ~dose_2, ~drug_3, ~dose_3,
"1970-01-01", "Drug A", 100, NA, NA, NA, NA,
"1980-01-01", "Drug B", 150, "Drug A", 200, NA, NA,
"1990-01-01", NA, NA, "Drug C", 500, "Drug B", 100
)
)
# The desired arrangement is as follows:
#
# dob | 'Drug A' | 'Drug B' | 'Drug C'
# -----------|-------------------------------
# 1970-01-01 | 100 | NA | NA
# 1980-01-01 | 200 | 150 | NA
# 1980-01-01 | NA | 100 | 500
# The following attempt to pivot wider creates too many columns by combining all the drug names:
medication %>%
pivot_wider(names_from = starts_with("drug_"),
values_from = starts_with("dose_"))
# # A tibble: 3 × 10
# dob `dose_1_Drug A_NA_NA` `dose_1_Drug B_Drug A_NA` `dose_1_NA_Drug C_Drug B` `dose_2_Drug A_NA_NA` `dose_2_Drug B_Drug A_NA` `dose_2_NA_Drug C_Drug B` `dose_3_Drug A_NA_NA` `dose_3_Drug B_Drug A_NA` `dose_3_NA_Drug C_Drug B`
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1970-01-01 100 NA NA NA NA NA NA NA NA
# 2 1980-01-01 NA 150 NA NA 200 NA NA NA NA
# 3 1990-01-01 NA NA NA NA NA 500 NA NA 100
I even tried to apply the pivot_wider
function multiple times, each time explicitly stating names_from = "drug_1"
and so on, but this causes a "columns already exist" error.
Is there a way to achieve the desired arrangement as shown above? Thanks.
CodePudding user response:
Two steps, pivoting longer first:
medication %>%
pivot_longer(
cols = -dob,
names_to = c(".value", NA),
names_sep = "_",
values_drop_na = TRUE
) %>%
pivot_wider(
names_from = drug,
values_from = dose
)
# A tibble: 3 x 4
dob `Drug A` `Drug B` `Drug C`
<chr> <dbl> <dbl> <dbl>
1 1970-01-01 100 NA NA
2 1980-01-01 200 150 NA
3 1990-01-01 NA 100 500