Home > Mobile >  How to `pivot_wider` multiple columns without combining the names?
How to `pivot_wider` multiple columns without combining the names?

Time:04-29

I would like to apply pivot_wider to a tidyverse tibble by combining multiple columns without combining the names. My data contains information about patients' medication. There are various drugs that a given patient may or may not take. The columns contain the drug names and the daily doses in arbitrary order.

The problem is that my current approach produces far too many columns because the drug names are combined. See this reproducible example:

library(tidyverse)

# Let's have data of patients who may or may not take some drugs with a
# corresponding dose:
(
  medication <- tribble(
    ~dob,           ~drug_1,  ~dose_1, ~drug_2, ~dose_2, ~drug_3,  ~dose_3,
    "1970-01-01",   "Drug A", 100,     NA,      NA,      NA,       NA,
    "1980-01-01",   "Drug B", 150,    "Drug A", 200,     NA,       NA,
    "1990-01-01",   NA,       NA,     "Drug C", 500,     "Drug B", 100
  )
)

# The desired arrangement is as follows:
#
# dob        | 'Drug A' | 'Drug B' | 'Drug C'
# -----------|-------------------------------
# 1970-01-01 | 100      | NA       | NA
# 1980-01-01 | 200      | 150      | NA
# 1980-01-01 | NA       | 100      | 500

# The following attempt to pivot wider creates too many columns by combining all the drug names:

medication %>%
  pivot_wider(names_from = starts_with("drug_"),
              values_from = starts_with("dose_"))

# # A tibble: 3 × 10
#   dob        `dose_1_Drug A_NA_NA` `dose_1_Drug B_Drug A_NA` `dose_1_NA_Drug C_Drug B` `dose_2_Drug A_NA_NA` `dose_2_Drug B_Drug A_NA` `dose_2_NA_Drug C_Drug B` `dose_3_Drug A_NA_NA` `dose_3_Drug B_Drug A_NA` `dose_3_NA_Drug C_Drug B`
#   <chr>                      <dbl>                     <dbl>                     <dbl>                 <dbl>                     <dbl>                     <dbl>                 <dbl>                     <dbl>                     <dbl>
# 1 1970-01-01                   100                        NA                        NA                    NA                        NA                        NA                    NA                        NA                        NA
# 2 1980-01-01                    NA                       150                        NA                    NA                       200                        NA                    NA                        NA                        NA
# 3 1990-01-01                    NA                        NA                        NA                    NA                        NA                       500                    NA                        NA                       100

I even tried to apply the pivot_wider function multiple times, each time explicitly stating names_from = "drug_1" and so on, but this causes a "columns already exist" error.

Is there a way to achieve the desired arrangement as shown above? Thanks.

CodePudding user response:

Two steps, pivoting longer first:

medication %>% 
  pivot_longer(
    cols = -dob,
    names_to = c(".value", NA),
    names_sep = "_",
    values_drop_na = TRUE
  ) %>% 
  pivot_wider(
    names_from = drug,
    values_from = dose
  )
# A tibble: 3 x 4
  dob        `Drug A` `Drug B` `Drug C`
  <chr>         <dbl>    <dbl>    <dbl>
1 1970-01-01      100       NA       NA
2 1980-01-01      200      150       NA
3 1990-01-01       NA      100      500
  • Related