Stack data (maybe pivot_longer) but complicated, R-CodePudding

I have data like this:

df<-structure(list(record_id = c(1, 2, 4), alcohol = c(1, 2, 1), 
ethnicity = c(1, 1, 1), bilateral_vs_unilateral = c(1, 2, 
2), fat_grafting = c(1, 1, 0), number_of_adm_sheets_used = c(1, 
NA, NA), number_of_adm_sheets_used_2 = c(1, 1, 1), number_of_fills = c(7, 
NA, NA), number_of_fills_2 = c(7, NA, 2), total_fill_volume_ml_left = c(240, 
NA, NA), total_volume_ml = c(240, 300, 550), implant_size_l = c(NA_real_, 
NA_real_, NA_real_), implant_size_l_2 = c(NA_real_, NA_real_, 
NA_real_)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", 
"data.frame"))

It is info about patients with each row representing a patient that underwent breast surgery.

I'd like to change it into each row representing a particular breast (of the two). There are several variables, everything from 'number_of_adm_sheets_used' to 'implant_size_l_2' that have a column for each side. I'd like to change those to represent either. An example is 'number_of_adm_sheets_used' stood for on the left side, and 'number_of_adm_sheets_used_2' was on the right side. I'd like to combine them to become one column of sheets used that was for either side.

My expected output would look like:

Pre-

Post-

I figure its some variant of pivot_longer but I'm having trouble with a few aspects:

the real data has 68 columns
I only need a duplicate row if the column "bilateral_vs_unilateral" is a "1" (meaning bilateral)
The way I've used pivot_longer before, you'd say "cols" and pick a big range, I'm not sure how to stack pairs of columns, if that makes sense.
Luckily, despite having 68 other columns, all of the "trouble" columns are shown below. Pairing 'number_of_adm_sheets_used' with 'number_of_adm_sheets_used_2' 'number_of_fills' with 'number_of_fills_2' 'total_fill_volume_ml_left' with 'total_volume_ml' and 'implant_size_1' with 'implant_size_1_2'

Thank you

CodePudding user response：

Here is one possibility, if I'm understanding the issue correctly.

# Make long format
df.long <- df %>%
  pivot_longer(cols = -record_id) %>%
  mutate(subject = ifelse(str_sub(name, -2, -1) == "_2", "breast 2", NA),
         name = str_remove(name, "_2")) %>%
  group_by(record_id, name) %>%
  mutate(subject = case_when(
    subject == "breast 2" ~ subject,
    n() == 2 ~ "breast 1",
    n() == 1 ~ "patient"
  )) %>%
  ungroup()

# statistics regarding the patient
patient <- df.long %>%
  filter(subject == "patient") %>%
  pivot_wider(names_from = name, values_from = value) %>%
  select(-subject)
# statistics regarding each breast
breasts <- df.long %>%
  filter(str_detect(subject, "breast")) %>%
  pivot_wider(names_from = name, values_from = value)

# merge the two data.frames
patient %>%
  inner_join(breasts) %>%
  select(record_id, subject, everything())