So I have this code that separates delimited values from each of the right 3 columns and makes them into separate rows. However, I want there to be only one observation per the 3 rows while keeping the rest of the variables the same.
data %>%
select(Name, A, B, C) %>%
separate_rows(A, B, C, sep = ";") %>%
filter(!is.na(A) | !is.na(B) | !is.na(C))
As an example, here's a table that demonstrates what I want the function to do.
Input (after above code is run):
Name | A | B | C | |
---|---|---|---|---|
x | 1 | 2 | ||
y | 3 | 4 |
Desired output:
Name | A | B | C | |
---|---|---|---|---|
x | 1 | |||
x | 2 | |||
y | 3 | |||
y | 4 |
I have been trying to find a function that can do this to no avail. Does anyone know what I can use to do separate values from multiple columns to new rows?
CodePudding user response:
Something like the following could work:
df <- data.frame(Name = c("x", "y"),
A = c(1, NA),
B = c(NA, 3),
C = c(2, 4))
library(tidyverse)
df |>
pivot_longer(-Name) |>
filter(!is.na(value)) |>
group_by(Name) |>
mutate(id = 1:n()) |>
ungroup() |>
arrange(name) |>
pivot_wider(values_from = value,
names_from = name) |>
arrange(Name, id) |>
select(-id)
which gives:
# A tibble: 4 x 4
Name A B C
<chr> <dbl> <dbl> <dbl>
1 x 1 NA NA
2 x NA NA 2
3 y NA 3 NA
4 y NA NA 4
CodePudding user response:
This can be done using bind_rows
from dplyr. Something like the following:
df_A = df %>% select(name, A) %>% filter(!is.na(value))
df_B = df %>% select(name, B) %>% filter(!is.na(value))
df_C = df %>% select(name, C) %>% filter(!is.na(value))
output_df = bind_rows(df_A, df_B, df_C)