Is it possible to merge columns with the same identifier and stack these on top of one another from the same dataframe. Also to then remove NA values as seen in the Desired DataFrame output?
Current DataFrame
id Col1 Col2 Col3
1 A NA C
1 NA B NA
2 NA B NA
2 A NA C
3 A NA C
Desired DataFrame
id NewCol
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 C
CodePudding user response:
Reshape from 'wide' to 'long' with pivot_longer
while removing the NA
s with values_drop_na
and then do the arrange
on all the columns
library(tidyr)
library(dplyr)
pivot_longer(df1, cols = starts_with('Col'), values_to = 'NewCol',
names_to = NULL, values_drop_na = TRUE) %>%
arrange(across(everything()))
-output
# A tibble: 8 × 2
id NewCol
<int> <chr>
1 1 A
2 1 B
3 1 C
4 2 A
5 2 B
6 2 C
7 3 A
8 3 C
data
df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 3L), Col1 = c("A", NA,
NA, "A", "A"), Col2 = c(NA, "B", "B", NA, NA), Col3 = c("C",
NA, NA, "C", "C")), class = "data.frame", row.names = c(NA, -5L
))
CodePudding user response:
As an alternative we could use splitshakeshape
packackge:
library(splitstackshape)
library(dplyr)
df %>%
merged.stack(df, id.vars="id",
var.stubs="Col", sep = ".") %>%
na.omit() %>%
select(-.time_1)
id Col
1: 1 A
2: 1 B
3: 1 C
4: 2 A
5: 2 B
6: 2 C
7: 3 A
8: 3 C