So, I have a dataframe that looks something like this:
A B C
a1 NA NA
a2 NA NA
NA b1 NA
NA NA c1
NA NA c2
I want to make it look like this:
A B C
a1 b1 c1
a2 NA c2
In other words, right now the dataframe is perfectly diagonal. I want to end up with 3 orderly columns and it doesn't matter to me if there are NAs after the last element in each column. How can I do this in R? Tidyverse solution would be best but not exclusively looking for one.
CodePudding user response:
We could loop across
the columns, order
based on the NA
elements and then filter
only rows having at least one non-NA
library(dplyr)
df1 %>%
mutate(across(everything(), ~ .x[order(is.na(.x))])) %>%
filter(if_any(everything(), complete.cases))
-output
A B C
1 a1 b1 c1
2 a2 <NA> c2
Or using base R
df1[] <- lapply(df1, \(x) x[order(is.na(x))])
df1[rowSums(!is.na(df1)) > 0,]
A B C
1 a1 b1 c1
2 a2 <NA> c2
data
df1 <- structure(list(A = c("a1", "a2", NA, NA, NA), B = c(NA, NA, "b1",
NA, NA), C = c(NA, NA, NA, "c1", "c2")), class = "data.frame",
row.names = c(NA,
-5L))
CodePudding user response:
Update: Code a little shorter:
df %>%
pivot_longer(everything()) %>%
arrange(name, value) %>%
mutate(x = max(parse_number(value), na.rm = TRUE)) %>%
group_by(name) %>%
slice(1:x[1]) %>%
pivot_wider(names_from = name, values_from = value, values_fn = list) %>%
unnest(cols = c(A, B, C))
Here is an alternative (longer :-)) way:
library(tidyverse)
df %>%
pivot_longer(
everything()
) %>%
drop_na() %>%
group_by(name) %>%
summarise(cur_data()[seq(max(count(df, df$name)$n)),]) %>%
pivot_wider(values_fn = list) %>%
unnest(cols = c(A, B, C)) %>%
filter(!if_all(everything(), ~ is.na(.)))
A B C
<chr> <chr> <chr>
1 a1 b1 c1
2 a2 NA c2
CodePudding user response:
Another possible solution, based on purrr::map_dfc
:
library(purrr)
map_dfc(df, ~ {y <- .x[!is.na(.x)]; length(y) <- max(colSums(!is.na(df))); y})
#> # A tibble: 2 × 3
#> A B C
#> <chr> <chr> <chr>
#> 1 a1 b1 c1
#> 2 a2 <NA> c2