How to move elements of a column up to top of dataframe in R-CodePudding

So, I have a dataframe that looks something like this:

A    B    C

a1   NA   NA

a2   NA   NA

NA   b1   NA

NA   NA   c1

NA   NA   c2

I want to make it look like this:

A    B    C

a1   b1   c1

a2   NA   c2

In other words, right now the dataframe is perfectly diagonal. I want to end up with 3 orderly columns and it doesn't matter to me if there are NAs after the last element in each column. How can I do this in R? Tidyverse solution would be best but not exclusively looking for one.

CodePudding user response：

We could loop across the columns, order based on the NA elements and then filter only rows having at least one non-NA

library(dplyr)
df1 %>%
  mutate(across(everything(), ~ .x[order(is.na(.x))])) %>%
  filter(if_any(everything(), complete.cases))

-output

   A    B  C
1 a1   b1 c1
2 a2 <NA> c2

Or using base R

df1[] <- lapply(df1, \(x)  x[order(is.na(x))])
df1[rowSums(!is.na(df1)) > 0,]
   A    B  C
1 a1   b1 c1
2 a2 <NA> c2

data

df1 <- structure(list(A = c("a1", "a2", NA, NA, NA), B = c(NA, NA, "b1", 
NA, NA), C = c(NA, NA, NA, "c1", "c2")), class = "data.frame", 
row.names = c(NA, 
-5L))

CodePudding user response：

Update: Code a little shorter:

df %>% 
  pivot_longer(everything()) %>% 
  arrange(name, value) %>% 
  mutate(x = max(parse_number(value), na.rm = TRUE)) %>% 
  group_by(name) %>% 
  slice(1:x[1]) %>% 
  pivot_wider(names_from = name, values_from = value, values_fn = list) %>% 
  unnest(cols = c(A, B, C))

Here is an alternative (longer :-)) way:

library(tidyverse)

df %>% 
  pivot_longer(
    everything()
  ) %>% 
  drop_na() %>% 
  group_by(name) %>% 
  summarise(cur_data()[seq(max(count(df, df$name)$n)),]) %>% 
  pivot_wider(values_fn = list) %>% 
  unnest(cols = c(A, B, C)) %>% 
  filter(!if_all(everything(), ~ is.na(.)))

 A     B     C    
  <chr> <chr> <chr>
1 a1    b1    c1   
2 a2    NA    c2

CodePudding user response：

Another possible solution, based on purrr::map_dfc:

library(purrr)

map_dfc(df, ~ {y <- .x[!is.na(.x)]; length(y) <- max(colSums(!is.na(df))); y})

#> # A tibble: 2 × 3
#>   A     B     C    
#>   <chr> <chr> <chr>
#> 1 a1    b1    c1   
#> 2 a2    <NA>  c2