How to reposition a row in a data frame that meets certain conditions?-CodePudding

I'm using R and the tidyverse and need to be able to systematically search a data frame for "problem rows", and then replace them with a row from above.

For example, let's say I have the following data:

df <- data.frame(
  name = c("blank", "A", "B", "blank", "C", "D", "blank", "E"),
  problem = c(0, 0, 0, 0, 0, 1, 0, 1)
)

df
<name>   <problem>
blank           0
    A           0
    B           0
blank           0
    C           0
    D           1
blank           0
    E           1

In this example both names D and E have a "problem", so I need to search for the nearest "blank" row above and reposition it where the problem row currently sits. So the solution would look like the following:

<name>   <problem>
blank           0
    A           0
    B           0
    C           0
    D           1
blank           0
    E           1
blank           0

Any help would be tremendously appreciated, as I have a much larger data frame in real life and can't make the adjustments manually. Thank you!

CodePudding user response：

To keep excess blanks in place:

library(tidyverse)

df <- data.frame(
  name = c("blank", "A", "B", "blank", "C", "D", "blank", "E"),
  problem = c(0, 0, 0, 0, 0, 1, 0, 1)
)

df |> 
  mutate(temp = if_else(problem == 1, name, NA_character_)) |> 
  fill(temp, .direction = "up") |> 
  mutate(temp = if_else(name == "blank" | problem == 1, temp, name)) |> 
  add_count(temp) |> 
  group_by(temp) |> 
  mutate(temp = if_else(n > 2 & row_number() == n - 2, 
                        NA_character_, temp)) |> 
  ungroup() |> 
  fill(temp, .direction = "up") |> 
  arrange(temp, desc(problem)) |> 
  select(-temp, -n)
#> # A tibble: 8 × 2
#>   name  problem
#>   <chr>   <dbl>
#> 1 blank       0
#> 2 A           0
#> 3 B           0
#> 4 C           0
#> 5 D           1
#> 6 blank       0
#> 7 E           1
#> 8 blank       0

^{Created on 2022-06-02 by the reprex package (v2.0.1)}

CodePudding user response：

Carl, I've built off your approach to create something I think is more generalizable. It's not super elegant, but I do think it works. If anyone else wants to have a go, that would be fantastic.

# add more data to top and bottom to ensure only "problems" are being addressed
df <- data.frame(
  name = c("blank", "blank", "A", "B", "blank", "C", 
           "D", "blank", "E", "blank", "F", "blank"),
  problem = c(0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0)
)

# find position of the first non-problem name
no_prob_name <- df %>% 
  filter(name != "blank") %>% 
  slice_head(n = 1) %>%
  pull(name)

row_np_name <- which(df$name == no_prob_name)

# rearrange
df |> 
  mutate(temp = if_else(problem == 1, name, NA_character_)) |> 
  fill(temp, .direction = "up") |>
  mutate(temp = ifelse(is.na(temp) & name != "blank", name, temp)) %>% 
  fill(temp, .direction = "up") %>% 
  fill(temp, .direction = "down") %>% 
  mutate(temp = if_else(name == "blank" | problem == 1, temp, name)
         ,temp = as.character(str_glue("{temp}{problem}"))
         ) |> 
  add_count(temp) |> 
  mutate(temp = if_else(n > 1 & row_number() < row_np_name, 
                        NA_character_, temp)) |> 
  fill(temp, .direction = "up") |> 
  mutate(temp = str_sub(temp, 1, 1)) %>% 
  arrange(temp, desc(problem)) |> 
  select(-temp, -n)