Home > other >  How to combine the rows above middle, and below if condition is met?
How to combine the rows above middle, and below if condition is met?

Time:01-16

I have a data frame which is one column of parts of names. It's formatted so that some rows are first name, some are middle initial and some are last name. Last name is always the row below first name, and middle initial is always above first name. Like this

df <- 
  tibble(
    name = c( 
      "Q",
      "Name: John",
      "Smith",
      "X",
      "J",
      "Name: Homer",
      "Simpson",
      "X",
      "X",
      "Q",
      "Name: John",
      "Citizen"
    )
  )

Which gives us

> df
# A tibble: 12 × 1
   name       
   <chr>      
 1 Q          
 2 Name: John 
 3 Smith      
 4 X          
 5 J          
 6 Name: Homer
 7 Simpson    
 8 X          
 9 X          
10 Q          
11 Name: John 
12 Citizen

What I want to do is collapse the rows so that if a row starts with "Name:", then it pastes the middle row, row above and row below to get something like:

> df_fixed
# A tibble: 6 × 1
  name                 
  <chr>                
1 Name: John Q Smith   
2 X                    
3 Name: Homer J Simpson
4 X                    
5 X                    
6 Name: John Q Citizen 

CodePudding user response:

we could do it this way:

According to your provided rules using lag and lead will solve it. The rest is some tweaking:

library(dplyr)
library(stringr)

df %>% 
  mutate(x = lag(name),
         y = lead(name)) %>% 
  mutate(across(-name, ~ifelse(str_detect(name, "Name"),., ""))) %>% 
  transmute(name = str_trim(paste(name, x, y))) %>% 
  filter(str_detect(name, "Name")  | str_detect(name, "X"))
  name                 
  <chr>                
1 Name: John Q Smith   
2 X                    
3 Name: Homer J Simpson
4 X                    
5 X                    
6 Name: John Q Citizen 
  • Related