How to fill in blanks based off another column's values in R-CodePudding

Here is the code and resulting dataframe

category <- c("East","BLANK","NorthEast","BLANK","BLANK")
subcat <- c("East","North","SW","NE","SE")
data1 <- as.data.frame(category)
data1$subcat <- subcat


**Category**  **Subcat**
East        East            
BLANK       North           
NorthEast   SW          
BLANK       NE          
BLANK       SE

The East category contains subcat East and North. The NorthEast category contains subcat SW,NE,SE. As you can see there are blanks for each category. How would I make so the 2nd value in Category is East and 4th and 5th row is North East? I have many more rows in the actual data so a way to do this would be helpful.

The result should be

**Category**  **Subcat**
East           East         
*East*         North            
NorthEast      SW           
*NorthEast*    NE           
*NorthEast*    SE

CodePudding user response：

Here is a dplyr only solution: First we group by every string that is not BLANK, then replace all group members with first value:

data1 %>% 
  group_by(x = cumsum(category != "BLANK")) %>% 
  mutate(category = first(category)) %>% 
  ungroup() %>% 
  select(-x)

 category  subcat
  <chr>     <chr> 
1 East      East  
2 East      North 
3 NorthEast SW    
4 NorthEast NE    
5 NorthEast SE

CodePudding user response：

We could convert the 'BLANK' to NA and use fill

library(tidyr)
library(dplyr)
data1 %>%
   na_if( "BLANK") %>%
   fill(category)

-output

    category subcat
1      East   East
2      East  North
3 NorthEast     SW
4 NorthEast     NE
5 NorthEast     SE