Check if there is a row with a pattern otherwise insert one-CodePudding

I have a df like the following one:

ID          Comment1    Comment2
X9999       text        text
X9999.000   text        text
Y8888       text        text
Y8888.111   text        text
Z7777.555   text        text

In the first column, there are Ids and sub-Ids. Ids are like X9999, sub-Ids like X9999.999. How could I make R check if there is any sun-Id row without the respective Id row, and if there isn't insert one?

CodePudding user response：

You can use dplyr to do a full_join on the unique codes with the .xxxx part excluded.

library(dplyr)
df2 <- full_join(df,data.frame(ID=unique(gsub('\\..*','',df$ID))))

CodePudding user response：

We can group by the ID (minus the sub-id component), then we can find any group that does not have a main ID. Then, we can use uncount to duplicate the row if it doesn't have a main ID. Then, for the first row, we can remove the sub-ID component.

library(tidyverse)

df %>%
  group_by(grp = str_replace_all(ID, "\\..*", "")) %>%
  mutate(duplicate_row = !any(ID == grp)) %>%
  uncount(case_when(duplicate_row ~ 2, TRUE ~ 1)) %>%
  mutate(ID = ifelse(row_number() == 1 &
                       duplicate_row == TRUE, str_replace_all(ID, "\\..*", ""), ID)) %>% 
  ungroup %>% 
  select(names(df))

Output

  ID        Comment1 Comment2
  <chr>     <chr>    <chr>   
1 X9999     text     text    
2 X9999.000 text     text    
3 Y8888     text     text    
4 Y8888.111 text     text    
5 Z7777     text     text    
6 Z7777.555 text     text

Data

df <- structure(list(ID = c("X9999", "X9999.000", "Y8888", "Y8888.111", 
"Z7777.555"), Comment1 = c("text", "text", "text", "text", "text"
), Comment2 = c("text", "text", "text", "text", "text")), class = "data.frame", row.names = c(NA, 
-5L))