Home > Software engineering >  Deleting nth delimiter in R
Deleting nth delimiter in R

Time:10-14

I am trying to delete the 5th delimiter in this string:

"Bacteria_Firmicutes_Clostridia_Clostridiales_Rumino_coccaceae_Ruminococcus_Ruminococcus_albus"

so it becomes:

"Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"

This seems to work, but I feel like there should be a more elegant solution possibly with regex and str_replace

library(stringr)
name <- "Bacteria_Firmicutes_Clostridia_Clostridiales_Rumino_coccaceae_Ruminococcus_Ruminococcus_albus"
index <- str_locate_all(name, "_")[[1]]
str_sub(name, index[5, "start"], index[5, "end"]) <- ""
name

CodePudding user response:

Try gsub:

> gsub("((?:[^_] _){4}[^_] )_", "\\1", name)
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 

Or a less "pretty" way:

> gsub("([^_]*_[^_]*_[^_]*_[^_]*_[^_]*)_", "\\1", name)
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 

Or with the strex library:

> library(strex)
> paste(str_before_nth(name, "_", 5), str_after_nth(name, "_", 5), sep="")
[1] "Bacteria_Firmicutes_Clostridia_Clostridiales_Ruminococcaceae_Ruminococcus_Ruminococcus_albus"
> 
  • Related