Home > Software design >  Remove characters including and after third hyphen
Remove characters including and after third hyphen

Time:09-29

I have a dataframe df and want to remove everything including and after the third '-' in the column 'case_id':

df

case_id             unit

TCGA-3A-01-03-9441  27  
TCGA-9C-01-04-9641  15
TCGA-1E-01-05-9471  6

This is the desired output:

df

case_id     unit

TCGA-3A-01  27  
TCGA-9C-01  15
TCGA-1E-01  6

CodePudding user response:

We could use str_replace

library(stringr)
library(dplyr)
df1 %>% 
   mutate(case_id = str_replace(case_id, "^(([^-] -){2}[^-] )-.*", "\\1"))

-output

      case_id unit
1 TCGA-3A-01   27
2 TCGA-9C-01   15
3 TCGA-1E-01    6

data

df1 <- structure(list(case_id = c("TCGA-3A-01-03-9441", "TCGA-9C-01-04-9641", 
"TCGA-1E-01-05-9471"), unit = c(27L, 15L, 6L)),
 class = "data.frame", row.names = c(NA, 
-3L))
  • Related