Home > Software engineering >  Extract values between 3 underscores
Extract values between 3 underscores

Time:09-03

I'm trying to extract part of a character string by underscores, including an underscore:

20220801_NM7_Chrom_2399_A12_CCIH.CSV

I want get output is

Chrom_2399

My code is here

x = "20220801_NM7_Chrom_2399_A12_CCIH.CSV"
gsub("^(?:[^_] _){2}([^_] ).*", "\\1", x)

It gave me

[1] "Chrom"

How do I correct it?

CodePudding user response:

Like this?

x <- "20220801_NM7_Chrom_2399_A12_CCIH.CSV"
sub("^([^_] _){2}([^_] _[^_] )_.*", "\\2", x)
#> [1] "Chrom_2399"

Created on 2022-09-03 by the reprex package (v2.0.1)

CodePudding user response:

You can try the following code using the stringr package.

library(stringr)

x <- "20220801_NM7_Chrom_2399_A12_CCIH.CSV"
paste0(str_split(x, pattern = "_")[[1]][3],"_",str_split(x, pattern = "_")[[1]][4])
  • Related