Home > Net >  How can I get the first and last part of one wordcombination using regex
How can I get the first and last part of one wordcombination using regex

Time:11-04

How can I get only the middle part of a combined name with PCRE regex?

name: 211103_TV_storyname_TYPE

result: storyname

I have used this single line: .(\d) .(_TV_) to remove the first part: 211103_TV_

Another idea is to use (_TYPE)$ but the problem is that I don´t have in all variations of names a space to declare a second word to use the ^ for the first word and $ for the second.

The variation of the combined name is fix for _TYPE and the TV. The numbers are changing according to the date. And the storyname is variable. Any ideas?

Thanks

CodePudding user response:

You could match as least as possible chars after _TV_ until you match _TYPE

\d_TV_\K.*?(?=_TYPE)
  • \d_TV_ Match a digit and _TV_
  • \K Forget what is matched until now
  • .*? Match as least as possible characters
  • (?=_TYPE) Assert _TYPE to the right

Regex demo

Another option without a non greedy quantifier, and leaving out the digit at the start:

_TV_\K[^_]* (?>_(?!TYPE)[^_]*)*(?=_TYPE)
  • _TV_ Match literally
  • \K[^_]* Forget what is matched until now and optionally match any char except _
  • (?>_(?!TYPE)[^_]*)* Only allow matching _ when not directly followed by TYPE
  • (?=_TYPE) Assert _TYPE to the right

Regex demo


Edit

If you want to replace the 2 parts, you can use an alternation and replace with an empty string.

If it should be at the start and the end of the string, you can prepend ^ and append $ to the pattern.

\b\d{6}_TV_|_TYPE\b
  • \b\d{6}_TV_ A word boundary, match 6 digits and _TV_
  • | Or
  • _TYPE\b Match _TYPE followed by a word boundary

Regex demo

CodePudding user response:

With your shown samples, please try following regex, this creates one capturing group which contains matched values in it.

.*?_TV_([^_]*)(?=_TYPE)

OR(adding a small variation of above solution with fourth bird's nice suggestion), following is without lazy match .*? unlike above:

_TV_([^_]*)(?=_TYPE)

Here is the Online demo for above regex

Explanation: Adding detailed explanation for above.

.*?_      ##Using Lazy match to match till 1st occurrence of _ here.
TV_       ##Matching TV_ here.
([^_]*)   ##Creating 1st capturing group which has everything before next occurrence of _ here.
(?=_TYPE) ##Making sure previous values are followed by _TYPE here.
  • Related