Home > Enterprise >  golang regex get the string including the search character
golang regex get the string including the search character

Time:06-04

I am extracting a piece of string from a string (link): https://arteptweb-vh.akamaihd.net/i/am/ptweb/100000/100000/100095-000-A_0_VO-STE[ANG]_AMM-PTWEB_XQ.1V7rLEYkPH.smil/master.m3u8 and the desired output should be 100000/100000/100095-000-A_. I am using the Regex ^.?(/i,na,fm,d(/am/ptweb/|. =. ,))([^_]).*?$ in Golang flavor and I can get only the group 4 with the folowing output 100000/100000/100095-000-A however I want the underscore after A. Bit stuck on this, any help on this is appreciated.

CodePudding user response:

You can use

(/(i|na|fm|d)(/am/ptweb/|. =. ,))([^_]*_?)

See the regex demo.

Details:

  • (/(i|na|fm|d)(/am/ptweb/|. =. ,)) - Group 1:
    • / - a / char
    • (i|na|fm|d) - Group 2: i, na, fm or d
    • (/am/ptweb/|. =. ,) - Group 3: /amp/ptweb/ or one or more chars as many as possible (other than line break chars), =, one or more chars as many as possible (other than line break chars) and a , char
  • ([^_]*_?) - Group 4: zero or more chars other than _ and then an optional _.

CodePudding user response:

You can match the underscore after the A like:

^.*?(/(?:[id]|na|fm)([,/]?)(/am/ptweb/|. =. ,))([^_]*_).*$

See a regex demo

A few notes about the pattern that you tried:

  • This notation is a character class [i,na,fm,d] which should be a grouping (?:[id]|na|fm)
  • In this group ([,/]?) you optionally capture either , or / so in theory it could match a string that has /i//am/ptweb/
  • The last part .*?$ does not have to be non greedy as it is the last part of the pattern
  • This part [^_]* can also match spaces and newlines
  • Related