Home > Blockchain >  Regex to extract specific text
Regex to extract specific text

Time:09-13

I have the following HTML string that I'm trying to extract specific text. (BASEBALL, FOOTBALL)

I've tried certain regexs but I can only get the first match or I can use look behind but that is not supported by mobile safari. Any better ways?

This text will ALWAYS be preceeded by style='font-weight:bold;'> and can be used to determine this text without any other issues.

<div><span > <b>19:43:08 pm</b></span> <strong><span style="cursor:pointer;">Gello:</span></strong> <span><strong>These are my favorite sports -- <div><button  class='btn' type='button'  style='font-weight:bold;'>BASEBALL</span></button></div> gets <div class='dropdown' style='display:inline-block;'><button  class='btn' type='button' data-toggle='dropdown' style='font-weight:bold;'>FOOTBALL</span></button></div> oijd;osijf osidj osd jfsoij fosj f.</strong></span></div>

CodePudding user response:

You can use a group with a lazy match in your regex.

const rx = /style='font-weight:bold;'>(.*?)<\/span>/g

const found = []
let m = rx.exec(input)
while (m) {
 found.push(m[1])
 let m = rx.match(input)
}

If you have false positives, you might want to limit the characters in the group. In this case, you do not even have to match the end tag following your text.

const rx = /style='font-weight:bold;'>([A-Z] )/g

CodePudding user response:

style=(?:'|")font-weight:bold;(?:'|")>(\w )

This regex will detect ' or ", but you still need to delete style to > to get the BASEBALL and FOOTBALL

  • Related