Home > Net >  Matching words & partial colon-delimited words within parentheses (excluding parentheses)
Matching words & partial colon-delimited words within parentheses (excluding parentheses)

Time:04-19

I am trying to extract stock symbols from a body of text. These matches usually come in the following forms:

(<symbol>) => (VOO)
(<market>:<symbol>) => (NASDAQ:C)

In the sample cases shown above, I'd like to match VOO and C, skipping everything else. This regex gets me halfway there:

(?<=\()(.*?)(?=\))

With this, I match what's included within the parentheses, but the logic that ignores "noise" like NASDAQ: eludes me. I'd love to learn how to conditionally specify this pattern/logic.

Any ideas? Thanks!

CodePudding user response:

You can use

[A-Z] (?=\))

See the regex demo.

Details:

  • [A-Z] - one or more uppercase ASCII letters
  • (?=\)) - a positive lookahead that matches a location that is immediately followed with a ) char.

Alternatively, you can use the following to capture the values into Group 1:

\((?:[^():]*:)?([A-Z] )\)

See this regex demo. Details:

  • \( - a ( char
  • (?:[^():]*:)? - an optional sequence of any zero or more chars other than (, ) and : and then a : char
  • ([A-Z] ) - Group 1: one or more uppercase ASCII letters
  • \) - a ) char.
  • Related