I am trying to extract stock symbols from a body of text. These matches usually come in the following forms:
(<symbol>) => (VOO)
(<market>:<symbol>) => (NASDAQ:C)
In the sample cases shown above, I'd like to match VOO and C, skipping everything else. This regex gets me halfway there:
(?<=\()(.*?)(?=\))
With this, I match what's included within the parentheses, but the logic that ignores "noise" like NASDAQ:
eludes me. I'd love to learn how to conditionally specify this pattern/logic.
Any ideas? Thanks!
CodePudding user response:
You can use
[A-Z] (?=\))
See the regex demo.
Details:
[A-Z]
- one or more uppercase ASCII letters(?=\))
- a positive lookahead that matches a location that is immediately followed with a)
char.
Alternatively, you can use the following to capture the values into Group 1:
\((?:[^():]*:)?([A-Z] )\)
See this regex demo. Details:
\(
- a(
char(?:[^():]*:)?
- an optional sequence of any zero or more chars other than(
,)
and:
and then a:
char([A-Z] )
- Group 1: one or more uppercase ASCII letters\)
- a)
char.