Home > other >  Regex Extract First all CAPS word from lines containing Dollar Signs
Regex Extract First all CAPS word from lines containing Dollar Signs

Time:11-25

I have printouts with hundreds of lines, some containing stock symbols in CAPS that I'd like to extract, e.g.

STOCKS OPTIONS SYMBOL GROUPS WORKING
$14,489.60
$14,489.60 Mark WMT D
72%
($24.00)
$45.00 ($153.00) T
2 opt
$500.00 MSFT
100 Sha

I'd like to extract:     WMT   T   MSFT
using online regex testers such as    https://regexr.com/
I spent hours trying expressions such as the following, but no luck yet to just extract just the symbols and none of the other text
$. [A-Z]\w\s

CodePudding user response:

You didn't specify a programming language so I'll assume PCRE:

regex

^.*\d .*?\K\b[A-Z] \b

data

STOCKS OPTIONS SYMBOL GROUPS WORKING
$14,489.60
$14,489.60 Mark WMT D
72%
($24.00)
$45.00 ($153.00) T
2 opt
$500.00 MSFT
100 Sha

The extracted data is WMT, T, and MSFT

https://regex101.com/r/N2shwC/1

In English:

Find every line with digits and capture the first sequence of all capital letters surrounded by word boundaries.

  • Related