I have a txt file with some text like this:
000001 - AAAAAA - BBBBBBBB - CCCC
000002 - BAAAAA - BBBBBBBB - CCCC
000003 - DAAAAA - BBBBBBBB - CCCC
...
I need a regex that ignore everything except the second colunm(AAAAAA,BAAAAA,DAAAAA,...)
Thank you
CodePudding user response:
Try this:
[^-] - ([^ -] ).*
[^-]
matches up to the first non-dash and [^- ]
matches up to the first character which is neither a space nor a dash.
With a tool like sed the command would be
sed -E 's/[^-] - ([^ -] ).*/\1/' input.txt
An even better tool for the job, however, is awk which will give the same result with the command
awk -F ' - ' '{ print $2 }' input.txt