Home > front end >  Perform a regular expression after a specified character
Perform a regular expression after a specified character

Time:07-17

I'm trying to ignore a specific character and everything before it, then perform a regex after it.

Example data:

Stack103Overflow!Yes:Overflow0204Stack02

So lets say I would like the output to be:

Stack103Overflow!Yes:OverflowStack

I know the regex to match all digits is [0-9] However, if performed normally "103" would be matched.

Conclusion:
Ignore a delimiter (in this case, ":") and everything before it.
Perform the regex after the specified delimiter.

What I have attempted:
I've tried reading into "lookahead" and lookbehind" but unable to grasp how to do it. I also can split my data, into 2 files (at the delimiter) so we can have left.txt and right.txt (perform the regex [0-9] on right.txt) then realign with the "paste" command. However this is extremely inconvenient and I've been trying to find a better permanent solution.

My make shift method is:

awk -F':' '{print $1" file.txt > l.txt
awk -F':' '{print $2" file.txt > r.txt
# I then open my r.txt in a text editor and remove: [0-9] 
# I then rejoin the file with:
paste l.txt r.txt > fileresults.txt

This method is very inefficient and time consuming, I think it's possible to match all occurances of [0-9] after a ":" with a regular expression.

CodePudding user response:

Using sed

$ sed -E ':a;s/(:[^0-9]*)[0-9] /\1/;ta' input_file
Stack103Overflow!Yes:OverflowStack

CodePudding user response:

Setup:

$ cat file.txt
Stack103Overflow!Yes:Overflow0204Stack02
Some0therL1ne|I'm here 100%:Now where'd I put those 4 cases of s0up
the 3rd line | 123456 some stuff : 1a2b3c4d5e

Since OP is already using awk we'll look at one idea using a single awk call:

$ awk 'BEGIN {FS=OFS=":"} {gsub(/[0-9]/,"",$2)} 1' file.txt
Stack103Overflow!Yes:OverflowStack
Some0therL1ne|I'm here 100%:Now where'd I put those  cases of sup
the 3rd line | 123456 some stuff : abcde
  • Related