I'm trying to ignore a specific character and everything before it, then perform a regex after it.
Example data:
Stack103Overflow!Yes:Overflow0204Stack02
So lets say I would like the output to be:
Stack103Overflow!Yes:OverflowStack
I know the regex to match all digits is [0-9] However, if performed normally "103" would be matched.
Conclusion:
Ignore a delimiter (in this case, ":") and everything before it.
Perform the regex after the specified delimiter.
What I have attempted:
I've tried reading into "lookahead" and lookbehind" but unable to grasp how to do it. I also can split my data, into 2 files (at the delimiter) so we can have left.txt and right.txt (perform the regex [0-9] on right.txt) then realign with the "paste" command. However this is extremely inconvenient and I've been trying to find a better permanent solution.
My make shift method is:
awk -F':' '{print $1" file.txt > l.txt
awk -F':' '{print $2" file.txt > r.txt
# I then open my r.txt in a text editor and remove: [0-9]
# I then rejoin the file with:
paste l.txt r.txt > fileresults.txt
This method is very inefficient and time consuming, I think it's possible to match all occurances of [0-9] after a ":" with a regular expression.
CodePudding user response:
Using sed
$ sed -E ':a;s/(:[^0-9]*)[0-9] /\1/;ta' input_file
Stack103Overflow!Yes:OverflowStack
CodePudding user response:
Setup:
$ cat file.txt
Stack103Overflow!Yes:Overflow0204Stack02
Some0therL1ne|I'm here 100%:Now where'd I put those 4 cases of s0up
the 3rd line | 123456 some stuff : 1a2b3c4d5e
Since OP is already using awk
we'll look at one idea using a single awk
call:
$ awk 'BEGIN {FS=OFS=":"} {gsub(/[0-9]/,"",$2)} 1' file.txt
Stack103Overflow!Yes:OverflowStack
Some0therL1ne|I'm here 100%:Now where'd I put those cases of sup
the 3rd line | 123456 some stuff : abcde