I am grepping on a file which occasionaly has words that have alternating whitespaces in them.
For instance:
h e l l o this is an e x a m p le
I would like this to become:
hello this is an example
I am open for any command line tools to solve this problem. I would take the risk of single character words getting squashed (since they occur very seldomly in my files).
E. g.: h e l l o this is a r i s k I would take.
becoming hello this is ariskI would take.
CodePudding user response:
Something like this would work:
(?:(?<=^)|(?<= ))([^ ]) (?=[^ ] )
https://regex101.com/r/yLccGg/1
CodePudding user response:
An example using node.js:
$ node -e "fs=require('fs'),fn='input.txt';fs.writeFileSync(fn,fs.readFileSync(fn,{encoding:'utf8'}).replace(/(?<=\b[a-z]) (?![A-Z]|\w\w\w)/g, ''));"
A space is replaced if it follows a lower-case letter and is not followed by a capital letter or three consecutive word characters.