I want to replace all of the empty characters with single empty character. I tried this:
import re
fin = open("toutput_des.txt", "r")
fout = open("toutput2_des.txt", "w")
for line in fin:
fout.write(re.sub('\s ',' ',line))
fin.close()
fout.close()
It worked but it also replaced "new line" character at the end of each line with a single empty character. If I want to exclude the "new line" how can I modify regex? I also tried '\s \b' but it deleted all of the contents of the file.
CodePudding user response:
You may simply append the newline back after replacing.
However, you do not need a regex here, you can use
for line in fin:
fout.write(' '.join(line.split()) '\n')
Note that line.split()
splits the string with any whitespace while removing leading and trailing whitespaces, and ' '.join(...)
joins the items back with a single space.
If you need to use a regex solution, then you can subtract \n
from \s
:
re.sub(r'[^\S\n] ', ' ', line)
The [^\S\n]
regex matches any one or more chars other than non-whitespace and line feed char, i.e. it matches any whitespace chars but line feed chars.
You probably also want to .lstrip()
the result.