Home > Back-end >  Removing all of empty characters in Python
Removing all of empty characters in Python

Time:04-15

I want to replace all of the empty characters with single empty character. I tried this:

import re

fin = open("toutput_des.txt", "r")
fout = open("toutput2_des.txt", "w")

for line in fin:
    fout.write(re.sub('\s ',' ',line))
    
fin.close()
fout.close()

It worked but it also replaced "new line" character at the end of each line with a single empty character. If I want to exclude the "new line" how can I modify regex? I also tried '\s \b' but it deleted all of the contents of the file.

CodePudding user response:

You may simply append the newline back after replacing.

However, you do not need a regex here, you can use

for line in fin:
    fout.write(' '.join(line.split())   '\n')

Note that line.split() splits the string with any whitespace while removing leading and trailing whitespaces, and ' '.join(...) joins the items back with a single space.

If you need to use a regex solution, then you can subtract \n from \s:

re.sub(r'[^\S\n] ', ' ', line)

The [^\S\n] regex matches any one or more chars other than non-whitespace and line feed char, i.e. it matches any whitespace chars but line feed chars. You probably also want to .lstrip() the result.

  • Related