How to format a text file(remove unwanted line spacing and paragraph spacing) in python?-CodePudding

I have extracted an email and save it to a text file that is not properly formatted.

The file looks like this:

                Hi Kim,
               



                     Hope you are fine.
                



                  Your Code is:
                 

                    42483423



                 Thanks and Regards,
                        

                    Bolt

I want to open and edit this file and arrange it in a proper format removing all the spaces before the text and below the text in the proper format like:

Hi Kim,
Hope you are fine.
Your Code is:
42483423
Thanks and Regards,
Bolt

My start procedure,

file = open('email.txt','rw')

CodePudding user response：

You can use re.sub:

import re
re.sub('\s\s ', '\n', s)

CodePudding user response：

If you have the entire text in a single string (s), you could do something like this:

formatted = "\n".join(filter(None, (x.strip() for x in s.split("\n"))))

That:

splits the string into separate lines
strips any leading and trailing whitespace
filters out empty strings
rejoins into a multi-line string

Result:

Hi Kim,
Hope you are fine.
Your Code is:
42483423
Thanks and Regards,
Bolt

CodePudding user response：

We can read the input file line by line and ignore the rows which do not have anything but spaces and newlines. Finally, we output the filtered lines with a new line at the end.

with open("output_file.txt", "w") as fw:
    with open("text_file.txt") as fr:
            for row in fr:
                r_s = row.strip()
                if len(r_s) > 0:
                    fw.write(r_s "\n")

The output_file.txt is as follows:

Hi Kim,
Hope you are fine.
Your Code is:
42483423
Thanks and Regards,
Bolt