Home > database >  Removing an empty line in a file
Removing an empty line in a file

Time:11-28

I have been trying to delete lines from a file without loading in memory all the file, because it's too large (~1Gb). How i do it without leaving a blank line in the file?

For example:

I want this

foo bar
this is the line to be removed
foo bar
foo bar

To this:

foo bar
foo bar
foo bar

But I get this:

foo bar

foo bar
foo bar

So I have managed to delete the line but I also want to remove the blank line. The way I did it so far is I move the file pointer (cursor) to the place i want and then with writing ' ' overwrite the line.

a = f.tell() 
f.readline()
b = f.tell()
f.seek(a)
l2 = b-a-1
blank = " "*l2
f.write(blank)
f.seek(a)

CodePudding user response:

A much simpler approach to filtering a file in-place would be to open the same file twice, once for reading and another for writing, output only what needs to be kept, and truncate the output in the end. This way, none of tell or seek or any file position calculations would be needed:

with open('file.txt') as file, open('file.txt', 'r ') as output:
    for line in file:
        if line != 'this is the line to be removed\n':
            output.write(line)
    output.truncate()

Demo: https://replit.com/@blhsing/SeagreenSlushyAutoresponder

CodePudding user response:

If you do need to remove the lines in place, which can be fraught with danger, then you could try the following. Basically, it keeps track of the latest line read and the latest line written, and truncates from the end of the last line written once the input is exhausted. Please test before use!

with open('file.txt', 'r ') as f:
    r_pos = w_pos = f.tell()
    while True:
        f.seek(r_pos)
        line = f.readline()
        if not line:
            break
        r_pos = f.tell()
        if 'remove' not in line: # or your criteria
            f.seek(w_pos)
            f.write(line)
            w_pos = f.tell()
    f.seek(w_pos)
    f.truncate()
  • Related