Home > Net >  Python how to delete a specific amount of lines after or before specific string in text file
Python how to delete a specific amount of lines after or before specific string in text file

Time:10-20

All I can find is how to delete all lines after a specific word. But I only want a specific amount of deleted lines.

For example I have a file that contains:

FCT
Paris
105,4
35
2,161 million
LZQ
London
1572
11
8,982 million
PRI
Paris
105,4
35
2,161 million
Rome
1285
11
2,873 million
PRI
Paris
105,4
35
2,161 million

And now I want to delete 3 lines after Paris, the line before Paris and the line containing Paris itself.

Expected output would be:

LZQ
London
1572
11
8,982 million

What works to delete only Paris:

bad_words = ['Paris',]

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
for line in oldfile:
    if not any(bad_word in line for bad_word in bad_words):
        newfile.write(line)

CodePudding user response:

This is pretty inelegant but it works, assuming you want to remove exactly one previous line and exactly three following lines if a "bad word" is encountered. It will not work as intended if there are sometimes more lines or fewer lines following a "bad word":

bad_words = {"Paris"}  # membership tests with sets are O(1)


with open('DataSystem.txt') as oldfile:
    data = oldfile.read().split("\n")


i = 0
new_data = []
while i < len(data):
    item = data[i]
    if item in bad_words:
        del new_data[-1]
        i  = 4
        continue
    new_data.append(item)
    i  = 1

Output:

['LZQ',
 'London',
 '1572',
 '11',
 '8,982 million',
 'Rome',
 '1285',
 '11',
 '2,873 million']

You can then write this to your newfile:

with open('newfile.txt', 'w') as newfile:
    newfile.write("\n".join(new_data))

CodePudding user response:

This does just what I described. Read the file in 5 lines at a time. If no "bad word" is found in line 2, write those 5 lines out.

bad_words = ['Paris']

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
    while True:
        lines = [oldfile.readline() for _ in range(5)]
        if not lines[0]:
            break
        if lines[1].rstrip() not in bad_words:
            newfile.write( ''.join(lines) )

CodePudding user response:

  • Since the end of data must contain million, you can try this code.

example code:

bad_words = ['Paris',]

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
    lines = oldfile.readlines()
    temp = []
    is_bad = False
    for line in lines:
        temp.append(line)
        for bad_word in bad_words:
            if bad_word in line:
                is_bad = True
                break
        if "million" in line:
            if not is_bad:
                for new_data in temp:
                    newfile.write(new_data)
            is_bad = False
            temp = []

result:

LZQ
London
1572
11
8,982 million
Rome
1285
11
2,873 million
  • Related