I have tried different solutions to solve this but none of them worked and were too messy to post here. So I will just present my problem. I have a .txt
file that looks like this:
Field1:
Something
Field2:
Something
Field3:
Field4:
Field1:
Something
Field2:
Field3:
Something
Field4:
Something
...
The file contains 4 fields which repeat themselves an unspecified number of times but it always ends with Field4
. Each field either has a string written under it or does not. Whether a field has something written under it or not is also random. In case it does not, I have to insert a string underneath which says "Empty"
. So in the end it should look something like this:
Field1:
Something
Field2:
Something
Field3:
Empty
Field4:
Empty
Field1:
Something
Field2:
Empty
Field3:
Something
Field4:
Something
...
My thought process was to open the original text file as readable and another text file as writable, iterate through the lines of the original and write each line in the output file. If a line contains Field1
and the next line contains Field2
, then add string Empty
underneath Field1
and continue doing this for each line.
CodePudding user response:
Since text files cannot be edited in the middle, the program reads every line in readable.txt and append them to writable.txt with correcting lines.
file = open("readable.txt","r")
file = file.readlines()
f = open("writable.txt", "a")
n = 0
while n < len(file):
if "Field" in file[n]:
f.write(str(file[n]))
if "Field" in file[n 1]:
f.write("Empty\n")
n = n 1
continue
else:
f.write(file[n 1])
n = n 1
continue
else:
n = n 1
continue
file.close()
f.close()
CodePudding user response:
If you have a large file, you don't want to read it all into memory before processing it, so you can do it line-by-line.
First, we can define a regex pattern
to match the word "Field"
, followed by any number of digits, followed by a colon. Try the regex
Each time you read a line, if the previous line matches this pattern and the current line also matches the pattern, you write an "Empty"
before writing this line. If not, you just write this line:
import re
pattern = re.compile(r"Field\d :") # Field, followed by one or more digits (\d ), and a colon
with open("in.txt") as infile, open("out.txt", "w") as outfile:
prev_line = ""
for line in infile:
if pattern.match(line) and pattern.match(prev_line):
outfile.write("Empty\n") # Write an Empty line if both lines match the pattern:
outfile.write(line) # This is outside an if because we always write the current line
prev_line = line
With your input file, this gives:
Field1:
Something
Field2:
Something
Field3:
Empty
Field4:
Empty
Field1:
Something
Field2:
Empty
Field3:
Something
Field4:
Something