Although I studied for a long time, I could not find the right answer anywhere. What I need is to retrieve the data contained in a text file and delete the imported data. In fact, the short name for it is usually "cut". But I couldn't find the question and solution that I really needed on Stackoverflow.
First, look at the contents of the file.txt to fully understand the problem and to guide me:
Start
General : Video
Format : Matroska at 3 961 kb/s
Length : 2.50 GiB for 1 h 30 min 12 s 928 ms
Video #1 : AVC at 3 320 kb/s
Aspect : 1920 x 1080 (1.778) at 24.000 fps
Audio #2 : AC-3 at 640 kb/s
Infos : 6 channel(s), 48.0 kHz
Language : tr
Text #3 : UTF-8
Language : tr
End
--- Passing Data ---
Start
General : Video
Format : AVI at 1 113 kb/s
Length : 718 MiB for 1 h 30 min 12 s 552 ms
Video #0 : MPEG-4 Visual at 976 kb/s
Aspect : 720 x 404 (1.782) at 24.000 fps
Audio #1 : MPEG Audio at 128 kb/s
Infos : 2 channel(s), 48.0 kHz
End
As you can see in the file, Start and End specifiers come at certain intervals. I use these specifiers to get the data between them. My code is like this:
f = open('file.txt','r ' , encoding='utf-8')
s = f.read()
start = s.find("Start") len("Start")
end = s.find("End")
substring = s[start:end]
f.close()
print(substring)
But this code just retrieves the data instead of truncating it. Therefore, it prevents me from passing to a data. Because s.find("Start")
and s.find("End")
fetches only the first data.
How can I solve this problem? Thanks
CodePudding user response:
I'm not sure what you mean by "Therefore, it prevents me from passing to a data." but I would use s.split("end") to separate and make othe string operations from there because you would have everything to "end" separated in each index of the array. Maybe using splitlines after for an array of the lines of each block of "start/end".
f = open('file.txt','r ' , encoding='utf-8')
s = f.read()
blocksOfData = s.split("end")
f.close()
CodePudding user response:
Apologies if this is poorly formatted, this is my first time on Stack Overflow.
Adding f.truncate(0) before you close the file will erase all of the contents of file.txt.
f = open('file.txt','r ' , encoding='utf-8')
s = f.read()
start = s.find("Start") len("Start")
end = s.find("End")
substring = s[start:end]
f.truncate(0)
f.close()
print(substring)
CodePudding user response:
Are you looking for something like:
import re
re_blocks = re.compile(r"^\s*Start. ?End\s*$", re.MULTILINE|re.DOTALL)
with open("file.txt", "r") as file:
blocks = re_blocks.findall(file.read())
file.seek(0)
new_file = re_blocks.sub("", file.read())
with open("file.txt", "w") as file:
file.write(new_file)
blocks
is a list with the extracted data-packages. And after extracting them, the file gets re-written without those parts.
CodePudding user response:
File can't work like strings. If you want to remove some part from beginning or middle of file then you have to read all text to memory, edit it in memory, and write all back to file. So you have to open file for writing and write s[:start]
and s[end:]
f = open('file.txt', 'r' , encoding='utf-8')
s = f.read()
start = s.find("Start") len("Start")
end = s.find("End")
substring = s[start:end]
f.close()
print(substring)
f = open('file.txt', 'w' , encoding='utf-8')
f.write(s[:start])
f.write(s[end:])
f.close()
But if you want to work with all blocks Start...End
then you don't have to crop it but you can use option start_position
in find()
to get next elements.
start = s.find("Start", end) len("Start")
end = s.find("End", start)
like
end = 0
while True:
start = s.find("Start", end)
if start == -1:
break
start = len("Start")
end = s.find("End", start)
substring = s[start:end]
print(substring)
end = len("End")
OR you can repeate code for with substring s[end:]
s = s[end:]
like
while True:
start = s.find("Start")
if start == -1:
break
start = len("Start")
end = s.find("End", start)
substring = s[start:end]
print(substring)
s = s[end:]