I have an input file (File A) as shown below:
1015090 3919032
2115090 3919032 3919032
3215090 3919032 3919032
4315090 3919032 3919032
5415090 3919032 3919032
6515090 3919032 3919032
7615090 3919032 3919032
8715090 3919032 3919032
9815069 3919032 <----- This row needs to be found
2015089 3919032 <----- This row needs to be found
2115089 3919032 3919032
2215069 3919032 3919032
2315069 3919032 3919032
2415089 3919032 3919032
2515089 3919032 3919032
2615089 3919032 3919032
2715069 3919032 3919032
2815069 3919032 3919032
2915069 3919032
3015069 3919032 3919032
3115069 3919032 3919032
I need to find blankspace in 3rd column only when it comes continuously in 2 rows together. When 2 rows are found together, then I need to write this specific rows(column 1 and column 2) to an output file.
**Output:**
9815069 3919032
2015089 3919032
I am new to Python and Could anyone tell me, how it can be done using python. Thanks in Advance
CodePudding user response:
I prefer to use a readline
method and a stack just in case the file size is very large.
import re
pat = re.compile(r'^\d \s \d \s*$')
with open('File_A.txt', 'r') as fr:
with open('File_A_Out.txt', 'a ') as fw:
stack = []
line = True
while line:
line = fr.readline()
if pat.search(line):
if stack:
fw.write(stack.pop() line)
else:
stack.append(line)
else:
stack = []
CodePudding user response:
One regex approach would be to read all lines into a string, then search for lines matching the pattern ^\d \s \d \s*$
:
with open('input.txt') as f:
lines = f.read()
matches = re.findall(r'^\d \s \d \s*?\r?\n\d \s \d \s*$', lines, flags=re.M)