For example I have a some text fragment in file:
Item Value Threshold Converged?
Maximum Force 0.009497 0.000450 NO
RMS Force 0.002723 0.000300 NO
Maximum Displacement 0.247463 0.001800 NO
RMS Displacement 0.065734 0.001200 NO
SCF Done: E(RPW91-PW91) = -2381.36459172 A.U. after 1 cycles
And I try to extract two numbers from columns Value and Threshold, for each lines. For example consider first line
Maximum Force 0.009497 0.000450 NO
So, for example I try to get the value 0.009497
#!/usr/bin/python3.6
import glob
import os
def CutMAXFValue( Line: str) -> float:
return float(Line.split()[2])
def CutSCFValue( Line: str) -> float:
return float(Line.split()[4])
def GrepSCF( Filename: str , StartStep = 1):
#print(Filename)
result = list()
Step = StartStep
with open(Filename, 'r') as f:
lines = f.readlines()
for s in lines:
if s.startswith("SCF Done:"):
result.append( (Step, CutSCFValue(s) ) )
Step = 1
return result
def GrepMAXF( Filename: str , StartStep = 1):
#print(Filename)
result = list()
Step = StartStep
with open(Filename, 'r') as f:
lines = f.readlines()
for s in lines:
if s.startswith("Maximum Force"):
result.append( (Step, CutMAXFValue(s) ) )
Step = 1
return result
def DraftListSteps(f: str):
result = list()
SubFiles = GetOutFilesBeginsWith( f ) #support function to read from file
MaxRerun = GetMaxRerun( SubFiles )
startstep = 1
for rerun in range(MaxRerun 1):
RFiles = GetFilesForRerun( SubFiles, rerun )
DoneFile = next( x for x in RFiles if x.endswith("out") or x.endswith("outERR") )
MAXF = GrepMAXF(DoneFile, startstep)
MinStep = min(MAXF, key = lambda x: x[1] )
startstep = MinStep[0]
result.append(MAXF)
# SCF = GrepSCF(DoneFile, startstep)
# MinStep = min(MAXF, key = lambda x: x[1] )
# startstep = MinStep[0]
# result.append(SCF)
return result
And after run script I get the sam error:
Traceback (most recent call last):
File "forker.py", line 123, in <module>
draft = DraftListSteps( f )
File "forker.py", line 91, in DraftListSteps
MinStep = min(MAXF, key = lambda x: x[1] )
ValueError: min() arg is an empty sequence
How to fix this error and extract required value? So, regexp don't work in tis case, thoug maybe I wrong code pattern for regexp...
If I exctract value of -2381.36459172
from SCF Done fragment this code work perfectly, but if I use code for get 0.009497
it's not work...
CodePudding user response:
I'll bet anything it's a TAB between Maximum
and Force
, so change
if s.startswith("Maximum Force"):
to
if s.startswith("Maximum\tForce"):
Or you can handle either with a regular expression:
if re.match(r'Maximum\s Force', s):
CodePudding user response:
An example of reading this using csv
(without actually bothering with a Dialect
):
with open(inputfile_name) as infile:
reader = csv.reader(infile, delimiter=' ', skipinitialspace=True)
datarows = [row for row in reader if row[0] != "Item"] #filter out headers
print(datarows)
Or, if it's tab-delimited:
with open(inputfile_name_t) as infile:
reader = csv.reader(infile, delimiter='\t', skipinitialspace=True)
datarows = [row for row in reader if row[1] != "Item"] #filter out headers
print(datarows)
prints (cleaned up for clarity):
[['Maximum', 'Force', '0.009497', '0.000450', 'NO', ''],
['RMS', 'Force', '0.002723', '0.000300', 'NO', ''],
['Maximum', 'Displacement', '0.247463', '0.001800', 'NO', ''],
['RMS', 'Displacement', '0.065734', '0.001200', 'NO', '']]
That's for the space-delimited version; the tab-delimited is slightly different output, but still gets you where you need to go.
You might need to adjust the subscript in the filtering line, depending on whether there's a tab or space at start-of-line.