This is not answering my own question, but I'll edit the solution later. This is a follow-up question from topic HERE.
If I use the solution from this post, I get an error
AttributeError: 'NoneType' object has no attribute 'text'
The values are in the XML file, so i really don't know what to do...
The code:
import pandas as pd
from bs4 import BeautifulSoup
import xml.etree.ElementTree as ET
files = ["S1.xml"]
#files = glob.glob('./*.xml')
all_data = []
for file in files:
with open(file, "r") as f_in:
soup = BeautifulSoup(f_in.read(), "xml")
all_data.append({"file": file, "A": soup.A.text, "Qfl": soup.Qfl.text})
df = pd.DataFrame(all_data).set_index("file")
df.index.name = None
print(df)
A sample od S1.xml is here:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<reiXmlPrenos>
<QNH>24788</QNH>
<QNC>9698</QNC>
<RefKlima>42774.8</RefKlima>
<Qf>255340</Qf>
<Qp>597451</Qp>
<CO2>126660</CO2>
<A>2362.8</A>
<Ht>0.336</Ht>
<f0>0.59</f0>
<z>0.105891</z>
<TP>3300</TP>
<Qfaux>2126</Qfaux>
<Qfh>24065</Qfh>
<Qfc>5345</Qfc>
<Qfv>18177</Qfv>
<Qfst>0</Qfst>
<Qfw>195520</Qfw>
<Qfl>10107</Qfl>
<fOVE>6.4</fOVE>
</reiXmlPrenos>
The error I get
File "<ipython-input-163-14360bc9577e>", line 1, in <module>
runfile('C:/......py', wdir='....n')
File ".....py", line 827, in runfile
execfile(filename, namespace)
File ".....py", line 110, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File ".....py", line 25, in <module>
all_data.append({"file": file, "A": soup.A.text})
AttributeError: 'NoneType' object has no attribute 'text'
CodePudding user response:
The error does not reproduce with provided sample so it could be a problem with a particular file. Adding a try-except would help to catch the problematic filename
import pandas as pd
from bs4 import BeautifulSoup
import xml.etree.ElementTree as ET
files = ["S1.xml"]
#files = glob.glob('./*.xml')
all_data = []
for file in files:
with open(file, "r") as f_in:
try:
soup = BeautifulSoup(f_in.read(), "xml")
all_data.append({"file": file, "A": soup.A.text, "Qfl": soup.Qfl.text})
except AttributeError as e:
print(f'Error: {file}, {e}')
if all_data:
df = pd.DataFrame(all_data).set_index("file")
df.index.name = None
print(df)
A simple way to reproduce the error is to comment out A
element on provided sample
<!--<A>2362.8</A>-->
CodePudding user response:
Try the below
import xml.etree.ElementTree as ET
xml = '''<reiXmlPrenos>
<QNH>24788</QNH>
<QNC>9698</QNC>
<RefKlima>42774.8</RefKlima>
<Qf>255340</Qf>
<Qp>597451</Qp>
<CO2>126660</CO2>
<A>2362.8</A>
<Ht>0.336</Ht>
<f0>0.59</f0>
<z>0.105891</z>
<TP>3300</TP>
<Qfaux>2126</Qfaux>
<Qfh>24065</Qfh>
<Qfc>5345</Qfc>
<Qfv>18177</Qfv>
<Qfst>0</Qfst>
<Qfw>195520</Qfw>
<Qfl>10107</Qfl>
<fOVE>6.4</fOVE>
</reiXmlPrenos>'''
elements = ['A', 'Qfl']
root = ET.fromstring(xml)
for el_str in elements:
el = root.find(el_str)
if el is not None:
print(f'{el_str} --> {el.txt}')
else:
print(f'{el_str} --> None')
output
2362.8
10107