Home > Software engineering >  Script doesn't recognize values in XML
Script doesn't recognize values in XML

Time:10-21

This is not answering my own question, but I'll edit the solution later. This is a follow-up question from topic HERE.

If I use the solution from this post, I get an error

AttributeError: 'NoneType' object has no attribute 'text'

The values are in the XML file, so i really don't know what to do...

The code:

import pandas as pd
from bs4 import BeautifulSoup
import xml.etree.ElementTree as ET

files = ["S1.xml"]


#files = glob.glob('./*.xml')

all_data = []
for file in files:
    with open(file, "r") as f_in:
        soup = BeautifulSoup(f_in.read(), "xml")
        all_data.append({"file": file, "A": soup.A.text, "Qfl": soup.Qfl.text})

df = pd.DataFrame(all_data).set_index("file")
df.index.name = None
print(df)

A sample od S1.xml is here:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<reiXmlPrenos>
  <QNH>24788</QNH>
  <QNC>9698</QNC>
  <RefKlima>42774.8</RefKlima>
  <Qf>255340</Qf>
  <Qp>597451</Qp>
  <CO2>126660</CO2>
  <A>2362.8</A>
  <Ht>0.336</Ht>
  <f0>0.59</f0>
  <z>0.105891</z>
  <TP>3300</TP>
  <Qfaux>2126</Qfaux>
  <Qfh>24065</Qfh>
  <Qfc>5345</Qfc>
  <Qfv>18177</Qfv>
  <Qfst>0</Qfst>
  <Qfw>195520</Qfw>
  <Qfl>10107</Qfl>
  <fOVE>6.4</fOVE>
</reiXmlPrenos>

The error I get

  File "<ipython-input-163-14360bc9577e>", line 1, in <module>
    runfile('C:/......py', wdir='....n')

  File ".....py", line 827, in runfile
    execfile(filename, namespace)

  File ".....py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File ".....py", line 25, in <module>
    all_data.append({"file": file, "A": soup.A.text})

AttributeError: 'NoneType' object has no attribute 'text'

CodePudding user response:

The error does not reproduce with provided sample so it could be a problem with a particular file. Adding a try-except would help to catch the problematic filename

import pandas as pd
from bs4 import BeautifulSoup
import xml.etree.ElementTree as ET

files = ["S1.xml"]


#files = glob.glob('./*.xml')

all_data = []
for file in files:
    with open(file, "r") as f_in:
        try:
            soup = BeautifulSoup(f_in.read(), "xml")
            all_data.append({"file": file, "A": soup.A.text, "Qfl": soup.Qfl.text})
        except AttributeError as e:
            print(f'Error: {file}, {e}')

if all_data:
    df = pd.DataFrame(all_data).set_index("file")
    df.index.name = None
    print(df)

A simple way to reproduce the error is to comment out A element on provided sample

  <!--<A>2362.8</A>-->

CodePudding user response:

Try the below

import xml.etree.ElementTree as ET

xml = '''<reiXmlPrenos>
  <QNH>24788</QNH>
  <QNC>9698</QNC>
  <RefKlima>42774.8</RefKlima>
  <Qf>255340</Qf>
  <Qp>597451</Qp>
  <CO2>126660</CO2>
  <A>2362.8</A>
  <Ht>0.336</Ht>
  <f0>0.59</f0>
  <z>0.105891</z>
  <TP>3300</TP>
  <Qfaux>2126</Qfaux>
  <Qfh>24065</Qfh>
  <Qfc>5345</Qfc>
  <Qfv>18177</Qfv>
  <Qfst>0</Qfst>
  <Qfw>195520</Qfw>
  <Qfl>10107</Qfl>
  <fOVE>6.4</fOVE>
</reiXmlPrenos>'''
elements = ['A', 'Qfl']
root = ET.fromstring(xml)
for el_str in elements:
    el = root.find(el_str)
    if el is not None:
        print(f'{el_str} --> {el.txt}')
    else:
        print(f'{el_str} --> None')

output

2362.8
10107
  • Related