Home > Software engineering >  How to find specific element in parsed xml?
How to find specific element in parsed xml?

Time:10-04

I am new to xml and xml.etree.ElementTree Python's library. I am trying to extract the "Item_desc" and "current_price" from below Xml data under "Items" element. For XML parse I followed this post.Below code is returning data for "Current Volume" as well as ""Current Cutout Value/Change". Not sure what is wrong with below code, any help is appreciated.

Thanks in advance for your time and efforts.

XML Data

b'<?xml version=\'1.0\' encoding=\'UTF-8\'?><results exportTime="2021-10-03 09:12:32 CDT"><report label="National Daily Cutter Cow Cutout and Boxed Cow Beef Cuts - Negotiated - Afternoon" slug="LM_XB405"><record report_date="09/24/2021" narrative="null"><report label="Current Cutout Value/Change"><record current_cutout_value="232.67" change_cutout_value="-.03"/></report><report label="Items"><record item_desc="90% lean  " current_price="276.93" current_value="154.64" change_value=".00"/><record item_desc="100% lean inside round  " current_price="563.93" current_value="13.08" change_value=".09"/><record item_desc="100% lean, flats and eyes  " current_price="429.52" current_value="9.96" change_value=".00"/><record item_desc="100% lean, S.P.B.  " current_price="431.54" current_value="21.58" change_value=".00"/><record item_desc="Chuck Tender  " current_price="311.27" current_value="3.11" change_value="-.02"/><record item_desc="Knuckle  4-7 lbs." current_price="323.60" current_value="8.19" change_value=".01"/><record item_desc="Tenderloin  2-3 lbs." current_price="442.55" current_value="2.35" change_value="-.06"/><record item_desc="Tenderloin  3-4 lbs." current_price="663.86" current_value="4.91" change_value=".00"/><record item_desc="Ribeye Roll  4-6 lbs." current_price="405.00" current_value="4.25" change_value=".00"/><record item_desc="Ribeye Roll  6-8 lbs." current_price="392.05" current_value="2.90" change_value="-.06"/><record item_desc="Ribeye Roll  8up lbs." current_price="435.00" current_value="3.18" change_value=".00"/><record item_desc="Flank Steak  " current_price="367.34" current_value="1.54" change_value=".01"/><record item_desc="Kidney, Edible  " current_price="40.00" current_value=".18" change_value=".00"/><record item_desc="Fat, inedible  " current_price="44.72" current_value="1.04" change_value=".00"/><record item_desc="Bone  " current_price="7.32" current_value="1.76" change_value=".00"/></report><report label="Current Volume"><record boner_volume_loads="17.31" cutter_volume_loads="4.11" bbcc_volume_loads="19.47" lean_volume_loads="19.09" frozen_volume_loads="4.61" boner_volume_pounds="692,580" cutter_volume_pounds="164,565" bbcc_volume_pounds="778,700" lean_volume_pounds="763,409" frozen_volume_pounds="184,527"/></report>

Current Code:

for child in root1.iter('record'):
    print(child.attrib.get('item_desc'))

CodePudding user response:

There is something wrong with the format of your XML file. The following code should work if your file is correctly formatted.

First we load the XML tree:

import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')

Search for the items node:

item = tree.find(".//report[@label='Items']")

Do whatever with the records in item:

for r in item:
    print(r.attrib["item_desc"].ljust(30), r.attrib["current_price"])

90% lean                       276.93
100% lean inside round         563.93
100% lean, flats and eyes      429.52
100% lean, S.P.B.              431.54
Chuck Tender                   311.27
Knuckle  4-7 lbs.              323.60
Tenderloin  2-3 lbs.           442.55
Tenderloin  3-4 lbs.           663.86
Ribeye Roll  4-6 lbs.          405.00
Ribeye Roll  6-8 lbs.          392.05
Ribeye Roll  8up lbs.          435.00
Flank Steak                    367.34
Kidney, Edible                 40.00
Fat, inedible                  44.72
Bone                           7.32

CodePudding user response:

First of all, the xml you mentioned is not correct, I fixed it:

b'<?xml version=\'1.0\' encoding=\'UTF-8\'?><results exportTime="2021-10-03 09:12:32 CDT"><report label="National Daily Cutter Cow Cutout and Boxed Cow Beef Cuts - Negotiated - Afternoon" slug="LM_XB405"><record report_date="09/24/2021" narrative="null"/></report><report label="Current Cutout Value/Change"><record current_cutout_value="232.67" change_cutout_value="-.03"/></report><report label="Items"><record item_desc="90% lean  " current_price="276.93" current_value="154.64" change_value=".00"/><record item_desc="100% lean inside round  " current_price="563.93" current_value="13.08" change_value=".09"/><record item_desc="100% lean, flats and eyes  " current_price="429.52" current_value="9.96" change_value=".00"/><record item_desc="100% lean, S.P.B.  " current_price="431.54" current_value="21.58" change_value=".00"/><record item_desc="Chuck Tender  " current_price="311.27" current_value="3.11" change_value="-.02"/><record item_desc="Knuckle  4-7 lbs." current_price="323.60" current_value="8.19" change_value=".01"/><record item_desc="Tenderloin  2-3 lbs." current_price="442.55" current_value="2.35" change_value="-.06"/><record item_desc="Tenderloin  3-4 lbs." current_price="663.86" current_value="4.91" change_value=".00"/><record item_desc="Ribeye Roll  4-6 lbs." current_price="405.00" current_value="4.25" change_value=".00"/><record item_desc="Ribeye Roll  6-8 lbs." current_price="392.05" current_value="2.90" change_value="-.06"/><record item_desc="Ribeye Roll  8up lbs." current_price="435.00" current_value="3.18" change_value=".00"/><record item_desc="Flank Steak  " current_price="367.34" current_value="1.54" change_value=".01"/><record item_desc="Kidney, Edible  " current_price="40.00" current_value=".18" change_value=".00"/><record item_desc="Fat, inedible  " current_price="44.72" current_value="1.04" change_value=".00"/><record item_desc="Bone  " current_price="7.32" current_value="1.76" change_value=".00"/></report><report label="Current Volume"><record boner_volume_loads="17.31" cutter_volume_loads="4.11" bbcc_volume_loads="19.47" lean_volume_loads="19.09" frozen_volume_loads="4.61" boner_volume_pounds="692,580" cutter_volume_pounds="164,565" bbcc_volume_pounds="778,700" lean_volume_pounds="763,409" frozen_volume_pounds="184,527"/></report></results>'  

In order to achieve what you expect, just iterate over the children of the root which thelabel attribute of it, is equal to "Items".

import xml.etree.ElementTree as et
root1 = et.fromstring(b'<?xml version=\'1.0\' encoding=\'UTF-8\'?><results exportTime="2021-10-03 09:12:32 CDT"><report label="National Daily Cutter Cow Cutout and Boxed Cow Beef Cuts - Negotiated - Afternoon" slug="LM_XB405"><record report_date="09/24/2021" narrative="null"/></report><report label="Current Cutout Value/Change"><record current_cutout_value="232.67" change_cutout_value="-.03"/></report><report label="Items"><record item_desc="90% lean  " current_price="276.93" current_value="154.64" change_value=".00"/><record item_desc="100% lean inside round  " current_price="563.93" current_value="13.08" change_value=".09"/><record item_desc="100% lean, flats and eyes  " current_price="429.52" current_value="9.96" change_value=".00"/><record item_desc="100% lean, S.P.B.  " current_price="431.54" current_value="21.58" change_value=".00"/><record item_desc="Chuck Tender  " current_price="311.27" current_value="3.11" change_value="-.02"/><record item_desc="Knuckle  4-7 lbs." current_price="323.60" current_value="8.19" change_value=".01"/><record item_desc="Tenderloin  2-3 lbs." current_price="442.55" current_value="2.35" change_value="-.06"/><record item_desc="Tenderloin  3-4 lbs." current_price="663.86" current_value="4.91" change_value=".00"/><record item_desc="Ribeye Roll  4-6 lbs." current_price="405.00" current_value="4.25" change_value=".00"/><record item_desc="Ribeye Roll  6-8 lbs." current_price="392.05" current_value="2.90" change_value="-.06"/><record item_desc="Ribeye Roll  8up lbs." current_price="435.00" current_value="3.18" change_value=".00"/><record item_desc="Flank Steak  " current_price="367.34" current_value="1.54" change_value=".01"/><record item_desc="Kidney, Edible  " current_price="40.00" current_value=".18" change_value=".00"/><record item_desc="Fat, inedible  " current_price="44.72" current_value="1.04" change_value=".00"/><record item_desc="Bone  " current_price="7.32" current_value="1.76" change_value=".00"/></report><report label="Current Volume"><record boner_volume_loads="17.31" cutter_volume_loads="4.11" bbcc_volume_loads="19.47" lean_volume_loads="19.09" frozen_volume_loads="4.61" boner_volume_pounds="692,580" cutter_volume_pounds="164,565" bbcc_volume_pounds="778,700" lean_volume_pounds="763,409" frozen_volume_pounds="184,527"/></report></results>')
for child in root1 :
    if child.attrib['label'] != "Items":
        continue
    for record in child.iter('record'):
        print(record.attrib.get('item_desc'))

The result will be:

90% lean  
100% lean inside round     
100% lean, flats and eyes  
100% lean, S.P.B.
Chuck Tender
Knuckle  4-7 lbs.
Tenderloin  2-3 lbs.       
Tenderloin  3-4 lbs.       
Ribeye Roll  4-6 lbs.      
Ribeye Roll  6-8 lbs.      
Ribeye Roll  8up lbs.      
Flank Steak
Kidney, Edible
Fat, inedible
Bone
  • Related