I am working with python, I need to move my xml to csv file, right now I have an issue with the tag "customfields" can be more than one tag with that name and inside each customfields tag I need to add fieldName and Value.
My code works but for fieldName and Value variables only store the last tag captured.
My result should be like these
ID. | abbreviation | active | address. | fieldName | value. |
---|---|---|---|---|---|
2424 | X1C-02220 | true. | my address | Company | |
2424 | X1C-02220 | true. | my address | Org | Tiber |
But now is looking like these
ID. | abbreviation | active | address. | fieldName | value. |
---|---|---|---|---|---|
2424 | X1C-02220 | true. | my address | Org | Tiber |
This is my xml:
<?xml version="1.0" ?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<ns1:getListResponse xmlns:ns1="http://service.web.propertyware.realpage.com" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<getListReturn xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" soapenc:arrayType="xsd:anyType[824]" xsi:type="soapenc:Array">
<getListReturn xmlns:ns2="urn:PWServices" xsi:type="ns2:Unit">
<ID xsi:type="xsd:long">2424</ID>
<abbreviation xsi:type="xsd:string">X1C-02220</abbreviation>
<active xsi:type="xsd:boolean">true</active>
<address xsi:type="xsd:string">my address</address>
<customFields soapenc:arrayType="ns2:CustomField[129]" xsi:type="soapenc:Array">
<customFields xsi:type="ns2:CustomField">
<ID xsi:type="xsd:long">0</ID>
<clientData xsi:type="soapenc:Array" xsi:nil="true"/>
<dataType xsi:type="xsd:string">Text</dataType>
<definitionID xsi:type="xsd:long">0</definitionID>
<fieldName xsi:type="xsd:string">Company</fieldName>
<value xsi:type="xsd:string"/>
</customFields>
<customFields xsi:type="ns2:CustomField">
<fieldName xsi:type="xsd:string">Org</fieldName>
<value xsi:type="xsd:string">Tiber</value>
</customFields>
</customFields>
</getListReturn>
</getListReturn>
</ns1:getListResponse>
</soapenv:Body>
</soapenv:Envelope>
Here is my code
import xml.etree.ElementTree as ET
import pandas as pd
import csv
root = ET.parse('build.xml') # replace file name
cols = ["ID","abbreviation","active","address"
,"fieldName","value"
]
rows = []
branches = root.findall('.//getListReturn/getListReturn')
for branch in branches:
ID = branch.find('ID').text
abbreviation= branch.find('abbreviation').text
active= branch.find('active').text
address= branch.find('address').text
customs=branch.findall('.//customFields/customFields')
for custom in customs:
fieldName=custom.find('fieldName').text
value=custom.find('value').text
rows.append({"ID":ID,"abbreviation":abbreviation,
"active":active,"address":address
,"fieldName":fieldName,"value":value
})
df = pd.DataFrame(rows, columns=cols)
df.to_csv('output.csv')
How to store all the data from "customfields" tag?
Regards
CodePudding user response:
Simply indent the rows.append
inside the inner loop to capture all information under child element:
...
for custom in customs:
fieldName = custom.find('fieldName').text
value = custom.find('value').text
rows.append({
"ID":ID, "abbreviation":abbreviation,
"active":active, "address":address,
"fieldName":fieldName, "value":value
})
Consider also, list/dict comprehensions with a dictionary merge to combine both parent/child levels:
root = ET.parse('build.xml')
rows = [
{
**{
"ID": branch.find('abbreviation').text,
"abbreviation": branch.find('abbreviation').text,
"active": branch.find('active').text,
"address": branch.find('address').text,
"fieldName":fieldName,
"value":value
},
**{
"fieldName": custom.find('fieldName').text,
"value": custom.find('value').text
}
}
for branch in root.findall('.//getListReturn/getListReturn')
for custom in branch.findall('.//customFields/customFields')
]
df = pd.DataFrame(rows, columns=cols)
df.to_csv('output.csv')