Is there a way to concatenate the text from duplicate named tags?
Example xml:
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank updated="yes">2</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor>"Austria"</neighbor>
<neighbor>"Switzerland"</neighbor>
</country>
<country name="Singapore">
<rank updated="yes">5</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor>"Malaysia"</neighbor>
</country>
<country name="Panama">
<rank updated="yes">69</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor>"Costa Rica"</neighbor>
<neighbor>"Colombia"</neighbor>
</country>
</data>
This is what I have so far:
from xml.etree import ElementTree as ET
tree = ET.parse('sample.xml')
root = tree.getroot()
for acct_det in root.iter('neighbor'):
print(acct_det.text)
What I wanting to do is make concatenated strings from the neighbor
tags:
Austria Switzerland
Malaysia
Costa Rica Colombia
I'm having trouble finding a solution to accomplish this.
CodePudding user response:
from lxml import etree
tree = etree.parse('tmp.xml')
slist = tree.xpath('//country')
for d in slist:
print( d.xpath('concat(./neighbor[1]/text(), " ", ./neighbor[2]/text())'))
Result
"Austria" "Switzerland"
"Malaysia"
"Costa Rica" "Colombia"
CodePudding user response:
from xml.etree import ElementTree as ET
tree = ET.parse("sample.xml")
root = tree.getroot()
for country in root:
neighbors = " ".join([n.text.strip('"') for n in country.findall("neighbor")])
print(neighbors)