Home > Net >  XML, append correct tag to another tag with Python
XML, append correct tag to another tag with Python

Time:08-18

all I'm new in Python, and using beautifoulsoup4 My XML is:


<?xml version="1.0" encoding="utf-8"?>

 <database name="test_testdatabase">

   <table name="products">
     <column name="product_id"> x1x </column>
   </table>

   <table name="products_en_gb">
    <column  name="product_name"> Some name 1 </column >
    <column  name="product_s_desc"> Some short description 1 </column >
  </table>
  
  <table name="products">
   <column name="product_id"> 2xx </column>
  </table>

  <table name="products_en_gb">
   <column  name="product_name"> Second product name 2 </column >
   <column  name="product_s_desc"> Second short description 2 </column >
  </table>

</database>

And so in the same pattern I have more than 5000 products in XML

I would like append tag with name="product_id" to table with name="products_en_gb" but I would like follow pattern as it is.

So first id to first table, second id to second table and so on.

I try lot ways to do it. The most success I have with this code:

#test.py

product_id = soup.findAll(attrs={"name": ["product_id"]}):


for products_en_gb in soup.findAll(attrs={"name": ["products_en_gb"]}):
    products_en_gb.contents.append(product_id[0])

The problem is that if i use product_id[0] always append 1 tag but is the same first one in sequence for all tables, and if i use product_id then all tags are append in all tables, my desired result is flowing:

<?xml version="1.0" encoding="utf-8"?>

 <database name="test_testdatabase">

   <table name="products">
     <column name="product_id"> x1x </column>
   </table>

   <table name="products_en_gb">
    <column name="product_id"> x1x </column>
    <column  name="product_name"> Some name 1 </column >
    <column  name="product_s_desc"> Some short description 1 </column >
  </table>
  
  <table name="products">
   <column name="product_id"> 2xx </column>
  </table>

  <table name="products_en_gb">
   <column name="product_id"> 2xx </column>
   <column  name="product_name"> Second product name 2 </column >
   <column  name="product_s_desc"> Second short description 2 </column >
  </table>

</database>

I hope someone could help.

Thank you.

CodePudding user response:

Try:

from bs4 import BeautifulSoup

xml_doc = """\
<?xml version="1.0" encoding="utf-8"?>

 <database name="test_testdatabase">

   <table name="products">
     <column name="product_id"> x1x </column>
   </table>

   <table name="products_en_gb">
    <column  name="product_name"> Some name 1 </column >
    <column  name="product_s_desc"> Some short description 1 </column >
  </table>
  
  <table name="products">
   <column name="product_id"> 2xx </column>
  </table>

  <table name="products_en_gb">
   <column  name="product_name"> Second product name 2 </column >
   <column  name="product_s_desc"> Second short description 2 </column >
  </table>

</database>"""

soup = BeautifulSoup(xml_doc, "xml")

for table in soup.select('table[name="products_en_gb"]'):
    prev_products = table.find_previous("table", attrs={"name": "products"})
    content = "\n".join(map(str, prev_products.contents)).strip()
    table.insert(0, BeautifulSoup("\n"   content, "html.parser"))

print(soup)

Prints:

<?xml version="1.0" encoding="utf-8"?>
<database name="test_testdatabase">
<table name="products">
<column name="product_id"> x1x </column>
</table>
<table name="products_en_gb">
<column name="product_id"> x1x </column>
<column name="product_name"> Some name 1 </column>
<column name="product_s_desc"> Some short description 1 </column>
</table>
<table name="products">
<column name="product_id"> 2xx </column>
</table>
<table name="products_en_gb">
<column name="product_id"> 2xx </column>
<column name="product_name"> Second product name 2 </column>
<column name="product_s_desc"> Second short description 2 </column>
</table>
</database>
  • Related