Home > other >  Python 3.6 - Parse over GET Response
Python 3.6 - Parse over GET Response

Time:09-09

I have a requests call that gives me some somewhat formatted XML data like this:

<info>
 <stats vol="545080705" orders="718021755"/>
  <symbols timestamp="2022-09-08 19:56:37" count="11394">
   <symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
   <symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
   <symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
   <symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
   <symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />

I'm having difficulty figuring out how to convert this into either a dictionary, or data frame, or some other object which I can loop over and extract the NAME, VOL, LAST & MATCHED items.

Any help would be awesome.

CodePudding user response:

You can parse this code as follow, but it depends what you need from it :

from bs4 import BeautifulSoup as bs
import pandas as pd

response = """<info>
 <stats vol="545080705" orders="718021755"/>
  <symbols timestamp="2022-09-08 19:56:37" count="11394">
    <symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
    <symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
    <symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
    <symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
    <symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />
   </symbols></info>"""

content = bs(response,"lxml-xml" )

df = pd.read_xml(str(content), xpath="//symbol")

Output :

    last    matched name    vol
0   28.23   8382339 TQQQ    8700394
1   401.00  8209174 SPY     8571092
2   44.39   6734334 SQQQ    7091770
3   0.17    6469576 AVCT    6493626
4   9.42    6142800 UVXY    6158364

CodePudding user response:

There are many ways you can parse and convert XML. Here is one of the ways using Beautifulsoup

doc = '''
<info>
 <stats vol="545080705" orders="718021755"/>
  <symbols timestamp="2022-09-08 19:56:37" count="11394">
   <symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
   <symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
   <symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
   <symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
   <symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />
'''

from bs4 import BeautifulSoup
soup = BeautifulSoup(doc, 'lxml-xml')

for sym in soup.find_all('symbol'):
    print("-"*32)
    print(sym.get("name"))
    print(sym.get("vol"))
    print(sym.get("last"))
    print(sym.get("matched"))

If you want more ways you can check this link

  • Related