I have a requests call that gives me some somewhat formatted XML data like this:
<info>
<stats vol="545080705" orders="718021755"/>
<symbols timestamp="2022-09-08 19:56:37" count="11394">
<symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
<symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
<symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
<symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
<symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />
I'm having difficulty figuring out how to convert this into either a dictionary, or data frame, or some other object which I can loop over and extract the NAME, VOL, LAST & MATCHED items.
Any help would be awesome.
CodePudding user response:
You can parse this code as follow, but it depends what you need from it :
from bs4 import BeautifulSoup as bs
import pandas as pd
response = """<info>
<stats vol="545080705" orders="718021755"/>
<symbols timestamp="2022-09-08 19:56:37" count="11394">
<symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
<symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
<symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
<symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
<symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />
</symbols></info>"""
content = bs(response,"lxml-xml" )
df = pd.read_xml(str(content), xpath="//symbol")
Output :
last matched name vol
0 28.23 8382339 TQQQ 8700394
1 401.00 8209174 SPY 8571092
2 44.39 6734334 SQQQ 7091770
3 0.17 6469576 AVCT 6493626
4 9.42 6142800 UVXY 6158364
CodePudding user response:
There are many ways you can parse and convert XML. Here is one of the ways using Beautifulsoup
doc = '''
<info>
<stats vol="545080705" orders="718021755"/>
<symbols timestamp="2022-09-08 19:56:37" count="11394">
<symbol name="TQQQ" vol="8700394" last="28.23" matched="8382339" />
<symbol name="SPY" vol="8571092" last="401.00" matched="8209174" />
<symbol name="SQQQ" vol="7091770" last="44.39" matched="6734334" />
<symbol name="AVCT" vol="6493626" last="0.17" matched="6469576" />
<symbol name="UVXY" vol="6158364" last="9.42" matched="6142800" />
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(doc, 'lxml-xml')
for sym in soup.find_all('symbol'):
print("-"*32)
print(sym.get("name"))
print(sym.get("vol"))
print(sym.get("last"))
print(sym.get("matched"))
If you want more ways you can check this link