I have style component in html
as shown below.
<style>
tr[data-row-company-id="27"] {
font-weight: 500;
}
tr[data-row-company-id="27"] a {
color: var(--ink-900);
}
</style>
I need to parse company-id
which is 27 here. I am able to find style component as below
soup = BeautifulSoup(html, 'html.parser')
item=soup.find('style')
I am not sure how do I proceed ahead. Kindly provide solution.
Thanks .
CodePudding user response:
This is one way to get that info:
from bs4 import BeautifulSoup as bs
html = '''
<style>
tr[data-row-company-id="27"] {
font-weight: 500;
}
tr[data-row-company-id="27"] a {
color: var(--ink-900);
}
</style>
'''
soup = bs(html, 'html.parser')
trs = soup.select('style')
for t in trs:
desired_info = [x.split('[')[1].split(']')[0] for x in t.text.split('}') if len(x) > 1]
print(desired_info)
Result:
['data-row-company-id="27"', 'data-row-company-id="27"']
You can drill down further and extract the number only if you want, etc. BeautifulSoup documentation: https://beautiful-soup-4.readthedocs.io/en/latest/index.html