Home > front end >  parse tr in style component using beautifulsoup
parse tr in style component using beautifulsoup

Time:09-09

I have style component in html as shown below.

<style>
  tr[data-row-company-id="27"] {
    font-weight: 500;
  }

  tr[data-row-company-id="27"] a {
    color: var(--ink-900);
  }
</style>

I need to parse company-id which is 27 here. I am able to find style component as below

soup = BeautifulSoup(html, 'html.parser')
item=soup.find('style')

I am not sure how do I proceed ahead. Kindly provide solution.

Thanks .

CodePudding user response:

This is one way to get that info:

from bs4 import BeautifulSoup as bs

html = '''
<style>
  tr[data-row-company-id="27"] {
    font-weight: 500;
  }

  tr[data-row-company-id="27"] a {
    color: var(--ink-900);
  }
</style>
'''
soup = bs(html, 'html.parser')
trs = soup.select('style')
for t in trs:
    desired_info = [x.split('[')[1].split(']')[0] for x in t.text.split('}') if len(x) > 1]
    print(desired_info)

Result:

['data-row-company-id="27"', 'data-row-company-id="27"']

You can drill down further and extract the number only if you want, etc. BeautifulSoup documentation: https://beautiful-soup-4.readthedocs.io/en/latest/index.html

  • Related