Home > Back-end >  How to bring back 1st div child in python using bs4 soup.select within a dynamic table
How to bring back 1st div child in python using bs4 soup.select within a dynamic table

Time:02-16

In the below html elements, I have been unsuccessful using beautiful soup.select to only obtain the first child after div > (i.e. -11.94M and 2.30M) in list format

<div >
   <div >
      <div>‪−11.94M‬</div>
      <div >−119.94%</div></div></div>

<div >
   <div >
      <div>‪2.30M‬</div>
      <div >−80.17%</div></div></div>

Above is just two examples within the html I'm attempting to scrape within the dynamic javascript coded table which the above source code lies within, but there are many more div attributes on the page, and many more div class "wrap-25PNPwRV" inside the javascript table

I currently have the below code which allows me to scrape all the contents within div class ="wrap-25PNPwRV"

data_list = [elem.get_text() for elem in soup.select("div.wrap-25PNPwRV")]

Output:

['-11.94M', '-119.94%', '2.30M', '-80.17%']

However, I would like to use soup.select to yield the desired output :

['-11.94M', '2.30M']

I tried following this guide https://www.crummy.com/software/BeautifulSoup/bs4/doc/ but have been unsuccessful to implement it to my above code.

Please note, if soup.select is not possible to perform the above, I am happy to use an alternative providing it generates the same list format/output

CodePudding user response:

You can use the :nth-of-type CSS selector:

data_list = [elem.get_text() for elem in soup.select(".wrap-25PNPwRV div:nth-of-type(1)")]

CodePudding user response:

I'd suggest to not use the .wrap-25PNPwRV class. Seems random and almost certainly will change in the future.

Instead, select the <div> element which has other element with as sibling. For example

print([t.text.strip() for t in soup.select('div:has(  [class^="change"])')])

Prints:

['−11.94M', '2.30M']
  • Related