Home > Enterprise >  Find by style (color) with requests_html
Find by style (color) with requests_html

Time:11-28

I have to use requests_html for JavaScript content. Code:

<td class="text-left worker-col truncated"><a href="/account/0x58e0ff2eb3addd3ce75cc3fbdac3ac3f4e21fa/38-G1x" style="color:red">38-G1</a></td>

I want to find all names (38-G1 in this case) with red color. I want to seach them by style="color:red". Is this possible with requests_html? How I can do this?

CodePudding user response:

I do use both html session and selenium with bs4. Selenium works fine but html session is unable to render js.

Code with selenium.(Success)

from bs4 import BeautifulSoup
import time
from selenium import webdriver


driver = webdriver.Chrome('chromedriver.exe')
url = "https://eth.nanopool.org/account/0x58e0ff2eb3addd3ce75cc3fbdac3ac3f4e21fd4a"
driver.get(url)
time.sleep(8)

soup = BeautifulSoup(driver.page_source, 'html.parser')
for t in soup.select('table.table.table-bordered.table-hover.table-responsive tr'):
    txt= t.select_one('td:nth-child(2) > a')
    text= txt.text if txt else None
    print(text)

Output:

38-G15
47_G15_2   
47-G1      
49-O15     
90_GGX     
91_ASF     
105_MGPM_3 
112-GG3    
121-APRO   
188-MGPM1  
198-AP     
248_MGPM_1 
262-GUD    
265_ASF    
302-AD     
355-GUD.2  
Rig_3471855
rigEdge    
107_MGPM_3 
None
None

Code with html session(not rendering js)

from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
response = session.get('https://eth.nanopool.org/account/0x58e0ff2eb3addd3ce75cc3fbdac3ac3f4e21fd4a')
soup = BeautifulSoup(response.content, 'html.parser')

for t in soup.select('table.table.table-bordered.table-hover.table-responsive tr'):
    txt= t.select_one('td:nth-child(2) > a')
    text= txt.text if txt else None
    print(text)

CodePudding user response:

You could find all <a> tags and put them in a list if they have that attribute.

Get your page:

from requests_html import HTMLSession
session = HTMLSession()

r = session.get('your_url')

then find all anchor tags with

anchors = r.html.find('a')

and in the end get the contents of all a tags that have their style attribute equal to color:red

names = []
for a in anchors:
    if a.attrs['style'] == "color:red":
        names.append(a.text)

Sadly this will work only for inline styles specified within the anchor tag, but if your example is representative, then it should work.

Edit: I see an other user gave you a solution with BeautifulSoup and I'd like to add that if you're new to webscraping, but you plan on learning more, I'd also recommend learning to use BeautifulSoup. It's not only more powerful, but it's user base is much larger, so it's easier to find solutions for your problem.

  • Related