'NoneType' object is not iterable in scrapper-CodePudding

I am trying to write scrapper for "free-proxy.cz" website, however, I am facing a problem

I know my "port" section is wrong, but I don't know the problem and how to fix it.

here is the code:

import requests
from bs4 import BeautifulSoup
import base64


urls = ['http://free-proxy.cz/en/proxylist/country/all/socks5/date/all',
       'http://free-proxy.cz/en/proxylist/country/all/socks5/date/all/2',
       'http://free-proxy.cz/en/proxylist/country/all/socks5/date/all/3',
       'http://free-proxy.cz/en/proxylist/country/all/socks5/date/all/4',
       'http://free-proxy.cz/en/proxylist/country/all/socks5/date/all/5',
]

for url in urls:
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')
    table = soup.find('table', {'id': 'proxy_list'})
    for row in table.find('tbody').find_all('tr'):
        for ip in row.find('script'):
            text=base64.b64decode(ip[29:-2:])
        for port in row.find('span', attrs='fport'):
            print(port.get_text())
    #ipadd=print(prt.decode('utf-8') ':' ports)

** I commented the last line because the port grabber is not working correct.

the result of running the above code is :

Traceback (most recent call last):
  File "LOCATION\main.py", line 22, in <module>
    for port in row.find('span', attrs='fport'):
TypeError: 'NoneType' object is not iterable
80
45554
1080
1080

what is the issue here ?

CodePudding user response：

    span_rows = row.find('span', attrs='fport')
    if span_rows is not None:
        for port in span_rows:
            print(port.get_text())

CodePudding user response：

Try the below code. It should work because, in my case, it's working fine.

from selenium import webdriver
import time
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.service import Service

webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url ='http://free-proxy.cz/en/proxylist/country/all/socks5/date/all/{p}'

for p in range(1,6):
    driver.get(url.format(p=p))
    driver.maximize_window()
    time.sleep(3)
    soup = BeautifulSoup(driver.page_source,"html.parser")
  
    for row in soup.select('#proxy_list tbody tr'):
        port = row.select_one('td:nth-child(2) span')
        port = port.get_text() if port else None
        print(port)

Output:

... so on