I am trying to scrape weather form weather.com but it is not working for some reason here is the error:
Traceback (most recent call last):
File "main.py", line 6, in <module>
all=soup.find("div",{"class":"locations-title ten-day-page-title"}).find("h1").text
AttributeError: 'NoneType' object has no attribute 'find'
I don't know what is going wrong here. Here is the code:
import requests
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/bf217d537cc1c8074ec195ce07fb74de3c1593caa6033b7c3be4645ccc5b01de")
soup = BeautifulSoup(page.content,"html.parser")
all=soup.find("div",{"class":"locations-title ten-day-page-title"}).find("h1").text
table=soup.find_all("table",{"class":"twc-table"})
l=[]
for items in table:
for i in range(len(items.find_all("tr"))-1):
d = {}
d["day"]=items.find_all("span",{"class":"date-time"})[i].text
d["date"]=items.find_all("span",{"class":"day-detail"})[i].text
d["desc"]=items.find_all("td",{"class":"description"})[i].text
d["temp"]=items.find_all("td",{"class":"temp"})[i].text
d["precip"]=items.find_all("td",{"class":"precip"})[i].text
d["wind"]=items.find_all("td",{"class":"wind"})[i].text
d["humidity"]=items.find_all("td",{"class":"humidity"})[i].text
l.append(d)
I really don't understand what is going on here as I am new to python and this library as a whole. I hope someone can help me with this problem.
CodePudding user response:
A couple of things:
As noted in comments your first soup.find is returning None hence error. Looking at the relevant title there are dynamic parts to the class names. I would instead use an attribute = value css selector, with starts with operator, to target the class attribute by substring. I would also add in the h1 type selector, thus ensuring to target the h1 directly without chaining calls.
Avoid using variable names which are already either key words or python in-builts i.e. all().
A re-write thus might look like:
import requests
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/bf217d537cc1c8074ec195ce07fb74de3c1593caa6033b7c3be4645ccc5b01de")
soup = BeautifulSoup(page.content,"html.parser")
title = soup.select_one('h1[class^=LocationPageTitle]').text
title
CodePudding user response:
Reason for the error:
There is no
<div>
with classten-day-page-title
. That is the reason you are getting that error. Also there is no<table>
with classtwc-table
.
You could instead try this way.
- The weather forecast for the next 10 days is present inside
<details>
tag whoseid
starts withdetailIndex
. I have usedre
to select all such<detail>
tags - Now you can extract whatever information you need from the above selected
<detail>
tags
Here I have printed out the forecast summary.
import re
import requests
from bs4 import BeautifulSoup
page = requests.get("https://weather.com/en-IE/weather/tenday/l/bf217d537cc1c8074ec195ce07fb74de3c1593caa6033b7c3be4645ccc5b01de")
soup = BeautifulSoup(page.content,"lxml")
d = soup.find_all('details', {'id': re.compile(r'^detailIndex')})
for i in d:
p = i.find('summary')
print(list(p.stripped_strings))
['Tonight', '--', '/', '9°', 'Cloudy', 'Cloudy', 'Rain', '15%', 'Wind', 'NNW', '9 km/h', 'Arrow Up']
['Wed 10', '16°', '/', '3°', 'Scattered Showers', 'AM Showers', 'Rain', '42%', 'Wind', 'NW', '18 km/h', 'Arrow Down']
['Thu 11', '12°', '/', '8°', 'Partly Cloudy', 'Partly Cloudy', 'Rain', '3%', 'Wind', 'ENE', '11 km/h', 'Arrow Down']
['Fri 12', '17°', '/', '6°', 'Rain', 'Rain', 'Rain', '89%', 'Wind', 'SSE', '32 km/h', 'Arrow Down']
['Sat 13', '14°', '/', '4°', 'Partly Cloudy', 'Partly Cloudy', 'Rain', '23%', 'Wind', 'N', '15 km/h', 'Arrow Down']
['Sun 14', '11°', '/', '2°', 'Mostly Sunny', 'Mostly Sunny', 'Rain', '8%', 'Wind', 'W', '21 km/h', 'Arrow Down']
['Mon 15', '9°', '/', '2°', 'Scattered Showers', 'Showers', 'Rain', '40%', 'Wind', 'WNW', '17 km/h', 'Arrow Down']
['Tue 16', '9°', '/', '2°', 'Mostly Sunny', 'Mostly Sunny', 'Rain', '14%', 'Wind', 'WNW', '22 km/h', 'Arrow Down']
['Wed 17', '10°', '/', '5°', 'Mostly Sunny', 'Mostly Sunny', 'Rain', '7%', 'Wind', 'W', '21 km/h', 'Arrow Down']
['Thu 18', '14°', '/', '7°', 'Mostly Sunny', 'Mostly Sunny', 'Rain', '7%', 'Wind', 'SW', '21 km/h', 'Arrow Down']
['Fri 19', '14°', '/', '4°', 'Scattered Showers', 'PM Showers', 'Rain', '32%', 'Wind', 'SW', '17 km/h', 'Arrow Down']
['Sat 20', '10°', '/', '2°', 'Partly Cloudy', 'Partly Cloudy', 'Rain', '24%', 'Wind', 'WNW', '19 km/h', 'Arrow Down']
['Sun 21', '9°', '/', '4°', 'Partly Cloudy', 'Partly Cloudy', 'Rain', '19%', 'Wind', 'NNW', '17 km/h', 'Arrow Down']
['Mon 22', '8°', '/', '3°', 'Scattered Showers', 'AM Showers', 'Rain', '32%', 'Wind', 'N', '19 km/h', 'Arrow Down']
['Tue 23', '8°', '/', '2°', 'Scattered Showers', 'Few Showers', 'Rain', '34%', 'Wind', 'NW', '19 km/h', 'Arrow Down']