I keep getting the error "'NoneType' object is not subscriptable" when I run my Scrapy code. I understand that the object value is None, but how to skip that and instruct Scrapy to record this object as an empty one?
Here are the method
def parse_country(self, response):
try:
item = response.meta['item']
link_id = response.meta['link_id']
place_data = json.loads(response.body)
place_country = place_data[0][0][0]
item['place_country'] = place_country
yield item
except Exception as e:
print(e)
The error only show when there are no data to scrape.
CodePudding user response:
Try/except
is useful to catch errors or bugs.
I would suggest an if/else
solution.
Something like that could work for you:
def parse_country(self, response):
item = response.meta['item']
link_id = response.meta['link_id']
place_data = json.loads(response.body)
if place_data[0][0][0] is not None:
place_country = place_data[0][0][0]
item['place_country'] = place_country
else
item['place_country'] = 'No Country found'
CodePudding user response:
Note that the use of try
block as a control statement is not a good practice.
When you write
place_data[0][0][0]
it means that you are looking for multi-level nested list. The solution is to check for None
and length at each level. If any of these items are None
you will have this error.
The solution is to check for all these values. You can do it one if statement like this
if place_data and len(place_data) > 0 \
and place_data[0] and len(place_data[0]) > 0 \
and place_data[0][0] and len(place_data[0][0]) \
and place_data[0][0][0]:
item['place_country'] = place_data[0][0][0]
else:
item['place_country'] = None
Or you could break it down into multiple, nested if
statements for better readability.
Side note # 1: Use of meta
is not recommended in the newer version of scrapy. Use cb_kwargs
instead. See the docs.
Side Note # 2: You can directly get json
by calling response.json()