Home > other >  Py answer IndexError: list index out of range
Py answer IndexError: list index out of range

Time:11-26

The import requests
The import re
The from LXML import etree
The import pymysql
The import time

Conn=pymysql. Connect (host='localhost', user='test' and password='123456', the db='xs', the port=3330, charset='utf8')
Cursor=conn. Cursor ()
Headers={' the user-agent ':' Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36 '}
Def get_url_music (url) :
HTML=requests. Get (url, headers=headers)
The selector=etree. HTML (HTML. Text)
Music_hrefs=selector. Xpath ('//div/@/a/@ href ')
For music_href music_hrefs in:
# print (music_href)
Get_music_info (music_href)

Def get_music_info (url) :
HTML=requests. Get (url, headers=headers)
The selector=etree. HTML (HTML. Text)
Try:
Name=re. The.findall (' & lt; Span property="v: itemreviewed" & gt; (. *?) ', HTML, text, re S) [0]
Director=selector. Xpath ('//* [@ id="info"]/span [1]/span [2]/a/text () ') [0]
Actors=selector. Xpath ('//* [@ id="info"]/span [3]/span [2] ') [0]
The actor=actors. Xpath (' string (.) ')
Style=re. The.findall (' & lt; Span property="v: genre" & gt; (. *?) ', HTML, text, re S) [0]
Country=re. The.findall (' & lt; Span & gt; Producer countries/regions: & lt;/span> (. *?)
', HTML, text, re S) [0]
Release_time=re. The.findall (' release date: & lt;/span> . *.?> (. *?) ', HTML, text, re S) [0]
Time=re. The.findall (' running time: & lt;/span> . *? (. *?) ', HTML, text, re S) [0]
Score=selector. Xpath ('//* [@ id="interest_sectl"]/div [1]/div [2]/strong/text () ') [0]
# print (name, director, actor, style, country, release_time, time, score)
Cursor. The execute (" insert into doubanmovie (name, director, actor, style, country, release_time, time, score) values (% s, % s, % s, % s, % s, % s, % s, % s) ", (STR (name), STR (director), STR (actor), STR (style), STR (country), STR (release_time), STR (time), STR (score)))

Except ImportError:
Pass

If __name__=="__main__ ':
Urls=[' https://movie.douban.com/top250?start={} '. The format (STR) (I) for I in range (0100)]
For the url in urls:
# print (url)
Get_url_music (url)
Time. Sleep (2)
Operation result error
C: \ Users \ cdweifeng \ PycharmProjects \ untitled1 \ venv \ Scripts \ python exe C:/Users/cdweifeng/PycharmProjects/untitled1/mysqltest py
Traceback (the most recent call last) :
File "C:/Users/cdweifeng/PycharmProjects/untitled1/mysqltest py", line, 59 in & lt; module>
# print (url)
File "C:/Users/cdweifeng/PycharmProjects/untitled1/mysqltest py", 34, the line in get_url_music
# print (music_href)
File "C:/Users/cdweifeng/PycharmProjects/untitled1/mysqltest py", line 40, in get_music_info
Try:
IndexError: list index out of range


Pray god grant instruction

CodePudding user response:


 

Name=re. The.findall (' & lt; Span property="v: itemreviewed" & gt; (. *?) ', HTML, text, re S)



Should be, the name is empty, try the except did not handle the exception (ImportError here use inappropriate?) , so IndexError: list index out of range is thrown,

1, check your keyword in the.findall is correct,
2, don't directly use name=re. The.findall (' & lt; Span property="v: itemreviewed" & gt; (. *?) ', HTML, text, re S) [0] this form,
With the name=re. The.findall (' & lt; Span property="v: itemreviewed" & gt; (. *?) ', HTML, text, re S)
To judge len (name),

CodePudding user response:

This very embarrassed, I use the print time is normal, but an error when I was inserted into the database...
  • Related