Home > other >  Help: crawl web pages while statement for statement instead after the data is not saved
Help: crawl web pages while statement for statement instead after the data is not saved


The following code to the while statement for statement instead, where write wrong? Why the while statement execution no problem, for statement executes, data is not saved?
# the current page is less than the total page number cycle continue to
While the int (page_active) & lt; Page_num + 1:
The items=get_data ()
Save_data (items)
If page_active==page_num:
Next_page ()
Page_active=WAIT until (EC. Presence_of_element_located (
(By CSS_SELECTOR, '# video - the list & gt; Div. Page - wrap & gt; Div & gt; Ul & gt; Li. Page - item. The active '))). The text
# # instead for statement
# for num in range (page_num) :
# items=get_data ()
# save_data (items)
# next_page ()
The quit ()
The book. The save (' video information list. XLSX)

A custom function
Def get_data () :
"" "
Get the name of the video in the web page, video address, description, viewed, number of barrage, release time
: return: the name of the video, video address, description, viewed, number of barrage, release time
"" "
Print (' began to get the data ')
# for the content of the video page. Page_source familiar with
HTML=the page_source
# regular expression must use r 'to set expression string
The pattern=re.com running (
R '& lt; Li. *? . *?" Icon - playtime & gt;" (. *?) . *?" Icon - the subtitle "& gt; (. *?) . *?" Icon - date & gt;" (. *?) . *? '
Re. S)
The items=re. The.findall (pattern, HTML)
For the item in the items:
Yield [
The item [0],
The item [1],
The item [2],
The item [3],
The item [4],
The item [5]

Def save_data (items) :
"" "
Save the video information
: param items: video information generator
: return:
"" "
Print (' began to save the data ')
For the item in the items:
# into global variables, save the data down
Global n
For j in range (len (item) :
Sheet. Write (n, j, item [j])
Print (n, j)
N=n + 1

CodePudding user response:

Len (item)

CodePudding user response:

reference 1st floor sjhbirds response:
len (item)
didn't understand
  • Related