Environment: Windows 7 64 - bit, python3.8 pycharm2019.3 (complete third-party libraries)
Requirements: in the web page (illustrated) rely on keywords grab images and pages of python code, specific requirements are as follows:
1, the website address:
http://test.com/piclist.aspx? Id=1
Fixed for the web links, only id different
PS: hope loop from 1 to 1000,
2, the text location:
keyword and other words
PS: hope to find the keyword , then download page when all images (may be 1 or more), if you don't have these a few keywords page to the next page looking, and
3, the image location:
PS: if keyword but no photos, skip to continue the next page, want to use the Xpath match (regular expression),
PS: if the current page has a keyword , the hope that automatically according to the path of (date) (month) (year) is saved as: keyword _ the name of the time, such as keyword _20200320. JPG (zhang automatic underlined keyword _20200320_1. JPG, keyword _20200320_2. JPG),
4, the local ground file requirements:
In D plate (1), automatically build a with keyword folder, named
(2), the main folder all photos folder (e.g., 2019202), according to the years points year folder no longer segmentation,
(3), in the hope that before starting operation, through the input entry keyword ,
(4), in the hope that after the completion of the prompt "congratulations, you've got all the photos,"
Post for the first time, description is not detailed place, please leave a message, see after the first time reply, don't know whether can intrigue of the great god, if you pass by and convenient, also looked back just a little, thank you,
CodePudding user response:
Target site to visitCodePudding user response:
@ Test_Box this web site is just a hypothetical site,,,CodePudding user response:
http://test.com/piclist.aspx? Id=1Set number cycle can replace
For the number in the range (1000 + 1) :
http://test.com/piclist.aspx? Id=STR (number)
Each website content is different, the other is the real url sent to you
CodePudding user response: