Home > other >  For how to use the selenium crawl need to sweep the yard for the seeds of magnetic chain website
For how to use the selenium crawl need to sweep the yard for the seeds of magnetic chain website

Time:10-26

Want to use the selenium crawl need to sweep the yard for magnetic chain in magnetic chain website 'https://skrbt025.xyz/', the site was first scan code for by magnetic chain after operation flux can be obtained without sweeping yards,
My train of thought is to use the selenium to jump to scan, code page. Use the time to wait for sleep, in the meantime manually scan code, wait for after the use of 'driver. Get_cookies ()' do not need to sweep yards of cookies, in the next time when using magnetic chain website 'https://skrbt025.xyz/' with the driver. The add_cookie (cookies) can be directly obtained skip scan code flux,
But the result of the actual operation is even read cookie, next time using selenium crawl flux site still need to sweep the yard, pray god to help me take a look at the code and thinking something wrong, thank you for your remind and correct , by the way why I use implicit wait for the driver. The implicitly_wait (10), sometimes elements on the page loading out live to still need to wait after page is completely loaded to continue the following operation?
 
The import time
The from the selenium import webdriver
The import json

The options=webdriver. ChromeOptions ()
Options. Add_argument (' start - maximized)
Driver=webdriver. Chrome (chrome_options=options)

Driver. Implicitly_wait (10)
Driver. The get (' https://skrbt025.xyz/')

# for the second time when used, use the following code import has saved cookies
# driver. Delete_all_cookies ()
# with the open (' cookies, TXT ', 'r') as cookief:
# # to use json to read cookies, pay attention to read the file with the load so not loads
# cookieslist=json. The load (cookief)
# for cookies in cookieslist:
# # is not all cookies contain expiry so use dict the get method to obtain
# if isinstance (cookies. Get (' expiry), float) :
# cookie [' expiry]=int (cookie [' expiry])
# driver. Add_cookie (cookies)
# driver. The refresh ()

# search for the first time, the website is robot verifies whether you
Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/input '). The send_keys (' programming ')
Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/span/button '). Click ()

Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/input '). The send_keys (' programming ')
Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/span/button '). Click ()
Target=driver. Find_element_by_xpath ('/HTML/body/div/div [5]/div [2]/ul [1]/li [1]/a ')
Js4="the arguments [0]. ScrollIntoView ();"
Driver. Execute_script (js4, target)
Target. Click ()

Current_windows=driver. Window_handles
Window (current_windows driver. Switch_to. [1])
Driver. Find_element_by_xpath ('//* [@ id="all"] '). Click () # click scan code link
Scanning time. Sleep (20) # code operation

Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/input '). The send_keys (' programming ')
Driver. Find_element_by_xpath ('//* [@ id="search - form"]/div/span [2]/button '). Click ()

Target=driver. Find_element_by_xpath ('/HTML/body/div/div [5]/div [2]/ul [1]/li [1]/a ')
Js4="the arguments [0]. ScrollIntoView ();"
Driver. Execute_script (js4, target)
Target. Click ()
Current_windows=driver. Window_handles
Window (current_windows driver. Switch_to. [1])

# to run for the first time after sweeping code with the following code to obtain the cookie
# with the open (' cookies, TXT ', 'w') as cookief:
# # to save cookies for json format
# cookief. Write (json. Dumps (driver. Get_cookies ()))
Time. Sleep (10)
Driver. The quit ()
  • Related