Home > other >  The selenium in headless mode and the interface mode page_source is different
The selenium in headless mode and the interface mode page_source is different

Time:10-06

RT, need not headless, pop-up window, but saved data is complete, use the headless saved data have no, if in firefox headless mode is normal again,
What reason?
The import json
The from bs4 import BeautifulSoup
The import OS
The import sqlite3
The from the win32. Win32crypt import CryptUnprotectData
The from the selenium import webdriver
The from the selenium. Webdriver. Chrome. The options import options
The from selenium.webdriver.com mon. By the import by
The from the selenium. Webdriver. Support. The UI import WebDriverWait
The from the selenium. Webdriver. Support the import expected_conditions as EC



Def getcookiefromchrome (host) :
Cookiepath=OS. Environ [' LOCALAPPDATA] + r "\ \ Google Chrome \ User Data \ Default \ Cookies"
SQL="select host_key, name, encrypted_value from cookies where host_key='% s'" % host
With sqlite3. Connect (cookiepath) as conn:
Cu=conn. Cursor ()
Cookies={name: CryptUnprotectData (encrypted_value) [1]. The decode () for host_key, name, encrypted_value cu. In the execute (SQL). Fetchall ()}
# print (cookies)
Return the cookies


The host='192.168.205.186'
Surl=r "http://192.168.205.186:50000/manufacturing/index.jsp"
Starturl=r "http://192.168.205.186:50000/shineraywebwar/application/setup/SfcObtainReport"
Headers={
'the user-agent' : 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36 '
}
Searchdata={
https://bbs.csdn.net/topics/'P_START' : 'the 2019-12-18 + 0% 3 a00:1450:8006%3 a00:1450:8006',
'P_END' : 'the 2019-12-20 + 0% 3 a00:1450:8006%3 a00:1450:8006',
'OPERATION' : 'AM160',
'CAR_TYPE' : ' ',
'SITE', '2000',
'ACTIVITY_ID' : 'Z_PP_SFC_OBTAIN',
'_FORM_NON_KEY_MODIFIED' : 'true',
'DISTINCT_BOX' : 'false',
'DISTRIBUTION_DETAIL_TABLE_SELECTED_ROW_INDEX' : '1',
'the LINE' : ' ',
'P_START_END_OF_DAY' : 'false',
'P_END_END_OF_DAY' : 'false',
'P_END_LOCALE_CONTEXT' : ' ',
'P_END_SHOW_TIME' : 'true',
'P_START_LOCALE_CONTEXT' : ' ',
'P_START_SHOW_TIME' : 'true',
'RESOURCE' : ' ',
'USER_CMD' : 'RetrieveCommand',
'VIN_NUMBER' : ' ',
}

Chrome_options=webdriver. ChromeOptions ()
Chrome_options. Add_argument (" headless ")
Chrome_options. Add_argument (" -- disable - gpu ")
Driver=webdriver. Chrome (options=chrome_options)
# # driver=webdriver. Chrome () has a head model

Driver. The get (surl)
Driver. Delete_all_cookies ()

Cookies=getcookiefromchrome (host)
For I in cookies:
Driver. Add_cookie ({' name ': I,' value ': cookies [I]})

Driver. The get (starturl)
Bs=BeautifulSoup (driver. Page_source ". The HTML parser ")
For I in bs. Find_all (' input 'type="hidden") :
If i.g et (' name ')=='APP_ID' :
Searchdata [' APP_ID ']=[" value "] I
Elif i.g et (' name ')=='FORM_ID' :
Searchdata [' FORM_ID]=[" value "] I
Break
Aa='
For ii in searchdata:
Aa=aa + ii + '=' + searchdata [ii] + '& amp; '
Starturl=starturl + '? '+ aa
Print (starturl] [: - 1)
Starturl=starturl] [: - 1

Driver. The get (starturl)
Bs=BeautifulSoup (driver. Page_source ". The HTML parser ")
Ss=bs. Find (id="DISTRIBUTION_DETAIL_TABLE")
Print (ss)
Open (' contacts. HTML ', 'w' + ', encoding="utf-8"), write (driver. Page_source)

Driver. The quit ()
Driver. The close ()


  • Related