This way I have a working code that waits for the elements on the page:
wait = WebDriverWait(driver, 60)
try:
imo_giris = wait.until(EC.visibility_of_element_located((By.XPATH, "//*[@id='P_ENTREE_HOME']")))
imo_giris.send_keys(imo, "\n")
except TimeoutException:
print("None")
driver.close()
continue
How can I integrate this WebDriverWait() module into my code that finds the email regex in the source codes of my page? Here is my code that gets the email regex of the website:
results = []
for query in my_list:
results.append(search(query, tld="com", num=3, stop=3, pause=2))
for result in results:
url = list(result)
print(*url,sep='\n')
for site in url:
driver = webdriver.Chrome()
driver.get(site)
doc = driver.page_source
emails = re.findall(r'[\w\.-] @[\w\.-] ', doc)
for email in emails:
print(email)
I can find emails from the source codes on the page, but sometimes the website is not active or it takes a lot of time because the source codes are too long. I want to reduce email regex search to 10 seconds, how can I do that?
I solved the problem
I replaced it with a better regex. The regex I'm using now and working fine:
r'\b[A-Za-z0-9._% -] @(?:[A-Za-z0-9-] \.) [A-Za-z]{2,4}\b'
CodePudding user response:
You could create a custom expected condition, but it seems a bit of a overkill. Instead you can use a simple for
loop with time measurement
...
doc = driver.page_source
emails = []
end_time = time.time() 10
while time.time() < end_time and not emails:
emails = re.findall(r'[\w\.-] @[\w\.-] ', doc)
print(emails)