I have a list of URLs that I screenshot and save periodically. It's pretty manual and time consuming and I researched how I could automate this to save time.
I came across a python solution using "Selenium" and I have a few questions.
Is there a script available that I could maybe amend (I'm very new to Python) That would enable me to:
Screenshot and store the image of a list of URLs
Automate this process so it occurs in periodically
Thanks in advance!
This is an example of code that I have found and amended, I just wonder what more I can do:
from selenium import webdriver
from time import sleep
driver = webdriver.Chrome()
driver.get('URL')
sleep(1)
driver.get_screenshot_as_file("FW.png")
driver.quit()
print("end...")
I believe I may have errors, but this is my start. (Super new to coding so apologies!)
CodePudding user response:
The following is an example of how you could screenshot a number of pages, every x seconds/minutes/hours/ etc using Selenium (this example is taking a screenshot every 10 seconds):
import schedule
import time as t
from datetime import datetime
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument('disable-notifications')
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
urls_list = ['https://yahoo.com/', 'https://eff.org/', 'https://invidious.io/']
def get_secondly_screenshots(list_of_urls):
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
for url in list_of_urls:
browser.get(url)
page = WebDriverWait(browser,10).until(EC.element_to_be_clickable((By.TAG_NAME, "body")))
t.sleep(5)
now = datetime.now()
date_time = now.strftime("%Y_%m_%d_%H_%M_%S")
sh_url = url.split('://')[1].split('.')[0]
print(sh_url, date_time)
page.screenshot(f'{sh_url}_{date_time}.png')
print('screenshotted ', url)
t.sleep(2)
browser.quit()
schedule.every(10).seconds.do(get_secondly_screenshots, list_of_urls = urls_list)
while True:
schedule.run_pending()
t.sleep(1)
This setup is for linux/chromedriver, you will need to set it up on your own system (there are many good tutorials for this, and documentation is also helpful).
Documentation for schedule: https://schedule.readthedocs.io/en/stable/ And also documentation for Selenium: https://www.selenium.dev/documentation/