I'm fairly new to Python and coding in general so please bare with me. Overall, what im attempting is to create a script to open up my monthly fire department training, go to the monthly training video's, make a list of the potential videos in that data pod that change monthly and vary in how many video's we have for that month, and then play the video's. I've used Selenium to access the webpage and login. Im currently trying to make the list of possible monthly video's that'll ill be able to pull from and play. Show in the pic's are the "Assigments" and the code layout of the inspect video elements. Below is my code that I've come up with to pull the video links but everytime I run it it comes up with email obfuscation. Not sure what's causing this or how to get around it. Any help would be appreciated.
###Edit added all of my code
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.options import Options #for maximize, disabling pop ups, enabling/disabling ext, etc..
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import requests
from bs4 import BeautifulSoup
import httplib2
import re
#Target Solutions Credentials
username = "#"
password = "#"
#opening web page
chrome_options = Options()
chrome_options.add_experimental_option("detach", True)
s = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=s, options=chrome_options)
#open window in maximize
driver.maximize_window()
#website
driver.get('https://www.targetsolutions.com/')
driver.implicitly_wait(10)
#login screen button
lms_login_button = driver.find_element(by=By.XPATH, value='//*[@id="riverbend-ButtonElement--oouR2NYdw9Ns5lb2VrED"]')
lms_login_button.click()
#username & password
username = driver.find_element(by=By.XPATH, value='//*[@id="username"]').send_keys("#")
password = driver.find_element(by=By.XPATH, value ='//*[@id="password"]').send_keys("#")
#Login button
login_screen_button = driver.find_element(by=By.XPATH, value='//*[@id="form-login"]/ul/li[3]/input')
login_screen_button.click()
# Assignments page
my_assignments = driver.find_element(by=By.XPATH, value ='//*[@id="navLeft"]/ul/li[2]/a')
my_assignments.click()
# EMAIL OBFUSCATION===============
def email(string):
r = int(string[:2], 16)
email = ''.join([chr(int(string[i:i 2], 16) ^ r)
for i in range(2, len(string), 2)])
return email
print(email('d0a3a5a0a0bfa2a490a4b1a2b7b5a4a3bfbca5a4b9bfbea3feb3bfbd'))
# WEBSCRAPER====================
url = 'https://app.targetsolutions.com/tsapp/dashboard/pl_fb/index.cfm?fuseaction=c_pro_assignments.showHome'
links = []
website = requests.get(url)
website_text = website.text
soup = BeautifulSoup(website_text, features='html.parser')
for link in soup.find_all('a'):
links.append(link.get('href'))
for link in links:
print(link)
Results: ====== WebDriver manager ====== Current google-chrome version is 107.0.5304 Get LATEST chromedriver version for 107.0.5304 google-chrome Driver [C:\Users\Wrd_3.wdm\drivers\chromedriver\win32\107.0.5304.62\chromedriver.exe] found in cache
DevTools listening on ws://127.0.0.1:55154/devtools/browser/d4e0b939-a7c4-4cfb-b828-1187823a031e [email protected] /cdn-cgi/l/email-protection#3a494f4a4a55484e7a4e5b485d5f4e4955564f4e5355544914595557
Which I understand to be some form of CloudFare Email Obfuscation.
CodePudding user response:
I really can't tell what's going on with you because you didn't provided us with any more information in order to check and verify
Basically you are dealing with website which is behind CloudFlare protection
and the results you get is an email [email protected]
you can try to decod the result with this script
def email(string):
r = int(string[:2], 16)
email = ''.join([chr(int(string[i:i 2], 16) ^ r)
for i in range(2, len(string), 2)])
return email
print(email('d0a3a5a0a0bfa2a490a4b1a2b7b5a4a3bfbca5a4b9bfbea3feb3bfbd')) // [email protected]
it's Email Address Obfuscation you can read about it here click here
CodePudding user response:
Using the module js2py
you can re-utilize their Javascript decode routines:
import js2py
js_script = """\
function decode(email) {
function r(e, t) {
var r = e.substr(t, 2);
return parseInt(r, 16);
}
function n(n, c) {
for (var o = "", a = r(n, c), i = c 2; i < n.length; i = 2) {
var l = r(n, i) ^ a;
o = String.fromCharCode(l);
}
return o;
}
var l = "/cdn-cgi/l/email-protection#";
return n(email, email.indexOf(l) l.length);
}
"""
decoder = js2py.eval_js(js_script)
email = decoder(
"/cdn-cgi/l/email-protection#d0a3a5a0a0bfa2a490a4b1a2b7b5a4a3bfbca5a4b9bfbea3feb3bfbd"
)
print(email)
Running the script prints your email.