I've been having trouble trying to extract the phone number without using selenium after clicking the "afficher le numero" button.
Here is the url to the link - https://www.mubawab.ma/fr/a/7469776/beau-terrain-à-la-vente-à-hay-izihar-superficie-68-m²-
Heres the code that I tried:
import re
import requests
from bs4 import BeautifulSoup
url = "https://www.mubawab.ma/fr/a/7469776/beau-terrain-à-la-vente-à-hay-izihar-superficie-68-m²-
"
phone_url = "https://www.mubawab.ma/jSpBT9/gAEhoRFWpm8vGww==', 'adPage"
ad_id = re.search(r"(\d )\.htm", url).group(1)
html_text = requests.get(phone_url.format(ad_id)).text
soup = BeautifulSoup(html_text, "html.parser")
phone = re.search(r"getTrackingPhone\((.*?)\)", html_text).group(1)
print(soup.select_one(".texto").get_text(strip=True), phone)
CodePudding user response:
In this case, you need to use selenium. Since it is quite difficult to understand how the payload is encoded and the time will be spent many times more. Most possible string:
YR3gCzHEBrHR63YyPD95vui5tCyoyGZZRCtdUTrrJtw=
Converted to:
ᣢ㡒䄬ീ嬤㠰℠尯〴䀶̨ۀ⪠嘡ਣ䰧〪ိ䁇䁦㗠߆ྠ㎁㠤怬Ⱡ⧓iⴠ祬ö~删ങ校屵䀠瀤槨‣㏰⏠᪠ӠѴ㢠ზ5ⵝ䯭涇䰧ࠢ⬠ӕ倠㓡Ġ༠ⲠǠË䜕₈Ф纾㾚ુ圪$㛀ŚR儨⒗Ᏼာ挥狩⬕䐠⮀㚐䈦ݕҊ冑懖咏࠳⧜性ᘂ㙻ⓔዠ佊摾妤䕖勩ᬕᣱ⋍庰䶬䟦䝱凸潹㈠䕪⠥㡃忭夠㭍㞹慳ၭ"☷ᦞ䂢䠷Р睢ୀ㌵׃ऄ〢㝒桾ᾠ☡犱ⶼᔨᔔᕢ㕒⣢ℰ䝐⒡ڹ䐫㋜㸩啒ᾼ昂纙ઽ瘲Ⲙẻˇ帠湧၍令偱夦䡮ऀᕚદ慢爼⠖䧜傉䤴夬݅䡯兰摍䨳0㢔仦摔䈤沥冠汈᠂ᕢń⠀㥰䣖ѵဠ幸栠
Maybe the answer is obvious to someone, but I will offer my own version with selenium. Dont forget to download webdriver for your browser, chrome like example and specify path in code
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Chrome(path)
driver.get('https://www.mubawab.ma/fr/a/7469776/beau-terrain-à-la-vente-à-hay-izihar-superficie-68-m²-')
script = BeautifulSoup(driver.page_source, 'lxml').find('div', class_='hide-phone-number-box').get('onclick')
elem = driver.find_element(By.CLASS_NAME, 'hide-phone-number-box')
driver.execute_script(script, elem)
timeout = 5
try:
element_present = EC.presence_of_element_located((By.CLASS_NAME, 'phoneText'))
WebDriverWait(driver, timeout).until(element_present)
phone = BeautifulSoup(driver.page_source, 'lxml').find('p', class_='phoneText').getText()
except TimeoutException:
print("Timed out waiting for page to load")
print(phone)
OUTPUT:
212 6 27 47 75 46
CodePudding user response:
I have found the solution to my own problem without using selenium. You can't use requests to get the phone number because the page uses javascript to create the page with the phone number. But you can use requests_html to render the javascript and get the phone number:
from requests_html import HTMLSession
url = "https://www.mubawab.ma/fr/a/7469776/beau-terrain-à-la-vente-à-hay-izihar-superficie-68-m²- "
session = HTMLSession()
r = session.get(url)
# get the onclick code from the button
onclick = r.html.xpath('//*[@id="stickyDiv"]/div[2]/div[1]/div')[0].attrs['onclick']
# put the onclick code in a script
script = f"() => {{{onclick}}}"
# render the script
r.html.render(sleep=1, timeout=20, script=script)
# get the phone number
phone_number = r.html.xpath('//*[@id="response"]/p')[0].text
print(phone_number)
OUTPUT :
06 27 47 75 46