I want to get tweets in python by selenium but get attribute didnt work for me. Here is my code. can you help me to fix it?
driver = webdriver.Chrome()
driver.get("http://twitter.com/elonmusk")
time.sleep(3)
SCROLL_PAUSE_TIME = 4
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
tweet = driver.find_elements(By.XPATH,"//div[@id='id__z5kb0qs2bgp']").get_attribute("innerHTML").splitlines()
time.sleep(SCROLL_PAUSE_TIME)
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
driver.quit()
CodePudding user response:
Something like this should work for grabbing the tweet text:
tweets = driver.find_elements(By.XPATH, '//div[@data-testid="tweetText"]')
for i in tweets:
print(i.get_attribute('innerText'))
CodePudding user response:
To extract tweets you need to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.get('http://twitter.com/elonmusk') print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[data-testid='tweetText'] span")))])
Using XPATH:
driver.get('http://twitter.com/elonmusk') print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@data-testid='tweetText']//span")))])
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC