How to get tweets in twitter using Selenium in Python-CodePudding

I want to get tweets in python by selenium but get attribute didnt work for me. Here is my code. can you help me to fix it?

driver = webdriver.Chrome()
driver.get("http://twitter.com/elonmusk")
time.sleep(3)
SCROLL_PAUSE_TIME = 4
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")
    tweet = driver.find_elements(By.XPATH,"//div[@id='id__z5kb0qs2bgp']").get_attribute("innerHTML").splitlines()
    time.sleep(SCROLL_PAUSE_TIME)
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
driver.quit()

CodePudding user response：

Something like this should work for grabbing the tweet text:

tweets = driver.find_elements(By.XPATH, '//div[@data-testid="tweetText"]')
for i in tweets:
    print(i.get_attribute('innerText'))

CodePudding user response：

To extract tweets you need to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

Using CSS_SELECTOR:

driver.get('http://twitter.com/elonmusk')
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[data-testid='tweetText'] span")))])

Using XPATH:

driver.get('http://twitter.com/elonmusk')
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@data-testid='tweetText']//span")))])

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC