how do I get this for loop to not repeat this list previous output while still using a range. this for-loop is repeating the output of the previous number. every time it goes to the next number. instead of going from 0-20 one time. it goes 0-1,0-2,0-3,0-4...…..etc. I want it to go from 0-20 once and not duplicate itself.
import time
from selenium import webdriver
import selenium
from selenium.webdriver.chrome import service
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
#class scraperdata():
ser= Service("C:\Program Files (x86)\chromedriver.exe")
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(options=options,service=ser)
driver.get('https://soundcloud.com/jujubucks')
print(driver.title)
wait = WebDriverWait(driver,30)
wait.until(EC.element_to_be_clickable((By.ID,"onetrust-accept-btn-handler"))).click()
try:
song_list = []
i = 1
for _ in range(20):
song_contents = driver.find_element(By.XPATH, "//li[@class='soundList__item'][{}]".format(i))
driver.execute_script("arguments[0].scrollIntoView(true);",song_contents)
search = song_contents.find_element(By.XPATH, ".//a[contains(@class,'soundTitle__username')]/span").text
search_song = song_contents.find_element(By.XPATH, ".//a[contains(@class,'soundTitle__title')]/span").text
search_date = song_contents.find_element(By.XPATH, ".//time[contains(@class,'relativeTime')]/span").text
search_plays = song_contents.find_element(By.XPATH, ".//span[contains(@class,'sc-ministats-small')]/span").text
i =1
if _ == Exception:
break
option ={
'Artist': search,
'Song_title': search_song,
'Date': search_date,
'Streams': search_plays
}
song_list.append(option)
df = pd.DataFrame(song_list)
print(df)
except Exception:
pass
driver.quit()
Output
Stream Juju Bucks music | Listen to songs, albums, playlists for free on SoundCloud
Artist Song_title Date Streams
0 Juju Bucks Squad Too Deep Ft. Cool Prince (Outro) Posted 1 year ago 31 plays
Artist Song_title Date Streams
0 Juju Bucks Squad Too Deep Ft. Cool Prince (Outro) Posted 1 year ago 31 plays
1 Juju Bucks Tropikana ft. P-Dogg Amazing Posted 1 year ago 48 plays
Artist Song_title Date Streams
0 Juju Bucks Squad Too Deep Ft. Cool Prince (Outro) Posted 1 year ago 31 plays
1 Juju Bucks Tropikana ft. P-Dogg Amazing Posted 1 year ago 48 plays
2 Juju Bucks Party Ka Mngani Ft. X-Poll Posted 1 year ago 72 plays
Artist Song_title Date Streams
0 Juju Bucks Squad Too Deep Ft. Cool Prince (Outro) Posted 1 year ago 31 plays
1 Juju Bucks Tropikana ft. P-Dogg Amazing Posted 1 year ago 48 plays
2 Juju Bucks Party Ka Mngani Ft. X-Poll Posted 1 year ago 72 plays
3 Juju Bucks Joy Ft. Black Sushi & Gavin Bowden Posted 1 year ago 122 plays
CodePudding user response:
The for-loop's range is fine. The problem is that, for each iteration of the loop, you are appending a new item to song_list
, which lives outside of the scope of the loop. Move song_list = []
into the loop to make the print-statement work the way you want.
However, then you will not be keeping track of all songs anymore when the loop ends. You probably don't want to print inside the loop at all. Print once outside the loop.
CodePudding user response:
You should move the dataframe allocation outside of the for loop:
for _ in range(20):
…
song_list.append(option)
df = pd.DataFrame(song_list)
print(df)