Home > Net >  Not getting any data entry with 'find_all' while scraping Spotify Charts webpage
Not getting any data entry with 'find_all' while scraping Spotify Charts webpage

Time:02-11

I am trying to scrape the spotify charts containing top 200 songs in India on 2022-02-01. My python code :

#It reads the webpage.
def get_webpage(link):
    page = requests.get(link)
    soup = bs(page.content, 'html.parser')
    return(soup)

#It collects the data for each country, and write them in a list.
#The entries are (in order): Song, Artist, Date, Play Count, Rank
def get_data():
    rows = []
    soup = get_webpage('https://spotifycharts.com/regional/in/daily/2022-02-01')
    entries = soup.find_all("td", class_ = "chart-table-track")
    streams = soup.find_all("td", class_= "chart-table-streams")
    print(entries)
    for i, (entry, stream) in enumerate(zip(entries,streams)):
         song = entry.find('strong').get_text()
         artist = entry.find('span').get_text()[3:]
         play_count = stream.get_text()
         rows.append([song, artist, date, play_count, i 1])
return(rows)

I tried printing the entries and streams but get a blank value

entries = soup.find_all("td", class_ = "chart-table-track")
streams = soup.find_all("td", class_= "chart-table-streams")

I have copied/referenced this from Here and tried running the full script but that gives error : 'NoneType' object has no attribute 'find_all' in the country function. Hence I tried for a smaller section as above.

CodePudding user response:

NoneType suggests that is doesn't find the "Entries" or "Streams", if you print soup it will show you that the selectors set up for entries and streams does not exist.

After checking your soup object, it seems that Cloudflare is blocking your access to Spotify and you need to complete a CAPTCHA to get around this. There is a library set up for bypassing cloudflare called "cloudscraper".

  • Related