Home > Back-end >  Scraping data from Spotify charts
Scraping data from Spotify charts

Time:09-16

I want to scrape daily top 200 songs from Spotify charts website. I am trying to parse html code of page and trying to get song's artist, name and stream informations. But following code returns nothing. How can I get these informations with the following way?

for a in soup.find("div",{"class":"Container-c1ixcy-0 krZEp encore-base-set"}):
    for b in a.findAll("main",{"class":"Main-tbtyrr-0 flXzSu"}):
        for c in b.findAll("div",{"class":"Content-sc-1n5ckz4-0 jyvkLv"}):
            for d in c.findAll("div",{"class":"TableContainer__Container-sc-86p3fa-0 fRKUEz"}):
                print(d) 

And let say this is the songs list that I want to scrape from it. enter image description here

CodePudding user response:

none selenium solution:

import requests
import pandas as pd


url = 'https://charts-spotify-com-service.spotify.com/public/v0/charts'
response = requests.get(url)
chart = []
for entry in response.json()['chartEntryViewResponses'][0]['entries']:
    chart.append({
        "Rank": entry['chartEntryData']['currentRank'],
        "Artist": ', '.join([artist['name'] for artist in entry['trackMetadata']['artists']]),
        "TrackName": entry['trackMetadata']['trackName']
    })
df = pd.DataFrame(chart)
print(df.to_string(index=False))

OUTPUT:

Rank                      Artist                                                 TrackName
    1            Bizarrap,Quevedo                     Quevedo: Bzrp Music Sessions, Vol. 52
    2                Harry Styles                                                 As It Was
    3  Bad Bunny,Chencho Corleone                                           Me Porto Bonito
    4                   Bad Bunny                                          Tití Me Preguntó
    5               Manuel Turizo                                                La Bachata
    6                     ROSALÍA                                                  DESPECHÁ
    7                   BLACKPINK                                                Pink Venom
    8     David Guetta,Bebe Rexha                                           I'm Good (Blue)
    9                 OneRepublic                                           I Ain't Worried
   10                   Bad Bunny                                                    Efecto
   11                 Chris Brown                                       Under The Influence
   12                  Steve Lacy                                                 Bad Habit
   13     Bad Bunny,Bomba Estéreo                                             Ojitos Lindos
   14                   Kate Bush    Running Up That Hill (A Deal With God) - 2018 Remaster
   15                        Joji                                             Glimpse of Us
   16                 Nicki Minaj                                         Super Freaky Girl
   17                   Bad Bunny                                               Moscow Mule
   18                   Rosa Linn                                                      SNAP
   19               Glass Animals                                                Heat Waves
   20                     KAROL G                                                  PROVENZA
   21  Charlie Puth,Jung Kook,BTS                   Left and Right (Feat. Jung Kook of BTS)
   22                Harry Styles                                        Late Night Talking
   23 The Kid LAROI,Justin Bieber                                 STAY (with Justin Bieber)
   24                   Tom Odell                                              Another Love
   25                 Central Cee                                                      Doja
   26             Stephen Sanchez                                         Until I Found You
   27                   Bad Bunny                                                  Neverita
   28        Post Malone,Doja Cat               I Like You (A Happier Song) (with Doja Cat)
   29                       Lizzo                                           About Damn Time
   30            Nicky Youre,dazy                                                   Sunroof
   31   Elton John,Britney Spears                                            Hold Me Closer
   32                   Luar La L                                                     Caile
   33               KAROL G,Maldy                                                  GATÚBELA
   34                  The Weeknd                                               Die For You
   35       Bad Bunny,Jhay Cortez                                                     Tarot
   36  James Hype,Miggy Dela Rosa                                                   Ferrari
   37             Imagine Dragons                                                     Bones
   38    Elton John,Dua Lipa,PNAU                                   Cold Heart - PNAU Remix
   39           The Neighbourhood                                           Sweater Weather
   40                       Ghost                                           Mary On A Cross
   41      Shakira,Rauw Alejandro                                               Te Felicito
   42               Justin Bieber                                                     Ghost
   43    Bad Bunny,Rauw Alejandro                                                     Party
   44             Drake,21 Savage                             Jimmy Cooks (feat. 21 Savage)
   45                    Doja Cat Vegas (From the Original Motion Picture Soundtrack ELVIS)
   46   Camila Cabello,Ed Sheeran                                Bam Bam (feat. Ed Sheeran)
   47 Rauw Alejandro,Lyanno,Brray                                                    LOKERA
   48                      Rels B                                            cómo dormiste?
   49                  The Weeknd                                           Blinding Lights
   50              Arctic Monkeys                                                       505

CodePudding user response:

In the example link you provided, there aren't 200 songs, but only 50. The following is one way to get those songs:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import NoSuchElementException, TimeoutException
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
import time as t
import pandas as pd
from bs4 import BeautifulSoup


chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("window-size=1920,1080")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)

url = 'https://charts.spotify.com/charts/view/regional-tr-daily/2022-09-14'
browser.get(url)
wait = WebDriverWait(browser, 5)
try:
    wait.until(EC.element_to_be_clickable((By.ID, "onetrust-accept-btn-handler"))).click()
    print("accepted cookies")
except Exception as e:
    print('no cookie button')
header_to_be_removed = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'header[data-testid="charts-header"]')))
browser.execute_script("""
var element = arguments[0];
element.parentNode.removeChild(element);
""", header_to_be_removed)
while True:
    try:
        show_more_button = wait.until(EC.element_to_be_clickable((By.XPATH, '//div[@data-testid="load-more-entries"]//button')))
        show_more_button.location_once_scrolled_into_view
        t.sleep(5)
        show_more_button.click()
        print('clicked to show more')
        t.sleep(3)
    except TimeoutException:
        print('all done')
        break
songs = wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'li[data-testid="charts-entry-item"]')))
print('we have', len(songs), 'songs')
song_list = []
for song in songs:
    song.location_once_scrolled_into_view
    t.sleep(1)
    title = song.find_element(By.CSS_SELECTOR, 'p[class^="Type__TypeElement-"]')
    artist = song.find_element(By.CSS_SELECTOR, 'span[data-testid="artists-names"]')
    song_list.append((artist.text, title.text))
df = pd.DataFrame(song_list, columns = ['Title', 'Artist'])
print(df)

This will print out in terminal:

no cookie button
clicked to show more
clicked to show more
clicked to show more
clicked to show more
all done
we have 50 songs
Title Artist
0 Bizarrap, Quevedo: Bzrp Music Sessions, Vol. 52
1 Harry Styles As It Was
2 Bad Bunny, Me Porto Bonito
3 Bad Bunny Tití Me Preguntó
4 Manuel Turizo La Bachata
5 ROSALÍA DESPECHÁ
6 BLACKPINK Pink Venom
7 David Guetta, I'm Good (Blue)
8 OneRepublic I Ain't Worried
9 Bad Bunny Efecto
10 Chris Brown Under The Influence
11 Steve Lacy Bad Habit
12 Bad Bunny, Ojitos Lindos
13 Kate Bush Running Up That Hill (A Deal With God) - 2018 Remaster
14 Joji Glimpse of Us
15 Nicki Minaj Super Freaky Girl
16 Bad Bunny Moscow Mule
17 Rosa Linn SNAP
18 Glass Animals Heat Waves
19 KAROL G PROVENZA
20 Charlie Puth, Left and Right (Feat. Jung Kook of BTS)
21 Harry Styles Late Night Talking
22 The Kid LAROI, STAY (with Justin Bieber)
23 Tom Odell Another Love
24 Central Cee Doja
25 Stephen Sanchez Until I Found You
26 Bad Bunny Neverita
27 Post Malone, I Like You (A Happier Song) (with Doja Cat)
28 Lizzo About Damn Time
29 Nicky Youre, Sunroof
30 Elton John, Hold Me Closer
31 Luar La L Caile
32 KAROL G, GATÚBELA
33 The Weeknd Die For You
34 Bad Bunny, Tarot
35 James Hype, Ferrari
36 Imagine Dragons Bones
37 Elton John, Cold Heart - PNAU Remix
38 The Neighbourhood Sweater Weather
39 Ghost Mary On A Cross
40 Shakira, Te Felicito
41 Justin Bieber Ghost
42 Bad Bunny, Party
43 Drake, Jimmy Cooks (feat. 21 Savage)
44 Doja Cat Vegas (From the Original Motion Picture Soundtrack ELVIS)
45 Camila Cabello, Bam Bam (feat. Ed Sheeran)
46 Rauw Alejandro, LOKERA
47 Rels B cómo dormiste?
48 The Weeknd Blinding Lights
49 Arctic Monkeys 505

​ Of course you can get other info like chart ranking, all artists when there are more than one, etc.

Selenium chrome/chromedriver setup is for Linux, you just have to observe the imports and code after defining the browser, to adapt it to your own setup.

Pandas documentation: https://pandas.pydata.org/pandas-docs/stable/index.html

For selenium docs, visit: https://www.selenium.dev/documentation/

  • Related