Home > Enterprise >  Python, Web Scrapping
Python, Web Scrapping

Time:01-26

I tried scrapping a website using BeautifulSoup and no matter the method or selector I try it always returns an empty list. This was supposed to print the top 1001 songs on the billboard chart

from bs4 import BeautifulSoup
import requests

date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")

response = requests.get("https://www.billboard.com/charts/hot-100/"   date)

soup = BeautifulSoup(response.text, 'html.parser')
song_names_spans = soup.find_all("span", class_="chart-element__information__song")
song_names = [song.getText() for song in song_names_spans]

CodePudding user response:

It looks like you have the wrong .find_all() call. Try using a .select() to use a CSS selector and call instead and copy-paste the list of classes that song titles have from the developer tools in your browser (I chose the first four: c-title, a-no-trucate, a-font-primary-bold-s, and u-letter-spacing-0021, and it worked). Like this:

from bs4 import BeautifulSoup
import requests

date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")

response = requests.get("https://www.billboard.com/charts/hot-100/"   date)

soup = BeautifulSoup(response.text, 'html.parser')
song_names_els = soup.select('h3.c-title.a-no-trucate.a-font-primary-bold-s.u-letter-spacing-0021')
song_names = [song.getText().strip() for song in song_names_els]

print(song_names)

Note that all song titles are <h1> tags, not <span>s, so you should search for <h1>s instead.

  • Related