I tried scrapping a website using BeautifulSoup and no matter the method or selector I try it always returns an empty list. This was supposed to print the top 1001 songs on the billboard chart
from bs4 import BeautifulSoup
import requests
date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")
response = requests.get("https://www.billboard.com/charts/hot-100/" date)
soup = BeautifulSoup(response.text, 'html.parser')
song_names_spans = soup.find_all("span", class_="chart-element__information__song")
song_names = [song.getText() for song in song_names_spans]
CodePudding user response:
It looks like you have the wrong .find_all()
call. Try using a .select()
to use a CSS selector and call instead and copy-paste the list of classes that song titles have from the developer tools in your browser (I chose the first four: c-title
, a-no-trucate
, a-font-primary-bold-s
, and u-letter-spacing-0021
, and it worked). Like this:
from bs4 import BeautifulSoup
import requests
date = input("Which year do you want to travel to? Type the date in this format YYYY-MM-DD: ")
response = requests.get("https://www.billboard.com/charts/hot-100/" date)
soup = BeautifulSoup(response.text, 'html.parser')
song_names_els = soup.select('h3.c-title.a-no-trucate.a-font-primary-bold-s.u-letter-spacing-0021')
song_names = [song.getText().strip() for song in song_names_els]
print(song_names)
Note that all song titles are <h1>
tags, not <span>
s, so you should search for <h1>
s instead.