I'm learning Python(Flask) and BeautifulSoup. For my first project I just wanted to wanted to get a video name from YT and display it on the homepage of my web app.
An error returns:
AttributeError: 'NoneType' object has no attribute 'text'
import requests
from flask import Blueprint, render_template
from bs4 import BeautifulSoup
views = Blueprint('views', __name__)
def scrapper():
url = 'https://www.youtube.com/'
web_response = requests.get(url).text
soup = BeautifulSoup(web_response, 'lxml')
card = soup.find(class_='style-scope ytd-rich-grid-media').text
return card
@views.route('/')
def home():
return render_template('home.html', text=scrapper())
CodePudding user response:
Beautifulsoup
can capture only static HTML source code. YT contains the javascript
content which is code that runs on the client. Use a tool that can handle javascript
, such as, selenium
.
for example,
from bs4 import BeautifulSoup
from selenium import webdriver
url = 'https://www.youtube.com/'
# your path for selenium driver (e.g., chrome or firefox)
webdriverFile = your_path '/geckodriver' # gecko for firefox
browser = webdriver.Firefox(executable_path=webdriverFile)
browser.get(url)
source = browser.page_source
soup = BeautifulSoup(source, 'html.parser')
browser.close()
for i in soup.find_all("yt-formatted-string", {"id": "video-title"}):
print(i.text)
output:
> Music for Healing Stress, Anxiety and Depression, Remove Inner Rage and Sadness
> รวมเพลงฮิตในติ๊กต๊อก ( ผีเห็นผี ไทม์แมชชิน ) เพลงมาแรงฟังกันยาวๆ2022
> เพลงเพราะในTikTok ครูหนุ่มแจง ไม่ได้เทงานแต่ง ยันเลิกกันด้วยดี
> Smooth Jazz Music & Bossa Nova For Good Mood - Positive Jazz Lounge Cafe Music, Coffee Shop BGM
> เปิดห้องทำงานใหม่ที่บ้าน — บอกหมด จัดห้อง จัดโต๊ะ จัดไฟ ใช้ของอะไรบ้าง
> Boyce Avenue Greatest Hits Full Album 2021 - Best Songs Of Boyce
> Avenue 2021 - Acoustic songs 2021
> .....