I am using Selenium in Python to scrape the videos from Youtube channels' websites. Below is a set of code. The line videos = driver.find_elements(By.CLASS_NAME, 'style-scope ytd-grid-video-renderer')
repeatedly returns no links to the videos (a.k.a. the print(videos)
after it outputs an empty list). How would you modify it to find all the videos on the loaded page?
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome()
driver.get('https://www.youtube.com/wendoverproductions/videos')
videos = driver.find_elements(By.CLASS_NAME, 'style-scope ytd-grid-video-renderer')
print(videos)
urls = []
titles = []
dates = []
for video in videos:
video_url = video.find_element(by=By.XPATH, value='.//*[@id="video-title"]').get_attribute('href')
urls.append(video_url)
video_title = video.find_element(by=By.XPATH, value='.//*[@id="video-title"]').text
titles.append(video_title)
video_date = video.find_element(by=By.XPATH, value='.//*[@id="metadata-line"]/span[2]').text
dates.append(video_date)
CodePudding user response:
If you don't have a YouTube Data API v3 developer key:
The following procedure requires you to have a Google account.
Go to: https://console.cloud.google.com/projectcreate
Click on the CREATE
button.
Go to: https://console.cloud.google.com/marketplace/product/google/youtube.googleapis.com
Click on the ENABLE
button.
Click on the CREATE CREDENTIALS
button.
Choose the Public data
option.
Click on the NEXT
button.
Note the displayed API Key
and continue reading.
If you have a YouTube Data API v3 developer key:
To get the videos of a given YouTube channel id (see this answer if you don't know how to get the channel id of a given YouTube channel), replace its second character (C
) to U
to obtain its uploads playlist id and provide it as a playlistId
to YouTube Data API v3 PlaylistItems: list endpoint.
This is a Python sample code listing videos of a given channel id (don't forget to replace API_KEY
with your YouTube Data API v3 developer key):
import requests, json
CHANNEL_ID = 'UC07-dOwgza1IguKA86jqxNA'
PLAYLIST_ID = 'UU' CHANNEL_ID[2:]
API_KEY = 'AIzaSy...'
URL = f'https://www.googleapis.com/youtube/v3/playlistItems?part=snippet&playlistId={PLAYLIST_ID}&maxResults=50&key={API_KEY}'
pageToken = ''
while True:
pageUrl = URL
if pageToken != '':
pageUrl = f'&pageToken={pageToken}'
response = json.loads(requests.get(pageUrl).text)
print(response['items'])
if 'nextPageToken' in response:
pageToken = response['nextPageToken']
else:
break
For documentation concerning the pagination, see this webpage.
CodePudding user response:
Below is an example how to extract youtube data from it's API
using requests
module.
import requests
import pandas as pd
from main import API_KEY,channel_id#You have to require youtube api key and id value of the channel from where you want to scrape data and I put them in another file called main
#api link = https://developers.google.com/youtube/v3
api_key = API_KEY
channel_id = channel_id
data = []
for page in range(0,200,50):
url = f'https://www.googleapis.com/youtube/v3/search?part=snippet&id&key={api_key}&channelId={channel_id}&order=date&maxResults={page}'
res = requests.get(url).json()
#print(res)
for card in res['items']:
video_url = 'https://www.youtube.com/watch?v=' card['id']['videoId']
title= card['snippet']['title']
data.append({
'video_url':video_url,
'title':title
})
print(data)
# df = pd.DataFrame(data)
# print(df)
Output:
[{'video_url': 'https://www.youtube.com/watch?v=oL0umpPPe-8', 'title': 'Samsung’s Dangerous Dominance over South Korea'}, {'video_url': 'https://www.youtube.com/watch?v=GBp_NgrrtPM', 'title': 'China’s Electricity Problem'}, {'video_url': 'https://www.youtube.com/watch?v=YBNcYxHJPLE', 'title': 'How the World’s Wealthiest People Travel'}, {'video_url': 'https://www.youtube.com/watch?v=iIpPuJ_r8Xg', 'title': 'The US Military’s Massive Global Transportation System'}, {'video_url': 'https://www.youtube.com/watch?v=MY8AB1wYOtg', 'title': 'The Absurd Logistics of Concert Tours'},
{'video_url': 'https://www.youtube.com/watch?v=8xzINLykprA', 'title': 'Money’s Mostly Digital, So Why Is Moving It So Hard?'}, {'video_url': 'https://www.youtube.com/watch?v=f66GfsKPTUg', 'title': 'How This Central African City Became the World’s Most Expensive'}, {'video_url': 'https://www.youtube.com/watch?v=IDLkOWW0_xg', 'title': 'The Simple Genius of NYC’s Water Supply System'}, {'video_url': 'https://www.youtube.com/watch?v=U9jirFqex6g', 'title': 'Europe’s Experiment: Treating Trains Like Planes'}, {'video_url': 'https://www.youtube.com/watch?v=eoWcQUjNM8o', 'title': 'How the YouTube Creator Economy Works'}, {'video_url': 'https://www.youtube.com/watch?v=V0Xx0E8cs7U', 'title': 'The
Incredible Logistics Behind Weather Forecasting'}, {'video_url': 'https://www.youtube.com/watch?v=v0aGGOK4kAM', 'title': 'Australia Had a Mass-Shooting Problem. Here’s How it Stopped'}, {'video_url': 'https://www.youtube.com/watch?v=AW3gaelBypY', 'title': 'The Carbon Offset Problem'}, {'video_url': 'https://www.youtube.com/watch?v=xhYl7Jjefo8', 'title': 'Jet Lag: The Game - A New Channel by Wendover Productions'}, {'video_url': 'https://www.youtube.com/watch?v=oESoI6XxZTg', 'title': 'How to Design a Theme Park (To Take Tons of Your Money)'}, {'video_url': 'https://www.youtube.com/watch?v=AQbmpecxS2w', 'title': 'Why Gas Got So Expensive (It’s Not the War)'}, {'video_url': 'https://www.youtube.com/watch?v=U_7CGl6VWaQ', 'title': 'How Cyberwarfare Actually Works'}, {'video_url': 'https://www.youtube.com/watch?v=R9pxFgJwxFE', 'title': 'The Incredible Logistics Behind Corn Farming'}, {'video_url': 'https://www.youtube.com/watch?v=SrTrpwzVt4g', 'title': 'The Sanction-Fueled Destruction of the Russian Aviation Industry'}, {'video_url':
'https://www.youtube.com/watch?v=b4wRdoWpw0w', 'title': 'The Failed Logistics of Russia's Invasion of Ukraine'}, {'video_url': 'https://www.youtube.com/watch?v=UX4KklvCDmg', 'title': 'Why Everywhere in the US is Starting to Look the Same'}, {'video_url': 'https://www.youtube.com/watch?v=J-M98KLgaUU', 'title': 'Drone Delivery Was Supposed to be the Future. What Went Wrong?'}, {'video_url': 'https://www.youtube.com/watch?v=0faCad2kKeg', 'title': 'How Cell Service Actually Works'}, {'video_url': 'https://www.youtube.com/watch?v=9dnN82DsQ2k', 'title': 'Electric Vehicles' Battery Problem'}, {'video_url': 'https://www.youtube.com/watch?v=Y413Czri6qw', 'title': 'The News You Missed in 2021, From Every Country in the World (Part 2)'}, {'video_url': 'https://www.youtube.com/watch?v=W3qZIPiWKc4', 'title': 'The News You Missed in 2021, From Every Country in the World (Part 1)'}, {'video_url': 'https://www.youtube.com/watch?v=ggUduBmvQ_4', 'title': 'How Airlines Quietly Became Banks'}, {'video_url': 'https://www.youtube.com/watch?v=xhxo2oXRiio', 'title': 'How Electricity Gets to You'}, {'video_url': 'https://www.youtube.com/watch?v=8d5d_HXGeMA',
'title': 'How Ocean Shipping Works (And Why It's Broken)'}, {'video_url': 'https://www.youtube.com/watch?v=8egszLpKMWU', 'title': 'Is Africa the Next China?'}, {'video_url': 'https://www.youtube.com/watch?v=WNrobOYWZQE', 'title': 'When Will Space Tourism be Affordable?'}, {'video_url': 'https://www.youtube.com/watch?v=ZZ3F3zWiEmc', 'title': 'The Art Market is a Scam (And Rich People Run It)'}, {'video_url': 'https://www.youtube.com/watch?v=V16GdzRvhRU', 'title': 'Saudi Arabia's Oil Problem'}, {'video_url': 'https://www.youtube.com/watch?v=o4tuhWvKduU', 'title': 'The Logistics of Evacuating Afghanistan'}, {'video_url': 'https://www.youtube.com/watch?v=1-uNMj57Y4c', 'title': 'Airlines' Business Travel Problem'}, {'video_url': 'https://www.youtube.com/watch?v=SR7BA3xEmDo', 'title': 'The Simple Genius of the Interstate Highway System'}, {'video_url': 'https://www.youtube.com/watch?v=iO5mfbpq16A', 'title': 'Why the Southern Hemisphere is Poorer'}, {'video_url': 'https://www.youtube.com/watch?v=B3FKtBNEBRc', 'title': 'The
Incredible Logistics of the Tokyo Olympics'}, {'video_url': 'https://www.youtube.com/watch?v=VJtFgte1GKc', 'title':
'Little Bay: Why This Island Was Abandoned on December 31st, 2019'}, {'video_url': 'https://www.youtube.com/watch?v=J5PLyYVIEpg', 'title': 'How the Navajo Nation Works (A Country Within a Country?)'}, {'video_url': 'https://www.youtube.com/watch?v=LHhJuAOK3CI', 'title': 'Extremities: A New Channel by Wendover Productions'}, {'video_url': 'https://www.youtube.com/watch?v=aH4b3sAs-l8', 'title': 'Why Electric Planes are Inevitably Coming'}, {'video_url': 'https://www.youtube.com/watch?v=b1JlYZQG3lI', 'title': 'Why There are Now So Many Shortages (It's Not COVID)'}, {'video_url': 'https://www.youtube.com/watch?v=BNpk_OGEGlA', 'title': 'The Incredible Logistics of Grocery Stores'}, {'video_url': 'https://www.youtube.com/watch?v=4p0fRlCHYyg', 'title': 'Supersonic Planes are Coming Back (And This Time, They Might Work)'}, {'video_url': 'https://www.youtube.com/watch?v=N4dOCfWlgBw', 'title': 'The Insane Logistics of Shutting Down the Cruise Industry'}, {'video_url': 'https://www.youtube.com/watch?v=3CuPqeIJr3U', 'title': 'China's Vaccine Diplomacy'}, {'video_url': 'https://www.youtube.com/watch?v=DlTq8DbRs4k', 'title': 'The UK's Failed Experiment in Rail Privatization'}, {'video_url': 'https://www.youtube.com/watch?v=VjiH3mpxyrQ', 'title': 'How to Start an Airline'}, {'video_url': 'https://www.youtube.com/watch?v=pLcqJ2DclEg', 'title': 'The Electric Vehicle Charging Problem'}]