import requests
from bs4 import BeautifulSoup
import re
import sys
if len(sys.argv) > 1:
url = sys.argv[1]
else:
sys.exit("Error: Enter a TED Talk URL")
r = requests.get(url)
result = ''
print("Gathering Resources...")
soup = BeautifulSoup(r.content, features="lxml")
for value in soup.findAll("script"):
if(re.search("talkPage.init", str(value))) is not None:
result = str(value)
res_mp4 = re.search("(?P<url>https?://[^\s] )(mp4)", result).group()
mp4_url = res_mp4.split('"')[0]
print("Downloading video from: " mp4_url)
file_name = mp4_url.split("/")[len(mp4_url.split("/"))-1].split('?')[0]
print("Storing the video in..." file_name)
r = requests.get(mp4_url)
with open(file_name, 'wb') as f:
f.write(r.content)
print("Download Completed")
Running the file "main.py https://www.ted.com/talks/jia_jiang_what_i_learned_from_100_days_of_rejection" causes following error. What causes this? Traceback (most recent call last): File "C:\Users\MADAWA\PycharmProjects\TED_Talk_Downloader\main.py", line 23, in res_mp4 = re.search("(?Phttps?://[^\s] )(mp4)", result).group() AttributeError: 'NoneType' object has no attribute 'group'
CodePudding user response:
This is most likely caused by re.search("(?P<url>https?://[^\s] )(mp4)", result)
returning None. Which means you're calling the group function on None (None.group()
) causing the NoneType error.
CodePudding user response:
The response is a 404 page meaning the link doesn't exist and thus has no match.
def search(pattern, string, flags=0):
"""Scan through string looking for a match to the pattern, returning
a Match object, or None if no match was found."""
return _compile(pattern, flags).search(string)
This is the code of re
. And since there is no match, it returns None
which is of NoneType
and doesn't have the attribute group
.