Home > Blockchain >  Pytube is giving assorted regex errors
Pytube is giving assorted regex errors

Time:04-14

I have a program to download youtube videos with python. The program is:

from pytube import YouTube

def Download(video_list):
    for url in video_list:
        print("url: [%s]" % url)
        youtube = YouTube(url)
        youtube.streams.get_highest_resolution().download("C:/Users/user1/Downloads")
        
        print (f'{youtube.title} downloaded.')
    
video_list = []

run = True

while run == True:
    link = str(input("Enter youtube URL, or press D to download: "))
    
    if link.find("youtu") != 1:
        video_list.append(link)
    if link == 'd' or link == 'D':
        Download(video_list)
        run = False
    elif link.find("youtu") == -1:
        print ("Invalid youtube URL.")

I tried downloading a video (https://www.youtube.com/watch?v=jikcB7_gj8A), and it gave me this error:

url: [https://www.youtube.com/watch?v=jikcB7_gj8A]
Traceback (most recent call last):
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 27, in <module>
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 8, in Download
    youtube = YouTube(url)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 71, in __init__        
    self.video_id = extract.video_id(url)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 133, in video_id        
    return regex_search(r"(?:v=|\/)([0-9A-Za-z_-]{11}).*", url, group=1)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for (?:v=|\/)([0-9A-Za-z_-]{11}).*

So I tried the solution here (replace the regexes in cipher.py with 'r'\bc\s*&&\s*d\.set\([^,] \s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$] )\(''), and now I'm getting this error:

url: [https://www.youtube.com/watch?v=jikcB7_gj8A]
Traceback (most recent call last):
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 181, in fmt_streams
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 409, in apply_signature 
    cipher = Cipher(js=js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 29, in __init__
    self.transform_plan: List[str] = get_transform_plan(js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 186, in get_transform_plan
    return regex_search(pattern, js, group=1).split(";")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for qra\[0\]=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:. )}    

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 27, in <module>
    Download(video_list)
  File "c:\Users\user1\Desktop\Mostly Python\youtube_downloader.py", line 9, in Download
    youtube.streams.get_highest_resolution().download("C:/Users/user1/Downloads")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 296, in streams        
    return StreamQuery(self.fmt_streams)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\__main__.py", line 188, in fmt_streams    
    extract.apply_signature(stream_manifest, self.vid_info, self.js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\extract.py", line 409, in apply_signature 
    cipher = Cipher(js=js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 29, in __init__
    self.transform_plan: List[str] = get_transform_plan(js)
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\cipher.py", line 186, in get_transform_plan
    return regex_search(pattern, js, group=1).split(";")
  File "C:\Users\user1\AppData\Local\Programs\Python\Python310\lib\site-packages\pytube\helpers.py", line 129, in regex_search    
    raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for qra\[0\]=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:. )} 

I'm at a loss. What do I do?

Edit: I am running pytube version 12.0.0, downloaded from https://github.com/nficano/pytube.

CodePudding user response:

I reworked your code, which throws no errors on my system.

import tldextract
from pytube import YouTube


def Download(video_list):
    for url in video_list:
        youtube = YouTube(url.strip())
        youtube.streams.get_highest_resolution().download()
        print(f'The video - {youtube.title} - was downloaded.')

video_list = []

run = True

while run == True:
     link = str(input("Enter a YouTube URL or press D to download the video(s): "))
    domain_name = tldextract.extract(link).domain
    if link.lower() == 'd':
        Download(video_list)
        run = False
    elif domain_name == 'youtube':
        video_list.append(link)
    else:
        print("Invalid youtube URL.")

Enter a YouTube URL or press D to download the video(s): https://www.youtube.com/watch?v=jikcB7_gj8A
Enter a YouTube URL or press D to download the video(s): D

The video - How Teachers Help You During Tests #Shorts - was downloaded.

Process finished with exit code 0
  • Related