Pytube: urllib.error.HTTPError: HTTP Error 403: Forbidden

Created on 6 Jun 2019 · 62Comments · Source: pytube/pytube

Hi all,

I'm aware that there are some older issues with the same error, howevery I thought I would just give it a try. For some videos (like this one: https://www.youtube.com/watch?v=393C3pr2ioY ) I get a 403 error...others work fine though. Do you know of any workaround for that? I saw this thread in the older issue https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping but I'm not good enough in Python to find the right spot to implement it. Any help would be appreciated.

PS: I already updated to the most current version (9.5.0).

stale

Source

NeverAskWhy

👍3

Most helpful comment

I think I have a working fix now for this and a related issue (https://github.com/nficano/pytube/issues/392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

danielgordon10 on 13 Jun 2019

👍20 🎉10 ❤4 🚀1

All 62 comments

I am having the same problem, but for me, it is random whether the same youtube link will generate a 403 or not. My current testing links are:
https://www.youtube.com/watch?v=pAgnJDJN4VA&index=117&list=PL3LCt07uiILKduYr52ZMYrE_PvbSuUdZM
https://www.youtube.com/watch?v=t9psCAWrISg&index=136&list=PL3LCt07uiILKduYr52ZMYrE_PvbSuUdZM
I am only trying to download the audio stream.

Waarbubble on 7 Jun 2019

It's a hack of a fix, but this is how I've got mine working for now (I'm running in python 2.7, you may need to fiddle depending on your version):
In pytube/request.py:
At line 4 I added:
from urllib2 import Request

And above line 21 (response = urlopen(url)):
req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})

And then replace the line below:
response = urlopen(url)
with:
response = urlopen(req)

Nowbob on 7 Jun 2019

👍6

Just tested the solution, keeps randomly giving 403, and then working again.

Waarbubble on 7 Jun 2019

@Waarbubble : you mean you tested @Nowbob's solution?

NeverAskWhy on 7 Jun 2019

yes
Tested in python 3

Waarbubble on 7 Jun 2019

Yeah I was having the same "sometimes it works sometimes it doesn't" problem, but it's the best I could come up with in a short time.

Nowbob on 7 Jun 2019

It was worth a try.

Waarbubble on 7 Jun 2019

I'm doing something wrong...
In line for of request.py:
from urllib import request

then line 22/23:
req = request(url, headers = {"User-Agent": "Mozilla/5.0"})
response = urlopen(req)

now I'm getting this error:

self.watch_html = request.get(url=self.watch_url)

File "C:\Users\janvo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytube\request.py", line 22, in get
req = request(url, headers = {"User-Agent": "Mozilla/5.0"})
TypeError: 'module' object is not callable

NeverAskWhy on 7 Jun 2019

If you are using python 3 the

fourth line needs to be from urllib.request import Request

and then line 22 will need a capital R in request

Waarbubble on 7 Jun 2019

👍3

Thanks...I changed it now. It's behaving the same way...three times I got the 403-error and then it worked.

NeverAskWhy on 7 Jun 2019

Getting the same error. I tried all the solutions above. And some works and some don't.

hoetaek on 11 Jun 2019

👍1

This got much more frequent for me today for some reason. The above solutions do not seem to work for the new cases. Example video https://www.youtube.com/watch?v=ZYoUXaU81Gk

danielgordon10 on 11 Jun 2019

I believe I've managed to fix this by changing &signature= in line 64 of mixins.py to &sig=.

fofofofofofofofo on 11 Jun 2019

👍11

The solution works for me, no more random 403

Waarbubble on 11 Jun 2019

Neither suggested solution fixes my test video https://www.youtube.com/watch?v=ZYoUXaU81Gk

danielgordon10 on 11 Jun 2019

👍1

@danielgordon10 What error are you getting? I just tried downloading that link and it seems to be working with the &sig= fix.

pytube https://www.youtube.com/watch?v=ZYoUXaU81Gk --list
<Stream: itag="43" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp8.0" acodec="vorbis">
<Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">
<Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d400b">
<Stream: itag="244" mime_type="video/webm" res="480p" fps="30fps" vcodec="vp9">
<Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">
<Stream: itag="243" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp9">
<Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d400c">
<Stream: itag="242" mime_type="video/webm" res="240p" fps="30fps" vcodec="vp9">
<Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400b">
<Stream: itag="278" mime_type="video/webm" res="144p" fps="30fps" vcodec="vp9">
<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">
<Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">
<Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus">
<Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus">
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus">

pytube https://www.youtube.com/watch?v=ZYoUXaU81Gk --itag 251
Мисс Новоуральск 2012.webm | 7446370 bytes

fofofofofofofofo on 11 Jun 2019

So I am able to get the stream but I can't download it.

import pytube
video_url = 'https://www.youtube.com/watch?v=ZYoUXaU81Gk'
yt = pytube.YouTube(video_url)
streams = yt.streams
streams = (
        streams.filter(progressive=True, custom_filter_functions=[lambda x: x.resolution is not None])
        .order_by("resolution")
        .all()
)
stream = streams[0]
video = stream.download()

Then I get the urllib.error.HTTPError

    video = stream.download()
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/streams.py", line 217, in download
    bytes_remaining = self.filesize
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/streams.py", line 164, in filesize
    headers = request.get(self.url, headers=True)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/request.py", line 22, in get
    response = urlopen(url)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

danielgordon10 on 11 Jun 2019

👍3

I've only used this in CLI before so I'm not sure, my guess would be to check if it's importing the correct one.

fofofofofofofofo on 12 Jun 2019

It works fine for other videos...

danielgordon10 on 12 Jun 2019

I think I have a working fix now for this and a related issue (https://github.com/nficano/pytube/issues/392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

danielgordon10 on 13 Jun 2019

👍20 🎉10 ❤4 🚀1

I believe I've managed to fix this by changing &signature= in line 64 of mixins.py to &sig=.

I tried the first fix from @Nowbob, but the problem persisted.
Change the line 64 as you said and now it's working again. Perfectly on Python 2.7

rsistema on 17 Jun 2019

still have that error !?

TaKeO90 on 18 Jun 2019

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

It's work thank you!
But I modify two file mixins.py & cipher.py
% mixins.py
ref #402
if ('signature=' in url) or ('&sig=' in url) or ('&lsig=' in url):
to
if ('signature=' in url or
('s' not in stream and
('&sig=' in url or '&lsig=' in url))):

and

stream_manifest[i]['url'] = url + '&signature=' + signature
to
stream_manifest[i]['url'] = url + '&sig=' + signature

and change pattern in cipher.py
pattern = [
r'yt.akamaized.net/)\s\|\|\s'
r'.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
r'.sig\|\|(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
]
to
pattern = [
r'\b[cs]\s&&\s[adf].set([^,]+\s,\sencodeURIComponent\s(\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\sencodeURIComponent\s(\s(?P[a-zA-Z0-9$]+)(',
r'(?P[a-zA-Z0-9$]+)\s=\sfunction(\sa\s)\s{\sa\s=\sa.split(\s""\s)',
r'(["\'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(',
r'.sig\|\|(?P[a-zA-Z0-9$]+)(',
r'yt.akamaized.net/)\s\|\|\s.?\s[cs]\s&&\s[adf].set([^,]+\s,\s(?:encodeURIComponent\s()?\s'
r'(?P[a-zA-Z0-9$]+)(',
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\sa.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)])\s(\s*(?P[a-zA-Z0-9$]+)('
]

I forget this issue number, someone can help me to link?
work on python 3.6, pytube 9.5.0, urllib3 1.25.3

genesysvip on 27 Jun 2019

👍2

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py
        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature
It's work thank you!
But I modify two file mixins.py & cipher.py
% mixins.py
ref #402
if ('signature=' in url) or ('&sig=' in url) or ('&lsig=' in url):
to
if ('signature=' in url or
('s' not in stream and
('&sig=' in url or '&lsig=' in url))):

and

stream_manifest[i]['url'] = url + '&signature=' + signature
to
stream_manifest[i]['url'] = url + '&sig=' + signature

and change pattern in cipher.py
pattern = [
r'yt.akamaized.net/)\s||\s'
r'._?\s_c\s&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
r'.sig||(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
]
to
pattern = [
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s_encodeURIComponent\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s_encodeURIComponent\s_(\s(?P[a-zA-Z0-9$]+)(',
r'(?P[a-zA-Z0-9$]+)\s=\s_function(\s_a\s)\s{\s_a\s_=\s_a.split(\s_""\s)',
r'(["'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(',
r'.sig||(?P[a-zA-Z0-9$]+)(',
r'yt.akamaized.net/)\s||\s._?\s_[cs]\s&&\s[adf].set([^,]+\s,\s(?:encodeURIComponent\s()?\s'
r'(?P[a-zA-Z0-9$]+)(',
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s_a.set([^,]+\s_,\s([^)]_)\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)]_)\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)]_)\s_(\s*(?P[a-zA-Z0-9$]+)('
]

I forget this issue number, someone can help me to link?
work on python 3.6, pytube 9.5.0, urllib3 1.25.3

changing the regex pattern to that value returns a SyntaxError.

Traceback (most recent call last):
  File "get_file.py", line 9, in <module>
    from pytube import YouTube
  File "/usr/local/lib/python3.5/dist-packages/pytube/__init__.py", line 18, in <module>
    from pytube.contrib.playlist import Playlist
  File "/usr/local/lib/python3.5/dist-packages/pytube/contrib/playlist.py", line 11, in <module>
    from pytube.__main__ import YouTube
  File "/usr/local/lib/python3.5/dist-packages/pytube/__main__.py", line 18, in <module>
    from pytube import mixins
  File "/usr/local/lib/python3.5/dist-packages/pytube/mixins.py", line 9, in <module>
    from pytube import cipher
  File "/usr/local/lib/python3.5/dist-packages/pytube/cipher.py", line 43
    r'(["'])signature\1\s*,\s*(?P[a-zA-Z0-9$]+)(',
           ^
SyntaxError: invalid syntax

ogspeace on 29 Jun 2019

sorry, I copy all the code in Github but still get errors. Any help would be appreciated. plz.

URL: https://www.youtube.com/watch?v=iWZmdoY1aTE
Traceback (most recent call last):
File "pytube_mp3.py", line 9, in
stream.download("/home/fisheep/Desktop")
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/streams.py", line 202, in download
bytes_remaining = self.filesize
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/streams.py", line 153, in filesize
headers = request.get(self.url, headers=True)
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/request.py", line 21, in get
response = urlopen(req)
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(args)
File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

Fisheep1207 on 2 Jul 2019

I am also still having difficulty after applying several of the fixes. I am not getting the 403 error. Now I am getting a regex error:

regex pattern (yt.akamaized.net/)\s\|\|\s.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s*()?(?P[a-zA-Z0-9$]+)() had zero matches

I've tried changing the pattern to several of the suggested versions with no luck.

On PyTube 9.5.1 on Python 3.7.3 in a virtual env on Kubuntu

General-Gouda on 3 Jul 2019

Same here. Applying the mixins.py fix (I believe I did this correctly) and the new regex listed above (replacing the faulty ' ) gives me:

File "/home/user/.local/lib/python3.5/site-packages/pytube/mixins.py", line 51, in apply_signature signature = cipher.get_signature(js, stream['s']) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 258, in get_signature tplan = get_transform_plan(js) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 77, in get_transform_plan name = re.escape(get_initial_function_name(js)) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 53, in get_initial_function_name return regex_search(pattern, js, group=1) File "/home/user/.local/lib/python3.5/site-packages/pytube/helpers.py", line 36, in regex_search regex = re.compile(p, flags) File "/usr/lib/python3.5/re.py", line 224, in compile return _compile(pattern, flags) File "/usr/lib/python3.5/re.py", line 293, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib/python3.5/sre_compile.py", line 536, in compile p = sre_parse.parse(p, flags) File "/usr/lib/python3.5/sre_parse.py", line 829, in parse p = _parse_sub(source, pattern, 0) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 689, in _parse len(char) + 2) sre_constants.error: unknown extension ?P[ at position 60

johnberroa on 3 Jul 2019

same error here...

runa91 on 4 Jul 2019

I am also still having difficulty after applying several of the fixes. I am not getting the 403 error. Now I am getting a regex error:

regex pattern (yt.akamaized.net/)\s||\s._?\s_c\s&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent\s*()?(?P[a-zA-Z0-9$]+)() had zero matches

I've tried changing the pattern to several of the suggested versions with no luck.

On PyTube 9.5.1 on Python 3.7.3 in a virtual env on Kubuntu

same here with python 2.7.5 and centos 7

nhatton96 on 12 Jul 2019

changing pattern in cipher.py to

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

fixes it for me(regex stolen from youtube-dl)

GravelCZ on 13 Jul 2019

👍4 ❤2 🎉2

changing pattern in cipher.py to

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

fixes it for me(regex stolen from youtube-dl)

Apart from a missing single quotation mark and a comma, this works for me, thanks.
Here is the fix:

    pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

nhatton96 on 15 Jul 2019

👍12 ❤7 🎉3

Neither seems to work for me. I tried changing the user agent, changed &signature to &sig in the stream manifest URL, and change the pattern in cipher.py, but still getting a 403. That's under Python 3 in a venv.

jrial on 16 Jul 2019

Hi, it works for me.
I changed the cipher.py and the mixins.py as suggested.
My configuration is: mac Mojave and python 3.6, pytube 9.5.1
thanks everybody for the help.

diediaga on 23 Jul 2019

👍1

Thank you pytube hackers... works for me with cipher.py/mixins.py as listed. Mac El Capitan / Py3.6 / PyTube 9.5.1 -- ❤️ pytube

peterrenshaw on 25 Jul 2019

works for me too with cipher.py and the mixins.py as suggested.
My configuration: win10 / Py 3.7.4 /pytube 9.5.1

choten on 5 Aug 2019

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

mixins.py - patch by @danielgordon10 in https://github.com/nficano/pytube/issues/399#issuecomment-501814505
cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of https://github.com/nficano/pytube/issues/399#issuecomment-511535993

victorgregorio on 9 Aug 2019

👍4

Worked form me with @GravelCZ's pattern and @nhatton96's fix + the issue at this URL:
https://github.com/nficano/pytube/issues/399 (mentionned by @ogspeace)
On Windows10, Python 3.7.4, Pytube 9.5.1
Thanks a lot!

SofiyanIfren on 12 Aug 2019

👍1

Not working for me when applying the fixes of \@victorgregorio or \@nahatton96

I tried inspecting cURL requests of a browser playing a youtube video and a url returned by pytube, and it seems that some GET parameters may be missing from the fetch requests performed by pytube. I get the 403 error systematically, so I think it might be related to something like this.

MacOS 10.13, python 3.6.0, pytube 9.5.1

JeffMv on 12 Aug 2019

😕2

With 9.5.1 (the latest downloaded) I tinkered with all the fixes from this thread and finally got it working. There are two separate issues that i had to tackle:
1) KeyError:['title'] and
2) "403: Forbidden"

1)FIrst one to solve was the KeyError:['title']
For some reason the 'title' key is not retrieved anymore in the root of the player_config_args dictionary, hence the error. Upon debugging of the req/reply I found It is instead present in "player_response/videoDetails/title", so edit __main__.py:

def title(self):
        #Get the video title.

        #:rtype: str

        try:
                tt1 = self.player_config_args['title']
        except:
                tt1 = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not tt1:
                        tt1 = "Unknown YTube video"
        return tt1

and streams.py:

    def default_filename(self):
        """Generate filename based on the video title.

        :rtype: str
        :returns:
            An os file system compatible filename.
        """
        try:
                title = self.player_config_args['title']
        except:
                title = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not title:
                        title = "Unknown YTube video"
        filename = safe_filename(title)
        return '{filename}.{s.subtype}'.format(filename=filename, s=self)

I did also apply the fix found in this thread to request.py about the Agent-User:
from urllib2 import Request
and

req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})
    #response = urlopen(url)
    response = urlopen(req)

2)For the "403 Forbidden" I applied first the suggestion about mixins.py in the apply_signature function:

         if ('signature=' in url or
                ('s' not in stream and
                ('&sig=' in url or '&lsig=' in url))):

            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

then in the cipher.py I replaced the regex pattern as suggested in another post of this thread:

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

gittethis on 16 Aug 2019

👍5 😄4 🎉1

@gittethis
You're the best!
Thanks for fixing solution

martis-git on 16 Aug 2019

😄1

I tried for 15 minutes before realising that : for the mixins.py fix, you have to replace the whole snippet, not just replace signature to sig.

Now it works, thanks a lot !

snwfdhmp on 26 Aug 2019

👍2

With 9.5.1 (the latest downloaded) I tinkered with all the fixes from this thread and finally got it working. There are two separate issues that i had to tackle:

1. KeyError:['title'] and

2. "403: Forbidden"

def title(self):
        #Get the video title.

        #:rtype: str

        try:
                tt1 = self.player_config_args['title']
        except:
                tt1 = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not tt1:
                        tt1 = "Unknown YTube video"
        return tt1

and streams.py:

    def default_filename(self):
        """Generate filename based on the video title.

        :rtype: str
        :returns:
            An os file system compatible filename.
        """
        try:
                title = self.player_config_args['title']
        except:
                title = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not title:
                        title = "Unknown YTube video"
        filename = safe_filename(title)
        return '{filename}.{s.subtype}'.format(filename=filename, s=self)

I did also apply the fix found in this thread to request.py about the Agent-User:
from urllib2 import Request
and

req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})
    #response = urlopen(url)
    response = urlopen(req)

2)For the "403 Forbidden" I applied first the suggestion about mixins.py in the apply_signature function:

         if ('signature=' in url or
                ('s' not in stream and
                ('&sig=' in url or '&lsig=' in url))):

            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

then in the cipher.py I replaced the regex pattern as suggested in another post of this thread:

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

@gitethis 's solutions work, and tried it on my scripts, however, here's a slight tweak for python3 users:

in request.py instead of: from urllib2 import Request

use this: from urllib.request import Request

/ogs

ogspeace on 27 Aug 2019

👍5

Seeing as how this appears to be solved with the changes described in various comments, did anyone create a pull request already? I noticed one while quickly scanning this issue, but that one's CI tests failed.

jrial on 27 Aug 2019

@jrial ,

Unfortunately, I seem to have applied all the fixes so helpfully summarized by @ogspeace, and I'm still getting "urllib.error.HTTPError: HTTP Error 403: Forbidden"

Algomorph on 8 Sep 2019

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

mixins.py - patch by @danielgordon10 in #399 (comment)

cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of #399 (comment)

This also worked very well for me.

halfendt on 13 Sep 2019

I changed both mixins.py and cipher.py, but I still got HTTPError: Forbidden

xyang5987 on 16 Sep 2019

I use Python 3.7 and pytube 9.5.2 with win 10.

xyang5987 on 16 Sep 2019

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

mixins.py - patch by @danielgordon10 in #399 (comment)

cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of #399 (comment)

I tried this in Windows 10, and it did finally work. Python 3.7.4, PyTube 9.5.2 with the above 2 patches. I also had to apply the title retrieval patch in __main__.py and streams.py as outlined by @ogspeace. Did not have to mess with urllib for whatever reason, but maybe just got lucky on the ~50 videos I've tried.

Algomorph on 18 Sep 2019

😄1 👍1

I use Python 3.7 and pytube 9.5.2 with win 10.

did you use also my patches on "title" retrieval for __main__.py and streams.py, as outlined and confirmed by @ogspeace ?

gittethis on 18 Sep 2019

To keep things clear here, the 'title' retrieval patches fixed a separate issue, e.g. using you_tube_object.video after it's been created. It doesn't affect the "HTTPError: Forbidden" issue, and as such should be kept in a separate PR.

Algomorph on 18 Sep 2019

For me, the errors are not occurring randomly. They always seem to occur when I attempt to download music videos.

eyobofficial on 26 Sep 2019

Not working
It seems to be a good option to download youtube contents but no support, therefore let's look after other options.

santoslopes on 19 Oct 2019

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

How do i find this mixins py file?

MrSlagovich on 2 Nov 2019

👍1

I tried @Nowbob fix but does'nt work, it says Request not defined.

YashKarthik on 3 Dec 2019

You’re missing the Request module in your python installation.

Sent with the great iPhone

On 3 Dec 2019, at 15:17, Yash notifications@github.com wrote:

I tried @Nowbob fix but does'nt work, it says Request not defined.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

gittethis on 3 Dec 2019

@YashKarthik @gittethis fixed in pytube3: https://github.com/hbmartin/pytube3/

hbmartin on 6 Feb 2020

Cannot work with the solutions in this thread
Probably because of copyright?

Mac OS Mojave, Python 3.7

yijun-li-20 on 9 Feb 2020

@yijun-li-20 what url are you looking at? what version of pytube are you using?

hbmartin on 9 Feb 2020

hi, pytube3 9.6.4 get urllib.error.HTTPError: HTTP Error 403: Forbidden on this video:
https://www.youtube.com/watch?v=yEG2VTHS9yg

makesnosense on 29 Mar 2020

Where is the mixins.py file in urllib3?

aulorbe on 5 May 2020

I am using Python3 and the following changes worked for me:

So, you basically study the structure of your URL. In my case since I'm working with Youtube Data API and hence a sample URL for me looked like:

https://www.googleapis.com/youtube/v3/videos?part=snippet&id=8SbUC-UaAxE&key=AIzaSyAVtcurxxUyQznBtLU5UmxqrRSENZ6gAIA

I used the urllib.parse library and then generated the above URL using the strategy below:

params = {'part':'snippet','id':VIDEO_KEY_goes_here', 'key':'YOUR_API_KEY_goes_here'}
querystring = urllib.parse.urlencode(params)
stats_url = 'https://www.googleapis.com/youtube/v3/videos'+'?'+querystring

amitbirajdar on 1 Aug 2020

👍1

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.