Pytube: urllib.error.HTTPError: HTTP Error 403: Forbidden

Created on 6 Jun 2019  ·  62Comments  ·  Source: pytube/pytube

Hi all,

I'm aware that there are some older issues with the same error, howevery I thought I would just give it a try. For some videos (like this one: https://www.youtube.com/watch?v=393C3pr2ioY ) I get a 403 error...others work fine though. Do you know of any workaround for that? I saw this thread in the older issue https://stackoverflow.com/questions/16627227/http-error-403-in-python-3-web-scraping but I'm not good enough in Python to find the right spot to implement it. Any help would be appreciated.

PS: I already updated to the most current version (9.5.0).

stale

Most helpful comment

I think I have a working fix now for this and a related issue (https://github.com/nficano/pytube/issues/392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

All 62 comments

I am having the same problem, but for me, it is random whether the same youtube link will generate a 403 or not. My current testing links are:
https://www.youtube.com/watch?v=pAgnJDJN4VA&index=117&list=PL3LCt07uiILKduYr52ZMYrE_PvbSuUdZM
https://www.youtube.com/watch?v=t9psCAWrISg&index=136&list=PL3LCt07uiILKduYr52ZMYrE_PvbSuUdZM
I am only trying to download the audio stream.

It's a hack of a fix, but this is how I've got mine working for now (I'm running in python 2.7, you may need to fiddle depending on your version):
In pytube/request.py:
At line 4 I added:
from urllib2 import Request

And above line 21 (response = urlopen(url)):
req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})

And then replace the line below:
response = urlopen(url)
with:
response = urlopen(req)

Just tested the solution, keeps randomly giving 403, and then working again.

@Waarbubble : you mean you tested @Nowbob's solution?

yes
Tested in python 3

Yeah I was having the same "sometimes it works sometimes it doesn't" problem, but it's the best I could come up with in a short time.

It was worth a try.

I'm doing something wrong...
In line for of request.py:
from urllib import request

then line 22/23:
req = request(url, headers = {"User-Agent": "Mozilla/5.0"})
response = urlopen(req)

now I'm getting this error:

self.watch_html = request.get(url=self.watch_url)

File "C:\Users\janvo\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pytube\request.py", line 22, in get
req = request(url, headers = {"User-Agent": "Mozilla/5.0"})
TypeError: 'module' object is not callable

If you are using python 3 the

fourth line needs to be from urllib.request import Request

and then line 22 will need a capital R in request

Thanks...I changed it now. It's behaving the same way...three times I got the 403-error and then it worked.

Getting the same error. I tried all the solutions above. And some works and some don't.

This got much more frequent for me today for some reason. The above solutions do not seem to work for the new cases. Example video https://www.youtube.com/watch?v=ZYoUXaU81Gk

I believe I've managed to fix this by changing &signature= in line 64 of mixins.py to &sig=.

The solution works for me, no more random 403

Neither suggested solution fixes my test video https://www.youtube.com/watch?v=ZYoUXaU81Gk

@danielgordon10 What error are you getting? I just tried downloading that link and it seems to be working with the &sig= fix.

pytube https://www.youtube.com/watch?v=ZYoUXaU81Gk --list
<Stream: itag="43" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp8.0" acodec="vorbis">
<Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">
<Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d400b">
<Stream: itag="244" mime_type="video/webm" res="480p" fps="30fps" vcodec="vp9">
<Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">
<Stream: itag="243" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp9">
<Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d400c">
<Stream: itag="242" mime_type="video/webm" res="240p" fps="30fps" vcodec="vp9">
<Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400b">
<Stream: itag="278" mime_type="video/webm" res="144p" fps="30fps" vcodec="vp9">
<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">
<Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">
<Stream: itag="249" mime_type="audio/webm" abr="50kbps" acodec="opus">
<Stream: itag="250" mime_type="audio/webm" abr="70kbps" acodec="opus">
<Stream: itag="251" mime_type="audio/webm" abr="160kbps" acodec="opus">
pytube https://www.youtube.com/watch?v=ZYoUXaU81Gk --itag 251
Мисс Новоуральск 2012.webm | 7446370 bytes

So I am able to get the stream but I can't download it.

import pytube
video_url = 'https://www.youtube.com/watch?v=ZYoUXaU81Gk'
yt = pytube.YouTube(video_url)
streams = yt.streams
streams = (
        streams.filter(progressive=True, custom_filter_functions=[lambda x: x.resolution is not None])
        .order_by("resolution")
        .all()
)
stream = streams[0]
video = stream.download()

Then I get the urllib.error.HTTPError

    video = stream.download()
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/streams.py", line 217, in download
    bytes_remaining = self.filesize
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/streams.py", line 164, in filesize
    headers = request.get(self.url, headers=True)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/site-packages/pytube/request.py", line 22, in get
    response = urlopen(url)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/home/xkcd/miniconda3/envs/video-env/lib/python3.7/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

I've only used this in CLI before so I'm not sure, my guess would be to check if it's importing the correct one.

It works fine for other videos...

I think I have a working fix now for this and a related issue (https://github.com/nficano/pytube/issues/392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

I believe I've managed to fix this by changing &signature= in line 64 of mixins.py to &sig=.

I tried the first fix from @Nowbob, but the problem persisted.
Change the line 64 as you said and now it's working again. Perfectly on Python 2.7

still have that error !?

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

It's work thank you!
But I modify two file mixins.py & cipher.py
% mixins.py
ref #402
if ('signature=' in url) or ('&sig=' in url) or ('&lsig=' in url):
to
if ('signature=' in url or
('s' not in stream and
('&sig=' in url or '&lsig=' in url))):

and

stream_manifest[i]['url'] = url + '&signature=' + signature
to
stream_manifest[i]['url'] = url + '&sig=' + signature

and change pattern in cipher.py
pattern = [
r'yt.akamaized.net/)\s\|\|\s'
r'.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
r'.sig\|\|(?P[a-zA-Z0-9$]+)(',
r'\bc\s
&&\sd.set([^,]+\s,\s(?:encodeURIComponent'
r'\s
()?(?P[a-zA-Z0-9$]+)(',
]
to
pattern = [
r'\b[cs]\s&&\s[adf].set([^,]+\s,\sencodeURIComponent\s(\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\sencodeURIComponent\s(\s(?P[a-zA-Z0-9$]+)(',
r'(?P[a-zA-Z0-9$]+)\s=\sfunction(\sa\s)\s{\sa\s=\sa.split(\s""\s)',
r'(["\'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(',
r'.sig\|\|(?P[a-zA-Z0-9$]+)(',
r'yt.akamaized.net/)\s\|\|\s.?\s[cs]\s&&\s[adf].set([^,]+\s,\s(?:encodeURIComponent\s()?\s'
r'(?P[a-zA-Z0-9$]+)(',
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\sa.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s
&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)])\s(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)])\s(\s*(?P[a-zA-Z0-9$]+)('
]

I forget this issue number, someone can help me to link?
work on python 3.6, pytube 9.5.0, urllib3 1.25.3

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

It's work thank you!
But I modify two file mixins.py & cipher.py
% mixins.py
ref #402
if ('signature=' in url) or ('&sig=' in url) or ('&lsig=' in url):
to
if ('signature=' in url or
('s' not in stream and
('&sig=' in url or '&lsig=' in url))):

and

stream_manifest[i]['url'] = url + '&signature=' + signature
to
stream_manifest[i]['url'] = url + '&sig=' + signature

and change pattern in cipher.py
pattern = [
r'yt.akamaized.net/)\s||\s'
r'._?\s_c\s&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent'
r'\s()?(?P[a-zA-Z0-9$]+)(',
r'.sig||(?P[a-zA-Z0-9$]+)(',
r'\bc\s
&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent'
r'\s
()?(?P[a-zA-Z0-9$]+)(',
]
to
pattern = [
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s_encodeURIComponent\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s_encodeURIComponent\s_(\s(?P[a-zA-Z0-9$]+)(',
r'(?P[a-zA-Z0-9$]+)\s=\s_function(\s_a\s)\s{\s_a\s_=\s_a.split(\s_""\s)',
r'(["'])signature\1\s,\s(?P[a-zA-Z0-9$]+)(',
r'.sig||(?P[a-zA-Z0-9$]+)(',
r'yt.akamaized.net/)\s||\s._?\s_[cs]\s&&\s[adf].set([^,]+\s,\s(?:encodeURIComponent\s()?\s'
r'(?P[a-zA-Z0-9$]+)(',
r'\b[cs]\s&&\s[adf].set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\b[a-zA-Z0-9]+\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s_a.set([^,]+\s_,\s([^)]_)\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s
&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)]_)\s_(\s(?P[a-zA-Z0-9$]+)(',
r'\bc\s&&\s[a-zA-Z0-9]+.set([^,]+\s,\s([^)]_)\s_(\s*(?P[a-zA-Z0-9$]+)('
]

I forget this issue number, someone can help me to link?
work on python 3.6, pytube 9.5.0, urllib3 1.25.3

changing the regex pattern to that value returns a SyntaxError.

Traceback (most recent call last):
  File "get_file.py", line 9, in <module>
    from pytube import YouTube
  File "/usr/local/lib/python3.5/dist-packages/pytube/__init__.py", line 18, in <module>
    from pytube.contrib.playlist import Playlist
  File "/usr/local/lib/python3.5/dist-packages/pytube/contrib/playlist.py", line 11, in <module>
    from pytube.__main__ import YouTube
  File "/usr/local/lib/python3.5/dist-packages/pytube/__main__.py", line 18, in <module>
    from pytube import mixins
  File "/usr/local/lib/python3.5/dist-packages/pytube/mixins.py", line 9, in <module>
    from pytube import cipher
  File "/usr/local/lib/python3.5/dist-packages/pytube/cipher.py", line 43
    r'(["'])signature\1\s*,\s*(?P[a-zA-Z0-9$]+)(',
           ^
SyntaxError: invalid syntax

sorry, I copy all the code in Github but still get errors. Any help would be appreciated. plz.

URL: https://www.youtube.com/watch?v=iWZmdoY1aTE
Traceback (most recent call last):
File "pytube_mp3.py", line 9, in
stream.download("/home/fisheep/Desktop")
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/streams.py", line 202, in download
bytes_remaining = self.filesize
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/streams.py", line 153, in filesize
headers = request.get(self.url, headers=True)
File "/home/fisheep/.local/lib/python3.6/site-packages/pytube/request.py", line 21, in get
response = urlopen(req)
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(args)
File "/usr/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(
args)
File "/usr/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

I am also still having difficulty after applying several of the fixes. I am not getting the 403 error. Now I am getting a regex error:

regex pattern (yt.akamaized.net/)\s\|\|\s.?\sc\s&&\sd.set([^,]+\s,\s(?:encodeURIComponent\s*()?(?P[a-zA-Z0-9$]+)() had zero matches

I've tried changing the pattern to several of the suggested versions with no luck.

On PyTube 9.5.1 on Python 3.7.3 in a virtual env on Kubuntu

Same here. Applying the mixins.py fix (I believe I did this correctly) and the new regex listed above (replacing the faulty ' ) gives me:

File "/home/user/.local/lib/python3.5/site-packages/pytube/mixins.py", line 51, in apply_signature signature = cipher.get_signature(js, stream['s']) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 258, in get_signature tplan = get_transform_plan(js) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 77, in get_transform_plan name = re.escape(get_initial_function_name(js)) File "/home/user/.local/lib/python3.5/site-packages/pytube/cipher.py", line 53, in get_initial_function_name return regex_search(pattern, js, group=1) File "/home/user/.local/lib/python3.5/site-packages/pytube/helpers.py", line 36, in regex_search regex = re.compile(p, flags) File "/usr/lib/python3.5/re.py", line 224, in compile return _compile(pattern, flags) File "/usr/lib/python3.5/re.py", line 293, in _compile p = sre_compile.compile(pattern, flags) File "/usr/lib/python3.5/sre_compile.py", line 536, in compile p = sre_parse.parse(p, flags) File "/usr/lib/python3.5/sre_parse.py", line 829, in parse p = _parse_sub(source, pattern, 0) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 778, in _parse p = _parse_sub(source, state) File "/usr/lib/python3.5/sre_parse.py", line 437, in _parse_sub itemsappend(_parse(source, state)) File "/usr/lib/python3.5/sre_parse.py", line 689, in _parse len(char) + 2) sre_constants.error: unknown extension ?P[ at position 60

same error here...

I am also still having difficulty after applying several of the fixes. I am not getting the 403 error. Now I am getting a regex error:

regex pattern (yt.akamaized.net/)\s||\s._?\s_c\s&&\s_d.set([^,]+\s_,\s(?:encodeURIComponent\s*()?(?P[a-zA-Z0-9$]+)() had zero matches

I've tried changing the pattern to several of the suggested versions with no luck.

On PyTube 9.5.1 on Python 3.7.3 in a virtual env on Kubuntu

same here with python 2.7.5 and centos 7

changing pattern in cipher.py to

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

fixes it for me(regex stolen from youtube-dl)

changing pattern in cipher.py to

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

fixes it for me(regex stolen from youtube-dl)

Apart from a missing single quotation mark and a comma, this works for me, thanks.
Here is the fix:

    pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

Neither seems to work for me. I tried changing the user agent, changed &signature to &sig in the stream manifest URL, and change the pattern in cipher.py, but still getting a 403. That's under Python 3 in a venv.

Hi, it works for me.
I changed the cipher.py and the mixins.py as suggested.
My configuration is: mac Mojave and python 3.6, pytube 9.5.1
thanks everybody for the help.

Thank you pytube hackers... works for me with cipher.py/mixins.py as listed. Mac El Capitan / Py3.6 / PyTube 9.5.1 -- ❤️ pytube

works for me too with cipher.py and the mixins.py as suggested.
My configuration: win10 / Py 3.7.4 /pytube 9.5.1

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

  1. mixins.py - patch by @danielgordon10 in https://github.com/nficano/pytube/issues/399#issuecomment-501814505
  2. cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of https://github.com/nficano/pytube/issues/399#issuecomment-511535993

Worked form me with @GravelCZ's pattern and @nhatton96's fix + the issue at this URL:
https://github.com/nficano/pytube/issues/399 (mentionned by @ogspeace)
On Windows10, Python 3.7.4, Pytube 9.5.1
Thanks a lot!

Not working for me when applying the fixes of \@victorgregorio or \@nahatton96

I tried inspecting cURL requests of a browser playing a youtube video and a url returned by pytube, and it seems that some GET parameters may be missing from the fetch requests performed by pytube. I get the 403 error systematically, so I think it might be related to something like this.

MacOS 10.13, python 3.6.0, pytube 9.5.1

With 9.5.1 (the latest downloaded) I tinkered with all the fixes from this thread and finally got it working. There are two separate issues that i had to tackle:
1) KeyError:['title'] and
2) "403: Forbidden"

1)FIrst one to solve was the KeyError:['title']
For some reason the 'title' key is not retrieved anymore in the root of the player_config_args dictionary, hence the error. Upon debugging of the req/reply I found It is instead present in "player_response/videoDetails/title", so edit __main__.py:

def title(self):
        #Get the video title.

        #:rtype: str

        try:
                tt1 = self.player_config_args['title']
        except:
                tt1 = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not tt1:
                        tt1 = "Unknown YTube video"
        return tt1

and streams.py:

    def default_filename(self):
        """Generate filename based on the video title.

        :rtype: str
        :returns:
            An os file system compatible filename.
        """
        try:
                title = self.player_config_args['title']
        except:
                title = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not title:
                        title = "Unknown YTube video"
        filename = safe_filename(title)
        return '{filename}.{s.subtype}'.format(filename=filename, s=self)

I did also apply the fix found in this thread to request.py about the Agent-User:
from urllib2 import Request
and

req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})
    #response = urlopen(url)
    response = urlopen(req)

2)For the "403 Forbidden" I applied first the suggestion about mixins.py in the apply_signature function:

         if ('signature=' in url or
                ('s' not in stream and
                ('&sig=' in url or '&lsig=' in url))):

            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

then in the cipher.py I replaced the regex pattern as suggested in another post of this thread:

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

@gittethis
You're the best!
Thanks for fixing solution

I tried for 15 minutes before realising that : for the mixins.py fix, you have to replace the whole snippet, not just replace signature to sig.

Now it works, thanks a lot !

With 9.5.1 (the latest downloaded) I tinkered with all the fixes from this thread and finally got it working. There are two separate issues that i had to tackle:

1. KeyError:['title'] and

2. "403: Forbidden"

1)FIrst one to solve was the KeyError:['title']
For some reason the 'title' key is not retrieved anymore in the root of the player_config_args dictionary, hence the error. Upon debugging of the req/reply I found It is instead present in "player_response/videoDetails/title", so edit *main.py*:

def title(self):
        #Get the video title.

        #:rtype: str

        try:
                tt1 = self.player_config_args['title']
        except:
                tt1 = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not tt1:
                        tt1 = "Unknown YTube video"
        return tt1

and streams.py:

    def default_filename(self):
        """Generate filename based on the video title.

        :rtype: str
        :returns:
            An os file system compatible filename.
        """
        try:
                title = self.player_config_args['title']
        except:
                title = self.player_config_args.get('player_response', {}).get('videoDetails', {}).get('title')
        finally:
                if not title:
                        title = "Unknown YTube video"
        filename = safe_filename(title)
        return '{filename}.{s.subtype}'.format(filename=filename, s=self)

I did also apply the fix found in this thread to request.py about the Agent-User:
from urllib2 import Request
and

req = Request(url, headers = {"User-Agent": "Mozilla/5.0"})
    #response = urlopen(url)
    response = urlopen(req)

2)For the "403 Forbidden" I applied first the suggestion about mixins.py in the apply_signature function:

         if ('signature=' in url or
                ('s' not in stream and
                ('&sig=' in url or '&lsig=' in url))):

            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

then in the cipher.py I replaced the regex pattern as suggested in another post of this thread:

pattern = [
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*encodeURIComponent\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'(?P<sig>[a-zA-Z0-9$]+)\s*=\s*function\(\s*a\s*\)\s*{\s*a\s*=\s*a\.split\(\s*""\s*\)',
        r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
        r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<si$',
        r'\b[cs]\s*&&\s*[adf]\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\b[a-zA-Z0-9]+\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*a\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
        r'\bc\s*&&\s*[a-zA-Z0-9]+\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\('
    ]

@gitethis 's solutions work, and tried it on my scripts, however, here's a slight tweak for python3 users:

in request.py instead of: from urllib2 import Request

use this: from urllib.request import Request

/ogs

Seeing as how this appears to be solved with the changes described in various comments, did anyone create a pull request already? I noticed one while quickly scanning this issue, but that one's CI tests failed.

@jrial ,

Unfortunately, I seem to have applied all the fixes so helpfully summarized by @ogspeace, and I'm still getting "urllib.error.HTTPError: HTTP Error 403: Forbidden"

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

  1. mixins.py - patch by @danielgordon10 in #399 (comment)
  2. cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of #399 (comment)

This also worked very well for me.

I changed both mixins.py and cipher.py, but I still got HTTPError: Forbidden

I use Python 3.7 and pytube 9.5.2 with win 10.

Thank you! Works for me as well on Python 3.6.5, urllib3 1.25.3, and PyTube 9.5.1.

To summarize the patches needed:

  1. mixins.py - patch by @danielgordon10 in #399 (comment)
  2. cipher.py - edit of @GravelCZ's patch by @nhatton96 at the bottom of #399 (comment)

I tried this in Windows 10, and it did finally work. Python 3.7.4, PyTube 9.5.2 with the above 2 patches. I also had to apply the title retrieval patch in __main__.py and streams.py as outlined by @ogspeace. Did not have to mess with urllib for whatever reason, but maybe just got lucky on the ~50 videos I've tried.

I use Python 3.7 and pytube 9.5.2 with win 10.

did you use also my patches on "title" retrieval for __main__.py and streams.py, as outlined and confirmed by @ogspeace ?

To keep things clear here, the 'title' retrieval patches fixed a separate issue, e.g. using you_tube_object.video after it's been created. It doesn't affect the "HTTPError: Forbidden" issue, and as such should be kept in a separate PR.

For me, the errors are not occurring randomly. They always seem to occur when I attempt to download music videos.

Not working
It seems to be a good option to download youtube contents but no support, therefore let's look after other options.

I think I have a working fix now for this and a related issue (#392). It's not pretty, but it seems to get the job done.
in mixins.py

        if ('signature=' in url or 
                ('s' not in stream and 
                 ('&sig=' in url or '&lsig=' in url))):
            # For certain videos, YouTube will just provide them pre-signed, in
            # which case there's no real magic to download them and we can skip
            # the whole signature descrambling entirely.
            logger.debug('signature found, skip decipher')
            continue

        if js is not None:
            signature = cipher.get_signature(js, stream['s'])
        else:
            # signature not present in url (line 33), need js to descramble
            # TypeError caught in __main__
            raise TypeError('JS is None')

        logger.debug(
            'finished descrambling signature for itag=%s\n%s',
            stream['itag'], pprint.pformat(
                {
                    's': stream['s'],
                    'signature': signature,
                }, indent=2,
            ),
        )
        stream_manifest[i]['url'] = url + '&sig=' + signature

How do i find this mixins py file?

I tried @Nowbob fix but does'nt work, it says Request not defined.

You’re missing the Request module in your python installation.

Sent with the great iPhone

On 3 Dec 2019, at 15:17, Yash notifications@github.com wrote:


I tried @Nowbob fix but does'nt work, it says Request not defined.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

@YashKarthik @gittethis fixed in pytube3: https://github.com/hbmartin/pytube3/

Cannot work with the solutions in this thread
Probably because of copyright?

Mac OS Mojave, Python 3.7

@yijun-li-20 what url are you looking at? what version of pytube are you using?

hi, pytube3 9.6.4 get urllib.error.HTTPError: HTTP Error 403: Forbidden on this video:
https://www.youtube.com/watch?v=yEG2VTHS9yg

Where is the mixins.py file in urllib3?

I am using Python3 and the following changes worked for me:

So, you basically study the structure of your URL. In my case since I'm working with Youtube Data API and hence a sample URL for me looked like:

https://www.googleapis.com/youtube/v3/videos?part=snippet&id=8SbUC-UaAxE&key=AIzaSyAVtcurxxUyQznBtLU5UmxqrRSENZ6gAIA

I used the urllib.parse library and then generated the above URL using the strategy below:

params = {'part':'snippet','id':VIDEO_KEY_goes_here', 'key':'YOUR_API_KEY_goes_here'}
querystring = urllib.parse.urlencode(params)
stats_url = 'https://www.googleapis.com/youtube/v3/videos'+'?'+querystring

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

kpister picture kpister  ·  23Comments

stephanemombuleau picture stephanemombuleau  ·  19Comments

RONNCC picture RONNCC  ·  29Comments

Ikebani picture Ikebani  ·  36Comments

JMIdeaMaker picture JMIdeaMaker  ·  20Comments