Hi Community,
What are some additional features you would like to see in PyTube?
Would love some input so we can prioritize and create a roadmap for future contributors.
Thanks
@RONNCC
Several years back, I rewrote pytube from scratch. One enhancement I planned for a future iteration was an upgrade to the stream filtering. I intended to make it feel more like an ORM (namely SQLAalchemy), albeit less elaborate.
The biggest obstacle was to figure out how a model could have a query attribute that returned a list of instances of that model, for example:
yt.streams.filter(Stream.resolution == "1080p")
After quite a bit of research, I figured it out.
import operator
class Field(object):
def __init__(self, name):
self.name = name
def __eq__(self, rhs):
op = operator.is_ if rhs is None else operator.eq
return Expression(self, op, rhs)
def __ne__(self, rhs):
op = operator.is_not if rhs is None else operator.ne
return Expression(self, op, rhs)
class Video:
itag = Field("itag")
resolution = Field("resolution")
def __init__(self, itag, resolution):
self.itag = itag
self.resolution = resolution
def __repr__(self):
return f'<Video: itag={self.itag} resolution="{self.resolution}">'
class YouTube:
def __init__(self, url):
self.videos = [
Video(itag="137", resolution="1080p"),
Video(itag="246", resolution="720p"),
Video(itag="244", resolution="480p"),
]
@property
def streams(self):
return Query(self.videos)
class Expression:
def __init__(self, lhs, op, rhs):
self.lhs = lhs
self.op = op
self.rhs = rhs
class Query:
def __init__(self, files):
self.videos = files
def filter(self, *expressions):
resultset = []
for video in self.videos:
for expr in expressions:
# expressions represent == and != in class form.
# lhs and rhs are "left hand side" and "right hand side"
# respectively. or put another way:
#
# left side expr right side
# ↓ ↓ ↓
# yt.streams.filter(Video.resolution == "1080p")
key, value = (expr.lhs.name, expr.rhs)
# key: itag or resolution, value: 1080p, 720p, 137, etc.
if expr.op(getattr(video, key), value):
resultset.append(video)
return Query(resultset)
def all(self):
return self.videos
def main():
yt = YouTube("https://youtube.com/watch?v=2lAe1cqCOXo")
print(yt.streams.filter(Video.resolution == "1080p").all())
print(yt.streams.filter(Video.resolution != "480p").all())
if __name__ == "__main__":
main()
This would allow you to do stuff like:
yt.streams.filter(
Video.resolution > "480p",
Video.resolution < "1080p",
Video.mime_type == "video/mp4"
).all()
Create a brew tap for pytube. This one doesn't seem too difficult, I can possibly find time to do it in the new couple days.
The documentation could also use a bit of TLC for sure. Some of it is a bit outdated, especially after the weird fork/merge with pytube3, and there are some features that lack documentation altogether.
I suspect the person who expressed interest in contributing to pytube may have lost interest, but I'll give them until issue 874 hits a month old before I just take over myself so they at least get an opportunity. I know it's easy to get overwhelmed by the holidays, so I want to give them a bit more time first
Here are some more ideas:
Maybe a Django application with a webserver or a gRPC interface to remotely submit a request (running a pytube server on a raspberry Pi) ?
This might also solve the ORM idea of @nficano (?)
Maybe interface to VideoDownload helper (https://downloadhelper.net/):
A python drop-in replacement for:
https://github.com/mi-g/vdhcoapp
That can run on a remote different server (not localhost only).
More generic stream extractor
for also downloading non-youtube streams (e.g. vimeo, arte, ... )
- high-res downloads (1080p, DASH , webm )
This is actually already supported for videos that provide these resolutions, you just have to know what itags to look for.
- downloads of playlist with asyncio / threads (maybe max. 6 at a time)
I assume you mean for the cli, correct? This might be doable, but should probably be a separate flag
- download of age restricted files
This is already supported
Maybe a Django application with a webserver or a gRPC interface to remotely submit a request (running a pytube server on a raspberry Pi) ?
This might also solve the ORM idea of @nficano (?)
Are you suggesting that somebody hosts a pytube service online? As cool as that sounds, this could become very challenging with hosting costs, and pytube is designed more as a library for users to incorporate into their projects
More generic stream extractor
for also downloading non-youtube streams (e.g. vimeo, arte, ... )
While this is functionality that could be written, I think it would be better if it were extracted to a different library. I'm open to input from @RONNCC and @nficano on this though, but I think that if we did implement this functionality, we'd have to refactor the codebase to keep it organized
much simpler, @tfdahlin: you have one central little download server (e.g. a raspberry pi, directly connected to your router) and this accepts download requests from clients at home and downloads the file in a quiet moment (e.g. at night) - very useful for people with bad internet connection - like us ;)
much simpler, @tfdahlin: you have one central little download server (e.g. a raspberry pi, directly connected to your router) and this accepts download requests from clients at home and downloads the file in a quiet moment (e.g. at night) - very useful for people with bad internet connection - like us ;)
Ah, so you mean write the code that will queue downloads, that end users would host on their own servers? That could be doable, but would likely need to be a different repository from the main pytube repository.
- high-res downloads (1080p, DASH , webm )
This is actually already supported for videos that provide these resolutions, you just have to know what itags to look for.
Aah, but I have an easier interface for the cli in mind (without the user querying the streams first) : one just should be able to specify container format (webm or mp4) and then the correct and best stream/itag is automatically selected and downloaded.
- downloads of playlist with asyncio / threads (maybe max. 6 at a time)
I assume you mean for the cli, correct? This might be doable, but should probably be a separate flag
yes, mainly for the cli - but it would be also useful for playlist downloads (multiple downloads at a time)
- download of age restricted files
This is already supported
Great - then documentation would be very helpful :)
much simpler, @tfdahlin: you have one central little download server (e.g. a raspberry pi, directly connected to your router) and this accepts download requests from clients at home and downloads the file in a quiet moment (e.g. at night) - very useful for people with bad internet connection - like us ;)
Ah, so you mean write the code that will queue downloads, that end users would host on their own servers? That could be doable, but would likely need to be a different repository from the main pytube repository.
Agreed - that might exceed the scope of pytube ( would be cool as a follow-up application)
Generator function for Stream to return chunks of data?
@Weathercold depending on how you would use that, that functionality actually already exists sorta. You can define an on_progress function that takes 3 arguments. As the video is downloaded, this function is called with 3 arguments. The first is the stream itself, the second is a chunk of bytes, and the third is the bytes remaining in the download as an int.
Choosing user agent - maybe varying it from time to time ?
In general I think pytube should limit itself to stay lean, stable, fast and compact - if people want the full-blown feature-monster, they would go to youtube-dl anyway...
More generic stream extractor
for also downloading non-youtube streams (e.g. vimeo, arte, ... )
While this is functionality that could be written, I think it would be better if it were extracted to a different library.
I agree @tfdahlin, If this is something the community wants, it would make more sense to create a separate video downloading library, versus shoehorning that ability into pytube.
Most helpful comment
This is actually already supported for videos that provide these resolutions, you just have to know what itags to look for.
I assume you mean for the cli, correct? This might be doable, but should probably be a separate flag
This is already supported