Signup/Sign In
LAST UPDATED: JUNE 15, 2023

Pytube To Download Youtube Videos With Python

Technology #library#python

    Let's start this article with some facts about YouTube:

    1. 100 -120 hours of video content is added to YouTube every 10 minutes.
    2. Watching all the available YouTube videos, it will take more than 1,00,000 years.
    3. More than 1.8 billion users sign into YouTube every month.
    4. Most programmers around the world use YouTube as the main source of learning, because of the huge knowledge base (Lots of educational videos).

    The main reason why I am talking about YouTube is that with a huge amount of content available on YouTube, most of us might have been using third-party applications to download videos from YouTube to access them offline. And even though we use third-party applications or plug-ins we still end up downloading low-quality videos and have to upgrade to a so-called "Premium Account" to download the videos in high quality.

    So why not build your own application to download YouTube videos? How? Using pytube module of Python.

    Unlike some other famous libraries, pytube doesn't have any third-party dependencies. Pytube is a lightweight Python library with a rich set of features for downloading YouTube videos developed by Nick Ficano. This open-source project is very easy to use but has a lot of bugs(still a work in progress). Toward the end of this article, I will be providing some code snippets to rectify some of the errors that might occur while you use the library. Hope these errors get fixed by the next release.

    Pytube To Download Youtube Videos With Python

    Installing pytube Library

    Open the command prompt(terminal) and type in the following command:

    pip install pytube

    pytube is not dependent on any other library, so you don't need to install any other libraries. It is available for almost all the latest versions of Python (2.7, 3.4, and higher). To start working with pytube, create a Python file and type in the following statements,

    from pytube import YouTube

    The above command means that from the library pytube we are using the "YouTube" module and from now on this YouTube keyword will be acting as an Object. This object provides many methods and properties like length, title, rating, etc. Note that in the word YouTube letters Y and T are in the capital, if the case is not given properly eventually we will end up with a lot of errors.

    Youtube video download using pytube

    There are in total five types of Objects available in the pytube library, they are:

    1. YouTube Object
    2. Stream Object
    3. StreamQuery Object
    4. Caption Object
    5. CaptionQuery Object

    In this article, we will mostly work with the YouTube object while briefly covering other objects as well.


    The YouTube Object

    To fetch information about the video that you want to download using the URL of the video, you need to first, create an instance with the YouTube video's URL on which you want to perform operations. Simply copy the URL of the Youtube video and pass it as an argument to the YouTube() object.

    Here is the code,

    myVideo = YouTube("https://www.youtube.com/watch?v=OTCuykFHBeA")

    Now with the myVideo object created, let us learn how to retrieve some of the basic information about the video:


    Properties of the YouTube Object

    Following are the properties(information about the YouTube video) of the YouTube object that we can access.

    1. Title
    2. Length
    3. Thumbnail_URL
    4. Description
    5. Views*
    6. Rating*
    7. Age Restricted
    8. Video Id

    NOTE: *Refers that you may get errors while using the property, I will be explaining how to fix those errors in the "Possible Errors and their fixes" section below.

    1. Title

    print(myVideo.title)

    The property title of the YouTube object (myVideo) returns the title of the video. (Please refer to the image above)

    2. Length

    print(myVideo.length)

    The length of a video is the total time in seconds, whenever the property length is accessed using the YouTube object it gives the length of the video. The value returned will always be in seconds.

    Thumbnail image example for pytube

    3. Thumbnail Url

    print(myVideo.thumbnail_url)

    The thumbnail is an image(generally picked up from the video itself) that is the representational image or short descriptive image of the video. When this thumbnail_url property is used on the YouTube object it returns a URL for the thumbnail picture.

    4. Description

    print(myVideo.description)

    Every video on Youtube has some description which gives information about the video and some hyperlinks to the associated blogs or websites etc. This description can be viewed by using the description property of the YouTube object (myVideo).

    5. Views

    print(myVideo.views)

    The YouTube object (myVideo) Has this property called views which returns the number of times the video has been viewed. This property might not work correctly because of the changes in the Youtube API recently. So you have to change the code in the pytube library. All the changes that should be made are listed below in a separate section.

    6. Rating

    print(myVideo.rating)

    rating property will give us the average rating of the YouTube video being accessed. Just like the views property, rating the property might also through some errors.

    7. Age Restricted

    print(myVideo.age_restricted)

    Some videos on YouTube are age-restricted(can be viewed when signed in), to know whether a particular video is age restricted or not we can use the property age_restricted of the YouTube object (myVideo) which returns a Boolean value.

    If the value is false then the video is not age restricted and if it's true then the video is restricted.

    8. Video Id

    print(myVideo.video_id)

    This property video_id returns the ID of the video, which is generally present at the end of the Youtube video URLs.


    Possible Errors And their Fixes

    Here are some errors that I encountered while working with the pytube library.

    The Regex Error

    pytube.exceptions.RegexMatchError: regex pattern (yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*c\s*&&\s*d\.set\([^,]+\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\() had zero matches

    Or,

    signature = cipher.get_signature(js, stream['s'])

    These errors are encountered when we try to create a YouTube object with some youtube link and the link doesn't gets identified as a valid link. This happens because the regular expression used to identify YouTube links could not identify the URL as a valid one, that's why we need to change the code in the library.

    The screenshot below shows that this error is thrown in the function get_initial_function_name that is located in the cipher.py file.

    Error fixing in pytube Cipher Error

    To know where the library files are installed in your system, open command prompt and type the following command.

    pip show pytube

    You can use this command to locate any Python package installed.

    After executing that command you will get the location of the library. Open the folder, and edit the cipher.py file (Administrator permissions might be required). It's better to store a copy of the original file somewhere so that if we end up making some wrong changes we can always use the backup file.

    Fixing Regular Expression Error

    After opening the cipher.py file, do the following changes

    original code (at line 39):

    r'yt\.akamaized\.net/\)\s*\|\|\s*'

    fixed code (replace at line 39) with the following code:

    r'\bc\s*&&\s*d\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',

    After this fix, you can use any YouTube link without getting any error(fingers crossed).


    Error with the property Views

    As we mentioned before as well, due to some changes in the YouTube API you might get an error while using the views property.

    View Count error fix in pytube library

    Whenever we use the views property, this error occurs because of the old API call made to Youtube by pytube library. This error occurs in __main__.py file at line 290 in the pytube library.

    Fixing view_count Error

    As shown earlier, use the pip show pytube command in the command prompt and open the library directory and then the __main__.py file and make the following changes.

    Go to line number 290 where the error occurred, now replace that line with the following code

    return self.player_config_args['player_response']['videoDetails']['viewCount']

    , and that's it we are done.


    Error with the Property rating

    Currently, there is no fix for this error because the Youtube API is not giving any information on the rating because it is deprecated.

    Download the updated library files that are fixed from here: Library Files With errors Fixed or update the files yourself as explained above.


    Program to get Information on YouTube Video using pytube

    After Fixing all the errors, run the following program:

    # Importing YouTube Module from pytube library
    
    from pytube import YouTube
    
    # Prompting user for Youtube Video link
    
    youtube_url = input("Please enter a YouTube link:")
    
    # Creating YouTube object with the link
    
    myVideo = YouTube(youtube_url)
    
    # Title of the Video
    
    print("Title: " + myVideo.title)
    
    # Length of the Video in Seconds
    
    print("Duration: " + myVideo.length)
    
    # URL of the Thumbnail of the video
    
    print("Thumbnail Link: " + myVideo.thumbnail_url)
    
    # Description of the Video
    
    print("Description: " + myVideo.description)
    
    # Total Views of the Video
    
    print("Views: " + myVideo.views)
    
    # Age Restricted Content
    
    print("Age Restricted: " + str(myVideo.age_restricted))
    
    # ID of the Video
    
    print("Video ID: " + myVideo.video_id)

    Now that we have created a simple program and are quite familiar with the basic functioning of the pytube library, its time to take a deep dive and understand how we can actually use this library to download and stream the videos.


    pytube Library: Working with Streams

    Now let's talk about the StreamQuery Object, pytube library provides two types of streams:

    1. DASH Stream
    2. Progressive Stream

    The main difference between these two streams is that Dynamic Adaptive Streaming over HTTP (DASH) is an adaptive bitrate streaming technique while in Progressive stream it doesn't adapt to the current network speed. YouTube uses DASH Stream and that is the reason why it automatically adjusts the quality of the video streamed to your device based on the network speed.

    adaptive vs progressive streaming image example

    Youtube way of streaming DASH Video


    Youtube way of streaming a Video

    Whenever we use the streams property on a YouTube object it will return a StreamQuery object, which you can use to find all the streams available for the video you have provided the link for.

    from pytube import YouTube
    
    myVideo = YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI")
    
    myVideoStreams = yt.streams
    
    print(myVideoStreams)

    Output of the above Code:

    pytube.query.StreamQuery

    With this StreamQuery object, we can perform different operations and the most basic one is streamQuery.all() which gives the information of all the available streams in the form of a list. See the code below:

    print(myVideoStreams.all())

    Output of the above Code:

    [<Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">,
    
    <Stream: itag="136" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.4d4016">,
    
    <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">]

    In the output, we can observe a list with several streams, but I am showing only a few of the list items. Observe that each stream is just like a tag in HTML with different key-value pairs. Each of these streams has different keys like itag, mime_type, res, fps, vcodec, acodec and abr.

    Every stream has a different set of attributes based on which the quality in terms of resolution and codecs differ:

    1. mime_type refers to the way how video/audio files are transferred from server to client and are the way of knowing the format of the file (mp4,3gp,WebM etc.,)
    2. res (resolution) is the resolution of the video in the selected stream
    3. fps (frames per second) define how many frames are rendered per second? In general, most of the videos have 30 FPS
    4. acodec and vcodec refers to audio and video codec's. If the vcodec is missing in the stream, it means that it is an audio file and has an extra property named abr(average bitrate)of the audio.


    first() and last() on StreamQuery

    When first() is accessed on the StreamQuery object (myVideoStream), it will return the highest quality stream available for a given video which consists of the audio codec also.
    Similarly, when we use last() on the StreamQuery object it will return the audio stream because it is the lowest quality available.

    Here is the Syntax:

    YouTube('URL').streams.first()
    
    YouTube('URL').streams.last()

    Usage Example:

    myHDStream = YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI").streams.first()
    
    myAudioStream = YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI").streams.last()
    
    print(myHDStream, myAudioStream)

    This prints the details of the high definition stream and the audio stream.


    Downloading the YouTube Video

    Coming to the part where we download the video.

    Whenever you get the details of a single stream you can download that stream to your current working directory using the download() method.

    myStream = YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI").streams.first()
    
    myStream.download()

    If you want to download the file to a specific location or path on your system, you have to just specify the path where you want to store the file as the parameter to the download() method.

    Providing Relative Path:

    myStream.download("Videos/")

    Providing Absolute Path:

    myStream.download("D:/Videos/")

    Remember that whenever the download statement is executed, the program waits for the video/audio to download and it depends on the speed of your internet connection how long that will take.


    Using Filters

    As you have seen different keys in the stream like resolution, fps, codec, etc. We can use these keys as a filtering mechanism on the StreamQuery object to retrieve the desired stream(s) and then retrieve a single stream by using methods like first(), last() etc., and then download the video using that stream.

    myVideoStream = YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI").streams
    
    webmStreams = myVideoStream.filter(file_extension = "webm")
    
    print(webmStreams.first())
    
    audioStream = myVideoStream.filter(type = "audio")
    
    print(audioStream.first())
    
    audioStream.first().download("audios/")
    
    webmStreams.first().download("videos/")

    The above code shows how to use the filter() method, we can pass the file_extension parameter and retrieve streams for only the specified file format. Similarly, we can use type parameters in the filter() method, using which we can get either stream-of-type audio or of-type video only. After using the filter() method you can use any of the methods that will retrieve a single stream (first(), last() etc.,) then you can call the download() method to download the file.

    If you are looking for the code to download a video easily, here is the magical one-line code to download the video with the highest quality available to the current working directory.

    One-Line Code To Download a Video:

    YouTube("https://www.youtube.com/watch?v=oS8lASbvlpI").streams.first().download()

    Here's the link to the GitHub repository where you will find the code for the above programs and usage of different properties:

    Sample Programs for pytube


    Applications of pytube:

    1. Build a media player to stream YouTube videos.
    2. Automatically get the number of views of a YouTube video you want to monitor or a dashboard for your Youtube channel.
    3. Download multiple videos or playlists with ease.
    4. Download Subtitles of videos.

    Please drop any innovative ideas/applications possible with pytube in the comment section below to help our readers gain more insights.

    Happy CODING!!

    Conclusion

    Pytube opens up a world of possibilities for downloading YouTube videos programmatically using Python. With its straightforward API and powerful features, Pytube simplifies the process of automating video downloads and extracting audio tracks.

    By following the examples and guidelines provided in this article, you can seamlessly integrate Pytube into your Python projects and enhance your video downloading capabilities. Whether it's for educational purposes, personal archiving, or content creation, Pytube empowers you to harness the wealth of YouTube videos at your fingertips. Frequently Asked Questions(FAQs)

    1. What is Pytube?

    Pytube is a Python library that allows you to download YouTube videos programmatically. It provides an intuitive API for interacting with YouTube's video streaming capabilities and offers various functionalities like video downloading, audio extraction, and more.

    2. How do I install Pytube?

    You can install Pytube using pip, the Python package manager. Open your command prompt or terminal and run the command "pip install pytube" to install Pytube on your system.

    3. Can I specify the quality or format of the downloaded video using Pytube?

    Yes, Pytube allows you to specify the desired video quality and format for the downloaded videos. You can choose from various available options and customize the download process based on your preferences.

    4. Is Pytube capable of extracting audio from YouTube videos?

    Yes, Pytube has the capability to extract audio tracks from YouTube videos. You can use Pytube to download the audio-only version of a video or extract the audio track from a downloaded video file.

    5. Are there any limitations or restrictions when using Pytube?

    Pytube relies on YouTube's API and is subject to any limitations or restrictions imposed by YouTube. For example, some videos may be restricted from downloading due to copyright or other reasons. Additionally, YouTube's terms of service should always be respected when using Pytube or any other YouTube-related tool.

    You may also like:

    Software Engineer| AI | ML | Geek
    IF YOU LIKE IT, THEN SHARE IT
    Advertisement

    RELATED POSTS