How to Unzip file in Python
In this article, we will learn how one can perform unzipping of a file in Python. We will use some built-in functions, some simple approaches, and some custom codes as well to better understand the topic. Let's first have a quick look over what is a zip file and why we use it.
What is a Zip File?
ZIP is the archive file format that permits the first information to be totally recreated from the compacted information. A zip file is a single file containing one or more compressed files, offering an easy way to make large files smaller and keep related files together. Python ZipFile
is a class of zipfile
module for reading and writing zip files. We need zip files to lessen storage necessities and to improve transfer speed over standard connections.
A zip folder consisted of several files, in order to use the contents of a zip folder, we need to unzip the folder and extract the documents inside it. Let's learn about different ways to unzip a file in Python and saving the files in the same or different directory.
Python Zipfile Module
Python ZipFile
module provides several methods to handle file compress operations. It uses context manager construction. Its extractall()
function is used to extract all the files and folders present in the zip file. We can use zipfile.extractall()
function to unzip the file contents in the same directory as well as in a different directory.
Let us look at the syntax first and then the following examples.
Syntax
extractall(path, members, pwd)
Parameters
path
- It is the location where the zip file is unzipped, if not provided it will unzip the contents in the current directory.
members
- It shows the list of files to be unzipped, if not provided it will unzip all the files.
pwd
- If the zip file is encrypted then the password is given, the default is None.
Example: Extract all files to the current directory
In the given example, we have a zip file in our current directory. To unzip it first create a ZipFile object by opening the zip file in read mode and then call extractall() on that object. It will extract all the files in the current directory. If the file path argument is provided, then it will overwrite the path.
#import zipfile module
from zipfile import ZipFile
with ZipFile('filename.zip', 'r') as f:
#extract in current directory
f.extractall()
Example: Extract all files to a different directory
In the given example, the directory does not exist so we name our new directory as "dir" to place all extracted files from "filename.zip". We pass the destination location as an argument in extractall(). The path can be relative or absolute.
from zipfile import ZipFile
with ZipFile('filename.zip', 'r') as f:
#extract in different directory
f.extractall('dir')
Example: Extract selected files to a different directory
This method will unzip and extract only a particular list of files from all the files in the archive. We can unzip just those files which we need by passing a list of names of the files. In the given example, we used a dataset of 50 students (namely- roll1, roll2, ..., roll50) and we need to extract just the data of those students whose roll no is 7, 8, and 10. We make a list containing the names of the necessary files and pass this list as a parameter to extractall() function.
#import zipfile and os module
import zipfile
import os
#list of necessary files
list_of_files=['roll7.txt','roll8.txt','roll10.txt']
with zipfile.ZipFile("user.zip","r") as f:
f.extractall('students',members = list_of_files)
print("List of extracted files- ")
#loop to print necessary files
p=os.path.join(os.getcwd(),'students')
for item in os.listdir(path=p):
print(item)
List of extracted files- roll7.txt roll8.txt roll10.txt
Python Shutil Module
Zipfile provides specific properties to unzip files but it is a somewhat low-level library module. Instead of using zipfile the alternate is shutil
module. It is a higher-level function as compared to zipfile. It performs high-level operations on files and the collection of files. It uses unpack.archive()
to unpack the file, Let us look at the below example to understand it.
Syntax
shutil.unpack_archive(filename , extract_dir)
Parameters
unpack_archive
- It detects the compression format automatically from the "extension" of the filename (.zip, .tar.gz, etc)
filename
- It can be any path-like object (e.g. pathlib.Path instances). It represents the full path of the file.
extract_dir
(optional) - It can be any path-like object (e.g. pathlib.Path instances) that represents the path of the target directory where the file is unpacked. If not provided the current working directory is used as the target directory.
Example: Extract all files to a different directory
# importing shutil module
import shutil
# Path of the file
filename = "/home/User/Desktop/filename.zip"
# Target directory
extract_dir = "/home/username/Documents"
# Unzip the file
shutil.unpack_archive(filename, extract_dir)
Conclusion
In this article, we learned to unzip files by using several built-in functions such as extractall()
, shutil()
and different examples to store extracted contents in different directories. We learned about zip files and their Python module.