Pandas DataFrame filter() Method
In this tutorial, we will learn the Python pandas DataFrame.filter()
method. This method subsets the dataframe rows or columns according to the specified index labels. Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.
The below shows the syntax of the DataFrame.filter()
method.
Syntax
DataFrame.filter(items=None, like=None, regex=None, axis=None)
Parameters
items:list-like. Keep labels from the axis which are in items.
like: str. Keep labels from the axis for which “like in label == True”.
regex: str (regular expression). Keep labels from axis for which re.search(regex, label) == True.
axis{0 or ‘index’, 1 or ‘columns’, None}, default None. The axis to filter on, expressed either as an index (int) or axis name (str). By default, this is the info axis, ‘index’ for Series, ‘columns’ for DataFrame.
Example: Create DataFrame in Pandas
In this tutorial, we will use this DataFrame to apply the filter method.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({"Name":["Navya","Vindya","Sinchana","Amrutha","Akshatha"],"Age":[25,24,25,25,26],"Education":["M.Tech","M.Tech","M.Tech","Ph.d","Ph.d"],"YOP":[2019,2020,2018,None,None]},index=["Group_1", "Group_1","Group_1","Group_2","Group_2"])
print("-------DataFrame is----------")
print(df)
Once we run the program we will get the following output.
-------DataFrame is----------
Name Age Education YOP
Group_1 Navya 25 M.Tech 2019.0
Group_1 Vindya 24 M.Tech 2020.0
Group_1 Sinchana 25 M.Tech 2018.0
Group_2 Amrutha 25 Ph.d NaN
Group_2 Akshatha 26 Ph.d NaN
Example: Filter by column names using the DataFrame.filter()
Method
Using the items
parameter of the DataFrame.filter()
method we can filter the DataFrame by certain columns. The below example shows the same.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({"Name":["Navya","Vindya","Sinchana","Amrutha","Akshatha"],"Age":[25,24,25,25,26],"Education":["M.Tech","M.Tech","M.Tech","Ph.d","Ph.d"],"YOP":[2019,2020,2018,None,None]},index=["Group_1", "Group_1","Group_1","Group_2","Group_2"])
print("---------Filter by columns name---------")
print(df.filter(items=["Name","Education"]))
Once we run the program we will get the following output.
---------Filter by columns by name---------
Name Education
Group_1 Navya M.Tech
Group_1 Vindya M.Tech
Group_1 Sinchana M.Tech
Group_2 Amrutha Ph.d
Group_2 Akshatha Ph.d
Example: Filter by row names using the DataFrame.filter()
Method
By using the like
parameter of the DataFrame.filter()
method, we can filter the DataFrame by certain rows. The below example shows the same.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({"Name":["Navya","Vindya","Sinchana","Amrutha","Akshatha"],"Age":[25,24,25,25,26],"Education":["M.Tech","M.Tech","M.Tech","Ph.d","Ph.d"],"YOP":[2019,2020,2018,None,None]},index=["Group_1", "Group_1","Group_1","Group_2","Group_2"])
print("---------Filter by rows name---------")
print(df.filter(like='Group_2', axis=0))
Once we run the program we will get the following output.
---------Filter by rows name---------
Name Age Education YOP
Group_2 Amrutha 25 Ph.d NaN
Group_2 Akshatha 26 Ph.d NaN
Example: Filter by column names with the regex
the DataFrame.filter()
Method
By using the regex parameter of the DataFrame.filter()
method, we can filter the DataFrame by certain columns. The below example shows the same.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({"Name":["Navya","Vindya","Sinchana","Amrutha","Akshatha"],"Age":[25,24,25,25,26],"Education":["M.Tech","M.Tech","M.Tech","Ph.d","Ph.d"],"YOP":[2019,2020,2018,None,None]},index=["Group_1", "Group_1","Group_1","Group_2","Group_2"])
print("---------Filter by columns name---------")
print(df.filter(regex ='[g]'))
Once we run the program we will get the following output.
---------Filter by columns name---------
Age
Group_1 25
Group_1 24
Group_1 25
Group_2 25
Group_2 26
Conclusion
In this tutorial, we learned the Python pandas DataFrame.filter()
method. We learned the syntax, parameter and by applying this method on the DataFrame.