Pandas DataFrame rank() Method
In this tutorial, we will discuss and learn the Python pandas DataFrame.rank()
method. This method is simple gives ranks to the data. When this method applied to the DataFrame, it gives a numerical rank from 1 to n along the specified axis.
The below is the syntax of the DataFrame.rank()
method.
Syntax
DataFrame.rank(axis=0, method='average', numeric_only=None, na_option='keep', ascending=True, pct=False)
Parameters
axis: It represents index or column axis, '0' for index and '1' for the column. When the axis=0
, method applied over the index
axis and when the axis=1
method applied over the column
axis. It indicates the index to direct ranking.
method: It includes ‘average’, ‘min’, ‘max’, ‘first’, ‘dense’, and the default method is ‘average’
numeric_only: It represents the bool(True or False), which is optional.
na_optio: It includes ‘keep’, ‘top’, ‘bottom’, and the default is ‘keep’
ascending: It represents the bool(True or False), and the default is True. It indicates whether the elements in the DataFrame should be ranked in ascending order or not.
pct: It represents the bool(True or False), and the default is False. It indicates that whether to display the returned rankings in percentile form or not.
Example 1: Rank the DataFrame column in Pandas
Let's create a DataFrame and get the rank of one of the columns of the Dataframe using the DataFrame.rank()
method. Here, we are getting the rank of the 'Profit' column. See the below example.
As we can see, by default the DataFrame.rank()
method gives the rank in ascending order. In the below example, in the profit column, there are four values and the smaller number gets the rank '1', the highest number gets the rank '4'.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({'Product_Id':[1001,1002,1003,1004],'Product_Name':['Coffee powder','Black pepper','rosemary','Cardamom'],'customer_Name':['Navya','Vindya','pooja','Sinchana'],'ordered_Date':['16-3-2021','17-3-2021','18-3-2021','18-3-2021'],'ship_Date':['18-3-2021','19-3-2021','20-3-2021','20-3-2021'],'Profit':[750,652.14,753.8,900.12]})
df['ranked_profit']=df['Profit'].rank()
df
Output
Example 2: Rank the DataFrame column in Pandas
This example is similar to the previous one. Here, we set the ascending parameter to False. Now the DataFrame.rank()
method gives rank in descending order. See the below example.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({'Product_Id':[1001,1002,1003,1004],'Product_Name':['Coffee powder','Black pepper','rosemary','Cardamom'],'customer_Name':['Navya','Vindya','pooja','Sinchana'],'ordered_Date':['16-3-2021','17-3-2021','18-3-2021','18-3-2021'],'ship_Date':['18-3-2021','19-3-2021','20-3-2021','20-3-2021'],'Profit':[750,652.14,753.8,900.12]})
df['ranked_profit']=df['Profit'].rank(ascending=False)
df
Output
Example 3: Rank the DataFrame column in Pandas
If the DataFrame consists of the same values, we can rank the DataFrame by the different methods using the DataFrame.rank()
method.
If the method is average
, it provides rank by taking the average of two numbers. If the method is min
, it gives the lowest rank in the group.
If the method is max
, it gives the highest rank in the group.
If the method is first
, it ranks the column in the order they appear in the array.
If the method is dense
, it is similar to the 'min' but rank always increases by 1 between groups.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df=pd.DataFrame({'column_1':[1,3,3,4,7],'column_2':[1,2,3,4,5]})
df['average_rank']=df['column_1'].rank(method='average')
df['min_rank']=df['column_1'].rank(method='min')
df['max_rank']=df['column_1'].rank(method='max')
df['first_rank']=df['column_1'].rank(method='first')
df['dense_rank']=df['column_1'].rank(method='dense')
df
Output
Example 4: Rank the DataFrame column in Pandas
If the DataFrame consists of null values, we can rank them using the na_option
parameter, if the parameter is set to keep
, it assigns NaN rank to NaN values, if it is set to top
, assigns the smallest rank to NaN values and if it is set to bottom
, assign the highest rank to NaN values if ascending.
#importing pandas as pd
import pandas as pd
#imporing numpy as np
import numpy as np
#creating DataFrame
df=pd.DataFrame({'column_1':[1,3,np.nan,4,np.nan],'column_2':[1,2,3,np.nan,np.nan]})
df['keep_rank_Nan']=df['column_2'].rank(na_option='keep')
df['Top_rank_Nan']=df['column_2'].rank(na_option='top')
df['Bottom_rank_Nan']=df['column_1'].rank(na_option='bottom')
df