Pandas DataFrame join() Method
In this tutorial, we will learn the Python pandas DataFrame.join()
method. This method is used to join the columns of another DataFrame. It joins the columns with other DataFrame either on the index or on a key column. By index, at once this method can join the multiple DataFrame objects by passing a list.
The below shows the syntax of the DataFrame.join()
method.
Syntax
DataFrame.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
Parameters
other: It represents the DataFrame, Series, or list of DataFrame.
on: It represents the str, list of str, or array-like, which is optional.
how: It includes ‘left’, ‘right’, ‘outer’, ‘inner’, and the default is ‘left’.
lsuffix: It represents the str, default ‘’. Suffix to use from left frame’s overlapping columns.
rsuffix: It represents the str, default ‘’.Suffix to use from right frame’s overlapping columns.
sort: It represents the bool, default False. It orders the resulted DataFrame lexicographically by the join key. If it is False, the order of the join key depends on the join type (how keyword).
Example: Joining the two DataFrames using the DataFrame.join()
Method
Here, in this example, we will create two DataFrame and join the two DataFrame using the DataFrame.join()
method. See the below example.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df_1=pd.DataFrame({"A":[0,1],"B":[3,4]})
print("-----------The DataFrame is-------")
print(df_1)
df_2=pd.DataFrame({"C":[0,1],"D":[3,4,]})
print("----------------------------------")
print(df_2)
print("------------------")
print(df_1.join(df_2))
-----------The DataFrame is-------
A B
0 0 3
1 1 4
----------------------------------
C D
0 0 3
1 1 4
------------------
A B C D
0 0 3 0 3
1 1 4 1 4
Example: Join the two DataFrames using the lsuffix and rsuffix
Here, we use the Suffix method to differentiate the left and right frame’s overlapping columns. See the below example.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df_1=pd.DataFrame({"A":[0,1],"B":[3,4]})
print("-----------The DataFrame is-------")
print(df_1)
df_2=pd.DataFrame({"A":[0,1],"B":[3,4,]})
print("----------------------------------")
print(df_2)
print("------------------")
print(df_1.join(df_2,lsuffix='_first', rsuffix='_second'))
-----------The DataFrame is-------
A B
0 0 3
1 1 4
----------------------------------
A B
0 0 3
1 1 4
------------------
A_first B_first A_second B_second
0 0 3 0 3
1 1 4 1 4
Example: The DataFrame.join()
Method
If we want to join using the 'A' columns, we need to set the 'A' to be the index in both df_1 and df_2. The joined DataFrame will have the 'A' as its index. See the below example.
#importing pandas as pd
import pandas as pd
#creating DataFrame
df_1=pd.DataFrame({"A":[0,1],"B":[3,4]})
print("-----------The DataFrame is-------")
print(df_1)
df_2=pd.DataFrame({"A":[0,1],"D":[3,4,]})
print("----------------------------------")
print(df_2)
print("------------------")
print(df_1.set_index('A').join(df_2.set_index('A')))
-----------The DataFrame is-------
A B
0 0 3
1 1 4
----------------------------------
A D
0 0 3
1 1 4
------------------
B D
A
0 3 3
1 4 4
Conclusion
In this tutorial, we learned the Python pandas DataFrame.join()
method. We learned the syntax and by applying this method on the DataFrame we solved examples and understood the DataFrame.join()
method.