Pandas DataFrame duplicated() Method

In this tutorial, we will learn the Python pandas DataFrame.duplicated() method. It returns the boolean Series denoting duplicate rows. We can consider certain columns but it is optional. It returns the boolean series for each duplicated row.

The below shows the syntax of the DataFrame.duplicated() method.

Syntax

DataFrame.duplicated(subset=None, keep='first')

Parameters

subset: column label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns.

keep:{‘first’, ‘last’, False}, default ‘first’

Determines which duplicates (if any) to mark.

first : Mark duplicates as True except for the first occurrence.
last : Mark duplicates as True except for the last occurrence.
False: Mark all duplicates as True.

Example 1: Finding duplicated columns using the `DataFrame.duplicated()` Method

The below example shows by default, for each set of duplicated values in the DataFrame, the first occurrence is set on False and all others on True.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya','Vindya', 'Navya', 'Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print("-----------DataFrame--------")
print(df)
print("------Finding duplicates rows-------")
print(df.duplicated())

Once we run the program we will get the following output.

-----------DataFrame--------
Name Skills
0 Navya Python
1 Vindya Java
2 Navya Python
3 Vindya Java
4 Sinchana Java
5 Sinchana Java
------Finding duplicates rows-------
0 False
1 False
2 True
3 True
4 False
5 True
dtype: bool

Example 2: Finding duplicated columns using the `DataFrame.duplicated()` Method

The below example shows the by using ‘last’, the last occurrence of each set of duplicated values is set on False and all others on True.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya','Vindya', 'Navya', 'Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print("-----------DataFrame--------")
print(df)
print("------Finding duplicates rows-------")
print(df.duplicated(keep='last'))

Once we run the program we will get the following output.

-----------DataFrame--------
Name Skills
0 Navya Python
1 Vindya Java
2 Navya Python
3 Vindya Java
4 Sinchana Java
5 Sinchana Java
------Finding duplicates rows-------
0 True
1 True
2 False
3 False
4 True
5 False
dtype: bool

Example 3: Finding duplicated columns using the `DataFrame.duplicated()` Method

The below example shows by setting keep on False, all duplicates are True.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya','Vindya', 'Navya', 'Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print("-----------DataFrame--------")
print(df)
print("------Finding duplicates rows-------")
print(df.duplicated(keep=False))

Once we run the program we will get the following output.

Example 4: Finding duplicated columns using the `DataFrame.duplicated()` Method

The below example shows how to find duplicates on the specific column(s), by using subset method.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya','Vindya', 'Navya', 'Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print("-----------DataFrame--------")
print(df)
print("------Finding duplicates rows-------")
print(df.duplicated(subset=['Skills']))

Once we run the program we will get the following output.

Conclusion:

In this tutorial, we learned the Python pandas DataFrame.duplicated() method. We learned the syntax, parameter and by applying this method on the DataFrame we solved examples and understood the DataFrame.duplicated() method.

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Pandas DataFrame duplicated() Method

Syntax

Parameters

Example 1: Finding duplicated columns using the DataFrame.duplicated() Method

Example 2: Finding duplicated columns using the DataFrame.duplicated() Method

Example 3: Finding duplicated columns using the DataFrame.duplicated() Method

Example 4: Finding duplicated columns using the DataFrame.duplicated() Method

Conclusion:

Example 1: Finding duplicated columns using the `DataFrame.duplicated()` Method

Example 2: Finding duplicated columns using the `DataFrame.duplicated()` Method

Example 3: Finding duplicated columns using the `DataFrame.duplicated()` Method

Example 4: Finding duplicated columns using the `DataFrame.duplicated()` Method