Pandas DataFrame drop_duplicates() Method

In this tutorial, we will learn the Python pandas DataFrame.drop_duplicates() method. It returns a DataFrame with duplicate rows removed. Considering certain columns is optional. Indexes, including time indexes, are ignored.

The below shows the syntax of the DataFrame.drop_duplicates() method.

Syntax

DataFrame.drop_duplicates(subset=None, keep='first', inplace=False, ignore_index=False)

Parameters

subset: column label or sequence of labels, optional. Only consider certain columns for identifying duplicates, by default use all of the columns.

keep: {‘first’, ‘last’, False}, default ‘first’. Determines which duplicates (if any) to keep. - first: Drop duplicates except for the first occurrence. - last: Drop duplicates except for the last occurrence. - False: Drop all duplicates.

inplace: bool, default False. Whether to drop duplicates in place or to return a copy.

ignore_index: bool, default False. If True, the resulting axis will be labeled 0, 1, …, n - 1.

Example 1: Removing duplicate rows using DataFrame.drop_duplicates() Method

The DataFrame.drop_duplicates() method removes the duplicates rows based on the columns. The below example shows the same.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya','Vindya', 'Navya', 'Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print(df)
print("-------After removing duplicate rows------")
print(df.drop_duplicates())

Once we run the program we will get the following output.

Name Skills
0 Navya Python
1 Vindya Java
2 Navya Python
3 Vindya Java
4 Sinchana Java
5 Sinchana Java
-------After removing duplicate rows------
Name Skills
0 Navya Python
1 Vindya Java
4 Sinchana Java

Example 2: Removing duplicate rows using DataFrame.drop_duplicates() Method

The DataFrame.drop_duplicates() method removes the duplicates rows on a specific column(s), using a subset method. The below example shows the same.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya',  'Vindya','Navya','Vindya','Sinchana','Sinchana'],'Skills': ['Python', 'Java','Python','Java','Java','Java']})
print(df)
print("-------After removing duplicate rows------")
print(df.drop_duplicates(subset=['Skills']))

Once we run the program we will get the following output.

Name Skills
0 Navya Python
1 Vindya Java
2 Navya Python
3 Vindya Java
4 Sinchana Java
5 Sinchana Java
-------After removing duplicate rows------
Name Skills
0 Navya Python
1 Vindya Java

Example 3: Removing duplicate rows using DataFrame.drop_duplicates() Method

The DataFrame.drop_duplicates() method removes the duplicates rows by keeping last occurrences, and using the keep method. The below example shows the same.

import pandas as pd
df = pd.DataFrame({'Name': ['Navya', 'Vindya','Navya','Vindya','Sinchana','Sinchana'],'Skills': ['Python','Java','Python','Java','Java','Java']})
print(df)
print("-------After removing duplicate rows------")
print(df.drop_duplicates(subset=['Name', 'Skills'], keep='last'))

Once we run the program we will get the following output.

Name Skills
0 Navya Python
1 Vindya Java
2 Navya Python
3 Vindya Java
4 Sinchana Java
5 Sinchana Java
-------After removing duplicate rows------
Name Skills
2 Navya Python
3 Vindya Java
5 Sinchana Java

Conclusion

In this tutorial, we will learn the DataFrame.drop_duplicates() method. We learned the syntax, parameters, and solved examples by applying this method on the DataFrame and understood the method.

C TUTORIAL

C PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

C++ TUTORIAL

C++ PROGRAMS

INTERVIEW TESTS

EXECUTE CODE

PYTHON TUTORIAL

PYTHON HOW TOS

INTERVIEW TESTS

EXECUTE CODE

JAVA TUTORIAL

JAVA CODE EXAMPLES

SPRING TUTORIAL

MORE IN JAVA

COMPUTER ARCHITECTURE

COMPUTER NETWORK

OPERATING SYSTEM

DBMS & SQL

PL/SQL

MongoDB

EXECUTE SQL

ANDROID DEVELOPMENT

GO LANGUAGE

LINUX

DOCKER

HTML TAGS (A to Z)

CSS REFERENCES

SASS/SCSS

KOTLIN

GAME DEVELOPMENT

PHP

GIT GUIDE

JAVASCRIPT

ADVANCED DSA

Pandas DataFrame drop_duplicates() Method

Syntax

Parameters

Example 1: Removing duplicate rows using DataFrame.drop_duplicates() Method

Example 2: Removing duplicate rows using DataFrame.drop_duplicates() Method

Example 3: Removing duplicate rows using DataFrame.drop_duplicates() Method

Conclusion