While working on spreadsheets, you may come across instances where you will see duplicate values. This may arise for different reasons, like multiple people working on the same document, unification of multiple data sets, and manual data entry errors, to name a few. In such cases, you don’t need to delete the affected rows or cells individually. MS Excel has a function to remove duplicates in spreadsheets. It is a tool that allows you to remove duplicates in Excel automatically. This Excel tutorial covers different approaches to removing duplicates in Excel. Learn through examples and different datasets.
- Check for Duplicates
- Remove Duplicates When All Values are Identical
- Eliminate Duplicate Excel Entries with Identical Single Values
- Remove Duplicates and Replace With Blank Cells With Formula
- Using the Advanced Filter
Check for Duplicates
The first step to cleaning your dataset is to identify duplicate entries. You can use Conditional Formatting to highlight such data. Excel allows you to review the duplicates and decide if such duplicate entries should be removed.
To check for duplicates in your data set –
- Select the dataset.
- Go to the Home tab on the Ribbon and locate Styles.
- Choose Conditional Formatting from the Styles option and find Highlight Cell Rules in the dropdown.
- Under Highlight Cell Rules, you will see the Duplicate Values option in the list.
- Select Duplicate Values.
- Excel will identify duplicates and ask you to choose the preferred colour coding for duplicate values.
- You can choose the colours you want.
- Excel will highlight duplicate values. You can decide which data to keep.
Must Explore – MS Excel Courses
Remove Duplicates in Excel When All Values are Identical
The most frequent use of the remove duplicates in Excel function is the removal of totally identical entries from the table. In Excel, it searches for rows with exactly the same value in each column. All entries except the original are deleted if there is such a match. This is what happens with rows 2, 10, and 15 in the table shown as an example below:
- Identical entries in this example table have been manually marked to make them stand out.
- To remove duplicates, first, select any cell.
- Select the “Data” menu and, in the “Data Tools” group, click on “Remove Duplicates”.
- Click OK.
Excel removes the duplicate entries from the sheet and keeps the unique values while showing the corresponding warning.
Eliminate Duplicate Excel Entries with Identical Single Values
While the most common use of the Remove Duplicates function is the removal of totally identical duplicates, there are also instances where duplicates appear, usually as single values. This is the case, for example, of tables containing columns with optional fields or outdated information that is no longer needed.
So, in our spreadsheet example, if the values in the Type of vehicle and Number of vehicles were irrelevant, even after discarding the totally identical entries, we would still have duplicates in rows 6, 7, 9, 10, and 14.
To get rid of such instances, again go to “ Data Tools ” and click on “ Remove Duplicates.”
In the remove duplicates dialogue box, uncheck the boxes corresponding to the columns that should not be considered in the search for duplicates. In our example, the Type of Vehicle and Number of Vehicles columns. Accept the changes and save the sheet.
So, you saw that Excel removed the instances where single values appeared as duplicates. Make sure you click on the right options to avoid any data loss.
Remove Duplicates and Replace With Blank Cells With Formula
If there are some duplicate data in a range, the following formula will guide you to replace the duplicates with blanks in the range.
We have the below data set that has some of the repetitive numbers and names
1. Select a blank cell to obtain the result
2. Enter the below formula and press the Enter key
Now fill in the Phone Numbers marked against the names in the new table.
Explore all MS Excel Tutorials
Using the Advanced Filter
With the help of the Advanced Filter, you can filter duplicate values and move the unique values to another location. Below are the steps to use the Advanced Filter –
1. Click on the cell or data range in your data source.
2. Go to Sort & Filter and locate the Advanced Filter
3. A dialogue box will appear with several options.
- Check the list range of your data set.
- In this case, we want to move the values to another place, so we will choose “Copy to another location”.
Here, we have specified a different location in the same sheet. We have selected Columns G and H. We have also checked “Unique records only”.
- Click OK. You will now have the cleaned data.
We hope this article on removing duplicates from the data set was helpful. You can try different ways, either built-in Excel functionalities or formulas, to sort the duplicate values.
FAQs - Removing Duplicates in Excel
What does it mean to remove duplicates in Excel?
Removing duplicates in Excel means eliminating repeated entries from a dataset, leaving unique records. This is important for ensuring the accuracy of your data analysis, as duplicates can skew results or lead to incorrect conclusions.
Why is it important to remove duplicates in Excel?
It is important to remove duplicates in Excel for the following reasons -
Accuracy: Ensures your data is clean and avoids skewed results in calculations and analysis.
Efficiency: Reduces data size, improving spreadsheet performance and processing speed.
Clarity: Makes data more accessible to read and understand.
How do I identify duplicate values in Excel?
Conditional Formatting: Highlight duplicates with specific formatting rules based on cell values.
Advanced Filter: Use the "Unique records only" option to filter out duplicates.
Duplicates tool: The built-in "Remove Duplicates" tool allows for fine-grained removal based on your criteria.
What are the different methods for removing duplicates in Excel?
Remove Duplicates tool: Select your data range, access the "Data" tab, and click "Remove Duplicates." Choose columns to compare and select which duplicates to remove (all or based on specific criteria).
Formula approach: Use functions like COUNTIF, MATCH, and INDEX to create a helper column identifying duplicates, then filter and delete those rows.
Advanced Filter: Similar to the Remove Duplicates tool, but offers more control over filtering criteria.
Is there a way to remove duplicates without losing data?
To avoid losing data, copy your dataset to another location or sheet and perform the duplicate removal. Alternatively, you can use Conditional Formatting to highlight duplicates and manually review them before deletion, ensuring you do not accidentally remove necessary data.
How do I ensure I don't have duplicates in Excel in the future?
To prevent duplicates, you can use Data Validation. Select the range where you want to prevent duplicates, go to the Data tab, click on Data Validation, and set the validation criteria to 'Custom' with the formula =COUNTIF($A$1:$A1, A1)=1 (adjust the range according to your needs). This will prevent users from entering duplicate values in the specified range.