How To Remove Duplicates In Excel

While working on spreadsheets, you may come across instances where you will see duplicate values. This may arise for different reasons, like multiple people working on the same document, unification of multiple data sets, and manual data entry errors, to name a few. In such cases, you donβt need to delete the affected rows or cells individually. MS Excel has a function to remove duplicates in spreadsheets. It is a tool that allows you to remove duplicates in Excel automatically. This Excel tutorial covers different approaches to removing duplicates in Excel. Learn through examples and different datasets.
Content
- Check for Duplicates
- Remove Duplicates When All Values ββare Identical
- Eliminate Duplicate Excel Entries with Identical Single Values
- Remove Duplicates and Replace With Blank Cells With Formula
- Using the Advanced Filter
- How to Remove Duplicates from Google Sheets
Check for Duplicates
The first step to cleaning your dataset is to identify duplicate entries. You can use Conditional Formatting to highlight such data. Excel allows you to review the duplicates and decide if such duplicate entries should be removed.
To check for duplicates in your data set β
- Select the dataset.
- Go to the Home tab on the Ribbon and locate Styles.
- Choose Conditional Formatting from the Styles option and find Highlight Cell Rules in the dropdown.
- Under Highlight Cell Rules, you will see the Duplicate Values option in the list.
- Select Duplicate Values.
- Excel will identify duplicates and ask you to choose the preferred colour coding for duplicate values.
- You can choose the colours you want.
- Excel will highlight duplicate values. You can decide which data to keep.
If you want to boost your data analytical skills, then start with learning MS Excel! We recommend that you select the most suitable MS Excel online courses as per your professional and personal goals. Learn how to use both basic and advanced Excel formulas and functions, dashboards, macros, pivot tables, and data analysis tools to improve business insights and reporting.
Best-suited MS Excel courses for you
Learn MS Excel with these high-rated online courses
Remove Duplicates in Excel When All Values ββare Identical
The most frequent use of the Remove Duplicates in Excel function is the removal of totally identical entries from the table. Excel searches for rows with the same value in each column. All entries except the original are deleted if there is such a match. This is what happens with rows 2, 10, and 15 in the table shown as an example below:
- Identical entries in this example table have been manually marked to make them stand out.
- To remove duplicates, first, select any cell.
- Select the βDataβ menu and, in the βData Toolsβ group, click on βRemove Duplicatesβ.
- Click OK.
Excel removes the duplicate entries from the sheet and keeps the unique values while showing the corresponding warning.
Eliminate Duplicate Excel Entries with Identical Single Values
While the most common use of the Remove Duplicates function is the removal of totally identical duplicates, there are also instances where duplicates appear, usually as single values. This is the case, for example, of tables containing columns with optional fields or outdated information that is no longer needed.
So, in our spreadsheet example, if the values ββin the Type of vehicle and Number of vehicles were irrelevant, even after discarding the totally identical entries, we would still have duplicates in rows 6, 7, 9, 10, and 14.
To get rid of such instances, again go to β Data Tools β and click on β Remove Duplicates.β
In the remove duplicates dialogue box, uncheck the boxes corresponding to the columns that should not be considered in the search for duplicates. In our example, the Type of Vehicle and Number of Vehicles columns. Accept the changes and save the sheet.
So, you saw that Excel removed the instances where single values appeared as duplicates. Make sure you click on the right options to avoid any data loss.
Remove Duplicates and Replace With Blank Cells With Formula
If there are some duplicate data in a range, the following formula will guide you to replace the duplicates with blanks in the range.
We have the following data set that has some repetitive numbers and names
1. Select a blank cell to obtain the result
2. Enter the below formula and press the Enter key
=IF(A2=ββ,ββ,IF(COUNTIF($A2:A19,A2)=1,A2,ββ))
Now fill in the Phone Numbers marked against the names in the new table.
Explore all MS Excel Tutorials
Using the Advanced Filter
With the help of the Advanced Filter, you can filter duplicate values and move the unique values to another location. Below are the steps to use the Advanced Filter β
1. Click on the cell or data range in your data source.
2. Go to Sort & Filter and locate the Advanced Filter
3. A dialogue box will appear with several options.
- Check the list range of your data set.
- In this case, we want to move the values to another place, so we will choose βCopy to another locationβ.
Here, we have specified a different location in the same sheet. We have selected Columns G and H. We have also checked βUnique records onlyβ.
- Click OK. You will now have the cleaned data.
We hope this article on removing duplicates from the dataset was helpful. You can try different ways, either built-in Excel functionalities or formulas, to sort the duplicate values.
How to Remove Duplicates from Google Sheets
Removing duplicates in Google Sheets is just as simple as in Excel, but requires slightly different steps. Since Google Docs is not only available on the web, but also as a standalone app on both ios and Android, note that removing duplicates is only available in Google Sheets for the web. Here is how to do it.
- Open your spreadsheet in Google Docs.
- Choose the data range from which you want to remove duplicates. Cells with identical values ββbut different cases, formulas, or formats will be considered duplicates.
- Select " Data " from the top menu.
- Click on " Data Cleanup ".
- Click " Remove Duplicates ".
- Select the data range you want to clean of duplicates.
- Click " Remove Duplicates ".
That's it. There should be no duplicate values ββin the spreadsheet.
You can also use the UNIQUE function in Google Docs to display unique values ββin your spreadsheet. Here's how to do it:
- Create a new column or sheet: Choose where the deduplicated list appears.
- In a new cell, enter the formula =UNIQUE(range), replacing range with the actual range of your data. Example: If your data is in column A, you would use " =UNIQUE(A:A) ".
- Press Enter: The formula will generate a list of unique values ββfrom the specified range.
- Done
You will be left with only the unique values, ββand all the repetitions will disappear.
Google Sheets can also help you automatically highlight all duplicate values.
For this, you must -
- Select the data range where you want duplicates to be highlighted.
- Open " Conditional Formatting ".
- Set rules for duplicates. To do this
- Select " Custom formula is " in " Format cells if " in the side panel.
- Type the following formula " =COUNTIF($A$1:$A$100, A1) > 1 "where $A$1:$A$100 is the actual data range.
- Choose a formatting style by selecting the color you want to highlight repeating values ββwith.
- Press " Done ".
Keep learning!
FAQs - Removing Duplicates in Excel
What does it mean to remove duplicates in Excel?
Removing duplicates in Excel means eliminating repeated entries from a dataset, leaving unique records. This is important for ensuring the accuracy of your data analysis, as duplicates can skew results or lead to incorrect conclusions.
Why is it important to remove duplicates in Excel?
It is important to remove duplicates in Excel for the following reasons -
Accuracy: Ensures your data is clean and avoids skewed results in calculations and analysis.
Efficiency: Reduces data size, improving spreadsheet performance and processing speed.
Clarity: Makes data more accessible to read and understand.
How do I identify duplicate values in Excel?
Conditional Formatting: Highlight duplicates with specific formatting rules based on cell values.
Advanced Filter: Use the "Unique records only" option to filter out duplicates.
Duplicates tool: The built-in "Remove Duplicates" tool allows for fine-grained removal based on your criteria.
What are the different methods for removing duplicates in Excel?
Remove Duplicates tool: Select your data range, access the "Data" tab, and click "Remove Duplicates." Choose columns to compare and select which duplicates to remove (all or based on specific criteria).
Formula approach: Use functions like COUNTIF, MATCH, and INDEX to create a helper column identifying duplicates, then filter and delete those rows.
Advanced Filter: Similar to the Remove Duplicates tool, but offers more control over filtering criteria.
Is there a way to remove duplicates without losing data?
To avoid losing data, copy your dataset to another location or sheet and perform the duplicate removal. Alternatively, you can use Conditional Formatting to highlight duplicates and manually review them before deletion, ensuring you do not accidentally remove necessary data.
How do I ensure I don't have duplicates in Excel in the future?
To prevent duplicates, you can use Data Validation. Select the range where you want to prevent duplicates, go to the Data tab, click on Data Validation, and set the validation criteria to 'Custom' with the formula =COUNTIF($A$1:$A1, A1)=1 (adjust the range according to your needs). This will prevent users from entering duplicate values in the specified range.
