Home > OS >  Excel remove duplicates based on 2 columns case-sensitive
Excel remove duplicates based on 2 columns case-sensitive

Time:06-27

I need to remove duplicates from an Excel worksheet based on the values in 2 columns while taking case into account.

In the example below, Rows 1 and 2 are duplicates (Row 2 should be removed). Row 3, 4, and 5 are unique.

Row Column A Column B
1 Abc Def
2 Abc Def
3 ABC DEF
4 ABC DeF
5 Abc DeF

I've done this with other datasets using Data > Remove duplicates, but since it is case-insensitive, it won't work for this.

I also found this question, which is very similar, but only identifies duplicates based on 1 column.

(How to remove duplicates that are case SENSITIVE in Excel (for 100k records or more)?)

CodePudding user response:

Try this code:

Sub SubRemoveDuplicates()
    
    'Declarations.
    Dim RngData As Range
    Dim RngDataToBeCompared
    Dim RngCell As Range
    Dim RngRow As Range
    
    'Settings.
    Set RngData = Range("A1:C6")
    Set RngDataToBeCompared = Range("B2:C6")
    
    
    'Covering each row of the data to be compared.
    For Each RngRow In RngDataToBeCompared.Rows
        
CP_Rerun_For:
        
        'Covering each cell of the given row.
        For Each RngCell In RngRow.Cells
            
            'Checking if any cell is different from the one under it.
            If RngCell.Value <> RngCell.Offset(1, 0).Value Then
            
                'If said cell has been found, skip to the next row.
                GoTo CP_Next_Row
                
            End If
        Next
        
        'Checking if the range to be targeted is within RngData.
        If Not Intersect(RngRow.Offset(1, 0).EntireRow, RngData) Is Nothing Then
            
            'Deleting the row of duplicates.
            Intersect(RngRow.Offset(1, 0).EntireRow, RngData).Delete (xlShiftUp)
            
            'Rerunning this cycle for the given row in order to catch duplicates that comes in more than 2.
            GoTo CP_Rerun_For
        End If
        
CP_Next_Row:
        
    Next
    
End Sub

Note: if you are going to cover an entire column with presumably many empty cells, the macro will cover (and eventually delete) all those empty cells too. The macro can be modified so it will stop when it encounters and empty row, or to dynamically determinate the appropriate range to be covered. Otherwise it might take more time than necessary.

CodePudding user response:

I don't like using macros until it's last hope. For your situation, I would suggest adding new columns, and with function =lower(Column A), etc. get values of column A in lower case. Then I would add one more new column and do the same for Column B. And after that, I would use Data/Remove Duplicates (converting range to Table format first). And then I would delete unnecessary columns which were added for converting everything to lowercase.

  • Related