Home > Back-end >  Excel not detecting alternating duplicates in three columns
Excel not detecting alternating duplicates in three columns

Time:02-12

I have a spreadsheet with three columns: year, name1, name2. Each entry represents a pair. I've noticed that there are duplicates in the data that Excel does not detect, namely:

So the only pair in year 1986 is ABW-AIA. However, in the data I've got two entries: ABW-AIA and AIA-ABW. I would like to remove either ABW-AIA or AIA-ABW, so I am only left with one entry. I would like to repeat the same process with many other pairs. How can I do this?

I'll greatly appreciate any help!

CodePudding user response:

If you have O365, you could follow this approach. There may very well be a more elegant way (i.e. a single step) but this did work for me.

If your data is in columns A to C (and for my test data, I only had 4 rows worth) ...

Cell E1 = =TRANSPOSE(SORT(TRANSPOSE(A1:C1),,1)) (fill down)

Cell I1 = =UNIQUE(E1:G4)

Worksheet

CodePudding user response:

Remove Alernating Duplicates

Option Explicit

Sub RemoveAlternatingDuplicates()
    
    Const Delim As String = "@@"
    Const cCount As Long = 3
    
    Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
    
    ' Copy the range data to the array.
    
    Dim rg As Range
    Dim rCount As Long
    With ws.Range("A1").CurrentRegion.Resize(, cCount)
        rCount = .Rows.Count - 1
        Set rg = .Resize(rCount).Offset(1)
    End With
    Dim Data As Variant: Data = rg.Value
    
    ' Copy the 'unique' data from the array to a dictionary.
    
    Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
    dict.CompareMode = vbTextCompare
    
    Dim r As Long
    Dim sString1 As String
    Dim sString2 As String
    
    For r = 1 To rCount
        sString1 = Data(r, 1) & Delim & Data(r, 2) & Delim & Data(r, 3)
        sString2 = Data(r, 1) & Delim & Data(r, 3) & Delim & Data(r, 2)
        If Not dict.Exists(sString1) Then
            If Not dict.Exists(sString2) Then
                dict(sString1) = Empty
            End If
        End If
    Next r
    
    ' Copy from the dictionary to the array.
    
    r = 0
    
    Dim Arr() As String
    Dim Key As Variant
    Dim c As Long
    
    For Each Key In dict.Keys
        Arr = Split(Key, Delim)
        r = r   1
        For c = 1 To cCount
            Data(r, c) = Arr(c - 1)
        Next c
    Next Key
    
    ' Copy from the array to the range.
    
    With rg.Resize(, cCount)
        .Resize(r).Value = Data
        .Resize(ws.Rows.Count - .Row - r   1).Offset(r).Clear
    End With

    MsgBox "Alternating duplicates removed.", vbInformation

End Sub
  • Related