I have a spreadsheet with three columns: year, name1, name2. Each entry represents a pair. I've noticed that there are duplicates in the data that Excel does not detect, namely:
So the only pair in year 1986 is ABW-AIA. However, in the data I've got two entries: ABW-AIA and AIA-ABW. I would like to remove either ABW-AIA or AIA-ABW, so I am only left with one entry. I would like to repeat the same process with many other pairs. How can I do this?
I'll greatly appreciate any help!
CodePudding user response:
If you have O365, you could follow this approach. There may very well be a more elegant way (i.e. a single step) but this did work for me.
If your data is in columns A to C (and for my test data, I only had 4 rows worth) ...
Cell E1 = =TRANSPOSE(SORT(TRANSPOSE(A1:C1),,1))
(fill down)
Cell I1 = =UNIQUE(E1:G4)
CodePudding user response:
Remove Alernating Duplicates
Option Explicit
Sub RemoveAlternatingDuplicates()
Const Delim As String = "@@"
Const cCount As Long = 3
Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
' Copy the range data to the array.
Dim rg As Range
Dim rCount As Long
With ws.Range("A1").CurrentRegion.Resize(, cCount)
rCount = .Rows.Count - 1
Set rg = .Resize(rCount).Offset(1)
End With
Dim Data As Variant: Data = rg.Value
' Copy the 'unique' data from the array to a dictionary.
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
dict.CompareMode = vbTextCompare
Dim r As Long
Dim sString1 As String
Dim sString2 As String
For r = 1 To rCount
sString1 = Data(r, 1) & Delim & Data(r, 2) & Delim & Data(r, 3)
sString2 = Data(r, 1) & Delim & Data(r, 3) & Delim & Data(r, 2)
If Not dict.Exists(sString1) Then
If Not dict.Exists(sString2) Then
dict(sString1) = Empty
End If
End If
Next r
' Copy from the dictionary to the array.
r = 0
Dim Arr() As String
Dim Key As Variant
Dim c As Long
For Each Key In dict.Keys
Arr = Split(Key, Delim)
r = r 1
For c = 1 To cCount
Data(r, c) = Arr(c - 1)
Next c
Next Key
' Copy from the array to the range.
With rg.Resize(, cCount)
.Resize(r).Value = Data
.Resize(ws.Rows.Count - .Row - r 1).Offset(r).Clear
End With
MsgBox "Alternating duplicates removed.", vbInformation
End Sub