Home > other >  Match several columns of 2 tables via key most efficiently (Xlookup (C library usage?) vs binary s
Match several columns of 2 tables via key most efficiently (Xlookup (C library usage?) vs binary s

Time:08-08

I would like to match two tables using a key in the most efficient way possible.

I know that arrays are fastest, so I am trying to use them as much as possible. Having read that using in-built functions is faster (as they use C/C libraries directly), I wanted to incorporate Application.Xlookup, particularly for the binary-search function.

However, I have been unable to get Application.Xlookup to work on arrays, but rather only on ranges?

Moreover, I have been unable to fetch the address of the found XLookup value, which would allow me to fetch associated column items directly from the array. Instead, as an inefficient (temporary) workaround I am running the XLookup 3x for each search key (row).

Please guide me on how to achieve the most efficient approach by either:

  • assisting me in using XLookup in VBA in the most efficient manner and to use the address to fetch associated row items.
  • comment on whether using a VBA-based binary search algorithm will be as efficient as the built-in Application.Xlookup function for thousands of items and whether I should use this instead.

Relevant code below:

Sub matchTables_viaKey()
' Define tables
    Dim T_DataToImport As ListObject
    Set T_DataToImport = ThisWorkbook.Worksheets("DataToImport").ListObjects("T_DataToImport")
    
    Dim T_Match As ListObject
    Set T_Match = ThisWorkbook.Worksheets("Match").ListObjects("T_Match")

' Sort data to be fetched (matched)
    Call table_Sort_ByAscending(T_DataToImport, "Key") ' need to sort for binary search of xlookup to work.

' Define search range and column for xlookup
    Dim searchRange_key As Range:               Set searchRange_key = T_DataToImport.ListColumns("Key").DataBodyRange
    Dim key_ColumnNumber As Integer:               key_ColumnNumber = T_ColNr(T_Match, "Key")

' Define return ranges (1-X) and columns for xlookup
    Dim returnRange1 As Range:                  Set returnRange1 = T_DataToImport.ListColumns("column1").DataBodyRange
    Dim rR1_ColumnNumber As Integer:            rR1_ColumnNumber = T_ColNr(T_Match, "column1")
    
    Dim returnRange2 As Range:                  Set returnRange2 = T_DataToImport.ListColumns("column2").DataBodyRange
    Dim rR2_ColumnNumber As Integer:            rR2_ColumnNumber = T_ColNr(T_Match, "column2")
    
    Dim returnRangeX As Range:                  Set returnRangeX = T_DataToImport.ListColumns("X").DataBodyRange
    Dim rRX_ColumnNumber As Integer:            rRX_ColumnNumber = T_ColNr(T_Match, "X")
    
' Assign tables to arrays
    Dim arrT_Match As Variant:                 arrT_Match = T_Match.DataBodyRange.Value         
    Dim arrT_DataToImport As Variant:   arrT_DataToImport = T_DataToImport.DataBodyRange.Value 

' Loop through each item of array and do an xlookup.
    Dim i As Long, temp As String
    For i = LBound(arrT_Match) To UBound(arrT_Match)
        temp = Application.XLookup(arrT_Match(i, key_ColumnNumber), searchRange_key2, searchRange_key2, "", 0, 2) ' sometimes xlookup seems to be unreliable when set to 2
        
        If temp = "" Then
            arrT_Match(i, rR1_ColumnNumber) = ""
            arrT_Match(i, rR2_ColumnNumber) = ""
            arrT_Match(i, rRX_ColumnNumber) = ""
        Else
            ' Below problematic: as it runs the xlookup 3x (I was unable to fetch the address of found item and then return other values by difference in column numbers)
            arrT_Match(i, rR1_ColumnNumber) = Application.XLookup(arrT_Match(i, key_ColumnNumber), searchRange_key2, returnRange1, "", 0, 2) ' sometimes xlookup seems to be unreliable when set to 2 even though sort is applied?
            arrT_Match(i, rR2_ColumnNumber) = Application.XLookup(arrT_Match(i, key_ColumnNumber), searchRange_key2, returnRange2, "", 0, 2)
            arrT_Match(i, rRX_ColumnNumber) = Application.XLookup(arrT_Match(i, key_ColumnNumber), searchRange_key2, returnRangeX, "", 0, 2)
        End If
    Next

' Assign array back to table
    T_Match.DataBodyRange.Value = arrT_Match
    
    Set T_DataToImport = Nothing
    Set T_Match = Nothing
End Sub

Sub table_Sort_ByAscending(ByRef tbl As ListObject, headerName As String)
    Dim iColumn As Range:     Set iColumn = tbl.ListColumns(headerName).Range
    
    With tbl.Sort
        .SortFields.Clear
        .SortFields.Add Key:=iColumn, SortOn:=xlSortOnValues, Order:=xlAscending
        .header = xlYes
        .Apply
    End With
End Sub

Function T_ColNr(ByRef tbl As ListObject, header As String) As Integer
    Dim OffsetOfTableCol As Integer
    OffsetOfTableCol = tbl.Range(1, 1).Column - 1

    Dim CN As Integer
    CN = tbl.ListColumns(header).Range.Column
    
    T_ColNr = CN - OffsetOfTableCol
End Function

Please feel welcome to add any additional feedback on where and how I could improve (bad coding practices, etc).

Thank you for your time.

CodePudding user response:

This worked for me, using Application.Match(). Also moved the returned value column headers into a loop, so you can more-easily add/remove columns.

Sub matchTables_viaKey()

    Dim wb As Workbook, T_DataToImport As ListObject, T_Match As ListObject
    Dim searchRange_key As Range, colKey As Integer, m, v
    Dim arrT_Match As Variant, i As Long, ub As Long, r As Long
    Dim arrT_DataToImport As Variant, colNames, colPosSrc, colPosDest
    
    Set wb = ThisWorkbook
    
    'this is the lookup table
    Set T_DataToImport = wb.Worksheets("DataToImport").ListObjects("T_DataToImport")
    'set the lookup range for `Match()`
    Set searchRange_key = T_DataToImport.ListColumns("Key").DataBodyRange
    arrT_DataToImport = T_DataToImport.DataBodyRange.Value
    
    'this is the data to be supplemented from the lookup table
    Set T_Match = wb.Worksheets("Match").ListObjects("T_Match")
    colKey = T_Match.ListColumns("Key").Index
    arrT_Match = T_Match.DataBodyRange.Value
    
    'Set up the information for the columns to be looked up
    colNames = Array("column1", "column2", "X") 'column names for data to be transferred
    ub = UBound(colNames)
    ReDim colPosSrc(0 To ub)  'size arrays for source/destination column indexes
    ReDim colPosDest(0 To ub)
    For i = 0 To ub 'loop and look up the relevant column postions in source/destination tables
        colPosSrc(i) = T_DataToImport.ListColumns(colNames(i)).Index
        colPosDest(i) = T_Match.ListColumns(colNames(i)).Index
    Next i
    
    'Loop over the table data and match on key columns
    For r = 1 To UBound(arrT_Match, 1)
        m = Application.Match(arrT_Match(r, colKey), searchRange_key, 0)
        For i = 0 To ub                  'loop columns to transfer
            If Not IsError(m) Then       'got a match?
                v = arrT_DataToImport(m, colPosSrc(i))
            Else
                v = ""   'no match: empty value
            End If
            arrT_Match(r, colPosDest(i)) = v
        Next i
    Next r

    T_Match.DataBodyRange.Value = arrT_Match ' Assign array back to table
    
End Sub
  • Related