Home > Software design >  VBA solution for very slow Countif formula
VBA solution for very slow Countif formula

Time:02-10

I have a solution that works in Excel with a countif formula (with the help of another Stackoverflow user).

Essentially what this countif formula does is count the first instance of an ID that exclusively exists with the classification type "DC". For example, as you can see in my snippet, 2232 is marked with as it is only exists with the classification "DC". Whilst in the case of 2240 it is marked as 0 as there are multiple classifications possible.

The formula in column D is the following:

=IF(IF(B2<>"DC",0,AND(COUNTIF(C$2:C$28,C2)=COUNTIF(A$2:A$28,A2),COUNTIF(A$2:A2,A2)=1)),1,0)

enter image description here

The problem that I am experiencing is that this is an extremely slow formula to process for Excel -- it takes roughly ~10-15 mins to complete. The database that I am running this on contains of roughly 150k~ lines.

I was wondering if it was possible to do this same process in VBA, but a lot faster and more efficient than the current processing time.

So I am using the following piece of VBA code to try to recreate the same results:

Sub MarkUniqueID()
   Dim Ary As Variant, Nary As Variant
   Dim r As Long
   
   With ThisWorkbook.Sheets("sheet1")
      Ary = .Range("A2", .Range("A" & Rows.Count).End(xlUp)).Value2
   End With
   ReDim Nary(1 To UBound(Ary), 1 To 1)
   With CreateObject("scripting.dictionary")
      For r = 1 To UBound(Ary)
         If Not .Exists(Ary(r, 1)) Then
            .Add Ary(r, 1), Nothing
            Nary(r, 1) = 1
         Else
            Nary(r, 1) = 0
         End If
      Next r
   End With
   ThisWorkbook.Sheets("sheet1").Range("E2").Resize(r).Value = Nary
End Sub

Which runs the process much smoother it takes only a few ~seconds of my original time, however, I am not sure how I can add one more criteria into my array (i.e. only exclusively consider "DC"), as now the results are not what I want (see below).

enter image description here

Any pointers would be much appreciated!

CodePudding user response:

You can use another dictionary to track which ID's should be excluded:

Sub MarkUniqueID()
    Dim Ary As Variant, Nary() As Long, cls, id, k
    Dim r As Long, dictIn As Object, dictOut As Object
    Dim ws As Worksheet
    
    Set dictIn = CreateObject("scripting.dictionary")
    Set dictOut = CreateObject("scripting.dictionary")
    
    Set ws = ThisWorkbook.Sheets("sheet1")
    'pick up the classification and ID
    Ary = ws.Range("B2:C" & ws.Cells(ws.Rows.Count, "B").End(xlUp).Row).Value
    
    ReDim Nary(1 To UBound(Ary), 1 To 1)
   
    For r = 1 To UBound(Ary, 1)
        cls = Ary(r, 1)
        id = CStr(Ary(r, 2))
       
        If cls = "DC" Then
            If Not dictIn.exists(id) Then dictIn.Add id, r
        Else
            If Not dictOut.exists(id) Then dictOut.Add id, True
        End If
    Next r
    For Each k In dictIn
        If Not dictOut.exists(k) Then Nary(dictIn(k), 1) = 1
    Next k
    ws.Range("E2").Resize(UBound(Nary, 1)).Value = Nary
End Sub
  •  Tags:  
  • vba
  • Related