I'm pretty new to VBA. So I have a string with a bunch of random words in it like this:
"foo Foo FOO Bar FoO Faz FAZ"
How do I find the number of unique words in the string? In the above example, it would be simply 3, as there's only foo, bar, and faz. I have some code here but it outputs the wrong value, I'm not sure if I should be adding it to an array or some sort of database instead and then iterate through the database to check for duplicates.
Function UniqueWordCount(TextString As String) As Integer
TextString = LCase(TextString)
Dim Result() As String
Dim Count As Integer
Result = Split(TextString, " ")
Count = UBound(Result()) 1
Dim k As Integer
Dim repeat As Integer
repeat = 0
For i = LBound(Result) To UBound(Result)
For k = 0 To i
If Result(i) = Result(k) Then
repeat = repeat 1
End If
Next k
Next i
If repeat > 1 Then
repeat = repeat - 1
End If
repeat = repeat - i
UniqueWordCount = Count - repeat
End Function
CodePudding user response:
Use a Scripting.Dictionary
(add a reference to Microsoft Scripting Runtime):
Sub foo()
Dim d As Scripting.Dictionary
Set d = New Scripting.Dictionary
Dim s As String
s = "foo Foo FOO Bar FoO Faz FAZ"
Dim x As Variant
x = Split(s)
Dim i As Long
For i = LBound(x) To UBound(x)
If Not d.Exists(UCase$(x(i))) Then
d.Add UCase$(x(i)), "whatever"
End If
Next
Debug.Print d.Count ' returns 3
End Sub
CodePudding user response:
Count Unique Words in a Sentence
- Be careful with this simple Regex pattern, it may not work as expected.
- Uncomment the
Debug.Print
lines to get a better feel of what is happening.
Option Explicit
Sub UniqueWordsCountTEST()
Dim Sentence As String
Sentence = "I am using this Regex to get rid off the punctuation,?!:;-. " _
& "I am using a dictionary to get rid off duplicates, and the " _
& "dictionary's CompareMode for 'word' to be the same as 'WORD'."
' 12 Dupes: I,am,using,to,get,rid,off,the,dictionary,to,the,WORD
Debug.Print "Unique Words Count = " & UniqueWordsCount(Sentence)
End Sub
Function UniqueWordsCount( _
ByVal Sentence As String) _
As Long
Dim Matches As Variant
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "\w "
Set Matches = .Execute(Sentence)
'Debug.Print "All Words Count = " & Matches.Count
End With
With CreateObject("Scripting.Dictionary")
.CompareMode = vbTextCompare
Dim Match As Variant
For Each Match In Matches
.Item(Match.Value) = Empty
'Debug.Print "All Words: " & Match.Value
Next Match
UniqueWordsCount = .Count
'Debug.Print "Unique Words:" & vbLf & Join(.Keys, vbLf)
End With
End Function