I have a spreadsheet that lists all permutations of 5 columns of data into a single column of text (Column X aka 24) and my goal is to extract only actual words from that list into its own column (Column Y aka 25). The first part is not performed with VBA and happens almost instantaneously, but the spell check extracting the actual words takes over an hour to complete (I've had to stop it it after 10 minutes and not even 10% of the way through). Is there a better way to do this?
My lists start on row 6 (n = 6) and Range("V3") is just the number of permutations (in this case, 83,521).
Sub Permute_and_Extract()
n = 6
Range("X7:X1000000").ClearContents
Range("Y6:Y1000000").ClearContents
Max = Range("V3") 5
Range("X6").Select
Selection.AutoFill Destination:=Range("X6:X" & Max)
For i = 6 To Max
x = Application.CheckSpelling(Cells(i, 24).Text)
If x = True Then
Cells(n, 25) = Cells(i, 24)
n = n 1
End If
Next i
End Sub
CodePudding user response:
Following from the comments above:
Sub Permute_and_Extract()
Const RNG As String = "F1:F10000"
Dim wlist As Object, t, c As Range, i As Long, arr, res
Dim rngTest As Range
Set rngTest = ActiveSheet.Range(RNG)
t = Timer
Set wlist = WordsList("C:\Temp\words.txt", 5)
Debug.Print "loaded list", Timer - t
Debug.Print wlist.Count, "words"
'using an array approach...
t = Timer
arr = rngTest.Value
For i = 1 To UBound(arr, 1)
res = wlist.exists(arr(i, 1))
Next i
Debug.Print "Array check", Timer - t
'going cell-by-cell...
t = Timer
For Each c In rngTest.Cells
res = wlist.exists(c.Value)
Next c
Debug.Print "Cell by cell", Timer - t
End Sub
'return a dictionary of words of length `wordLen` from file at `fPath`
Function WordsList(fPath As String, wordLen As Long) As Object
Dim dict As Object, s As String
Set dict = CreateObject("scripting.dictionary")
dict.comparemode = vbTextCompare 'case-insensitive !!!
With CreateObject("scripting.filesystemobject").opentextfile(fPath)
Do While Not .AtEndOfStream
s = .readline()
If Len(s) = wordLen Then dict.Add s, True
Loop
.Close
End With
Set WordsList = dict
End Function
Output:
loaded list 0.359375
8938 words
Array check 0.019
Cell by cell 0.030