Home > Software engineering >  Find the exact digits in a Word Document
Find the exact digits in a Word Document

Time:12-30

I am trying to find specific digits in a Microsoft Word Document which contains text and digits as well but I am having no clue how to do it. I want to use VBA for this purpose.

For example the text in the document is as follows;

(1) 52.203-19, This is a some text here (2) 52.204-23, Quick brown fox jumped over the lazy dog 52 times. (3) 52.204-25, I tried to search for a solution 52.204 times. (4) 52.2, Could not find any luck though (5) 52.203, this is blowing my mind away with mac 2.36

Now I wish to find the exact digits "52.2" as a whole and I don't want to find any other instances where 52.2 is a part of another number like 52.203 or 52.204.

Also when I would like to find 52.203 then I want to exclude all instances like 52.203-xx where xx could be any two digit number.

In short I would like to find the exact number only as a whole and not in between the numbers, just like Excel's EXACT function.

What should I do? Should I use RegEx or should I use Word's Advanced Find function with wildcards through VBA? Here is what I have done so far without any luck,

Selection.Find.ClearFormatting
            With Selection.Find
                .Text = "52.2"
                .Replacement.Text = ""
                .Forward = True
                .Wrap = wdFindAsk
                .Format = False
                .MatchCase = False
                .MatchWholeWord = True
                .MatchWildcards = False
                .MatchSoundsLike = False
                .MatchAllWordForms = False
            End With
        Selection.Find.Execute

but this finds all instances which I don't want.

CodePudding user response:

Regular expressions seems like the way to go for this.

First, go to Tools > References in the VBA editor and make sure that there is a check next to the Microsoft VBScript Regular Expressions 5.5 library.

The following code worked for me on your sample text to remove only the '52.2' after the '(4)' without affecting any of the surrounding characters:

Sub removeNumber()

Dim regExp As Object
Set regExp = CreateObject("vbscript.regexp")

With regExp
    .Pattern = "\b52.2\b"
    .Global = True
    Selection.Text = .Replace(Selection.Text, "")
End With

End Sub

\b means word boundary so will not match any digits before or after the '52.5'.

CodePudding user response:

No need for RegEx. You can use Find with wildcards. For explanation see enter image description here

  • Related