Home > OS >  How to write a regular expression that includes numbers [0-9] and a defined word?
How to write a regular expression that includes numbers [0-9] and a defined word?

Time:10-12

I'm using VBA and struggling to make a regex.replace function to clean my string cells

Example: "Foo World 4563" What I want: "World"

by replacing the numbers and the word "Foo"

Another example: "Hello World 435 Foo", I want "Hello World"

This is what my code looks like so far:

Public Function Replacement(sInput) As String

Dim regex As New RegExp

With regex
   .Global = True
   .IgnoreCase = True
End With

regex.Pattern = "[0-9,()/-] \bfoo\b"
Replacement = regex.Replace(sInput, "")

End Function

CodePudding user response:

You can use

Function Replacement(sInput) As String

Dim regex As New regExp

With regex
   .Global = True
   .IgnoreCase = True
End With

regex.Pattern = "\s*(?:\bfoo\b|\d )"
Replacement = Trim(regex.Replace(sInput, ""))

End Function

See the enter image description here

Details:

  • \s* - zero or more whitespaces
  • (?:\bfoo\b|\d ) - either a whole word foo or one or more digits.

Note the use of Trim(), it is necessary to remove leading/trailing spaces that may remain after the replacement.

CodePudding user response:

My two cents, capturing preceding whitespace chars when present trying to prevent possible false positives:

(^|\s )(?:foo|\d )(?=\s |$)

See an online demo.


  • (^|\s ) - 1st Capture group to assert position is preceded by whitespace or directly at start of string;
  • (?:foo|\d ) - Non-capture group with the alternation between digits or 'foo';
  • (?=\s |$) - Positive lookahead to assert position is followed by whitespace or end-line anchor.

Sub Test()

Dim arr As Variant: arr = Array("Foo World 4563", "Hello World 435 Foo", "There is a 99% chance of false positives which is foo-bar!")

For Each el In arr
    Debug.Print Replacement(el)
Next

End Sub

Public Function Replacement(sInput) As String

With CreateObject("vbscript.regexp")
   .Global = True
   .IgnoreCase = True
   .Pattern = "(^|\s )(?:foo|\d )(?=\s |$)"
   Replacement = Application.Trim(.Replace(sInput, "$1"))
End With

End Function

Print:

World
Hello World
There is a 99% chance of false positives which is foo-bar!

Here Application.Trim() does take care of multiple whitespace chars left inside your string.

  • Related