I'm using VBA and struggling to make a regex.replace function to clean my string cells
Example: "Foo World 4563" What I want: "World"
by replacing the numbers and the word "Foo"
Another example: "Hello World 435 Foo", I want "Hello World"
This is what my code looks like so far:
Public Function Replacement(sInput) As String
Dim regex As New RegExp
With regex
.Global = True
.IgnoreCase = True
End With
regex.Pattern = "[0-9,()/-] \bfoo\b"
Replacement = regex.Replace(sInput, "")
End Function
CodePudding user response:
You can use
Function Replacement(sInput) As String
Dim regex As New regExp
With regex
.Global = True
.IgnoreCase = True
End With
regex.Pattern = "\s*(?:\bfoo\b|\d )"
Replacement = Trim(regex.Replace(sInput, ""))
End Function
Details:
\s*
- zero or more whitespaces(?:\bfoo\b|\d )
- either a whole wordfoo
or one or more digits.
Note the use of Trim()
, it is necessary to remove leading/trailing spaces that may remain after the replacement.
CodePudding user response:
My two cents, capturing preceding whitespace chars when present trying to prevent possible false positives:
(^|\s )(?:foo|\d )(?=\s |$)
See an online demo.
(^|\s )
- 1st Capture group to assert position is preceded by whitespace or directly at start of string;(?:foo|\d )
- Non-capture group with the alternation between digits or 'foo';(?=\s |$)
- Positive lookahead to assert position is followed by whitespace or end-line anchor.
Sub Test()
Dim arr As Variant: arr = Array("Foo World 4563", "Hello World 435 Foo", "There is a 99% chance of false positives which is foo-bar!")
For Each el In arr
Debug.Print Replacement(el)
Next
End Sub
Public Function Replacement(sInput) As String
With CreateObject("vbscript.regexp")
.Global = True
.IgnoreCase = True
.Pattern = "(^|\s )(?:foo|\d )(?=\s |$)"
Replacement = Application.Trim(.Replace(sInput, "$1"))
End With
End Function
Print:
World
Hello World
There is a 99% chance of false positives which is foo-bar!
Here Application.Trim()
does take care of multiple whitespace chars left inside your string.