I need to match all the :
in a txt file, but avoiding when they are preceded by an https
, http
or \
, but VBA does not support lookbehind for regex.
With negative-lookbehind it should be (?<!http)(?<!https)(?<!\\)\:
.
For some engines that don't support lookbehind it can be ([^https*][^\\])\K\:
.
Both do not work in VBA, the first regex gives me an error (5017), and the second one ignores all :
but the code does not throw any errors.
Based on regEx positive lookbehind in VBA language I tested this in a small example:
myString = "BA"
, pattern = "[^B](A)"
and then myString = rg.Replace(myString,"$1")
, the expected result was "A"
but the result obtained was "$1BA"
. What did I miss?
CodePudding user response:
The "trick" is to match what you don't want, but then capture what you do want and return only the captured group. eg:
Sub regex()
Dim RE As Object, MC As Object, M As Object
Const sPat As String = "B(A)"
Const myString As String = "BA"
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = sPat
Set MC = .Execute(myString)
Debug.Print MC(0).submatches(0)
End With
End Sub
will => A
in the Immediate Window
CodePudding user response:
You can use
Dim pattern As regExp, m As Object
Dim text As String, result As String, repl As String, offset As Long
text = "http://www1 https://www2 \: : text:..."
repl = "_"
offset = 0
Set pattern = New regExp
With pattern
.pattern = "(https?:|\\:)|:"
.Global = True
End With
result = text
For Each m In pattern.Execute(text)
If Len(m.SubMatches(0)) = 0 Then ' If Group 1 matched, replace with "a"
result = Left(result, m.FirstIndex offset) & repl & Mid(result, m.FirstIndex m.Length 1 offset)
offset = offset Len(repl) - m.Length
End If
Next
Output for http://www1 https://www2 \: : text:...
is http://www1 https://www2 \: _ text_...
.
The point is to match and capture https:
, http:
or \:
with the (https?:|\\:)
capturing group, then replace inline while matching. The use of the offset helps to track the changing string length, especially when you need to replace with a string of different length.