Home > database >  Replacing submatches
Replacing submatches

Time:10-28

I have a list of values in Excel, looking something like this:

J0-315 Sirkulasjonspumpe So2 
J0-318 Sirkulasjonspumpe So2 
Sirkulasjonspumpe J0-321
Sirkulasjonspumpe So2 J0-324
Sirkulasjonspumpe So2 J0-327
Sirkulasjonspumpe So2 J0-330
S1-179 Spjeld Ut Fra Pumpe 
S1-187 Spjeld Ut Fra Pumpe 
Spjeld Ut Fra Pumpe S1-195
Spjeld Ut Fra Pumpe S1-203
Spjeld Ut Fra Pumpe S1-211
Spjeld Ut Fra Pumpe S1-219
Ventil Inn Y227
S1-181 Spjeld Ut Fra Pumpe 
S1-189 Spjeld Ut Fra Pumpe 
Spjeld Ut Fra Pumpe S1-197
Spjeld Ut Fra Pumpe S1-205
Spjeld Ut Fra Pumpe S1-213
Spjeld Ut Fra Pumpe S1-221
Ventil Mot Tankfarm Y234
Ventil Ut Y225

Now, I want to have each word except the first one of each line changed to lowercase, unless it is a machine number (e.g. J0-315, Y225). Ideally I'd have chemical compounds (SO2) changed to uppercase as well, but there are few enough instances of this that I'll just do that by hand afterwards :-p

Now, I've come up with a macro which prints out all the values I want to have changed, but I have trouble figuring out how to just change the submatches to lowercase, and some cursory googling fails to give me any insights as well.

Can someone here give me a clue on how to solve this problem?

The code I have come up with so far can be found below:

Sub dlgho()
    Dim c As Range
    Dim regex As New RegExp
    Dim matches As MatchCollection
    Dim m As Match
    Dim i As Long
    
    With regex
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "([A-ZÆØÅ])[^0-9\-]"
    End With
    
    For Each c In Sheet8.Range("A1:A896")
        If regex.Test(c) Then
            Set matches = regex.Execute(c)
            If matches.Count > 1 Then
                For i = 1 To matches.Count - 1
                    Set m = matches(i)
                    Debug.Print m
                    Debug.Print m.SubMatches(0)
                Next i
            End If
        End If
    Next c
End Sub

To be clear, I want the characters printed in the second print-statement to be changed to lowercase, but I am uncertain about how to go about this.

The expected output would be

J0-315 sirkulasjonspumpe so2 
J0-318 sirkulasjonspumpe so2 
Sirkulasjonspumpe J0-321
Sirkulasjonspumpe so2 J0-324
Sirkulasjonspumpe so2 J0-327
Sirkulasjonspumpe so2 J0-330
S1-179 spjeld ut fra pumpe 
S1-187 spjeld ut fra pumpe 
Spjeld ut fra pumpe S1-195
Spjeld ut fra pumpe S1-203
Spjeld ut fra pumpe S1-211
Spjeld ut fra pumpe S1-219
Ventil inn Y227
S1-181 spjeld ut fra pumpe 
S1-189 spjeld ut fra pumpe 
Spjeld ut fra pumpe S1-197
Spjeld ut fra pumpe S1-205
Spjeld ut fra pumpe S1-213
Spjeld ut fra pumpe S1-221
Ventil mot tankfarm Y234
Ventil ut Y225

CodePudding user response:

Non-regex method, editing cell 1 by 1 is also slow and inefficient so a faster way is to transfer the values of the range to an array and process the array instead.

This will also make SO2 as uppercase.

Full code below:

Sub dlgho()
    Const whiteList As String = "SO2"
    Const inputRange As String = "A1:A896"
    
    Dim inputArr As Variant
    inputArr = Sheet1.Range(inputRange).Value
    
    Dim wordArr() As String
    
    Dim i As Long
    Dim n As Long
    
    For n = 1 To UBound(inputArr)
                
        wordArr = Split(inputArr(n, 1), " ")
       
        For i = 1 To UBound(wordArr)
            If InStr(wordArr(i), "-") = 0 And _
                Not IsNumeric(Mid$(wordArr(i), 2)) Then
                
                wordArr(i) = LCase(wordArr(i))
            End If
            
            If UCase(wordArr(i)) = whiteList Then wordArr(i) = UCase(wordArr(i))
        Next i
        inputArr(n, 1) = Join(wordArr, " ")
    Next n
    
    Sheet1.Range(inputRange).Value = inputArr
End Sub

CodePudding user response:

If the common identifier for machine-numbers is that they end in three digits, you could use native functionality too:

enter image description here

Formula in B1:

=LET(X,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s"),Y,TEXTJOIN(" ",,IF(ISNUMBER(--RIGHT(X,3)),X,LOWER(X))),UPPER(LEFT(Y))&MID(Y,2,LEN(Y)))

Or:

=LET(X,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s"),Y,SEQUENCE(COUNTA(X)),TEXTJOIN(" ",,IF(Y>1,IF(ISNUMBER(--RIGHT(X,3)),X,LOWER(X)),X)))
  • Related