Home > Software design >  How can I find lines matched by Regex A and perform replacement of Regex B? (Swift 5.5)
How can I find lines matched by Regex A and perform replacement of Regex B? (Swift 5.5)

Time:02-20

Suppose that I have this text file (already loaded into a string named "strContent")

Algebra 0 arujabura
Algebra 0 daishu
Kangaroo 0 daishu
Geometry 0 jiometori
Geometry 0 jihe
Physics 0 fijikusu
Physics 0 wuli

Regex A: ^.*Algebra.*\b(arujabura)\b.*$ (There is a private reason to use and only use a whole-word branch \b()\b to the 2nd column.)

Regex B is a replacement pattern to perform search within the line matched by the Regex A above: find \b(arujabura)\b and replace it to アルジャブラ.

However, I failed from matching anything here using my following shortened example:

#!/usr/bin/env swift

import Foundation

extension String {
    /* https://stackoverflow.com/a/66189289/4162914 */
    func match(_ pattern: String) -> [String] {
        do {
            let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpression.Options(rawValue: 0))
            let nsstr = self as NSString
            let all = NSRange(location: 0, length: nsstr.length)
            var matches : [String] = [String]()
            regex.enumerateMatches(in: self, options: [], range: all) {
                (result : NSTextCheckingResult?, _, _) in
                if let r = result {
                    let result = nsstr.substring(with: r.range) as String
                    matches.append(result)
                }
            }
            return matches
        } catch {
            return [String]()
        }
    }
}

func filterTone_Romaji(inputString: String) -> [String] {
    var arrResult = inputString.match("^.*Algebra.*\\b(arujabura)\\b.*$")
    arrResult.append(contentsOf: inputString.match("^.*Geometry.*\\b(jiometori)\\b.*$"))
    arrResult.append(contentsOf: inputString.match("^.*Physics.*\\b(Fijikusu)\\b.*$"))
    return arrResult
}

let str_Test_File = "Algebra 0 arujabura\nAlgebra 0 daishu\nKangaroo 0 daishu\nGeometry 0 jiometori\nGeometry 0 jihe\nPhysics 0 fijikusu\nPhysics 0 wuli"


// Trying to find the matched result
var arrConvResultTest : [[String]] = [[]]

arrConvResultTest.append(["@# phrases-test-pragma-header.txt"])
var arrConv_Test_File = filterTone_Romaji(inputString: str_Test_File)

// Print Out the matched result
var varLineData = ""

for lineData in arrConvResultTest {
    varLineData = lineData.joined()
    print(varLineData)
}

What should I do next to make sure it matches?

P.S.: Regarding the replacement, I assume that I can use the following method:

extension String {
    /* https://stackoverflow.com/a/40993403/4162914 */
    mutating func regReplace(pattern: String, replaceWith: String = "") {
        do {
            let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
            let range = NSRange(location: 0, length: count)
            self = regex.stringByReplacingMatches(in: self, options: [], range: range, withTemplate: replaceWith)
        } catch { return }
    }
}

Update

I tried one of the solutions given in the answer thread above. However, it looks like the range doesn't work. Otherwise, the value of the Kangaroo shouldn't be affected:

#!/usr/bin/env swift

import Foundation

extension String {
    mutating func inlineRep(regLineMatch: String, replaceOf: String, replaceWith: String) {
        let rangeWithinLine = self.range(of: regLineMatch, options: .regularExpression)
        let result = self.replacingOccurrences(of: replaceOf,
                                with: replaceWith,
                                options: .regularExpression,
                                range: rangeWithinLine)
        self = result
    }
}

var strClusterTest = "Algebra 0 arujabura\nAlgebra 0 daishu\nKangaroo 0 daishu\nGeometry 0 jiometori\nGeometry 0 jihe\nPhysics 0 fijikusu\nPhysics 0 wuli"


strClusterTest.inlineRep(regLineMatch: #"\n^.*Algebra.*\b(daishu)\b.*$\n"#, replaceOf: #"\b(daishu)\b"#, replaceWith: "代數")
print(strClusterTest)

I added a loop container trying to limit the edit range to the current line, but the regex range still fails from working:

#!/usr/bin/env swift

import Foundation

extension String {
    mutating func inlineRep(regLineMatch: String, replaceOf: String, replaceWith: String) {
        let rangeWithinLine = self.range(of: regLineMatch, options: .regularExpression)
        let result = self.replacingOccurrences(of: replaceOf,
                                with: replaceWith,
                                options: .regularExpression,
                                range: rangeWithinLine)
        self = result
    }
}

var strClusterTest = "Algebra 0 arujabura\nAlgebra 0 daishu\nKangaroo 0 daishu\nGeometry 0 jiometori\nGeometry 0 jihe\nPhysics 0 fijikusu\nPhysics 0 wuli"

var arrClusterTest = strClusterTest.components(separatedBy: "\n")
var strClusterOutputTest = ""

for line in arrClusterTest {
    var currentLine = line
    currentLine.inlineRep(regLineMatch: #"^.*Algebra[^\n]\b(daishu)\b.*$"#, replaceOf: #"\b(daishu)\b"#, replaceWith: "代數")
    strClusterOutputTest  = currentLine
    strClusterOutputTest  = "\n"
}

print(strClusterOutputTest)

CodePudding user response:

You can use range(of:options) to find the regex

let range = str_Test_File.range(of: regex, options: .regularExpression)

and then replacingOccurrences(of:with:options:range) to perform the replacement

replacingOccurrences(of: #"\barujabura\b"#, with: replacement, options: .regularExpression, range: range)

So the full code would be

let str_Test_File = "Algebra 0 arujabura\nAlgebra 0 daishu\nGeometry 0 jiometori\nGeometry 0 jihe\nPhysics 0 fijikusu\nPhysics 0 wuli"

let regex = #"^.*Algebra.*\b(arujabura)\b.*$"#
let replacement = "アルジャブラ"
let range = str_Test_File.range(of: regex, options: .regularExpression)

let result = str_Test_File.replacingOccurrences(of: #"\barujabura\b"#,
                                                with: replacement,
                                                options: .regularExpression,
                                                range: range)

I am not quite sure why you need to make this into a two step process so as an alternative maybe you directly could do

let result = str_Test_File.replacingOccurrences(of: #"\barujabura\b"#,
                                                with: replacement,
                                                options: .regularExpression)

At least for the given test string the end result is the same between the two solutions.


Update

If you want to create a loop to go through and replace all matches individually I would use a while loop like this

var searchRange = strClusterTest.startIndex..<strClusterTest.endIndex
while let range = strClusterTest.range(of: regLineMatch,
                                       options: .regularExpression,
                                       range: searchRange) {

    strClusterTest = strClusterTest.replacingOccurrences(of: replaceOf,
                                                         with: replaceWith,
                                                         options: .regularExpression,
                                                         range: range)
    searchRange = range.upperBound..<strClusterTest.endIndex
}

The searchRange variable is used to only search the part of the string after the last match and the loop will exit when range(of:) returns nil. Also I believe the search regex can be changed to

let regLineMatch = #"\n?.*Algebra.*\b(daishu)\b.*\n"#
  • Related