Home > Blockchain >  How to split string in Go based on certain prefix and suffix?
How to split string in Go based on certain prefix and suffix?

Time:10-06

Let's say I have this big string:

13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a

I want it to be splitted into array, with 1324 as prefix and 0d0a as suffix. The result is an array of 3 elements:

arr[0] = 13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a

arr[1] = 13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a

arr[2] = 1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a

Here's my code:

package main

import (
    "fmt"
    "regexp"
)

func main() {

    var testData = "13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a"

    re := regexp.MustCompile("^1324[0-9a-zA-Z]*0d0a")

    matches := re.FindAllString(testData, -1)

    for _, m := range matches {
        fmt.Printf("%s\n", m)
    }
}

It simply prints the same entire string, which very likely means my regex is wrong. What's the proper form?

CodePudding user response:

Your regex has a two issues. The caret (^) means you want to only match the beginning of the string, so by definition you will only get one result. The other issue is that the * is a greedy quantifier, meaning it will match as many of the previous character set as it can. This means the regex will search until the end of the string for the suffix and backtrack only if it can't find it. What you want is a reluctant quantifier, so *?, which only matches the minimum number of characters it can to satisfy the regex.

Putting it together, your regex string should be "1324[0-9a-zA-Z]*?0d0a". I tested it in Go playground and it seems to get the results that you want. https://go.dev/play/p/qolk3vHNxKT

CodePudding user response:

It will be much simpler to use strings.Split on the keyword 1324 and then later prefix it to each entry.

The results type is a slice of strings each split by the delimiter provided. Iterate over it once to prefix the delimiter to get the desired result

package main

import (
    "fmt"
    "strings"
)

func main() {
    var output []string
    var testData = "13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a"
    results := strings.Split(testData, "1324")
    for idx := range results {
        if len(results[idx]) > 0 {
            output = append(output, fmt.Sprintf("%s%s", "1324", results[idx]))
        }
    }
}

Note that on my M1 Macbook Pro, the Split() example performed far better than the regex option, when ran with with Go's benchmarks.

  • Related