I need to split a chunk of text on the
symbol, but only when it's outside of single quotes. The text will look something like:
Some.data:'some value' some.more.data:9 yet.more.data:'rock roll'
which should become a slice of three values:
- Some.data:'some value'
- some.more.data:9
- yet.more.data:'rock roll'
I've found similar questions that do it using regex, but that requires look ahead which the golang regex engine doesn't have.
I also took a crack at creating my own regex without lookahead:
'.*?'(\ )|[^']*(\ )
But that seems to fall apart on the third item where it splits on the
in 'rock roll'
.
I've thought about potentially doing a string split on
and then validating each slice to make sure it's not a partial expression and then stitching the pieces back together if it is, but it will be fairly involved and i'd like to avoid it if possible.
At the moment I think the best solution would be to identify text that is inside of quotes (which I can easily do with regex), either URL encode that text or do something else with the plus sign, split the text and then URL decode the expression to get the
sign inside of quotes back, but i'm wondering if there is a better way.
Does anyone know of a way to split on a
sign that is outside of quotes using regex without lookahead? Can anyone think of a simpler solution than my URL encoding/decoding method?
CodePudding user response:
Plain code can be easier:
func split(s string) []string {
var result []string
inquote := false
i := 0
for j, c := range s {
if c == '\'' {
inquote = !inquote
} else if c == ' ' && !inquote {
result = append(result, s[i:j])
i = j 1
}
}
return append(result, s[i:])
}