Home > database >  Split by spaces except inside double quotes
Split by spaces except inside double quotes

Time:09-23

I'm trying to get the following Regex right: I need to split a whole shell command represented as String by (multiple) spaces. However this splitting shall happen everywhere except inside double quotes

The closest I came to the solution is this: \s (?![^\"]*[\"])

But this isn't matching the first two spaces in an example such as:

cao ciao "sad asd" cai

which shall be splitted as:

  • cao
  • ciao
  • sad asd
  • cai

What am I missing?

Ps: I'm writing a kotlin wrapper for a shell API, and luckily the range of constructs and different options is quite limited, so I think Regex is a nice fit for

Pps: I asked for a Regex while the duplicate answer suggested involves a StringTokenizer and Matcher, which are other things

Found! \s (?=[^"]*(?:"[^"]*"[^"]*)*$)

Kudos the regex101 community

CodePudding user response:

It might be easier to match the tokens instead of the spaces between them. Instead of split, extract all matches of

("[^"]*"?|\S) 

I used ? so that a single " without a closing " causes everything till the end to be read as one token.

Warning: You mentioned shell scripts. If you want to parse a shell script using a regex, you will have a very hard time, to say the least. For example, consider the following constructs:

echo 'a b'
echo "a \" b"
echo $'a b'
echo a\ b
echo "$(echo "a ") b"
echo \"
cat << EOF
a b
EOF

You need an actual parser to safely process shell scripts.

CodePudding user response:

Found:

\s (?=[^"]*(?:"[^"]*"[^"]*)*$)

You can try it here

  • Related