Home > other >  With shell, how to extract (to separate variables) values that are surrounded by "=" and s
With shell, how to extract (to separate variables) values that are surrounded by "=" and s

Time:11-04

For example, I have a string /something an-arg=some-value another-arg=another-value.

What would be the most straightforward way to extract an-arg's value to a variable and another-arg's value to another variable?

To better exemplify, this is what I need to happen:

STRING="/something an-arg=some-value another-arg=another-value"
AN_ARG=... # <-- do some magic here to extract an-arg's value
ANOTHER_ARG=... # <-- do some magic here to extract another-arg's value
echo $AN_ARG # should print `some-value`
echo $ANOTHER_ARG # should print `another-value`

So I was looking for a simple/straightforward way to do this, I tried:

ARG_NAME="an-arg="
AN_ARG=${STRING#*$ARG_NAME}

But the problem with this solution is that it will print everything that comes after an-arg, including the second argument's name and its value, eg some-value another-arg=another-value.

CodePudding user response:

Letting data set arbitrary variables incurs substantial security risks. You should either prefix your generated variables (with a prefix having at least one lower-case character to keep the generated variables in the namespace POSIX reserves for application use), or put them in an associative array; the first example below does the latter.


Generating An Associative Array

As you can see at https://ideone.com/cKcMSM --

#!/usr/bin/env bash
#              ^^^^- specifically, bash 4.0 or newer; NOT /bin/sh

declare -A vars=( )

re='^([^=]* )?([[:alpha:]_-][[:alnum:]_-] )=([^[:space:]] )( (.*))?$'
string="/something an-arg=some-value another-arg=another-value third-arg=three"
while [[ $string =~ $re ]]; do : "${BASH_REMATCH[@]}"
  string=${BASH_REMATCH[5]}
  vars[${BASH_REMATCH[2]}]=${BASH_REMATCH[3]}
done

declare -p vars # print the variables we extracted

...correctly emits:

declare -A vars=([another-arg]="another-value" [an-arg]="some-value" [third-arg]="three" )

...so you can refer to ${vars[an-arg]}, ${vars[another-arg]} or ${vars[third-arg]}.

This avoids faults in the original proposal whereby a string could set variables with meanings to the system -- changing PATH, LD_PRELOAD, or other security-sensitive values.


Generating Prefixed Names

To do it the other way might look like:

while [[ $string =~ $re ]]; do : "${BASH_REMATCH[@]}"
  string=${BASH_REMATCH[5]}
  declare -n _newVar="var_${BASH_REMATCH[2]//-/_}" || continue
  _newVar=${BASH_REMATCH[3]}
  unset -n _newVar
  declare -p "var_${BASH_REMATCH[2]//-/_}"
done

...which work as you can see at https://ideone.com/zUBpsC, creating three separate variables with a var_ prefix on the name of each:

declare -- var_an_arg="some-value"
declare -- var_another_arg="another-value"
declare -- var_third_arg="three"

CodePudding user response:

Assumptions:

  • OP understands all the issues outlined by Charles Duffy but still wants standalone variables
  • all variables names to be uppercased
  • hyphens (-) converted to underscores (_)
  • neither variable names nor the associated values contain embedded white space

One bash idea using namerefs:

unset newarg AN_ARG ANOTHER_ARG 2>/dev/null
STRING="/something an-arg=some-value another-arg=another-value"

read -ra list <<< "${STRING}"                                   # read into an array; each space-delimited item is a new entry in the array

#typeset -p list                                                # uncomment to display contents of the list[] array

regex='[^[:space:]] =[^[:space:]] '                             # search pattern: <var>=<value>, no embedded spaces in <var> nor <value>

for item in "${list[@]}"                                        # loop through items in list[] array
do
    if [[ "${item}" =~ $regex ]]                                # if we have a pattern match (<var>=<val>) then ...
    then
        IFS="=" read -r ndx val <<< "${BASH_REMATCH[0]}"        # split on '=' and read into variables ndx and val
        declare -nu newarg="${ndx//-/_}"                        # convert '-' to '_' and assign uppercased ndx to nameref 'newarg'
        newarg="${val}"                                         # assign val to newarg
    fi
done

This generates:

$ typeset -p AN_ARG ANOTHER_ARG
declare -- AN_ARG="some-value"
declare -- ANOTHER_ARG="another-value"

NOTE:

  • once the for loop processing has completed, accessing the new variables will require some foreknowledge of the new variables' names
  • using an associative array to manage the list of new variables makes post for loop accessing quite a bit easier (eg, the new variable names are simply the indices of the associative array)
  • Related