Home > Mobile >  How to convert a variable containing sed-arguments to an array?
How to convert a variable containing sed-arguments to an array?

Time:09-27

I'm using

sed (GNU sed) 4.4
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)

I've a complicated set of sed arguments in a bash script, about 20 different -e expression. Here is a simple example as a one-liner. It converts aa bb cc to aaBBcc:

sed -e 's# ##g' -e 's#b#B#g' <<< "aa bb cc"

or

k=('-e' 's# ##g'    '-e' 's#b#B#g'); sed "${k[@]}" <<< "aa bb cc"

However, there are 20-ish -e expression and most are complicated. The script is only for me, so it doesn't have to follow convention or policy. To make the arguments readable / editable (to me), I assign them to a variable with extra whitespace, columnated, indented .... Here is a simplified version of what I mean:

#!/bin/bash
k="-e s#    #    #g \
   -e s# b  # B  #g \
  "

That simplified example doesn't show how useful that approach is to me. Anyway, here is the "working" script:

#!/bin/bash
k="-e s#    #    #g \
   -e s# b  # B  #g \
  "
k=$(sed -e 's# ##g'         <<< "$k")  #1 remove all spaces
k=$(sed -e 's|###|# ##|g'   <<< "$k")  #2 put needed space back in
k=$(sed -e 's#-e#|-e #g'    <<< "$k")  #3 delimit the args with "|"
k=$(sed -e 's#|##'          <<< "$k")  #4 remove the leading "|"
z=$IFS; IFS="|"; k=($k); IFS=$z        #5 convert variable to array
sed "${k[@]}" <<< "aa bb cc"           #6 process the string

Output is:

aaBBcc

It works and it is readable for me. But it is really complicated, and took me quite awhile to figure out how to massage k into a form that sed would take.

It fails to work if I quote the expressions, as in -e 's#b#B#g'

Is there a less complicated way, and/or a way to quote the expressions? Must work with k whitespaced as above, sed 4.4, bash 4.4.12(1).

#######################################################

added 2022-09-26 14:58 PST:

Here is a real world script for converting a URL before bookmarking. The caveat is that I wrote for my usage. I don't have to figure out what the code is trying to do because I already know the paradigm, I invented it, or reinvented it.

https://www.ebay.com/sch/i.html?_from=R40&_trksid=123456&_nkw=(vintage,vtg) (polartec,fleece) (full,zip,zips,zipper,zippered,zipping) -(hilfiger,"old navy",hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,"1/4","1/2", lined,winnie,toddler,kids,ladies,womens,women)&_sacat=11450&LH_TitleDesc=0&_odkw=(vintage,vtg) fleece (full,zip,zips,zipper,zippered,zipping) -(hilfiger,"old navy",hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,"1/4","1/2", lined,winnie,toddler,kids,ladies,womens,women)&_osacat=11450&_sop=10&LH_PrefLoc=3&_ipg=240&_udhi=99

into

https://www.ebay.com/sch/i.html?&_nkw=(vintage,vtg) (polartec,fleece) (full,zip,zips,zipper,zippered,zipping) -(hilfiger,old navy,hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,1/4,1/2, lined,winnie,toddler,kids,ladies,womens,women)&_sacat=1145011450&_sop=10&LH_PrefLoc=3&_udhi=99&_ipg=240
#!/bin/bash
echo
k="-e s# [&]*_from=R40      #            #   \
   -e s# [&]*_trk[^&]*      #            #   \
   -e s# [&]*_odkw[^&]*     #            #   \
   -e s# [&]*_osacat[^&]    #            #   \
   -e s# [&]*_sacat=0       #            #   \
   -e s# [&]*LH_TitleDesc=0 #            #   \
   -e s#                    #            #g  \
   -e s# /                # /          #g  \
   -e s# (                # (          #g  \
   -e s# )                # )          #g  \
   -e s# ,                # ,          #g  \
   -e s# "                #            #g  \
   -e s# &_ipg=[0-9]*       #            #   \
   -e s# $                  # \&_ipg=240 #   \
   "
k=$(sed -e 's# ##g'         \
        -e 's|###|# ##|g'   \
        -e 's#-e#|-e #g'    \
        -e 's#|##'          \
        <<< "$k"            \
   )
z=$IFS; IFS="|"; k=($k); IFS=$z        
sed "${k[@]}" <<< "$1"

CodePudding user response:

See I'm trying to put a command in a variable, but the complex cases always fail!

You want

k=(
    -e 's/ //g'
    -e 's/b/B/g'
)
sed "${k@]}" ...

Even if this code is only for you, you'll never be able to maintain what you have after ignoring it for a while. Readability and good practice is good for you too.


Given your updated real-life example, here's another thought:

declare -a regex str flag cmds

regex =( '[&]*_from=R40'      ); str =( ''           ); flag =( ''  )
regex =( '[&]*_trk[^&]*'      ); str =( ''           ); flag =( ''  )
regex =( '[&]*_odkw[^&]*'     ); str =( ''           ); flag =( ''  )
regex =( '[&]*_osacat[^&]'    ); str =( ''           ); flag =( ''  )
regex =( '[&]*_sacat=0'       ); str =( ''           ); flag =( ''  )
regex =( '[&]*LH_TntleDesc=0' ); str =( ''           ); flag =( ''  )
regex =( '  '                 ); str =( ' '          ); flag =( 'g' )
regex =( '/'                ); str =( '\/'         ); flag =( 'g' )
# note the backslash in the "str" value: ^^ -- that's the "s///" delimiter
regex =( '('                ); str =( '('          ); flag =( 'g' )
regex =( ')'                ); str =( ')'          ); flag =( 'g' )
regex =( ','                ); str =( ','          ); flag =( 'g' )
regex =( '"'                ); str =( ''           ); flag =( 'g' )
regex =( '&_npg=[0-9]*'       ); str =( ''           ); flag =( ''  )
regex =( '$'                  ); str =( '\&_npg=240' ); flag =( ''  )

n=${#regex[@]}
for ((i=0; i < n; i  )); do
    cmds =( -e "s/${regex[i]}/${str[i]}/${flag[i]}" )
done

url='https://www.ebay.com/sch/i.html?_from=R40&_trksid=123456&_nkw=(vintage,vtg) (polartec,fleece) (full,zip,zips,zipper,zippered,zipping) -(hilfiger,"old navy",hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,"1/4","1/2", lined,winnie,toddler,kids,ladies,womens,women)&_sacat=11450&LH_TitleDesc=0&_odkw=(vintage,vtg) fleece (full,zip,zips,zipper,zippered,zipping) -(hilfiger,"old navy",hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,"1/4","1/2", lined,winnie,toddler,kids,ladies,womens,women)&_osacat=11450&_sop=10&LH_PrefLoc=3&_ipg=240&_udhi=99'

sed "${cmds[@]}" <<< "$url"
https://www.ebay.com/sch/i.html?&_nkw=(vintage,vtg) (polartec,fleece) (full,zip,zips,zipper,zippered,zipping) -(hilfiger,old navy,hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,1/4,1/2, lined,winnie,toddler,kids,ladies,womens,women)&_sacat=11450&LH_TitleDesc=011450&_sop=10&LH_PrefLoc=3&_ipg=240&_udhi=99&_npg=240

CodePudding user response:

Combining @glenn_jackman answer with my idiosyncrasies:

#!/bin/bash
declare -a k
k=('-e s#  [&]*_from=R40       #              #   '
   '-e s#  [&]*_trk[^&]*       #              #   '
   '-e s#  [&]*_odkw[^&]*      #              #   '
   '-e s#  [&]*_osacat[^&]     #              #   '
   '-e s#  [&]*_sacat=0        #              #   '
   '-e s#  [&]*LH_TitleDesc=0  #              #   '
   '-e s#                      #              #g  '
   '-e s#  /                 #  /           #g  '
   '-e s#  (                 #  (           #g  '
   '-e s#  )                 #  )           #g  '
   '-e s#  ,                 #  ,           #g  '
   '-e s#  "                 #              #g  '
   '-e s#  &_ipg=[0-9]*        #              #   '
   '-e s#  $                   #  \&_ipg=240  #   '
)

n=${#k[@]}
for ((i=0; i<n; i  )); do
  k[i]=$(sed -e 's% *# *%#%g' <<< "${k[i]}")
done
sed "${k[@]}" <<< "$1"

or

#!/bin/bash
declare -a k
k =('-e s#  [&]*_from=R40       #              #   ')
k =('-e s#  [&]*_trk[^&]*       #              #   ')
k =('-e s#  [&]*_odkw[^&]*      #              #   ')
k =('-e s#  [&]*_osacat[^&]     #              #   ')
k =('-e s#  [&]*_sacat=0        #              #   ')
k =('-e s#  [&]*LH_TitleDesc=0  #              #   ')
k =('-e s#                      #              #g  ')
k =('-e s#  /                 #  /           #g  ')
k =('-e s#  (                 #  (           #g  ')
k =('-e s#  )                 #  )           #g  ')
k =('-e s#  ,                 #  ,           #g  ')
k =('-e s#  "                 #              #g  ')
k =('-e s#  &_ipg=[0-9]*        #              #   ')
k =('-e s#  $                   #  \&_ipg=240  #   ')

n=${#k[@]}
for ((i=0; i<n; i  )); do
  k[i]=$(sed -e 's| *# *|#|g' <<< "${k[i]}")
done
sed "${k[@]}" <<< "$1"
exit

CodePudding user response:

Why not just use ; to join multiple set operations into a single parameter? Something like this:

k="s#    #    #g;
   s# b  # B  #g;
"
k=$(sed 's# ##g; s|###|# ##|g' <<<"$k") # Clean up spaces in $k
sed "$k" <<< "aa bb cc"

Result: "aaBBcc". Your big pattern would look like this:

k="s# [&]*_from=R40      #            #   ;
   s# [&]*_trk[^&]*      #            #   ;
   s# [&]*_odkw[^&]*     #            #   ;
   s# [&]*_osacat[^&]    #            #   ;
   s# [&]*_sacat=0       #            #   ;
   s# [&]*LH_TitleDesc=0 #            #   ;
   s#                    #            #g  ;
   s# /                # /          #g  ;
   s# (                # (          #g  ;
   s# )                # )          #g  ;
   s# ,                # ,          #g  ;
   s# "                #            #g  ;
   s# &_ipg=[0-9]*       #            #   ;
   s# $                  # \&_ipg=240 #   ;
   "

You could also do the whitespace-mangling as part of the sed command with a command substitution:

sed "$(sed 's# ##g; s|###|# ##|g' <<<"$k")" <<< "$1"
  • Related