Inputs examples:
raw_list='
libreoffice-impress.desktop - LibreOffice Impress
joplin.desktop - Joplin Notes
libreoffice-base.desktop - LibreOffice Base
yelp.desktop - Help
org.gnome.gedit.desktop - Text Editor
'
I want to parse out the app .desktop file from the name. example:
name='Joplin Notes'
path='joplin.desktop'
My regex:
parse_app_list(){
name=''
path=''
for i in "${raw_list[@]}"; do
echo "$i"
[[ $name =~ "$i.desktop[:space:]-[:space:].*" ]]
[[ $path =~ ".*$i.desktop" ]]
echo "$name"
echo "$path"
done
}
It's not even close. What would the correct syntax be?
CodePudding user response:
When using =~
as a test operator, the bash manual states:
<snip> the string to the right of the operator is considered an extended regular expression and matched accordingly (as in regex(3)) <snip> Any part of the pattern may be quoted to force the quoted portion to be matched as a string
This means that
[[ "$var" =~ regex ]] # matches regex
[[ "$var" =~ "string" ]] # matches string
In case of the OP, the test should read:
[[ $name =~ "$i.desktop"[[:space:]]-[[:space:]].* ]]
[[ $path =~ ".*"$i.desktop" ]]
Here we did the following modifications:
- unquote the entire regular expression to interpret it as a regex and not a string
- quote the string "$i.desktop" to have it interpreted as a string. Otherwise any <dot>-character or other special regex character in
$i
could be interpreted as a regex. [:space:]
is a character class and should be located in a bracket expression (i.e.[[:space:]]
,[:space:]
just matches any of the following characters:aceps
)
CodePudding user response:
A few issues with the current code:
raw_list=' libreoffice-impress.destop ... Text Editor'
is a variable containing one multi-line string;raw_list
is not an array ofpath - name
pairs; though we do have the special case where"${raw_list}"
can be referenced as"${raw_list[0]}"
"${raw_list[@]}"
is an array reference; the loop will be processed once withi="${raw_list}"
(ori="${raw_list[0]}"
); try runningfor i in "${raw_list[@]}";do echo "loop:$i"; done
to confirm this- the variables
name
andpath
are never set to anything (other than''
) so the tests will always fail
Assumptions:
- initial data can be reformatted as array entries
- each array entry contains a single instance of
[:space:]-[:space:]
Setup:
raw_list=(
'libreoffice-impress.desktop - LibreOffice Impress'
'joplin.desktop - Joplin Notes'
'libreoffice-base.desktop - LibreOffice Base'
'yelp.desktop - Help'
'org.gnome.gedit.desktop - Text Editor'
)
regex='(.*) - (.*)' # whatever matches the contents inside the parens will be
# our 1st and 2nd entries in the `BASH_REMATCH[]` array
One idea using bash
regex matching to parse the pairs for us:
while read -r line
do
path=''
name=''
[[ "${line}" =~ $regex ]] && \
path="${BASH_REMATCH[1]}" && \
name="${BASH_REMATCH[2]}"
echo "############## ${line}"
echo "path=${path}"
echo "name=${name}"
echo ""
done < <(printf "%s\n" "${raw_list[@]}")
Another idea using parameter expansions to parse the pairs for us:
for line in "${raw_list[@]}"
do
path="${line% - *}"
name="${line#* - }"
echo "############## ${line}"
echo "path=${path}"
echo "name=${name}"
echo ""
done
Both of these generate:
############## libreoffice-impress.desktop - LibreOffice Impress
path=libreoffice-impress.desktop
name=LibreOffice Impress
############## joplin.desktop - Joplin Notes
path=joplin.desktop
name=Joplin Notes
############## libreoffice-base.desktop - LibreOffice Base
path=libreoffice-base.desktop
name=LibreOffice Base
############## yelp.desktop - Help
path=yelp.desktop
name=Help
############## org.gnome.gedit.desktop - Text Editor
path=org.gnome.gedit.desktop
name=Text Editor