Home > other >  Regular expression in bash case statement throws `unexpected token` error
Regular expression in bash case statement throws `unexpected token` error

Time:03-31

EDIT: I've pushed my code to a new branch just for this question

Link to branch

Link to shell script


I'm trying to create a bash file that loops over all directories (and subdirectories) of a predefined structure (i.e. I know for certain the directories are named in a specific manner).

An obvious solution for this was a case...esac statement using regular expression patterns. So off I go to StackOVerflow where I found this post explaining how to do exactly what I need, so I'm ready to create a test code and this is what I came up with:

test
|
|__ python
|
|__ python-
|
|__ python-a
|
|__ python-0
|
|__ python0-
|
|__ python2
|
|__ pythona
|
|__ test.sh

and run this:

# test.sh
#!/bin/bash
shopt -s extglob;
for dir in $(ls); do
        case $dir in
                python*(-)*([a-zA-Z0-9]))
                        echo "yes -> $dir"
                        ;;
                *)
                        echo "no -> $dir"
                        ;;
        esac
done
shopt -u extglob;

which gives me the following output:

yes -> python
yes -> python-
yes -> python-a
no -> python0-
yes -> python2
yes -> pythona
no -> test.sh

which seems to work fine.

I carried this method on to my actual code:

# actual_code.sh
cd $PROGRAMS_DIR
for language in $(ls -l --group-directories-first | tail -n $(ls -l | awk '{print $1}' | grep d | wc -l) | awk '{print $9}'); do
    cd $language
    echo "language -> $language"
    for algorithm in $(ls -l --group-directories-first | tail -n $(ls -l | awk '{print $1}' | grep d | wc -l) | awk '{print $9}'); do
        cd $algorithm
        echo "algo -> $algorithm"
        shopt -s extglob;
        case $language in
            rust*(-)*([0-9]))
                rustc "${algorithm}_run.rs" -o "${algorithm}_run"
                COMMAND="./${algorithm}_run"
                if [ $TEST -eq 1 ]; then
                    echo "> Running Rust tests for $algorithm"
                    rustc --test "${algorithm}_test.rs" -o "${algorithm}_test"
                    ./${algorithm}_test
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            go*(-)*([0-9]))
                go build -o "${algorithm}_run" .
                COMMAND="./${algorithm}_run"
                if [ $TEST -eq 1 ]; then
                    echo "> Running Go tests for $algorithm"
                    go test
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            java*(-)*([0-9]))
                javac -cp .:$JUNIT:$HAMCREST *.java
                COMMAND="java -cp .:${JUNIT}:${HAMCREST} ${algorithm}_run"
                if [ $TEST -eq 1 ]; then
                    echo "> Running Java tests for $algorithm"
                    java -cp .:${JUNIT}:${HAMCREST} ${algorithm}_test
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            c*(-)*([0-9]))
                # TODO: Try to implement both the normal executable and the -O2 optimisation
                gcc -Wall -c "${algorithm}.c" "${algorithm}_run.c"
                gcc -o "${algorithm}_run" "${algorithm}.o" "${algorithm}_run.o"
                COMMAND="./${algorithm}_run"
                if [ $TEST -eq 1 ]; then
                    echo "> Running C tests for $algorithm"
                    gcc -Wall -c "${algorithm}.c" "${algorithm}_test.c" $UNITY
                    gcc -o "${algorithm}_test" "${algorithm}.o" "${algorithm}_test.o" "unity.o"
                    ./${algorithm}_test
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            python*(-)*([0-9]))
                COMMAND="python ${algorithm}_run.py"
                if [ $TEST -eq 1 ]; then
                    echo "> Running Python tests for $algorithm"
                    pytest .
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            haxe*(-)*([0-9]))
                COMMAND="haxe --main ${algorithm^}_Run.hx --interp"
                if [ $TEST -eq 1 ]; then
                    echo "> Running Haxe tests for $algorithm"
                    haxe --main "${algorithm^}_Test.hx" --library utest --interp -D UTEST_PRINT_TESTS
                    if [ $(echo $?) -ne 0 ]; then
                        exit 1
                    fi
                fi
                ;;
            *)
                echo "($language) has no compilation steps. Did you forget to update the benchmark script?"
                ;;
        esac
        shopt -u extglob;
        .
        .
        .
        some other random code
        .
        .
        .
    done
done

Executing this code now gives me this

./actual_code.sh: line 408: syntax error near unexpected token `('
./actual_code.sh: line 408: `                rust*(-)*([0-9]))'

I'm obviously missing something, but they look the same to me. Also, the echo part doesn't work. It goes straight to the error which is weird as this is an interpreted language.

CodePudding user response:

1. Do not parse ls

For example, I couldn't make this work:

for language in $(
    ls -l --group-directories-first |
    head -n $(ls -l | awk '{print $1}' | grep d | wc -l) |
    awk '{print $9}'
)
do
    # ...
done

Parsing ls is a lot of things except a good practice. It's simpler and 100% accurate to use globs (with post-filtering when needed).

  • For files:
#!/bin/bash
shopt -s nullglob

for file in ./*
do
    [[ -f "$file" ]] || continue
    # ...
done
  • For directories (even simpler):
#!/bin/bash
shopt -s nullglob

for dir in ./*/
do
    # ...
done
2. Use Makefiles

Having compilation details in the script makes it difficult to maintain; you should delegate it to make by creating a Makefile in each algorithm directory.

Here's a basic example of a Makefile for a Go algorithm directory:

build:
    go build -o REPLACE_ME_WITH_THE_ALGORITHM_NAME_run
test:
    go test

OK, let's suppose that you've written the needed Makefiles (with build and test targets); you can now greatly simplify your script:

#!/bin/bash
shopt -s extglob nullglob

for buildpath in "$PROGRAMS_DIR"/@(rust|go|java|c|python|haxe)*(-)*([0-9])/*/
do
    pushd -- "$buildpath" > /dev/null || continue

    make build

    if [[ $TEST == 1 ]]
    then
        make test || exit 1
    fi

    popd > /dev/null || exit 1
done

As a last remark: the variables in your script should be lower case, unless exported or constants

CodePudding user response:

Akcnowledgements


First of, thank you to everyone who suggested a solution, pointing me towards the right direction, and contributed to this project.

@EdMorton - Although shellcheck was a a good resource and pointed many (possible) flaws in my script, some of the fixes it wanted to apply weren't fixes, but rather a warning towards conventional practices which are not necessary for the actual functionality. I've chosen to follow some but not all as some gave unexpected results. As mentioned in the conversation thread on the original question, this points to some unknown underlying issue(s) and a new branch has been pushed in my repository where the benchmark script will be re-written in such a manner that it passes shellcheck.

@Fravadona - Thank you for the suggestion of a Makefile, this was my original decision but due to my inexperience I'm unable to achieve this in the amount of time I have left (this is still my final year dissertation).

Solution currently in place


I've decided that for the time being, applying a conditional regex matching is the way to go as seen in this commit (as seen below). It's by far the most elegant solution, but works well for now and doesn't add any significant complexity.

# Get rid of the leading './'
language=${language:2:${#language}}
# Capture for '-haxe' postfix of a language or any other pattern
# https://stackoverflow.com/a/18710850/5817020
if [[ $language =~ [a-zA-Z]*-[a-zA-Z0-9]* ]]; then
    readarray -d "-" -t LANGUAGE <<< $language
    lang="${LANGUAGE[0]}"
else
    lang=$language
fi
  • Related