Home > OS >  Bash - Comparing a string to an array that contains wildcards?
Bash - Comparing a string to an array that contains wildcards?

Time:03-08

I have an array of possible file extensions, which contains some wild cards e.g.:

FILETYPES=("DBG" "MSG" "OUT" "output*.txt")

I also have a list of files, which I am grabbing the file extension from. I then need to compare the extension with the array of file extensions.

I have tried:

if [[ ${EXTENSION} =~ "${FILETYPES[*]}" ]]; then
  echo "file found"
fi

if [[ ${EXTENSION} == "${FILETYPES[*]}" ]]; then
  echo "file found"
fi

and

if [[ ${EXTENSION} =~ "${FILETYPES[*]}" ]]; then
  echo "file found"
fi

But to no avail

I tried:

if [[ "${FILETYPES[*]}" =~ ${EXTENSION} ]]; then
  echo "file found"
fi

However, it ended up comparing "txt" to "output*.txt" and concluding it was a match.

Any suggestions would be much appreciated

CodePudding user response:

You cannot directly compare a string with an array. Would you please try something like:

filetypes=("DBG" "MSG" "OUT" "output*.txt")
extension="MSG"                         # example
match=0
for type in "${filetypes[@]}"; do
    if [[ $extension = $type ]]; then
        match=1
        break
    fi
done
echo "$match"

You can save looping with regex:

pat="^(DBG|MSG|OUT|output.*\.txt)$"
extension="output_foo.txt"              # example
match=0
if [[ $extension =~ $pat ]]; then
    match=1
fi
echo "$match"

Please note the expressions of regex differ from wildcards for globbing.
As a side note, we conventionally do not use uppercases for user variables to avoid conflicts with system variables.

CodePudding user response:

  • FILETYPES=("DBG" "MSG" "OUT" "output*.txt") First of all, avoid ALL_CAPS variable names except if these are meant as global environment variables.
  • "output*.txt": is ok as a globing pattern, for bash test [[ $variable == output*.txt ]] for example. But for Regex matching it needs a different syntax like [[ $variable =~ output.*\.txt ]]
  • "${FILETYPES[*]}": Expanding this array into a single_string was mostly a good approach, but it needs clever use of the IFS environment variable to help it expands into a Regex. Something like IFS='|' regex_fragment="(${array[*]})", so that each array entry will be expanded, separated by a pipe | and enclosed in parenthesis as (entry1|entry2|...).

Here is an implementation you could use:

textscript.sh

#!/usr/bin/env bash

extensions_regexes=("DBG" "MSG" "OUT" "output.*\.txt")

# Expands the extensions regexes into a proper regex string
IFS='|' regex=".*\.(${extensions_regexes[*]})"

# Prints the regex for debug purposes
printf %s\\n "$regex"

# Iterate all filenames passed as argument to the script
for filename; do

  # Compare the filename with the regex
  if [[ $filename =~ $regex ]]; then
    printf 'file found: %s \n' "$filename"
  fi

done

Sample usage:

$ touch foobar.MSG foobar.output.txt
$ bash testscript.sh *
.*\.(DBG|MSG|OUT|output.*\.txt)
file found: foobar.MSG 
file found: foobar.output.txt 
  •  Tags:  
  • bash
  • Related