Home > Back-end >  Check if any substring is contained in an array in Bash
Check if any substring is contained in an array in Bash

Time:12-27

Suppose I have a string,

a="This is a string"

and an array,

b=("This is my" "sstring")

I want to execute an if condition if any substring of a lies in b which is true because "This is" is a substring of the first element of b.

In case of two strings I know how to check if $x is a substring of $y using,

if [[ $y == *$x* ]]; then
 #Something
fi

but since $x is an array of strings I don't know how to do it without having to explicitly loop through the array.

CodePudding user response:

You can split the $a into an array, then loop both arrays to find matches:

a="this is a string"
b=( "this is my" "string")

# Make an array by splitting $a on spaces
IFS=' ' read -ra aarr <<< "$a"

for i in "${aarr[@]}"
do 
  for j in "${b[@]}"
  do
    if [[ $j == *"$i"* ]]; then
      echo "Match: $i : $j"
      break
    fi
  done
done

# Match: this : this is my
# Match: is : this is my
# Match: string : string

If you need to handle substrings in $a (e.g. this is, is my etc) then you will need to loop over the array, generating all possible substrings:

for (( length=1; length <= "${#aarr[@]}";   length )); do
  for (( start=0; start   length <= "${#aarr[@]}";   start )); do
    substr="${aarr[@]:start:length}"
    for j in "${b[@]}"; do
      if [[ $j == *"${substr}"* ]]; then
        echo "Match: $substr : $j"
        break
      fi
    done
  done
done

# Match: this : this is my
# Match: is : this is my
# Match: string : string
# Match: this is : this is my

CodePudding user response:

This might be all you need:

$ printf '%s\n' "${b[@]}" | grep -wFf <(tr ' ' $'\n' <<<"$a")
This is my

Otherwise - a shell is a tool to manipulate files/processes and sequence calls to tools. The guys who invented shell also invented awk for shell to call to manipulate text. What you're trying to do is manipulate text so there's a good chance you should be using awk instead of shell for whatever it is you're doing that this task is a part of.

$ printf '%s\n' "${b[@]}" | 
awk -v a="$a" '
    BEGIN { split(a,words) }
    { for (i in words) if (index($0,words[i])) { print; f=1; exit} }
    END { exit !f }
'
This is my

The above assumes a doesn't contain any backslashes, if it can then use this instead:

printf '%s\n' "${b[@]}" | a="$a" awk 'BEGIN{split(ENVIRON["a"],words)} ...'

If any element in b can contain newlines then:

printf '%s\0' "${b[@]}" | a="$a" awk -v RS='\0' 'BEGIN{split(ENVIRON["a"],words)} ...'

CodePudding user response:

Here is how to match the maximum number of words from string a to entries of array b:

#!/usr/bin/env bash

a="this is a string"
b=("this is my" "string" )

# tokenize a words into an array
read -ra a_words <<<"$a"

match()
{
  # iterate entries of array b
  for e in "${b[@]}"; do

    # tokenize entry words into an array
    read -ra e_words <<<"$e"

    # initialize counter/length to the shortest MIN words count
    i=$(( ${#a_words[@]} < ${#e_words[@]} ? ${#a_words[@]} : ${#e_words[@]} ))

    # iterate matching decreasing number of words
    while [ 0 -lt "$i" ]; do

      # return true it matches
      [ "${e_words[*]::$i}" = "${a_words[*]::$i}" ] && return

      # decrease number of words to match
      i=$(( i - 1 ))
    done
  done

  # reaching here means no match found, return false
  return 1
}

if match; then
  printf %s\\n 'It matches!'
fi
  • Related