Find if all characters in one string occur within another string-CodePudding

I am new to bash. I have a question about determining if all characters of one string occur within another string. For example, if the variables are:

var_1="abcdefg"
var_2="bcg"

Then I want to write an if statement of the form:

if [all characters of var_2 occur within var_1]
then
     echo "All characters of var_2 occur in var_1."
else
     echo "Not all characters of var_2 occur in var_1."
fi

In this example, the output should be All characters of var_2 occur in var_1. What would go in the if statement here?

This is what I tried:

if [[ $var_1 == *$var_2* ]]

... but I think this is only determines if var_2 is a substring of var_1. What I want is to determine if the characters of var_2 occur within var_1 in no particular order.

CodePudding user response：

The following oneliner should work:

echo -e "$var_2\0$var_1" | sed -E ':a;s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/;ta;s/^\x0.*/1/;s/.*\x0.*/0/'

It will print 0 or 1 to mean false or true respectively.

This is how it works:

echo -e allows using escape sequences, and \0 represents the null character, which I'm using to mark the separation between the two strings bcg and abcdefg.
The Sed script is not that complex:
- -E is a non POSIX option allowing to use ( and ) instead of $ and $ to write capturing groups (and other similar simplifications which I'm not using here);
- ;s separate commands;
- :a is a label, and allows one jumping here via ta or ba (I use only the former, keep reading);
- s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/ does the following (which succeedes if there's at least one character in common between var_2 and var_1):
  - matches and captures the first character of var_2 with (.),
  - matches and captures the following part of var_2 together with the null character, (.*\x0) (yes, what you write as \0 in Bash is \x0 in Sed),
  - matches and captures 0 or more characters,
  - matches what was captured by first group, i.e. by (.),
  - matches and captures 0 or more characters up to the end of var_1,
  - substitutes all that was matched with what was captured by the 2nd, 3rd, and 4th capturing groups: in fact, we've got rid of one character in common between var_2 and var_1;
- ta test if the previous substitution was successful and, if that's the case, it jumps to :a: this way we are running a loop as long as there's a characters in common between var_2 and var_1;
- when ther's no characters in common between var_2 and var_1, the test will fail, and the control will fall through ta;
- s/^\x0.*/1/ matches whatever is left, but only if the null character \x0 is leading, which happens if all letters of var_2 were found in var_1, and changes everything to just 1;
- s/.*\x0.*/0/ will match everything, as long as there's still \x0 in the string, which happens only if the previous substitution failed, which means that some letter from var_2 was not found in var_1, and change it to 0.

CodePudding user response：

Not really an if clause/statement, something like.

#!/usr/bin/env bash

i=0
var_2="bcg"
var_1="abcdefg"
total_str=${#var_2}

while (( i < total_str )); do
  [[ $var_1 = *"${var_2:i  :1}"* ]] || {
    printf >&2 'Not all characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"
    exit 1
  }
done

printf 'All characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"

Output

All characters of the string "bcg" occur in the string "abcdefg".

Changing the value of var_2 to something like

var_2="bxg"

The output should be:

Not all characters of the string "bxg" occur in the string "abcdefg".

CodePudding user response：

A very simple method in pure bash:

#!/bin/bash

var_1="abcdefg"
var_2="bcg"

if [[ ${var_2//[$var_1]} ]]; then
    echo "Not all characters of var_2 occur in var_1."
else
    echo "All characters of var_2 occur in var_1."
fi

The ${var_2//[$var_1]} expands to the value of var_2 with all characters that occur in var_1 deleted. All characters of var_2 occur in var_1 only if that expansion is null string.