Home > Enterprise >  Find if all characters in one string occur within another string
Find if all characters in one string occur within another string

Time:12-04

I am new to bash. I have a question about determining if all characters of one string occur within another string. For example, if the variables are:

var_1="abcdefg"
var_2="bcg"

Then I want to write an if statement of the form:

if [all characters of var_2 occur within var_1]
then
     echo "All characters of var_2 occur in var_1."
else
     echo "Not all characters of var_2 occur in var_1."
fi

In this example, the output should be All characters of var_2 occur in var_1. What would go in the if statement here?

This is what I tried:

if [[ $var_1 == *$var_2* ]]

... but I think this is only determines if var_2 is a substring of var_1. What I want is to determine if the characters of var_2 occur within var_1 in no particular order.

CodePudding user response:

The following oneliner should work:

echo -e "$var_2\0$var_1" | sed -E ':a;s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/;ta;s/^\x0.*/1/;s/.*\x0.*/0/'

It will print 0 or 1 to mean false or true respectively.

This is how it works:

  • echo -e allows using escape sequences, and \0 represents the null character, which I'm using to mark the separation between the two strings bcg and abcdefg.
  • The Sed script is not that complex:
    • -E is a non POSIX option allowing to use ( and ) instead of \( and \) to write capturing groups (and other similar simplifications which I'm not using here);
    • ;s separate commands;
    • :a is a label, and allows one jumping here via ta or ba (I use only the former, keep reading);
    • s/(.)(.*\x0)(.*)\1(.*)/\2\3\4/ does the following (which succeedes if there's at least one character in common between var_2 and var_1):
      • matches and captures the first character of var_2 with (.),
      • matches and captures the following part of var_2 together with the null character, (.*\x0) (yes, what you write as \0 in Bash is \x0 in Sed),
      • matches and captures 0 or more characters,
      • matches what was captured by first group, i.e. by (.),
      • matches and captures 0 or more characters up to the end of var_1,
      • substitutes all that was matched with what was captured by the 2nd, 3rd, and 4th capturing groups: in fact, we've got rid of one character in common between var_2 and var_1;
    • ta test if the previous substitution was successful and, if that's the case, it jumps to :a: this way we are running a loop as long as there's a characters in common between var_2 and var_1;
    • when ther's no characters in common between var_2 and var_1, the test will fail, and the control will fall through ta;
    • s/^\x0.*/1/ matches whatever is left, but only if the null character \x0 is leading, which happens if all letters of var_2 were found in var_1, and changes everything to just 1;
    • s/.*\x0.*/0/ will match everything, as long as there's still \x0 in the string, which happens only if the previous substitution failed, which means that some letter from var_2 was not found in var_1, and change it to 0.

CodePudding user response:

Not really an if clause/statement, something like.

#!/usr/bin/env bash

i=0
var_2="bcg"
var_1="abcdefg"
total_str=${#var_2}

while (( i < total_str )); do
  [[ $var_1 = *"${var_2:i  :1}"* ]] || {
    printf >&2 'Not all characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"
    exit 1
  }
done

printf 'All characters of the string "%s" occur in the string "%s".\n' "$var_2" "$var_1"

Output

All characters of the string "bcg" occur in the string "abcdefg".

Changing the value of var_2 to something like

var_2="bxg"

The output should be:

Not all characters of the string "bxg" occur in the string "abcdefg".

CodePudding user response:

A very simple method in pure bash:

#!/bin/bash

var_1="abcdefg"
var_2="bcg"

if [[ ${var_2//[$var_1]} ]]; then
    echo "Not all characters of var_2 occur in var_1."
else
    echo "All characters of var_2 occur in var_1."
fi

The ${var_2//[$var_1]} expands to the value of var_2 with all characters that occur in var_1 deleted. All characters of var_2 occur in var_1 only if that expansion is null string.

  • Related