Home > Net >  Check if variables are matching and if not find the difference
Check if variables are matching and if not find the difference

Time:04-02

I have data in 2 variables.

var1='abc;mno;def'
var2='mno;xyz'
** var2 can also be 'def;mno;abc'   

I want to compare var1 and var2

if [[ $var1=$var2 ]]
  then  
    echo "MATCHED"
  else 
    echo "Not Matched"
fi

This would give me basic validation but my requirement is bit different. I want data present in var1 but missing in var2 and also data present in var2 but missing in var1

I want data in below format :

abc;mno;def,mno;xyz,Not Matched,abc;def(## data present in var1 but missing in var2),xyz(data present in var2 but missing in var1)

Second case var1='abc;mno;def' var2='def;mno;abc'

Output

abc;mno;def,def;mno;abc,Matched, ,

Any suggestion would be great.

CodePudding user response:

Assuming you are using bash, would you please try the following:

#!/bin/bash

var1='abc;mno;def'
var2='mno;xyz'
#var2='def;mno;abc'

while IFS=';' read -r c1 c2 c3; do
    [[ -n $c1 ]] && col1 =("$c1")       # present in var1 only
    [[ -n $c2 ]] && col2 =("$c2")       # present in var2 only
    [[ -n $c3 ]] && col3 =("$c3")       # present in both var1 and var2
done < <(comm <(tr ';' '\n' <<< "$var1" | sort) <(tr ";" "\n" <<< "$var2" | sort
) | tr '\t' ';')

if (( ${#col1[@]} == 0 && ${#col2[@]} == 0 )); then
    msg="Matched"                       # both col1 and col2 are empty
else
    msg="Not Matched"
fi

printf "%s,%s,%s,%s,%s\n" "$var1" "$var2" "$msg" \
        "$(IFS=';'; echo "${col1[*]}")" "$(IFS=';'; echo "${col2[*]}")"

Output:

abc;mno;def,mno;xyz,Not Matched,abc;def,xyz

Output for the second case:

abc;mno;def,def;mno;abc,Matched,,
  • tr ';' '\n' <<< "$var1" | sort breaks $var1 into lines of sorted data to feed to comm. Same with $var2.
  • comm <(...) <(...) compares the two inputs then sorts the data into three columns depending on the uniqueness of the data.
  • The tab characters, the field separator of comm output, are replaced with ; to be properly handled with read command. Otherwise read will skip leading field separators.
  • The data in the 1st column (unique to var1) are accumulated in the array col1. Same with col2 and col3.
  • We can check for a match of $var1 and $var2 by examining the length of arrays col1 and col2. If both are empty, the variables match.

Here is an awk alternative just for reference:

#!/bin/bash

var1='abc;mno;def'
var2='mno;xyz'
#var2='def;mno;abc'

awk -v var1="$var1" -v var2="$var2" '
BEGIN {
    split(var1, ai, ";")        # split var1 on ";"
    split(var2, bi, ";")        # same as above
    for (i in ai) a[ai[i]]      # generate associavive array
    for (i in bi) b[bi[i]]      # same as above

    for (i in a) {
        if (! (i in b)) {       # seen in a, not b
            uniq1[i]            # then store it in uniq1
        }
    }
    for (i in b) {
        if (! (i in a)) {       # seen in b, not a
            uniq2[i]            # then store it in uniq2
        }
    }

    fs = ""
    for (i in uniq1) {          # join elements of uniq1 with ";"
        u1 = u1 fs i            # into a string u1
        fs = ";"
    }
    fs = ""
    for (i in uniq2) {          # join elements of uniq2 with ";"
        u2 = u2 fs i            # into a string u2
        fs = ";"
    }

    msg = (length(u1) == 0 && length(u2) == 0)? "Matched" : "Not Matched"
    printf("%s,%s,%s,%s,%s\n", var1, var2, msg, u1, u2)
}'
  • Related