Identifying hash encoding-CodePudding

I am creating a function that will accept an input and determine if the value is a certain type of hash encoding (md5, sha1, sha256, and sha512). I have asked a few classmates and logically it makes sense, but clearly something is wrong.

#!/usr/bin/bash

function identify-hash() {
  encryptinput=$(echo $1 | grep -E -i '^[a-z0-9=] ${32}')
  if [[ -n $encryptinput ]]; then
    echo "The $1 is a valid md5sum string"
    exit
  else
    encryptinput=$(echo $1 | grep -E -i '^[a-z0-9=] ${40}')
    if [[ -n $encryptinput ]]; then
      echo "The $1 is a valid sha1sum string"
      exit
    else
      encryptinput=$(echo $1 | grep -E -i '^[a-z0-9=] ${64}')
      if [[ -n $encryptinput ]]; then
        echo "The $1 is a valid sha256sum string"
        exit
      else
        encryptinput=$(echo $1 | grep -E -i '^[a-z0-9=] ${128}')
        if [[ -n $encryptinput ]]; then
          echo "The $1 is a valid sha512sum string"
          exit
        else
          echo "Unable to determine the hash function used to generate the input"
        fi
      fi
    fi
  fi
}

identify-hash $1

I know that hashes have a specific number of characters for them, but I don't know exactly why it's not working. Removing the {32} out of line 4 allows it to answer as a md5sum, but than it assumes everything is md5sum.

Suggestions?

CodePudding user response：

Fixed your script. I advise you would have spotted most of the issues if you had used ShellCheck:

#!/usr/bin/env bash

identify_hash() {
  # local variables
  local -- encrypt_input
  local -- sumname

  # Regex capture the hexadecimal digits
  if [[ "$1" =~ ([[:xdigit:]] ) ]]; then
    encrypt_input="${BASH_REMATCH[1]}"
  else
    encrypt_input=''
  fi

  # Determine name of sum algorithm based on length of encrypt_input
  case "${#encrypt_input}" in
    32) sumname=md5sum ;;
    40) sumname=sha1sum ;;
    64) sumname=sha256sum ;;
    128) sumname=sha512sum ;;
    *) sumname=;;
  esac

  # If sum algorithm name found (sumname is not empty)
  if [ -n "$sumname" ]; then
    printf 'The %s is a valid %s string\n' "$encrypt_input" "$sumname"
  else
    printf 'Unable to determine the hash function used to generate the input\n' >&2
    exit 1
  fi
}

identify_hash "$1"

CodePudding user response：

Further explaining @Gordon Davissons' comment and some basics for anyone who stops by

NB This answer is extremely simplified to apply only to the current question. here's my preferred guide for more regex

Basics of regex

^ - start of a line
$ - end of a line
[...] - list of possible characters
- has special sauce
- a-z = all lowercase (English) letters; 0-9 = all digits; etc.
- also accepts character classes - e.g [:xdigit:] for hexadecimal characters
  - the expression is now [[:xdigit:]] - i.e [:class:] inside [...]
{...} - number of times the preceding expression should be matched
- ^[a]{1}$ will match a but not aa
- ^f[o]{2}d$ will match food but not fod, foood, fooo*d
- ^[a-z]{4}$ will match
  - ball ✔️ but not buffalo ❌
  - cove ✔️ but not cover ❌
  - basically any line ( because of the ^...$) containing a string of exactly 4 (English) alphabetic characters
- {1,5} - at least 1 and at most 5
- * - shorthand for {0,} meaning 0 or any number of times
- - shorthand for {1,} meaning at least 1; but no upper limit
- ? - shorthand for {1}

So ${32} is looking for 32 "end of line" \n in jargon and what you need is [a-z0-9=]{32} instead

BUT as also pointed out by Andrej Podzimek in the comments you need to match only hexadecimal [0-9a-f] characters which is the same as [:xdigit:]. Either can be used.

more Basics

. (fullstop/period) matches ANY character including spaces and special characters
(...) is to match patterns

[a-z ]*(chicken).*
will match anything from chicken coop to chicken soup and please pass that chicken cookbook, Alex?

[.] means period/fullstop not any character
note the space after z this is to make space (ascii 32 ) a possible character
and . is case-insensituve

PPS if it's for homework/assignment/schoolwork, please specify so in your question :)