Home > Back-end >  Get first character of each string with BASH_REMATCH
Get first character of each string with BASH_REMATCH

Time:11-11

I'am trying to get the first character of each string using regex and BASH_REMATCH in shell script.

My input text file contain :

    config_text = STACK OVER FLOW

The strings STACK OVER FLOW must be uppercase like that.

My output should be something like this :

    SOF

My code for now is :

        var = config_text
        values=$(grep $var test_file.txt | tr -s ' '  '\n' | cut -c 1)
        if [[ $values =~ [=(.*)]]; then
           echo $values
        fi

As you can see I'am using tr and cut but I'am looking to replace them with only BASH_REMATCH because these two commands have been reported in many links as not functional on MacOs.

I tried something like this :

        var = config_text
        values=$(grep $var test_file.txt)
        if [[ $values =~ [=(.*)(\b[a-zA-Z])]]; then
           echo $values
        fi

VALUES as I explained should be :

    S O F

But it seems \b does not work on shell script. Anyone have an idea how to get my desired output with BASH_REMATCH ONLY. Thanks in advance for any help.

CodePudding user response:

First Put a valid shebang and paste your script at https://shellcheck.net for validation/recommendation.


With the assumption that the line starts with config and ends with FLOW e.g.

config_text = STACK OVER FLOW

Now the script.

#!/usr/bin/env bash

values="config_text = STACK OVER FLOW"
regexp="config_text = ([[:upper:]]{1})[^ ]  ([[:upper:]]{1})[^ ]  ([[:upper:]]{1}). $"

while IFS= read -r line; do
  [[ "$line" = "$values" && "$values" =~ $regexp ]] &&
  printf '%s %s %s\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"
done < test_file.txt

If there is Only one line or the target string/pattern is at the first line of the test_file.txt, the while loop is not needed.

#!/usr/bin/env bash

values="config_text = STACK OVER FLOW"
regexp="config_text = ([[:upper:]]{1})[^ ]  ([[:upper:]]{1})[^ ]  ([[:upper:]]{1}). $"

IFS= read -r line < test_file.txt
[[ "$line" = "$values" && "$values" =~ $regexp ]] &&
printf '%s %s %s\n' "${BASH_REMATCH[1]}" "${BASH_REMATCH[2]}" "${BASH_REMATCH[3]}"

CodePudding user response:

Bash's regexes are kind of cumbersome if you don't know how many words there are in the input string. How's this instead?

config_text="STACK OVER FLOW"
sed 's/\([^[:space:]]\)[^[:space:]]*/\1/g' <<<"$config_text"

CodePudding user response:

Another option rather than bash regex would be to utilize bash parameter expansion substring ${parameter:offset:length} to extract the desired characters:

$ read -ra arr <text.file ; printf "%s%s%s\n" "${arr[2]:0:1}" "${arr[3]:0:1}" "${arr[4]:0:1}"
SOF

CodePudding user response:

A generic BASH_REMATCH solution handling any number of words and any separator.

local input="STACK OVER FLOW" pattern='([[:upper:]] )([^[:upper:]]*)' result=""
while [[ $input =~ $pattern ]]; do
    result ="${BASH_REMATCH[1]::1}${BASH_REMATCH[2]}"                 
    input="${input:${#BASH_REMATCH[0]}}"
done
echo "$result"
# Output: "S O F"
  • Related