Home > database >  In Bash is there a way to extract a word and n characters after it from a line?
In Bash is there a way to extract a word and n characters after it from a line?

Time:02-15

I am trying to extract the JIRA Ticket number from a string.

The Jira ticket might be mentioned any where in the line like:

  1. Merge pull request #1387 from Config-change/REL-12345

  2. REL-12345: Enable XAPI at config level

I just want REL-12345 as the output.

Can someone please help. Thanks!

CodePudding user response:

If this is the standard.....

Input: Merge pull request #1387 from Config-change/REL-12345

echo "Merge pull request #1387 from Config-change/REL-12345" | cut -d/ -f2

Input: REL-12345: Enable XAPI at config level

 echo "REL-12345: Enable XAPI at config level" | cut -d: -f1

CodePudding user response:

You can pass a String to sed and use substitution with REGEX, like this:

myString="This is REL-12345 a test string "
sed -n 's/.*\(\REL-5*[0-9]*\).*/\1/p' <<< $myString

this should return: REL-12345

CodePudding user response:

Sample data:

$ cat jira.dat
Merge pull request #1387 from Config-change/REL-12345
REL-12346: Enable XAPI at config level

One idea using bash regex matching and the resulting BASH_REMATCH[]:

regex='(REL-[[:digit:]] )'

while read -r line
do
    printf "\n########## ${line}\n"
    [[ "${line}" =~ ${regex} ]] && echo "${BASH_REMATCH[1]}"
done < jira.dat

This generates:

REL-12345
REL-12346

CodePudding user response:

Sample data:

$ cat jira.dat
Merge pull request #1387 from Config-change/REL-12345
REL-12346: Enable XAPI at config level

One idea using grep:

$ grep -Eo 'REL-[[:digit:]] ' jira.dat
REL-12345
REL-12346

CodePudding user response:

grep -Eow 'REL-[0-9] '

is one or more, to specifiy N numbers (eg 5):

grep -Eow 'REL-[0-9]{5}
  • Ranges: {3,6} is 3 to 6, {5,} is 5 or more, etc.
  • On GNU/Linux: man grep -> /Repetition for more details.
  • -o prints only matching strings
  • -w matches full words only, ie. to avoid matching WREL-12345 (for example)
  • grep -Eow 'REL-[[:alnum:]] ' for both letters and numbers (after REL-).
  • Related