Home > database >  how to get substring from
how to get substring from


how to get substring from

 42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!         

to be


Note: there is whitespaces at end of lines, I tried removing whitespaces at end of lines but I cant.

I tried


s=$(awk '/BEGIN!@#/,/END#@!/' switch.log )

while IFS= read -r line 

  h=$(echo "$line" | awk '{$1=$1;print}')
  for i in {0..100}

    zzz=$(echo "$h"  | awk '{print $(NF-$i)}')

    if [ ! -z "$zzz" -a "$zzz" != " " ]; then

      hh=$(echo "$h"  | awk  '{print $(NF-$i)}') 
      echo "$zzz"

      echo  -e  "$zzz" >> ggg.txt


done <<< "$s"

I got


CodePudding user response:

Another option is using sed with the normal substitute method storing the text you want to keep as the first two backreferences. For example:

sed -E 's/^.*(BEGIN[^[:space:]] ).*(kEND[^[:space:]] )/\1\2/' <<< 'your string`

Example Use/Output

(note: updated to handle whitespace at the end)

$ sed -E 's/^.*(BEGIN[^[:space:]] ).*(kEND[^[:space:]] )/\1\2/' <<< '42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!'

(note: single-quoting the string is required due to '!')

CodePudding user response:

Using sed

$ sed -E 's/[0-9] [a-z]?  |  //g' input_file

CodePudding user response:

UPDATED, to fix an error: You have not defined precisely in your question, how the string to be extracted looks like in general, but based on your example, this would do:

if [[ $line =~ (BEGIN[^ ] )\ .*([^ ] END[^ ] ) ]]
  echo Pattern not found in line 1>&2

CodePudding user response:

I would harness GNU AWK for this task following way, let file.txt content be

 42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!        


awk 'BEGIN{FPAT="[^[:space:]]*(BEGIN|END)[^[:space:]]*";OFS=""}{$1=$1;print}' file.txt

gives output


Explanation: I inform GNU AWK using field pattern (FPAT) that field is BEGIN or (|) END, prefixed and suffixed by zero-or-more (*) non (^)-whitespace ([:space:]) characters and output field separator (OFS) is empty string, then for each line I do $1=$1 to trigger line rebuilt and print it. If you are sure only space characters are used in line you might elect to replace [^[:space:]] using [^ ]

(tested in gawk 4.2.1)

CodePudding user response:

The inherent logic of the transformation is unclear so you have a few options that will work for the sample input:

  • Remove all pairs of space-delimited hexadecimal digits and spaces
sed -nE -e 's/(^| )[[:xdigit:]]{2}( [[:xdigit:]]{2})*( |$)|  //g' \
        -e '/BEGIN!@#|END#@!/p'
  • print all the space-delimited substrings that contain BEGIN!@# or END#@!:
awk '
        ok = 0
        for (i = 1; i <= NF; i  )
            if ($i ~ /BEGIN!@#|END#@!/) {
                printf "%s", $i
                ok = 1
        if (ok)
            print ""
  • extract the substrings delimited by BEGIN!@# and END#@! and remove the space delimited content between them:
awk '
    match($0,/BEGIN!@#.*END#@!/) {
        s = substr($0,RSTART,RLENGTH)
        sub(/ .* | /,"",s)
        print s
  • Related