how to get substring from
42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!
to be
BEGIN!@#Ghjk,GhjkEND#@!
Note: there is whitespaces at end of lines, I tried removing whitespaces at end of lines but I cant.
I tried
#!/bin/bash
s=$(awk '/BEGIN!@#/,/END#@!/' switch.log )
while IFS= read -r line
do
h=$(echo "$line" | awk '{$1=$1;print}')
for i in {0..100}
do
zzz=$(echo "$h" | awk '{print $(NF-$i)}')
if [ ! -z "$zzz" -a "$zzz" != " " ]; then
hh=$(echo "$h" | awk '{print $(NF-$i)}')
echo "$zzz"
echo -e "$zzz" >> ggg.txt
break
fi
done
done <<< "$s"
I got
BEGIN!@#Ghjk,Ghj
CodePudding user response:
Another option is using sed
with the normal substitute method storing the text you want to keep as the first two backreferences. For example:
sed -E 's/^.*(BEGIN[^[:space:]] ).*(kEND[^[:space:]] )/\1\2/' <<< 'your string`
Example Use/Output
(note: updated to handle whitespace at the end)
$ sed -E 's/^.*(BEGIN[^[:space:]] ).*(kEND[^[:space:]] )/\1\2/' <<< '42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!'
BEGIN!@#Ghjk,GhjkEND#@!
(note: single-quoting the string is required due to '!'
)
CodePudding user response:
Using sed
$ sed -E 's/[0-9] [a-z]? | //g' input_file
BEGIN!@#Ghjk,GhjkEND#@!
CodePudding user response:
UPDATED, to fix an error: You have not defined precisely in your question, how the string to be extracted looks like in general, but based on your example, this would do:
if [[ $line =~ (BEGIN[^ ] )\ .*([^ ] END[^ ] ) ]]
then
substring=${BASH_REMATCH[1]}${BASH_REMATCH[2]}
else
echo Pattern not found in line 1>&2
fi
CodePudding user response:
I would harness GNU AWK for this task following way, let file.txt
content be
42 45 47 49 4e 21 40 23 47 68 6a 6b 2c 47 68 6a BEGIN!@#Ghjk,Ghj 6b 45 4e 44 23 40 21 kEND#@!
then
awk 'BEGIN{FPAT="[^[:space:]]*(BEGIN|END)[^[:space:]]*";OFS=""}{$1=$1;print}' file.txt
gives output
BEGIN!@#Ghjk,GhjkEND#@!
Explanation: I inform GNU AWK using field pattern (FPAT
) that field is BEGIN
or (|
) END
, prefixed and suffixed by zero-or-more (*
) non (^
)-whitespace ([:space:]
) characters and output field separator (OFS
) is empty string, then for each line I do $1=$1
to trigger line rebuilt and print
it. If you are sure only space characters are used in line you might elect to replace [^[:space:]]
using [^ ]
(tested in gawk 4.2.1)
CodePudding user response:
The inherent logic of the transformation is unclear so you have a few options that will work for the sample input:
- Remove all pairs of space-delimited hexadecimal digits and spaces
sed -nE -e 's/(^| )[[:xdigit:]]{2}( [[:xdigit:]]{2})*( |$)| //g' \
-e '/BEGIN!@#|END#@!/p'
- print all the space-delimited substrings that contain
BEGIN!@#
orEND#@!
:
awk '
{
ok = 0
for (i = 1; i <= NF; i )
if ($i ~ /BEGIN!@#|END#@!/) {
printf "%s", $i
ok = 1
}
if (ok)
print ""
}
'
- extract the substrings delimited by
BEGIN!@#
andEND#@!
and remove the space delimited content between them:
awk '
match($0,/BEGIN!@#.*END#@!/) {
s = substr($0,RSTART,RLENGTH)
sub(/ .* | /,"",s)
print s
}
'