Home > Enterprise >  Decode binary octet string in a file with perl
Decode binary octet string in a file with perl

Time:07-14

I have a file that contains for some of the lines a number that is coded as text -> binary -> octets and I need to decode that to end up with the number. All the lines where this encoded string is, begins with STRVID:

For example I have in one of the lines:

STRVID: SarI3gXp

If I do this echo "SarI3gXp" | perl -lpe '$_=unpack"B*"' I get the number in binary

0101001101100001011100100100100100110011011001110101100001110000

Now just to decode from binary to octets I do this (assign the previous command to a variable and then convert binary to octets

variable=$(echo "SarI3gXp" | perl -lpe '$_=unpack"B*"') ; printf '%x\n' "$((2#$variable))"

The result is the number but not in the correct order

5361724933675870

To get the previous number in the correct order I have to get for each couple of digits first the second digit and then the first digit to finally have the number I'm looking for. Something like this:

variable=$(echo "SarI3gXp" | perl -lpe '$_=unpack"B*"') ; printf '%x\n' "$((2#$variable))" | gawk 'BEGIN {FS = ""} {print $2 $1 $4 $3 $6 $5 $8 $7 $10 $9 $12 $11 $14 $13 $16 $15}'

And finally I have the number I'm looking for:

3516279433768507

I don't have any clue on how to do this automatically for every line that begins with STRVID: in my file. At the end what I need is the whole file but when a line begins with STRVID: then the decoded value.

When I find this:

STRVID: SarI3gXp

I will have in my file

STRVID: 3516279433768507

Can someone help with this?

CodePudding user response:

you can cross flip the numbers entirely via regex (and without back-references either) :

variable=$(echo "SarI3gXp"        | perl -lpe '$_=unpack"B*"') ; 
printf '%x\n' "$((2#$variable))"  | 

mawk -F'^$' 'gsub("..", "_&=&_")   gsub(\
                "(^|[0-9]_)(_[0-9]|$)", _) gsub("=",_)^_'
 1  3516279433768507

The idea is to make a duplicate copy on the other side, like this :

_53=53__61=61__72=72__49=49__33=33__67=67__58=58__70=70_

then scrub out the leftovers, since the numbers u now want are anchoring the 2 sides of each equal sign ("=")

CodePudding user response:

Please inspect the following sample demo code snippet for compliance with your problem.

You do not need double conversion when it can be done in one go.

Note: please read pack documentation , unpack utilizes same TEMPLATE

use strict;
use warnings;
use feature 'say';

while( <DATA> ) {
    chomp;
    /^STRVID: (. )/
        ? say 'STRVID: ' . unpack("h*",$1) 
        : say; 

}

__DATA__
It would be nice if you provide proper input data sample
 
STRVID: SarI3gXp

Perhaps the result of this script complies with your requirements.

To work with real input data file replace

while( <DATA> ) {

with 

while( <> ) {

and pass filename as an argument to the script.

Output

It would be nice if you provide proper input data sample

STRVID: 3516279433768507

Perhaps the result of this script complies with your requirements.

To work with real input data file replace

while( <DATA> ) {

with

while( <> ) {

and pass filename as an argument to the script.

./script.pl input_file.dat

  • Related