How to extract specific number of bits from a hexadecimal number for a given text file-CodePudding

This is input file: input.txt

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0         PS_00[31]      PS_00[00]      0x00000000          0x156A17[00]     0x156A17[31]      0x0003F4a1 
PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x0a56F001
PS_N_YD_C       PS_03[06]      PS_03[06]      0x00000000          0x1678A1[06]     0x1678A1[06]      0x0a56F001
PS_1_FG         PS_03[31]      PS_03[07]      0x000000FF          0x1678A1[07]     0x1678A1[31]      0x0a56F001
PS_F_23_ASD     PS_04[07]      PS_03[00]      0x00000000          0x18C550[00]     0x18C550[07]      0x00000000
PS_A_0_STR      PS_04[15]      PS_04[08]      0x00000FFF          0x18C550[08]     0x18C550[15]      0x00000000
PS_AD_0         PS_04[31]      PS_04[16]      0x00000000          0x18C550[16]     0x18C550[31]      0x00000000

here i need to extract the bits in this way:

if value of new = 0x0a56F001 then first i need that to be converted to binary 0000 1010 0101 0110 1111 0000 0000 0001 .

Then check above bit and below bit column.

for eg: PS_03[05] PS_03[00] then take 0 to 5th bit of new binary value which is 000001 which is 0x1 and then convert this to 32 bit value i.e 0x00000001. and replace new column of that row with this value.

PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x00000001

similarly for all and finally the output file should look like this:

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0         PS_00[31]      PS_00[00]      0x00000000          0x156A17[00]     0x156A17[31]      0x0003F4a1 
PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x00000001
PS_N_YD_C       PS_03[06]      PS_03[06]      0x00000000          0x1678A1[06]     0x1678A1[06]      0x00000000
PS_1_FG         PS_03[31]      PS_03[07]      0x000000FF          0x1678A1[07]     0x1678A1[31]      0x0014ADE0
PS_F_23_ASD     PS_04[07]      PS_03[00]      0x00000000          0x18C550[00]     0x18C550[07]      0x00000000
PS_A_0_STR      PS_04[15]      PS_04[08]      0x00000FFF          0x18C550[08]     0x18C550[15]      0x00000000
PS_AD_0         PS_04[31]      PS_04[16]      0x00000000          0x18C550[16]     0x18C550[31]      0x00000000

Is this possible in Python? This is current attempt:

with open("input.txt") as fin:
    with open("output.txt", "w") as fout:
         for line in fin:
             if line.strip():
                 line = line.strip("\n' '")
                 cols = l.split(" ")
                 cols[6] = int(cols[6],16)

i tried by selecting specific column but it is not working.

CodePudding user response：

You can use split to split the lines, then a regex to extract the above and below values.

To compute the new value, you can right shift the integer value by (31 - above_bit), and then only keep the (above - below 1) least signicant bits with a bitwise and with 2**n - 1.

Possible code:

import re

# compile the regex
bit_re = re.compile(r'.*\[(\d{2})\]')

with open("input.txt") as fin, open("output.txt", "w") as fout:
    line = next(fin)          # skip header line
    fout.write(line)
    for line in fin:
        row = line.split()    # extract fields
        # print(row)          # uncomment for traces
        # extract above and below values
        above = int(bit_re.match(row[1]).group(1))   1  # add 1 to meet Python end rules
        below = int(bit_re.match(row[2]).group(1))
        val = int(row[6],16) >> (32 - above)
        val = val & (2**(above - below) - 1)
        row[6] = format(val, '#010x')    # format the result as a 32 bits hex number
        print(*row, file=fout)

with for sample data it gives:

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0 PS_00[31] PS_00[00] 0x00000000 0x156A17[00] 0x156A17[31] 0x0003f4a1
PS_RST_D2 PS_03[05] PS_03[00] 0x00000003 0x1678A1[00] 0x1678A1[05] 0x00000002
PS_N_YD_C PS_03[06] PS_03[06] 0x00000000 0x1678A1[06] 0x1678A1[06] 0x00000001
PS_1_FG PS_03[31] PS_03[07] 0x000000FF 0x1678A1[07] 0x1678A1[31] 0x0056f001
PS_F_23_ASD PS_04[07] PS_03[00] 0x00000000 0x18C550[00] 0x18C550[07] 0x00000000
PS_A_0_STR PS_04[15] PS_04[08] 0x00000FFF 0x18C550[08] 0x18C550[15] 0x00000000
PS_AD_0 PS_04[31] PS_04[16] 0x00000000 0x18C550[16] 0x18C550[31] 0x00000000

You could get a better formatting by replacing the end of line with the new value...

CodePudding user response：

The first problem is that you have many spaces. When splitting at the space, you get a lot of empty columns. Replace many spaces with a single one first:

import re
line = re.sub('  ', ' ', line)

Then, 0x0a56F001 is a hexadecimal number. To read it from the text file, use int(cols[6], 16), not int(cols[6], 2), which attempts to read it as binary.

You can then get a 32 digit binary string like this

number = int(cols[6],16)
binary_string = f"{number:032b}"

Now do the slicing, then convert it back with

sliced_number = int( ..., 2)

CodePudding user response：

For reading input-Data like this I like to use pandas. (update at the end of answer)

To get the number of the above and the below bit, you can use indexing of the string like:

sAboveBit ="PS_03[05]"
iAboveBit = int(sAboveBit[-3:-1])

Or much safer:

iAboveBit = int(sAboveBit.split("[")[-1].split("]")[0])

For creating the new value, you could use a bitwise-AND with an integer which you can calculate with your aboveBit and belowBit

first way I think of is a for loop:

iSumUp = 0
for i in range(iBelowBit,iAboveBit 1):
    iSumUp =2**i

To getting your number in hex you can use the module/package bitstring.

import bitstring as bs
sOldNew = "0x0a56F001"
iOldNew = bs.BitArray(sOldNew).uint

Now you can use a bitwise AND

iNewNew = iOldNew & iSumUp

And finally create your new hex-string with a formatted string.

sNewNew = f"0x{iNewNew:08x}"

At least save your date to your (new) file, for which I also prefer using pandas.

Update:

For reading your data with pandas:

import pandas as pd
df =pd.read_csv(r'input.txt',delimiter="\t")
print(df)