How to split hexadecimal numbers in a text file on a specific condition?-CodePudding

I have a file named output.txt and it contain following datas.

number="0x12" bytesize="4" info="0x00000012"
number="0xA5" bytesize="8" info="0x00000000bbae321f"
number="0x356" bytesize="4" info="0x00000234"
number="0x48A" bytesize="8" info="0x00000000a18c3ba3"
number="0x156" bytesize="2" info="0x001A"
number="0x867" bytesize="1" info="0x0b"

using this file i need to create another file named new.txt in which if the byte size is 8 then the data needed to be splited and then write it to new.txt and if bytesize is 1,2,4 then write the data directly to new.txt.

eg:

number="0xA5" bytesize="8" info="0x00000000bbae321f"
number="0x48A" bytesize="8" info="0x00000000a18c3ba3"

here byte size is 8 then i need the info to be splitted "0x00000000" & "0xbbae321f" and stored them with new numbers, like this

number="0xA5" info="0xbbae321f"
number="0xD7" info="0x00000000"
number="0x48A" info="0xa18c3ba3"
number="0x4BC" info="0x00000000"

where, 0xD7 = 0xA5 0x32, 0x4BC = 0x48A 0x32

so finally new.txt should look like this:

number="0x12" info="0x00000012"
number="0xA5" info="0xbbae321f"
number="0xD7" info="0x00000000"
number="0x356" info="0x00000234"
number="0x48A" info="0xa18c3ba3"
number="0x4BC" info="0x00000000"
number="0x156" info="0x001A"
number="0x867" info="0x0b"

CodePudding user response：

I just provide one solution.

The most important thing is to get the bytesize item (and other items) from a line. I do this using regular expression.

import re                                                     

infile_path = "./output.txt"  # adjust to your case
outfile_path = "./new.txt"

with open(infile_path, "r") as infile, open(outfile_path, "w") as outfile:
    for s in infile:
        r = re.match('number="(.*)" bytesize="(.*)" info="(.*)"', s)
        if r:
            num, bs, info = map(lambda x: int(x, 0), r.groups())
            l = len(r.group(3)) - 2
            if bs == 8:
                l = 8
                nums = (num, num   0x32)
                infos = (info % (2**32), info // (2**32))
            else:
                nums = (num, )
                infos = (info, )
            for num, info in zip(nums, infos):
                outfile.write(f'number="{num:#x}" info="{info:#0{l 2}x}"\n')

CodePudding user response：

There are a number of ways that you could do this. Separating out the parsing of the file, processing the data, writing it seems to offer the best flexibility.

The number, bytesize, and info values appear to be integers so if they are stored as integers this makes checking the bytesize an easy operation. It also makes it easy to add 0x32 to the number when splitting bytesize=8 to two bytesize=4.

To find the two 4-byte values from the info integer, bitwise operations can be used.

To find the higher four bytes, shifting the bits to the right by places 32 bits will give the value of the four highest bytes.

To find the lowest four bytes, masking the four highest bytes with give the value of the lowest four bytes. This can be done with a bitwise & operation.

The example below has been tested in Python 3.10 but should work with most recent versions of Python 3.

It uses a dataclass to store the internal representation of the data.

from dataclasses import dataclass
from pathlib import Path
import re
from typing import List, Tuple


@dataclass
class Data:
    number: int
    bytesize: int
    info: int


def parse_file(filename: Path) -> List[Data]:
    data = []
    pattern = re.compile(
        r'number="0x([0-9a-fA-F] )"\s bytesize="(\d )"\s info="0x([0-9a-fA-F] )"')
    for line in filename.read_text().splitlines():
        match = pattern.search(line)
        if match:
            data.append(Data(number=int(match.group(1), 16),
                             bytesize=int(match.group(2)),
                             info=int(match.group(3), 16))
                        )
    return data


def two_size_four(entry: Data) -> Tuple[Data]:
    data1 = Data(entry.number, 4, entry.info & 0xffffffff)
    data2 = Data(entry.number   0x32, 4, entry.info >> 4 * 8)
    return data1, data2


def split_bytesize_8(data: List[Data]) -> List[Data]:
    new_data = []
    for entry in data:
        if entry.bytesize != 8:
            new_data.append(entry)
        else:
            split_data = two_size_four(entry)
            new_data.extend(split_data)
    return new_data


def writefile(filename: Path, data: List[Data]) -> None:
    lines = []
    for entry in data:
        lines.append(
            f'number="{entry.number:#x}" '
            f'info="{entry.info:#0{entry.bytesize * 2   2}x}"')
    output_txt = "\n".join(lines)
    print(output_txt)
    filename.write_text(output_txt)


def main(filename_in: Path, filename_out: Path) -> None:
    data = parse_file(filename_in)
    data = split_bytesize_8(data)
    writefile(filename_out, data)


if __name__ == '__main__':
    input_file = Path(__file__).parent.joinpath("data", "output.txt")
    output_file = Path(__file__).parent.joinpath("data", "new.txt")
    main(input_file, output_file)

If you are running a version of Python before 3.7 then the Data class can be created without using dataclasses as follows:

class Data:
    def __init__(self, number: int, bytesize: int, info: int):
        self.number = number
        self.bytesize = bytesize
        self.info = info

All other code should be the same