Convert structured content of text file to list with dictionaries-CodePudding

I'm reading a text file like this:

ATTACHMENT1=:1.xlsm
ATTACHMENT1RNG1=:Entity
ATTACHMENT1VRNG1=:TOT^^ENT1
ATTACHMENT1RNG2=:country
ATTACHMENT1VRNG2=:A
ATTACHMENT2=:2.xlsm
ATTACHMENT2RNG1=:Entity
ATTACHMENT2VRNG1=:TOT
ATTACHMENT2RNG2=:dept
ATTACHMENT2VRNG2=:F0008

and want to load it in list with dictionaries as in:

[
{'File': [1.xlsm'], 'Entity': ['TOT', 'ENT1'], 'country': ['A']},
{'File': [2.xlsm'], 'Entity': ['TOT'], 'dept': ['F0008']}
]

'File' is a fixed prefix for ATTACHMENT1 and ATTACHMENT2. For the other lines I would like to have the value of RNGx as dictionary keys and the values of VRNGx as dictionary values.

I know I can split lines on '=:', I can also split a string based on a separator, but I cannot figure out how to create this data structure myself. Any guidance would be very much appreciated.

Thanks in advance.

CodePudding user response：

Assuming you can rely on the ordering, this is pretty easy to do with a state machine that just looks at the presence of the different suffixes:

with open("file.txt") as f:
    data = []
    key = ""
    for line in f:
        k, v = line.strip().split("=:")
        if "RNG" not in k:
            data.append({'File': [v]})
        elif "VRNG" not in k:
            key = v
        else:
            data[-1][key] = v.split("^^")

print(data)

[{'File': ['1.xlsm'], 'Entity': ['TOT', 'ENT1'], 'country': ['A']}, {'File': ['2.xlsm'], 'Entity': ['TOT'], 'dept': ['F0008']}]