How to assign data from a file to a dictionary?-CodePudding

I am just starting my coding adventure. My problem is that I have a file with structure:

% program   : RTKPOST 
% pos mode  : ppp-static
% solution  : forward
% elev mask : 10.0 deg
% dynamics  : off
% tidecorr  : off
% tropo opt : saastamoinen
% ephemeris : broadcast
% ====================================  END OF HEADER

and I would like the code to return a dictionary {program: "RTKPOST", pos_mode : "ppp-static"}

I was trying:

data = []
header = {}

with open("file.txt") as file:
    for line in file:
        if line.startswith("%"):
            key, val = line.split()
            header[key] = val
        else:
            data.append(line.split())

and got:

ValueError: too many values to unpack (expected 2)

CodePudding user response：

The syntax x, y = z can be used to assign to both x and y using the values in z, but it expects that z has the correct number of values available (in this case 2). This works e.g.

>>> x, y = [1, 2]
>>> x
1
>>> y
2

But this does not:

>>> x, y = [1, 2, 3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: too many values to unpack (expected 2)

Since each of your line.split()s have more than 2 values, key, val = line.split() will always produce this error. The problem is, that e.g. the last line of the file doesn't follow the same rules as the first lines. In general, this approach will not be robust.

I suggest instead using regular expressions to pick out the key-value pairs you want. In this way you can easily specify the format that the key-value pairs have in the file, and extract them from the whole file easily, if the format changes in the future, just change the regular expression.

I suggest the regular expression:

header_property = r'^% (. ):(. )$'

To interpret this regular expression, take a look at the re docs. In short, this will match lines that start with % , followed by strings of one or more characters with a : in the middle.

A full example using this is as follows:

import re
header_property = r'^% (. ):(. )$'

header = {}

with open("file.txt") as file:
    for line in file:
        match = re.search(header_property, line)
        if match is not None:
            key = match.group(1).strip()
            value = match.group(2).strip()
            header[key] = value

After which header will be

>>> header
{'program': 'RTKPOST', 'pos mode': 'ppp-static', ..., 'ephemeris': broadcast'}

Pretty much anything can be extracted from text files using regular expressions. It is worthwhile learning a bit about how to use them.

CodePudding user response：

Split on ':' since that's what separates your keys from your values.
Make sure not to try to split/parse that END OF HEADER line.

data = []
header = {}

with open("file.txt") as file:
    for line in file:
        if line.startswith("% =="):
            break
        if not line.startswith("%"):
            data.append(line.split())
            continue
        key, val = map(str.strip, line[1:].split(':'))
        header[key] = val

print(header)

prints:

{'program': 'RTKPOST', 'pos mode': 'ppp-static', 'solution': 'forward', 'elev mask': '10.0 deg', 'dynamics': 'off', 'tidecorr': 'off', 'tropo opt': 'saastamoinen', 'ephemeris': 'broadcast'}