Conditional reading next rows in csv file with python-CodePudding

I have a csv file, in which data looks somewhat like this:

users      | some ids
user1      | 1,2,3,4,5
empty cell | 6,7,8
empty cell | 123,1890,345
user2      | 555,444,333
empty cell | 11,22,33

Since ids that are in rows where users data is empty belong to the last username above in that column I would like to get dictionary for each user, looking like that:

{'user1':[1,2,3,4,5,6,7,8,123,1890,345]}

I'm using python with csv.Dictreader

reader = csv.DictReader(infile, delimiter=',', quotechar='"')
for row in list(reader):
    if row['users'].startswith("user"):
        for id in get_id_list(row["some ids"]):
            update_dict(dict, row['users'], id)

and now I'm getting only {'user1':[1,2,3,4,5]}, is there a good way to check whether the first cell in row is empty and make a loop with that condition defined?

CodePudding user response：

This assumes that the first row WILL have some user. It remembers user of the row and only updates it when it changes. Something like this should work:

reader = csv.DictReader(infile, delimiter=',', quotechar='"')
current_user=None
for row in list(reader):
    if row['users'].startswith("user"):
        current_user=row['users']
    for id in get_id_list(row["some ids"]], id)
        update_dict(dict,current_user,id)

CodePudding user response：

I am not familiar with the csv library, but you can get the result by reading the file in normally and then appending to a defaultdict whenever the user value changes -

A defaultdict is like a dict but is more helpful when dealing with missing keys etc. You can read more about them here

from collections import defaultdict
d = defaultdict(list)
current_key = None
with open(f_path) as f:
    for line in f:
        if not line.startswith('users'):
            if line.startswith('user'):
                key, val = [_.strip() for _ in line.strip().split('|')]
                current_key = key
                d[current_key].extend(val.strip().split(','))
            else:
                if current_key is None:
                    continue
                key, val = [_.strip() for _ in line.strip().split('|')]
                d[current_key].extend(val.strip().split(','))

Output

defaultdict(list,
            {'user1': ['1',
              '2',
              '3',
              '4',
              '5',
              '6',
              '7',
              '8',
              '123',
              '1890',
              '345'],
             'user2': ['555', '444', '333', '11', '22', '33']})