Home > database >  Sum the values of a list of dictionaries when condition holds
Sum the values of a list of dictionaries when condition holds

Time:02-27

I want to sum the 'Output' values which belong to the same 'TimePoint' in the list of dicts below, without using pandas. In this example I have only 2 time series but I want a solution that will be applicable in more than 2.

List = [
 {'TimePoint': 1, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 50},
 {'TimePoint': 61, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 70},
 {'TimePoint': 121, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 90},
 {'TimePoint': 181, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 80},
 {'TimePoint': 241, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 50},
 {'TimePoint': 301, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 50},
 {'TimePoint': 361, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 40},
    ...                 ...                    ...                  ...                ...
 {'TimePoint': 1321, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 20},
 {'TimePoint': 1381, 'AreaIn': 'Area A', 'AreaOut': 'Area B', 'Line': 'Line_A_B', 'Output': 10},
 {'TimePoint': 1, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 100},
 {'TimePoint': 61, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 90},
 {'TimePoint': 121, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 50},
 {'TimePoint': 181, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 30},
 {'TimePoint': 241, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 30},
 {'TimePoint': 301, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 60},
 {'TimePoint': 361, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 10},
    ...       ...      ...         ...            ...    ...
 {'TimePoint': 1321, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 70},
 {'TimePoint': 1381, 'AreaIn': 'Area C', 'AreaOut': 'Area D', 'Line': 'Line_C_D', 'Output': 20}]

The result I expect should look like this:

New_list = [
 {'TimePoint': 1, 'Output': 150},
 {'TimePoint': 61, 'Output': 160},
 {'TimePoint': 121,'Output': 140},
 {'TimePoint': 181, 'Output': 110},
 {'TimePoint': 241, 'Output': 80},
 {'TimePoint': 301, 'Output': 110},
 {'TimePoint': 361, 'Output': 50},
    ...                 ...
 {'TimePoint': 1321, 'Output': 90},
 {'TimePoint': 1381, 'Output': 30}]

What I have tried below doesn't seem to work, as the 'Output' values are not summed up.

New_list = []
for key, group in itertools.groupby(List, lambda item: item['TimePoint']):
    new_dict = {}
    new_dict['TimePoint'] = key
    new_dict['Output'] = sum([item['Output'] for item in group])
    new_list.append(new_dict)

CodePudding user response:

First List is a reserved name in Python, therefore do not use it for variable names. Second does the solution have to be a list of dictionaries? It is not the easiest to work with.

For readability sake I would use two loops:

timepoint_dict = {}
for item in List:
    timepoint = str(item["TimePoint"])
    if timepoint in timepoint_dict:
        timepoint_dict[timepoint]  = item["Output"]
    else:
        timepoint_dict[timepoint] = item["Output"]

result_list = []
for key, value in timepoint_dict.items():
    result_list.append({"TimePoint": int(key), "Output": value})

From a performance view I advise against doing it in one loop because searching for the TimePoint field e.g. in a list-object has O^n-Complexity.

  • Related