Home > OS >  How to perform a group by on a list in python?
How to perform a group by on a list in python?

Time:03-23

I would like to perform a group by on a list and calculate the average.

Here is the list:

[['Profit ratio', [[2016, 5], [2017, 10], [2018, 5], [2016, 5], [2017, 20], [2018, 10]]]

After grouping and averaging I would like the following:

[['Profit ratio', [[2016, 5], [2017, 15], [2018, 7.5]]

I have tried doing this with a loop, that gathers the years and appends the numbers to the end and then calculates the average. Is there a better approach?

CodePudding user response:

Yeah this seems fairly straightforward. Assuming your data is:

data_with_headers = [['Profit ratio',
     [[2016, 5],
      [2017, 10],
      [2018, 5],
      [2016, 5],
      [2017, 20],
      [2018, 10]]]]

And that there's more values here than just "Profit ratio," you could do something like:

from collections import defaultdict

result = []
for header, values in data_with_headers:
    raw_data = defaultdict(list)
    for year, value in values:
        raw_data[year].append(value)
    result.append([header, [[year, sum(values)/len(values)] for year, values in raw_data.items()]])

assert result == [['Profit ratio', [[2016, 5.0], [2017, 15.0], [2018, 7.5]]]]
  • Related