Home > OS >  Count number of occurances in List of Dictionaries with Python
Count number of occurances in List of Dictionaries with Python

Time:08-17

I'm just trying to count occurrences of list of dictionaries in the python list. and I want to append count of occurances based on some sort of filtering. I saw an example to implement this in django models. I just don't know how to implement that approach in pure pythonic way.

Similar approach but with sum, I need only count, Count the values in list of dictionaries python

dataset =[
    {
        'Country': 'GB',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
    {
        'Country': 'AL',
        'Pollutant': 'O3',
        'Pollution': 10,
        'Year': 2015
    },
    {
        'Country': 'BE',
        'Pollutant': 'SO2',
        'Pollution': 5,
        'Year': 2016
    },
    {
        'Country': 'GB',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
    
    {
        'Country': 'BE',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
]

all_Years = sorted(set([v['Year'] for v in dataset]))
all_countries = sorted(set([v['Country'] for v in dataset]))
print(all_Years)
print(all_countries)

for country in all_countries:

    new_data = {
    'years': all_Years,
    'country': country,
    'data' : []
    }

   for year in all_Years:

    #some sort of filter function to apply like in django models
      f = Q(year = year, country=country)

    #apply that filltering
      country_count = dataset.filter(f).count()
      new_data['data'].append(country_count if country_count else 0)


#expected output
#new_data = 
[{'years': [2015,2016], 'country': 'AL', 'data': [1, 0]},
{'years': [2015,2016], 'country': 'BE', 'data': [1, 2]},
{'years': [2015,2016], 'country': 'GB', 'data': [0, 2]}]

CodePudding user response:

Create an intermediate dataset with the list of hits by country, then convert that to rows.

dataset =[
    {
        'Country': 'GB',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
    {
        'Country': 'AL',
        'Pollutant': 'O3',
        'Pollution': 10,
        'Year': 2015
    },
    {
        'Country': 'BE',
        'Pollutant': 'SO2',
        'Pollution': 5,
        'Year': 2016
    },
    {
        'Country': 'GB',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
    
    {
        'Country': 'BE',
        'Pollutant': 'SO2',
        'Pollution': 10,
        'Year': 2016
    },
]


all_years = set()
gather = {}
for row in dataset:
    all_years.add( row['Year'] )
    if row['Country'] not in gather:
        gather[row['Country']] = []
    gather[row['Country']].append( row['Year'] )

all_years = sorted(list(all_years))
out = []
for k,v in gather.items():
    out.append( {
        'years': all_years,
        'country': k,
        'data': [v.count(y) for y in all_years]
    })
print(out)

Output:

[{'years': [2015, 2016], 'country': 'GB', 'data': [0, 2]}, {'years': [2015, 2016], 'country': 'AL', 'data': [1, 0]}, {'years': [2015, 2016], 'country': 'BE', 'data': [0, 2]}]

Or, obligatory one liner:

out = [{
        'years': all_years,
        'country': k,
        'data': [v.count(y) for y in all_years]
    } for k,v in gather.items()]
  • Related