I have list of dicts with id numbers, I need to group it by main_id
and second_id
and count values in each group. What is the best Python way to reach this?
I'm tried with Pandas, but don't get dict with groups and counts
df = pd.DataFrame(data_list)
df2 = df.groupby('main_id').apply(lambda x: x.set_index('main_id')['second_id']).to_dict()
print(df2)
List looks like:
[
{
"main_id":34,
"second_id":"2149"
},
{
"main_id":82,
"second_id":"174"
},
{
"main_id":24,
"second_id":"4QCp"
},
{
"main_id":34,
"second_id":"2149"
},
{
"main_id":29,
"second_id":"126905"
},
{
"main_id":34,
"second_id":"2764"
},
{
"main_id":43,
"second_id":"16110"
}
]
I need result like:
[
{
"main_id":43,
"second_id":"16110",
"count": 1
},
{
"main_id":34,
"second_id":"2149",
"count": 2
}
]
CodePudding user response:
You could use collections
(from the standard library) instead of pandas. I assigned the list of dicts to xs
:
import collections
# create a list of tuples; each is (main_id, secondary_id)
ids = [ (x['main_id'], x['second_id']) for x in xs ]
# count occurrences of each tuple
result = collections.Counter(ids)
Finally, result
is a dict, which can be readily converted to the final form (not shown).
Counter({(34, '2149'): 2,
(82, '174'): 1,
(24, '4QCp'): 1,
(29, '126905'): 1,
(34, '2764'): 1,
(43, '16110'): 1})
CodePudding user response:
You can use pandas.DataFrame.groupby.size
to measure the size of each group and convert it back to a dictionary:
out = list(pd.DataFrame(data_list).groupby(['main_id','second_id']).size().reset_index().rename({0:'count'}, axis=1).T.to_dict().values())
Output:
[{'main_id': 24, 'second_id': '4QCp', 'count': 1},
{'main_id': 29, 'second_id': '126905', 'count': 1},
{'main_id': 34, 'second_id': '2149', 'count': 2},
{'main_id': 34, 'second_id': '2764', 'count': 1},
{'main_id': 43, 'second_id': '16110', 'count': 1},
{'main_id': 82, 'second_id': '174', 'count': 1}]