Home > Software design >  Group by from multidimensional data using Pandas
Group by from multidimensional data using Pandas

Time:10-14

I am facing a problem when I need to group-by a dataframe by 'party_b' and also count the total number of how many times the' usage_type' is 'SMSMT' or 'MOC' comes.

Dataset:


list = [{
    '_score': 1.220763,
    '_source': {'response_id': '8801756091550_1633620760',
     'usage_type': 'SMSMT',
     'party_b': '8801810107222',
     'party_a': '8801756091550',
     'additionalProperties': {},
     'event_time': '20211007093240'}},
   {'_score': 1.220763,
    '_source': {'response_id': '8801756091550_1633625609',
     'usage_type': 'MOC',
     'party_b': '8801736636044',
     'party_a': '8801756091550',
     'partya_original': None,
     'additionalProperties': {},
     'event_time': '20211007105329'}},
   {'_score': 1.220763,
    '_source': {'response_id': '8801756091550_1633625851',
     'usage_type': 'MOC',
     'party_b': '8801777701826',
     'party_a': '8801756091550',
     'partya_original': None,
     'additionalProperties': {},
     'event_time': '20211007105731'}},
   {'_score': 1.220763,
    '_source': {'response_id': '8801756091550_1633626326',
     'usage_type': 'SMSMO',
     'party_b': '8801736636044',
     'party_a': '8801756091550',
     'partya_original': None,
     'additionalProperties': {},
     'event_time': '20211007110526'}}]```
Desired output:
'party_b' -> SMSMT(how many times comes) ->MOC(how many times comes) -> SMSMO(how many times comes)

How should I achieve this?

CodePudding user response:

Use:

df = pd.DataFrame(data=data)
count = df['_source'].apply(pd.Series).groupby('usage_type').size()

Output:

usage_type
MOC      2
SMSMO    1
SMSMT    1
  • Related