I have a problem in my Django project. I have some cases like;
Case Type | Status | Sub Case | Stage |
---|---|---|---|
Volleyball | 2020 | finished | win |
Basketball | 2022 | pending | none |
Basketball | 2021 | finished | loss |
Volleyball | 2020 | finished | win |
I want to create a query for example are there how many cases is volleyball - 2020 - finished - win or are there how many cases is basketball - 2021 - finished - loss There are a lot of cases with different attributes, I have to count them with a dynamical query. Maybe pandas can help me with that but I don't know. Any idea can help me.
df = pd.DataFrame(list(Case.objects.all().values()))
test = df.groupby(['case_type_id', 'status_id', 'current_stage']).value_counts().reset_index()
CodePudding user response:
what is your expected output?
IIUC, is that what you're looking for?
RESULT
df.groupby(['Case Type','Status','Sub Case']).value_counts().reset_index()
OR
df.groupby(['Case Type','Status','Sub Case']).value_counts(df['Stage']).reset_index()
OR
df.assign(cnt=1).groupby(['Case Type','Status','Sub Case','Stage'])['cnt'].sum().reset_index()
Case Type Status Sub Case Stage 0
0 Basketball 2021 finished loss 1
1 Basketball 2022 pending none 1
2 Volleyball 2020 finished win 2
Data
data= {'Case Type': ['Volleyball ', 'Basketball ', 'Basketball ', 'Volleyball '],
'Status': [ 2020, 2022, 2021, 2020],
'Sub Case': [ 'finished ', 'pending ', 'finished ', 'finished '],
'Stage': [ 'win', 'none', 'loss', 'win']}
df=pd.DataFrame(data)
df
Case Type Status Sub Case Stage
0 Volleyball 2020 finished win
1 Basketball 2022 pending none
2 Basketball 2021 finished loss
3 Volleyball 2020 finished win