How to group , count, and unstack a pandas dataframe based on multiple columns values?-CodePudding

I have the following pandas data frame in which I stored my Win/Loss results of multiple models and multiple companies

company	Model_1	Winloss	Model_2	Winloss2
Company1	KNN	W	GPR	L
Company1	KNN	L	PLS	W
Company1	KNN	L	KRR	W
Company1	KNN	L	XGB	W
Company1	GPR	L	SGD	W
Company2	GPR	L	PLS	W
Company2	KRR	L	XGB	W

I want to group by both company and models and count Win-loss for each model within same company so that I can later unstack the result to and have the output to look like this:

('company', '')	('DT', 'L')	('DT', 'W')	('GPR', 'L')	('KNN', 'L')	('KNN', 'W')	('KRR', 'W')	('PLS', 'W')	('SGD', 'W')	('SVR', 'L')	('SVR', 'W')
Company1	3.0	2.0	5.0	3.0	1.0	1.0	1.0	1.0	2.0	1.0
Company2	6.0	2.0	0.0	2.0	1.0	0.0	0.0	0.0	6.0	1.0
Company3	0.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0
Company4	6.0	1.0	5.0	0.0	1.0	0.0	0.0	0.0	0.0	1.0
Company5	7.0	1.0	5.0	0.0	1.0	0.0	0.0	0.0	0.0	2.0

The table above is a result of my follwoing code but the numbers results of counted values were not accurate :

WLPerCompany = WinLoss.groupby(['company','Model_1','Winloss']) 
['Winloss'].count()
WinLossResults = pd.DataFrame(WLPerCompany)
WinLossResults.columns = [*WinLossResults.columns[:-1], 'counts']
WinLossResults= WinLossResults['counts'].unstack(level=['Model_1', 
'Winloss'])
WinLossResults= WinLossResults.fillna(0)
WinLossResults

CodePudding user response：

Use wide_to_long for reshape first and then crosstab:

df = pd.wide_to_long(df.reset_index().rename(columns={'Winloss':'Winloss1'}), 
                     stubnames=['Model_','Winloss'], 
                     i=['index','company'], 
                     j='tmp').reset_index()

df = pd.crosstab(df['company'], [df['Model_'], df['Winloss']])
print (df)
Model_   GPR KNN    KRR    PLS SGD XGB
Winloss    L   L  W   L  W   W   W   W
company                               
Company1   2   3  1   0  1   1   1   1
Company2   1   0  0   1  0   1   0   1