Groupby In Pandas Dataframe with MultiIndexing-CodePudding

PS: GroupBy with column as name

I have tried creating DataFrame with MultiIndexing:

import pandas as pd

df = [ [ 'las_00', '6', '3', '3', 'a', '1.03', '1.11', '1.11' ],
       [ 'las_01', '6', '3', '3', 'b', '1.03', '1.11', '1.11' ],
       [ 'las_02', '6', '3', '3', 'c', '1.03', '1.11', '1.11' ],
       [ 'las_03', '6', '3', '3', 'a', '1.03', '1.11', '1.11' ],
       [ 'las_03', '6', '3', '3', 'b', '1.03', '1.11', '1.11' ]
    ]


new_df = pd.DataFrame( df , columns = [ 'name, name', 'transactionCount, totalCount', 'transactionCount, passCount', 'transactionCount, failCount', 'status, failPerc', 'status, mean',
                    'status, perc90', 'status, max' ] )


a = new_df.columns.str.split( ', ', expand=True ).values

new_df.columns = pd.MultiIndex.from_tuples( [ ( ' ', x[ 0 ] ) if pd.isnull( x[ 1 ] ) else x for x in a])

Resultant dataframe is:

     name               transactionCount                       status
     name       totalCount passCount failCount failPerc  mean perc90   max
0  las_00                6         3         3        a  1.03   1.11  1.11
1  las_01                6         3         3        b  1.03   1.11  1.11
2  las_02                6         3         3        c  1.03   1.11  1.11
3  las_03                6         3         3        a  1.03   1.11  1.11
4  las_03                6         3         3        b  1.03   1.11  1.11

Now I want to use GroupBy with name I tried using level but not getting how to use column name. Could anyone help with this! Thanks

CodePudding user response：

Try this:

new_df.groupby(('name','name'))

CodePudding user response：

Also, you can groupby dataframe column slices:

new_df.groupby(new_df.columns[0])