Set name to groupby size column in Pandas-CodePudding

I have a data frame that I need to count the unique items of a certain row. In the example below, I want to label the name for the below function as "NUM_CIK". What's the best way to assign a name to the groupby column?

Current code:

    cik_groupby_cusip_occur = cik_groupby_cusip_occur.groupby(
        ['CUSIP'], sort=True)['CIK COMPANY'].size().sort_values(ascending=False)

Sample Output:

CUSIP
594918104    4560
037833100    4457
023135106    4053
02079K305    3545
478160104    3472

Wanted Output:

CUSIP       NUM_CIK
594918104    4560
037833100    4457
023135106    4053
02079K305    3545
478160104    3472

CodePudding user response：

Use Series.reset_index with name parameter:

(cik_groupby_cusip_occur = cik_groupby_cusip_occur
         .groupby('CUSIP')['CIK COMPANY']
         .size()
         .sort_values(ascending=False)
         .reset_index(name='NUM_CIK'))

Or Series.value_counts:

cik_groupby_cusip_occur = (cik_groupby_cusip_occur['CUSIP']
            .value_counts()
            .rename_axis('CUSIP')
            .reset_index(name='NUM_CIK'))

CodePudding user response：

Either use reset_index(name='NUM_CIK')

Or:

cik_groupby_cusip_occur = (cik_groupby_cusip_occur
 .groupby(['CUSIP'], sort=True)['CIK COMPANY']
 .agg(NUM_CIK='size')
 .sort_values(by='NUM_CIK', ascending=False)
)