Home > database >  Pandas pd.pivot_table where aggfunc returns a set or list
Pandas pd.pivot_table where aggfunc returns a set or list

Time:09-29

I am looking for an efficient method to pivot or groupyby a data frame where the output is the list of unique items in that subgroup. I am not interested in using loops and would prefer to retain the result as a data frame. Thank you in advance.

Example df:

number letter fruit
101 a apple
101 b melon
101 a peach
201 b grape
101 b orange
101 a pear
201 b apple
201 a peach

My Code:

df=pd.DataFrame({'number':['101','101','101','201','101','101','201','201'], 'letter': ['a','b','a','b','b','a','b','a'], 'fruit':['apple','melon','peach','grape','orange','pear','apple','peach'] })
    
pd.pivot_table(df, index='number', columns='letter', values='fruit', aggfunc='sum')

Output:

letter a b
number
101 appleapplepear melonorange
201 peach grapeapple

Desired Output:

letter a b
number
101 [apple ,pear] [melon ,orange]
201 [peach] [grape ,apple]

CodePudding user response:

Try:

>>> pivoted = pd.pivot_table(df, index='number', columns='letter', values='fruit', aggfunc=lambda x: list(set(x)))
letter                     a                b
number                                       
101     [apple, peach, pear]  [melon, orange]
201                  [peach]   [grape, apple]

CodePudding user response:

You can also use pd.unique for the aggfunc, as follows:

pd.pivot_table(df, index='number', columns='letter', values='fruit', aggfunc=pd.unique)

Note that the output for a single item is not within a list. Some people prefer this but see whether it fits your preference.

Result:

letter                     a                b
number                                       
101     [apple, peach, pear]  [melon, orange]
201                    peach   [grape, apple]

If you like to put single item also into a list, you can modify the call to pd.unique a bit, as follows:

pd.pivot_table(df, index='number', columns='letter', values='fruit', aggfunc=lambda x: pd.unique(x).tolist())

Result:

letter                     a                b
number                                       
101     [apple, peach, pear]  [melon, orange]
201                  [peach]   [grape, apple]

Thus, you have 2 choices of output layouts for single item now.

  • Related