Use pandas groupby to find unique combinations of columns and save as df-CodePudding

I have the following pandas Dataframe:

df = pd.DataFrame({'Colors': ['blue', 'blue', 'orange', 'red',
                              'purple', 'orange', 'purple', 'blue', 'brown'], 
                   'Price': ['500', '500', '200', '250', '300', '765', '1100', '762', 
                              '650', '625'],
                   'Style': ['farm', 'contemporary', 'modern', 'MDM', 'MDM', 
                             'contemporary', 'farm', 'contemporary', 'farm'],
                   'Location': ['far', 'near', 'far', 'far', 'near', 'far', 'far', 'near', 
                                'far']})

I can do df.groupby(['Colors', 'Price', 'Style', 'Location']).size() to get the breakdown by color to see unique combinations of Price, Style and Location wrt to Color returned as a Series.

Question - how can I use this to create a new pandas Dataframe where each column corresponds to a color (Blue, Orange, Red, etc.) and the unique combinations of values (500_contemporary_near, 500_farm_far, etc.)?

An example of the output I'm trying to generate is a dataFrame where there is a column called "Blue" and each value under (each row) is a strings such as 500_contemporary_near.

Blue

500_contemporary_near
500_farm_far

CodePudding user response：

You can do unstack

df.groupby(['Colors', 'Price', 'Style', 'Location']).size().unstack(level=0)

CodePudding user response：

How about this:

pd.DataFrame([{k[0]: '_'.join(k[1:])} for k in \
   df.groupby(['Colors','Price', 'Style', 'Location']).groups]).fillna('')


    blue    brown   orange  purple  red
0   500_contemporary_near               
1   500_farm_far                
2   762_contemporary_near               
3           650_farm_far            
4                   200_modern_far      
5                   765_contemporary_far        
6                           1100_farm_far   
7                           300_MDM_near    
8                                   250_MDM_far