Home > Mobile >  Customise column names in df.pivot Python
Customise column names in df.pivot Python

Time:11-17

I would like to pivot a dataframe df as below, with Id as index, Value as values, and customised columns [x0, x1, x2].

df

            Id  Date                 Value
    9129    10  2021-01-01 00:00:00  10
    9130    10  2021-01-01 00:01:00  15
    9131    10  2021-01-01 00:02:00  20
    1972    13  2021-01-01 00:00:00  125.0
    1973    13  2021-01-01 00:01:00  225.0
    1974    13  2021-01-01 00:02:00  325.0
    1134    15  2021-01-01 00:00:00  100000
    1135    15  2021-01-01 00:01:00  200000
    1136    15  2021-01-01 00:02:00  300000

Expected output:

      x0     x1     x2            
10    10     15     20    
13    125    225    325   
15    100000 200000 300000 

I tried:

df.pivot(index = 'Id', columns = ['x0','x1','x2'],  values = 'Value')

and it caught error:

KeyError                                  Traceback (most recent call last)
<ipython-input-90-02a6ff0e046a> in <module>


----> 3 df.pivot(index = 'Id', columns = ['x0','x1','x2'],  values = 'Value')

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py in pivot(self, index, columns, values)
   6665         from pandas.core.reshape.pivot import pivot
   6666 
-> 6667         return pivot(self, index=index, columns=columns, values=values)
   6668 
   6669     _shared_docs[

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\reshape\pivot.py in pivot(data, index, columns, values)
    463             index = [data[idx] for idx in index]
    464 
--> 465         data_columns = [data[col] for col in columns]
    466         index.extend(data_columns)
    467         index = MultiIndex.from_arrays(index)

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\reshape\pivot.py in <listcomp>(.0)
    463             index = [data[idx] for idx in index]
    464 
--> 465         data_columns = [data[col] for col in columns]
    466         index.extend(data_columns)
    467         index = MultiIndex.from_arrays(index)

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2897             if self.columns.nlevels > 1:
   2898                 return self._getitem_multilevel(key)
-> 2899             indexer = self.columns.get_loc(key)
   2900             if is_integer(indexer):
   2901                 indexer = [indexer]

~\AppData\Roaming\Python\Python38\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2889                 return self._engine.get_loc(casted_key)
   2890             except KeyError as err:
-> 2891                 raise KeyError(key) from err
   2892 
   2893         if tolerance is not None:

KeyError: 'x0'

How do you customise the column names? How to do it for 50 columns, ie., from x0 to x49?

CodePudding user response:

Create a sequential counter using groupby cumcount, then use it as columns in pivot_table

df.pivot_table('Value', 'Id', df.groupby('Id').cumcount()).add_prefix('x')

          x0        x1        x2
Id                              
10      10.0      15.0      20.0
13     125.0     225.0     325.0
15  100000.0  200000.0  300000.0
  • Related