I would like to pivot a dataframe df
as below, with Id
as index, Value
as values, and customised columns [x0, x1, x2]
.
df
Id Date Value
9129 10 2021-01-01 00:00:00 10
9130 10 2021-01-01 00:01:00 15
9131 10 2021-01-01 00:02:00 20
1972 13 2021-01-01 00:00:00 125.0
1973 13 2021-01-01 00:01:00 225.0
1974 13 2021-01-01 00:02:00 325.0
1134 15 2021-01-01 00:00:00 100000
1135 15 2021-01-01 00:01:00 200000
1136 15 2021-01-01 00:02:00 300000
Expected output:
x0 x1 x2
10 10 15 20
13 125 225 325
15 100000 200000 300000
I tried:
df.pivot(index = 'Id', columns = ['x0','x1','x2'], values = 'Value')
and it caught error:
KeyError Traceback (most recent call last)
<ipython-input-90-02a6ff0e046a> in <module>
----> 3 df.pivot(index = 'Id', columns = ['x0','x1','x2'], values = 'Value')
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py in pivot(self, index, columns, values)
6665 from pandas.core.reshape.pivot import pivot
6666
-> 6667 return pivot(self, index=index, columns=columns, values=values)
6668
6669 _shared_docs[
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\reshape\pivot.py in pivot(data, index, columns, values)
463 index = [data[idx] for idx in index]
464
--> 465 data_columns = [data[col] for col in columns]
466 index.extend(data_columns)
467 index = MultiIndex.from_arrays(index)
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\reshape\pivot.py in <listcomp>(.0)
463 index = [data[idx] for idx in index]
464
--> 465 data_columns = [data[col] for col in columns]
466 index.extend(data_columns)
467 index = MultiIndex.from_arrays(index)
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\frame.py in __getitem__(self, key)
2897 if self.columns.nlevels > 1:
2898 return self._getitem_multilevel(key)
-> 2899 indexer = self.columns.get_loc(key)
2900 if is_integer(indexer):
2901 indexer = [indexer]
~\AppData\Roaming\Python\Python38\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:
-> 2891 raise KeyError(key) from err
2892
2893 if tolerance is not None:
KeyError: 'x0'
How do you customise the column names? How to do it for 50 columns, ie., from x0
to x49
?
CodePudding user response:
Create a sequential counter using groupby
cumcount
, then use it as columns in pivot_table
df.pivot_table('Value', 'Id', df.groupby('Id').cumcount()).add_prefix('x')
x0 x1 x2
Id
10 10.0 15.0 20.0
13 125.0 225.0 325.0
15 100000.0 200000.0 300000.0