I have the following dataset Data in Python:
Fac y
Date
2017-06-19 A 0.350
2017-06-20 A 1.220
2017-06-21 A 8.730
2017-06-22 A 9.340
2017-06-23 A 4.300
... ... ...
2022-01-25 Z 3.144
2022-01-26 Z 2.278
2022-01-27 Z 1.352
2022-01-28 Z 0.312
2022-01-31 Z 0.312
and I want to pivot them from long to wider that it's column to have the name of it's Fac category and to contain the values of y corresponding to each category.
I tried :
df = pd.pivot(Data, index='Date', columns='Fac', values='y')
but doesn't work.Any help ?
CodePudding user response:
In your code
df = pd.pivot(Data.reset_index(), index='Date', columns='Fat', values='y')
CodePudding user response:
Use pivot_table
:
df = pd.pivot_table(Data, 'y', Data.index, 'Fac')
print(df)
# Output
Fac A Z
Date
2017-06-19 0.35 NaN
2017-06-20 1.22 NaN
2017-06-21 8.73 NaN
2017-06-22 9.34 NaN
2017-06-23 4.30 NaN
2022-01-25 NaN 3.144
2022-01-26 NaN 2.278
2022-01-27 NaN 1.352
2022-01-28 NaN 0.312
2022-01-31 NaN 0.312
You can also use fill_value=0
as parameter of pivot_table
to replace NaN
by 0:
df = pd.pivot_table(Data, 'y', Data.index, 'Fac', fill_value=0)
print(df)
# Output
Fac A Z
Date
2017-06-19 0.35 0.000
2017-06-20 1.22 0.000
2017-06-21 8.73 0.000
2017-06-22 9.34 0.000
2017-06-23 4.30 0.000
2022-01-25 0.00 3.144
2022-01-26 0.00 2.278
2022-01-27 0.00 1.352
2022-01-28 0.00 0.312
2022-01-31 0.00 0.312
Note: if you have multiple values for a same (Date, Fac)
, use aggfunc
parameter to apply an operation to the duplicate values: mean
, first
, last
, max
, min
, sum
or a custom function.
For example, use the first value found:
df = pd.pivot_table(Data, 'y', Data.index, 'Fac', fill_value=0, aggfunc='first')
print(df)
# Output (doesn't change here because you have only unique (Date, Fac)
Fac A Z
Date
2017-06-19 0.35 0.000
2017-06-20 1.22 0.000
2017-06-21 8.73 0.000
2017-06-22 9.34 0.000
2017-06-23 4.30 0.000
2022-01-25 0.00 3.144
2022-01-26 0.00 2.278
2022-01-27 0.00 1.352
2022-01-28 0.00 0.312
2022-01-31 0.00 0.312