In the code below I pivot a dataframe using an index date
. After the pivot, I need to get the month from column date
. This is my attempt:
df = pd.DataFrame({
'date' : [datetime(2021,3,11), datetime(2021,3,11), datetime(2021,3,11),
datetime(2021,3,12), datetime(2021,3,12), datetime(2021,3,12),
datetime(2021,3,13), datetime(2021,3,13), datetime(2021,3,13)],
'field': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'value': [150, 140, 1, 130, 280, 2, 260, 120, 4]
})
print(df)
date field value
0 2021-03-11 A 150
1 2021-03-11 B 140
2 2021-03-11 C 1
3 2021-03-12 A 130
4 2021-03-12 B 280
5 2021-03-12 C 2
6 2021-03-13 A 260
7 2021-03-13 B 120
8 2021-03-13 C 4
df_pivoted = df.pivot(index='date', columns='field', values='value')
print(df_pivoted)
field A B C
date
2021-03-11 150 140 1
2021-03-12 130 280 2
2021-03-13 260 120 4
df_pivoted['month'] = df_pivoted['date'].apply(lambda x: x.month)
I get an error, as the date
field is not a column, but I need to use it:
KeyError: 'date'
Why can't I use a column that's also an index?
CodePudding user response:
date
is the index, so you have to get the months from the index:
df_pivoted['month'] = df_pivoted.index.month
Output:
field A B C month
date
2021-03-11 150 140 1 3
2021-03-12 130 280 2 3
2021-03-13 260 120 4 3
CodePudding user response:
Another way
extract month first, set index, stack and unstack
df['month']=df.date.dt.month
df.set_index(['date','field','month']).stack().unstack('field', 'month').droplevel(level=2).reset_index()df['month']=df.date.dt.month
or
df['month']=df.date.dt.month
df_pivoted = df.pivot(index=['date','month'], columns=['field'], values='value').reset_index()
Outcome in both cases
field date month A B C
0 2021-03-11 3 150 140 1
1 2021-03-12 3 130 280 2
2 2021-03-13 3 260 120 4