I have a data frame containing several registers from sold products, revenues, prices and dates
DATA CLIENT PRODUCT PRICE
2020-08-28 xxxxxxx RIOT 20.0
I am to group my information by year/month and product. I am running a group by to_period that extract the exactly information :
dfgmv = dfgift[['PRODUCT','PRICE']].groupby([dfgift.DATA.dt.to_period("M"), 'PRODUCT']).agg(['count','sum'])
This is the output :
PRICE
count sum
DATA PRODUCT
2020-08 RIOT 2 40.00
The question is that, as I export to excel, the date column is not interpreted as da date (yyyy-mm). I am trying to convert the yyyy-mm to something like yyyy-mm-dd so Excel understand it.
I´ve read several questions about multi index but my knowledge wasn't enough to use that info to apply here. I tried to change my index to datetime, but, as I run it, I lost the second index column (product).
dfgmv.index = pd.to_datetime(dfgmv.index.get_level_values(0).astype('datetime64[ns]'))
.
VALOR
count sum
DATA
2020-08-01 2 40.00
So, How can I change the information format without losing my index?
CodePudding user response:
Index.set_levels
is designed to allow for the setting of specific index level(s).
dfgmv.index = (
dfgmv.index.set_levels(dfgmv.index.levels[0].astype('datetime64[ns]'), level=0)
)
Result
PRICE
count sum
DATA PRODUCT
2020-08-01 RIOT 1 20.0
CodePudding user response:
You can change your groupby
to include start of month for each date you have:
dfgmv = dfgift[['PRODUCT','PRICE']].groupby([dfgift.DATA.astype('datetime64[M]'), 'PRODUCT']).agg(['count','sum'])
dfgmv
PRICE
count sum
DATA PRODUCT
2020-08-01 RIOT 1 20.0