Home > database >  Date-time column names in pandas
Date-time column names in pandas

Time:11-09

I have a usage data per customer, collected per months during several years, shaped as ~(6000, 60).

Sample dataframe:

df = pd.DataFrame({'id': ['user_1', 'user_2'], 'access_type': ['mobile', 'desktop'], '2018-09-01 00:00:00': [7,5], '2018-10-01 00:00:00':[1,3], '2018-11-01 00:00:00':[0,10]})

    id  access_type 2018-09-01 00:00:00 2018-10-01 00:00:00 2018-11-01 00:00:00
0   user_1  mobile  7   1   0
1   user_2  desktop 5   3   10

How do I change 40 selected date-columns to a datetime index (?) format, or other format that will allow selecting/slicing required periods of time as date?

CodePudding user response:

Use DataFrame.melt with DataFrame.set_index:

df2 = (df.melt(['id','access_type'], var_name='date')
        .assign(date = lambda x: pd.to_datetime(x['date']))
        .set_index('date'))

print (df2)
                id access_type  value
date                                 
2018-09-01  user_1      mobile      7
2018-09-01  user_2     desktop      5
2018-10-01  user_1      mobile      1
2018-10-01  user_2     desktop      3
2018-11-01  user_1      mobile      0
2018-11-01  user_2     desktop     10

If need MultiIndex use set_index with DataFrame.stack:

s = (df.set_index(['id','access_type'])
       .stack()
       .rename(index = lambda x: pd.to_datetime(x), level=2))
print (s)

Or:

s = (df.melt(['id','access_type'], var_name='date')
        .assign(date = lambda x: pd.to_datetime(x['date']))
        .set_index(['id','access_type','date'])['value'])

print (s)
  • Related