I have a pandas dataframe that looks like this:
idx | A | B |
---|---|---|
01/01/01 00:00:01 | 5 | 2 |
01/01/01 00:00:02 | 4 | 5 |
01/01/01 00:00:03 | 5 | 4 |
02/01/01 00:00:01 | 3 | 8 |
02/01/01 00:00:02 | 7 | 4 |
02/01/01 00:00:03 | 1 | 3 |
I would like to group data based on its periodicity such that the final dataframe is:
new_idx | 01/01/01 | 02/01/01 | old_column |
---|---|---|---|
00:00:01 | 5 | 3 | A |
00:00:02 | 4 | 7 | A |
00:00:03 | 5 | 1 | A |
00:00:01 | 2 | 8 | B |
00:00:02 | 5 | 4 | B |
00:00:03 | 4 | 3 | B |
Is there a way to this that holds when the first dataframe gets big (more columns, more periods and more samples)?
CodePudding user response:
One way is to melt
the DataFrame, then split the datetime to dates and times; finally pivot
the resulting DataFrame for the final output:
df = df.melt('idx', var_name='old_column')
df[['date','new_idx']] = df['idx'].str.split(expand=True)
out = df.pivot(['new_idx','old_column'], 'date', 'value').reset_index().rename_axis(columns=[None]).sort_values(by='old_column')
Output
new_idx old_column 01/01/01 02/01/01
0 00:00:01 A 5 3
2 00:00:02 A 4 7
4 00:00:03 A 5 1
1 00:00:01 B 2 8
3 00:00:02 B 5 4
5 00:00:03 B 4 3