I have a single DataFrame in the following format:
Date Hour Information
2020-11-28 11:14:28 10.0
2020-11-28 11:14:30 15.4
2020-11-28 11:14:33 6.9
2020-11-28 11:14:35 27.0
2020-11-28 11:14:37 70.0
2020-11-28 11:14:40 37.1
2020-11-28 11:14:42 2.4
2020-11-28 11:15:20 11.9
2020-11-28 11:15:22 14.0
2020-11-28 11:15:24 122.8
2020-11-28 11:15:27 10.12
2020-11-28 11:15:29 56.86
2020-11-28 11:15:31 00.54
2020-11-28 11:15:34 01.87
2020-11-28 11:15:36 1.0
2020-11-28 11:24:21 45.45
2020-11-28 11:24:23 9.0
2020-11-28 11:24:26 90.5
2020-11-28 11:24:28 0.0
2020-11-28 11:24:30 5.34
. . .
. . .
. . .
2020-11-30 10:34:12 10.0
2020-11-30 10:34:14 15.4
I need to be able to organize the information on the right column generating a new DataFrame containing the Information and the Hour for each datetime in the left column, so that i have like the following:
DataFrame1:
Datetime Hour Information
2020-11-28 11:14:28 10.0
2020-11-28 11:14:30 15.4
. . .
. . .
. . .
2020-11-28 23:59:00 4.42
DataFrame2:
Datetime Hour Information
2020-11-29 00:00:00 18.7
. . .
. . .
. . .
2020-11-28 23:59:00 7.54
And so on with all the other days I have, which not necessarily start in January 1st, but always has records for consecutive days. I've tried to use .groupby() but couldn't find a way to use it wihout getting the mean, sum, etc.
CodePudding user response:
If those multiple Dataframes that you want can be stored in tuple you could do the following
def format_df(df, date_time):
df["DateTime"] = date_time
return df.reset_index(drop=True)
dfs_array = tuple(format_df(grouped_df, date_time) for date_time, grouped_df in original_df.groupby("Date"))