I have a list of dataframe and I want to sort the order they are in the list
Each dataframe has the same structure as shown below
df1 = pd.DataFrame.from_dict({'Ch1': {0: -28, 1: -36, 2: -39, 3: -16}, 'Ch2': {0: 543, 1: 547, 2: 559, 3: 561}, 'Ch3': {0: -126, 1: -131, 2: -147, 3: -149}, 'time': {0: '2022-02-10 16.37.25.502', 1: '2022-02-10 16.37.25.502', 2: '2022-02-10 16.37.25.502', 3: '2022-02-10 16.37.25.502'}})
df2 = pd.DataFrame.from_dict({'Ch1': {0: 81, 1: 70, 2: 70, 3: 75}, 'Ch2': {0: 570, 1: 559, 2: 554, 3: 565}, 'Ch3': {0: -103, 1: -120, 2: -131, 3: -122}, 'time': {0: '2022-02-11 05.29.28.116', 1: '2022-02-11 05.29.28.116', 2: '2022-02-11 05.29.28.116', 3: '2022-02-11 05.29.28.116'}})
df3 = pd.DataFrame.from_dict({'Ch1': {0: -887, 1: -887, 2: -890, 3: -898}, 'Ch2': {0: 1307, 1: 1292, 2: 1301, 3: 1307}, 'Ch3': {0: 59, 1: 61, 2: 57, 3: 55}, 'time': {0: '2022-02-08 01.12.54.578', 1: '2022-02-08 01.12.54.578', 2: '2022-02-08 01.12.54.578', 3: '2022-02-08 01.12.54.578'}})
df_list = [df1,df2,df3]
the values in the "time" column does not change in each row within the same dataframe.
I want the dataframes in the list sorted by time (first to last) so that further processing and can match up with other data.
my attempt thus far.
for i in df_list:
b = pd.to_datetime(i['time'].iloc[0]) #grab the first cell that contains the time stamp
b = b.sort_values(by('time'))
returns the following error ValueError: ('Unknown string format:', '2022-02-05 08.03.09.794')
I would expect the dataframes to appear in the list with df3 being fist, df1, second and df2 last. I the time column is going and needs to be dropped for other operations therefore I would like them sorted in time order already
Any help suggestion alternative approaches greatly appreciated
CodePudding user response:
If you want to sort the rows of each dataframe, you need to provide the exact format of your datetime, and you should sort in place:
for d in df_list:
d['time'] = pd.to_datetime(d['time'], format='%Y-%m-%d %H.%M.%S.%f')
d.sort_values(by='time', inplace=True)
Or, if you want to sort the dataframes in the list, which is completely different, use:
df_list.sort(key=lambda d: d['time'].iloc[0])
You should be able to sort using the string due to your particular format (assuming YYYY-MM-DD).
To ensure sorting on datetime (for example if the format was MM-DD-YYYY):
df_list.sort(key=lambda d: pd.to_datetime(d['time'].iloc[0], format='%Y-%m-%d %H.%M.%S.%f'))