Home > Mobile >  sort the order of dataframes in a list of dataframes based on a value in each dataframe
sort the order of dataframes in a list of dataframes based on a value in each dataframe

Time:04-09

I have a list of dataframe and I want to sort the order they are in the list

Each dataframe has the same structure as shown below

df1 = pd.DataFrame.from_dict({'Ch1': {0: -28, 1: -36, 2: -39, 3: -16}, 'Ch2': {0: 543, 1: 547, 2: 559, 3: 561}, 'Ch3': {0: -126, 1: -131, 2: -147, 3: -149}, 'time': {0: '2022-02-10 16.37.25.502', 1: '2022-02-10 16.37.25.502', 2: '2022-02-10 16.37.25.502', 3: '2022-02-10 16.37.25.502'}})
df2 = pd.DataFrame.from_dict({'Ch1': {0: 81, 1: 70, 2: 70, 3: 75}, 'Ch2': {0: 570, 1: 559, 2: 554, 3: 565}, 'Ch3': {0: -103, 1: -120, 2: -131, 3: -122}, 'time': {0: '2022-02-11 05.29.28.116', 1: '2022-02-11 05.29.28.116', 2: '2022-02-11 05.29.28.116', 3: '2022-02-11 05.29.28.116'}})
df3 = pd.DataFrame.from_dict({'Ch1': {0: -887, 1: -887, 2: -890, 3: -898}, 'Ch2': {0: 1307, 1: 1292, 2: 1301, 3: 1307}, 'Ch3': {0: 59, 1: 61, 2: 57, 3: 55}, 'time': {0: '2022-02-08 01.12.54.578', 1: '2022-02-08 01.12.54.578', 2: '2022-02-08 01.12.54.578', 3: '2022-02-08 01.12.54.578'}})

df_list = [df1,df2,df3]

the values in the "time" column does not change in each row within the same dataframe.

I want the dataframes in the list sorted by time (first to last) so that further processing and can match up with other data.

my attempt thus far.


for i in df_list:
    b = pd.to_datetime(i['time'].iloc[0]) #grab the first cell that contains the time stamp
    b = b.sort_values(by('time'))

returns the following error ValueError: ('Unknown string format:', '2022-02-05 08.03.09.794')

I would expect the dataframes to appear in the list with df3 being fist, df1, second and df2 last. I the time column is going and needs to be dropped for other operations therefore I would like them sorted in time order already

Any help suggestion alternative approaches greatly appreciated

CodePudding user response:

If you want to sort the rows of each dataframe, you need to provide the exact format of your datetime, and you should sort in place:

for d in df_list:
    d['time'] = pd.to_datetime(d['time'], format='%Y-%m-%d %H.%M.%S.%f')
    d.sort_values(by='time', inplace=True)

Or, if you want to sort the dataframes in the list, which is completely different, use:

df_list.sort(key=lambda d: d['time'].iloc[0])

You should be able to sort using the string due to your particular format (assuming YYYY-MM-DD).

To ensure sorting on datetime (for example if the format was MM-DD-YYYY):

df_list.sort(key=lambda d: pd.to_datetime(d['time'].iloc[0], format='%Y-%m-%d %H.%M.%S.%f'))
  • Related