Home > Back-end >  How should I use pandas last function on datetime?
How should I use pandas last function on datetime?

Time:07-01

I am trying to use the last() function on a dataframe with a timestamp index :

df = pd.DataFrame({
        'timestamp' : ['2022-06-14 16:01:00.292000 00:00', '2022-05-18 05:00:37.843000 00:00', '2022-06-06 00:00:56.134000 00:00'],
        'otherColumn' : ['A', 'B', 'C'],
    })
df["timestamp"] = pd.to_datetime(df["timestamp"], format='%Y-%m-%d %H:%M:%S')  
df = df.set_index(['timestamp'])
print(df.last('1D'))

here is what it returns : 2022-06-06 00:00:56.134000 00:00 C I don't understand how it would return the 2022-06-06, it should return the 2022-06-14 as this is the most recent one ?

CodePudding user response:

The last documentation mentions (although not very explicitly) that the Index must be sorted:

For a DataFrame with a sorted DatetimeIndex, this function selects the last few rows based on a date offset.

And, indeed, if you look at the code of last, the key logic is using searchsorted:

start = self.index.searchsorted(start_date, side="right")

Thus:

df.sort_index().last('1D')

output:

                                 otherColumn
timestamp                                   
2022-06-14 16:01:00.292000 00:00           A
  • Related