I have a pandas dataframe with two columns, both are datetime instances. The first column is made of measurement timings and the second column is the sum of the first column with a constant offset. E.g assuming a constant offset of 1 gives:
index | Measurement_time | offset_time |
---|---|---|
0 | 0.1 | 1.2 |
1 | 0.5 | 1.5 |
2 | 1.2 | 2.2 |
3 | 2.4 | 3.4 |
I would like to find the index of each measurement_time that closest matches the offset_time with the condition that the measurement_time must be smaller than or equal to the offset_time. The solution to the given example would therefore be:
index = [2, 2, 2, 3]
I have tried using get_loc
and making a mask but because my dataframe is large, these solutions are too inefficient.
Any help would be greatly appreciated!
CodePudding user response:
Let us use np.searchsorted
to find the indices of closest matches
s = df['Measurement_time'].sort_values()
np.searchsorted(s, df['offset_time'], side='right') - 1
Result:
array([2, 2, 2, 3], dtype=int64)
Note: You may skip the .sort_values
part if your dataframe is already sorted on the column Measurement_time