I have three different seismic catalogs with origin times calculated using different methods, naturally, the calculated values aren't exactly the same with an error of arround 5 seconds.
Catalog_1
Index Time
0 2022-05-01T08:16:55
1 2022-05-01T09:54:01
2 2022-05-01T10:25:49
3 2022-05-01T12:01:55
4 2022-05-01T18:17:23
Catalog_2
Index Time
0 2022-05-01T08:16:58.444
1 2022-05-01T10:25:46.939
2 2022-05-01T20:37:17.491
3 2022-05-01T23:34:22.539
Catalog_3
Index Time
0 2022-05-01T10:25:48
1 2022-05-01T23:34:20
2 2022-05-02T07:21:51
I want to combine these 3 dataframes into a single dataframe that automatically matches the origin times if they have the acceptable error.
Combined_catalog
Index Time_1 Time_2 Time_3
0 2022-05-01T08:16:55 2022-05-01T08:16:58.444 N/A
1 2022-05-01T09:54:01 N/A N/A
2 2022-05-01T10:25:49 2022-05-01T10:25:46.939 2022-05-01T10:25:48
3 2022-05-01T12:01:55 N/A N/A
4 2022-05-01T18:17:23 N/A N/A
5 N/A 2022-05-01T20:37:17.491 N/A
6 N/A 2022-05-01T23:34:22.539 2022-05-01T23:34:20
7 N/A N/A 2022-05-02T07:21:51
Is there a way to get a result similar to this witout using loops and if's?
Sometimes the catalogs have data from up to 5 years so it might be better to consider a different approach.
CodePudding user response:
Pandas round() function and compare() function might be of help here.
If you need HOUR level matching only use
pd.to_datetime(Catalog_1['Time']).dt.floor('H')