I have two dataframe, and would like to keep only row that both matches exactly on index (in this case datetime), and would like to return as two separate dataframe accordingly.
df1:
DATETIME,ACCEL_X,ACCEL_Y,ACCEL_Z
2021-10-05 17:04:36,-0.0078125,-0.008544921875,0.9970703125
2021-10-05 17:04:37,0.009521484375,-0.011474609375,0.97900390625
2021-10-05 17:04:38,-0.013427734375,-0.0111083984375,0.9793701171875
2021-10-05 17:04:39,-0.007080078125,0.0028076171875,0.98583984375
2021-10-05 17:04:40,0.012939453125,-0.005615234375,0.996826171875
2021-10-05 17:04:41,-0.0062255859375,0.008056640625,1.0028076171875
2021-10-05 17:04:42,0.0177001953125,-0.001953125,1.0042724609375
2021-10-05 17:04:43,-0.0057373046875,-0.0030517578125,0.99072265625
2021-10-05 17:04:44,0.006103515625,0.0101318359375,1.0035400390625
2021-10-05 17:04:45,0.002197265625,-0.0029296875,0.9827880859375
2021-10-05 17:04:46,-0.001708984375,-0.0045166015625,0.9974365234375
2021-10-05 17:04:47,0.0057373046875,-0.0048828125,0.9930419921875
df2:
DATETIME,ACCEL_X,ACCEL_Y,ACCEL_Z
2021-10-05 17:04:37,-0.03488372093023256,-0.018604651162790697,0.046511627906976744
2021-10-05 17:04:38,-0.006730769230769231,0.0,0.0
2021-10-05 17:04:39,0.02,0.0,0.0
2021-10-05 17:04:40,0.0,0.0,0.0
2021-10-05 17:04:41,0.13653846153846153,0.020192307692307693,-0.10480769230769231
2021-10-05 17:04:42,0.08571428571428572,0.0,0.10571428571428573
2021-10-05 17:04:43,-0.12115384615384615,0.0,-0.06346153846153847
Desired output:
df1:
DATETIME,ACCEL_X,ACCEL_Y,ACCEL_Z
2021-10-05 17:04:37,0.009521484375,-0.011474609375,0.97900390625
2021-10-05 17:04:38,-0.013427734375,-0.0111083984375,0.9793701171875
2021-10-05 17:04:39,-0.007080078125,0.0028076171875,0.98583984375
2021-10-05 17:04:40,0.012939453125,-0.005615234375,0.996826171875
2021-10-05 17:04:41,-0.0062255859375,0.008056640625,1.0028076171875
2021-10-05 17:04:42,0.0177001953125,-0.001953125,1.0042724609375
df2:
DATETIME,ACCEL_X,ACCEL_Y,ACCEL_Z
2021-10-05 17:04:37,-0.03488372093023256,-0.018604651162790697,0.046511627906976744
2021-10-05 17:04:38,-0.006730769230769231,0.0,0.0
2021-10-05 17:04:39,0.02,0.0,0.0
2021-10-05 17:04:40,0.0,0.0,0.0
2021-10-05 17:04:41,0.13653846153846153,0.020192307692307693,-0.10480769230769231
2021-10-05 17:04:42,0.08571428571428572,0.0,0.10571428571428573
CodePudding user response:
Use align
with inner
join:
new_df1, new_df2 = df1.align(df2, join='inner')
*Note this will align both index and columns (which works for the provided sample), however, we can also specify the axis to only align the index if needed:
new_df1, new_df2 = df1.align(df2, join='inner', axis=0)
new_df1
:
ACCEL_X ACCEL_Y ACCEL_Z
DATETIME
2021-10-05 17:04:37 0.009521 -0.011475 0.979004
2021-10-05 17:04:38 -0.013428 -0.011108 0.979370
2021-10-05 17:04:39 -0.007080 0.002808 0.985840
2021-10-05 17:04:40 0.012939 -0.005615 0.996826
2021-10-05 17:04:41 -0.006226 0.008057 1.002808
2021-10-05 17:04:42 0.017700 -0.001953 1.004272
2021-10-05 17:04:43 -0.005737 -0.003052 0.990723
new_df2
:
ACCEL_X ACCEL_Y ACCEL_Z
DATETIME
2021-10-05 17:04:37 -0.034884 -0.018605 0.046512
2021-10-05 17:04:38 -0.006731 0.000000 0.000000
2021-10-05 17:04:39 0.020000 0.000000 0.000000
2021-10-05 17:04:40 0.000000 0.000000 0.000000
2021-10-05 17:04:41 0.136538 0.020192 -0.104808
2021-10-05 17:04:42 0.085714 0.000000 0.105714
2021-10-05 17:04:43 -0.121154 0.000000 -0.063462