Home > Back-end >  How to do conditional merge Pandas
How to do conditional merge Pandas

Time:05-17

I have two different dataframes shown below.

This is the tel_times dataframe

enter image description here

And this is the maint_comp1 dataframe.

enter image description here

Now, I have joined these two dataframes using merge.

maint_tel_comp1 = pd.merge(tel_times, maint_comp1,  how='inner', left_on=['machineID','datetime_tel'], right_on = ['machineID','datetime_maint'])

The result is

enter image description here

I want to apply conditions on the two datetime columns I have.

Something like this,

maint_tel_comp1 = (telemetry_times.join(maint_comp1, 
                                        ((telemetry_times ['machineID']== maint_comp1['machineID']) 
                                         & (telemetry_times ['datetime_tel'] > maint_comp1['datetime_maint']) 
                                         & ( maint_comp1['comp1sum'] == '1')))

This is in PySpark but I want to do it in Pandas.

I am doing this to follow the same condition.

maint_tel_comp1[maint_tel_comp1['datetime_tel'] > maint_tel_comp1['datetime_maint']]

But it is giving an empty dataframe.

CodePudding user response:

I believe what you want is:

maint_tel_comp1 = tel_times.merge(maint_comp1, on='machineID', how='inner')
maint_tel_comp1[maint_tel_comp1['datetime_tel'].gt(maint_tel_comp1['datetime_maint'])]

The problem is your merge was also ensuring datetime_tel == datetime_maint, hence your condition was returning an empty DataFrame.

  • Related