Home > Net >  How to compare two dataframe timeindex and exclude the NaN rows to get the same length?
How to compare two dataframe timeindex and exclude the NaN rows to get the same length?

Time:08-01

I have two time series dataframes. One has some nan values. I can filter out the nans with dropna() and after that I get some missing hours in the time series. What I want is to compare this dataframe timeindex with another dataframe timeindex and exclude the missing hours from the first dataframe. Say like I have 8 rows in the first dataframe and the second dataframe has 6 after dropna, I also want to remove this 2 row with the corresponding timeindex hour in dataframe one. How can I do that?

Dataframe 1:

                              values_1
                                                       
2019-05-07 01:00:00 00:00     50.5
2019-05-07 02:00:00 00:00     45.0
2019-05-07 03:00:00 00:00     50.4
2019-05-07 04:00:00 00:00     61.9
2019-05-07 05:00:00 00:00     34.9
2019-05-07 06:00:00 00:00     23.9
2019-05-07 07:00:00 00:00     68.4
2019-05-07 08:00:00 00:00     85.4

Dataframe 2

                              values_2
                                                       
2019-05-07 01:00:00 00:00     43.5
2019-05-07 02:00:00 00:00     23.0
2019-05-07 03:00:00 00:00     86.4
2019-05-07 04:00:00 00:00     62.9
2019-05-07 05:00:00 00:00     NaN
2019-05-07 06:00:00 00:00     65.9
2019-05-07 07:00:00 00:00     NaN
2019-05-07 08:00:00 00:00     32.4

Now I want to compare this with Dataframe 1 and exclude the 2 rows with the NaN from Dataframe 2 in Dataframe 1.

What I want:

Dataframe 1_filtered:

                              values_1
                                                       
2019-05-07 01:00:00 00:00     50.5
2019-05-07 02:00:00 00:00     45.0
2019-05-07 03:00:00 00:00     50.4
2019-05-07 04:00:00 00:00     61.9
2019-05-07 06:00:00 00:00     23.9
2019-05-07 08:00:00 00:00     85.4

Dataframe 2_filtered

                              values_2
                                                       
2019-05-07 01:00:00 00:00     43.5
2019-05-07 02:00:00 00:00     23.0
2019-05-07 03:00:00 00:00     86.4
2019-05-07 04:00:00 00:00     62.9
2019-05-07 06:00:00 00:00     65.9
2019-05-07 08:00:00 00:00     32.4

CodePudding user response:

You can just pass the index of the filtered df2 to df1

df1.loc[df2.dropna().index]
  • Related