I'm looking for a way to always retrieve the row with index and value out of two dataframes.
These two dataframes have a timeseries index of the same length, moreover they both have 1 single column that contains the value of interest.
So far my approach gives me the min index (could be applied to max as well)
pd.concat([df1.col1, df2.col1], axis=1).idxmin()
which gives me
col1 2022-06-04 22:40:00
col1 2022-06-05 01:50:00
dtype: datetime64[ns]
Then I subset the dataframe which has the minimum:
df1.col1['2022-06-04 22:40:00']
But this process is manual and I'd like to always retrieve the minimum, and what if the second dataframe contains the lowest value out of the two? How could I make sure I'm getting the lowest other than manually checking.
CodePudding user response:
you need to add max() command its available in pandas.
PD.COLUMNS.MAX(column name) same for minimum value.
it will help.
CodePudding user response:
Use DataFrame.agg
with 4 aggregate functions for get indices and values by min
and max
:
df1 = pd.DataFrame({'col1':[1,2,6]}, index=pd.date_range('2010-01-02',periods=3))
df2 = pd.DataFrame({'col1':[10,20,6]}, index=pd.date_range('2010-01-02',periods=3))
out = (pd.concat([df1.col1, df2.col1], axis=1, keys=['df1','df2'])
.agg(['idxmin', 'idxmax', 'min', 'max']))
print (out)
df1 df2
idxmin 2010-01-02 00:00:00 2010-01-04 00:00:00
idxmax 2010-01-04 00:00:00 2010-01-03 00:00:00
min 1 6
max 6 20
print (out.loc['idxmin', 'df2'])
2010-01-04 00:00:00
print (out.loc['min', 'df2'])
6