Home > Back-end >  Find list of column names having their minimum values in a given date range using Python
Find list of column names having their minimum values in a given date range using Python

Time:11-12

Given a dataset as follows and a date range from 2013-05-01 to 2013-05-15:

import pandas as pd
import numpy as np

np.random.seed(2021)
dates = pd.date_range('20130226', periods=90)
df = pd.DataFrame(np.random.uniform(0, 10, size=(90, 6)), index=dates, columns=['A_values', 'B_values', 'C_values', 'D_values', 'E_values', 'target'])

Out:

            A_values  B_values  C_values  D_values  E_values    target
2013-02-26  6.059783  7.333694  1.389472  3.126731  9.972433  1.281624
2013-02-27  1.789931  7.529254  6.621605  7.843101  0.968944  0.585713
2013-02-28  9.623960  6.165574  0.866300  5.612724  6.165247  9.638430
2013-03-01  5.743043  3.711608  4.521452  2.018502  5.693051  1.950960
2013-03-02  5.837040  4.763135  5.178144  8.230986  7.322250  0.690563
             ...       ...       ...       ...       ...       ...
2013-05-22  8.795736  6.316478  0.427136  3.864723  6.803052  5.554824
2013-05-23  7.959282  1.839659  2.225667  5.140268  2.756443  8.440801
2013-05-24  5.412016  9.081583  7.212742  5.890717  0.430184  0.968086
2013-05-25  1.088414  9.016004  5.384490  1.610210  5.461017  0.672555
2013-05-26  4.930571  2.338785  9.823048  6.893207  6.312104  5.273122

First I filter columns by df.filter(regex='_values$'), then I hope to return a list of column names whose minimum value falls in the given date range (2013-05-01, 2013-05-15), ie. If column A_values's minimum value is in any day in this range, then A_values will be contained in the returned result list.

How could I achieve that in Pandas or Numpy? Thanks.

CodePudding user response:

Use DataFrame.idxmin for minimal datetimes per columns and then filter index by Series.between_time

s = df.filter(regex='_values$').idxmin()

out = s[s.between('2013-05-01','2013-05-15')].index.tolist()
print (out)
['D_values']
  • Related