I have a dataframe looking like this.
A
2013-01-05 00:00:00 0
2013-01-05 01:00:00 0
2013-01-05 02:00:00 5
2013-01-05 03:00:00 20
2013-01-05 04:00:00 10
2013-01-05 05:00:00 0
2013-01-05 06:00:00 0
2013-01-05 07:00:00 3
2013-01-05 07:00:00 6
I tried to select sub dataframes with positive values and extract their indexes
List= df[df['A']>0].index.tolist()
Index of first and last positive element of each sub dataframe, put the sub list in list: for this dataframe [[5,10],[3,6]] and return their indexes
Desired output:List[[ 2013-01-05 02:00:00,2013-01-05 04:00:00],[2013-01-05 07:00:00,2013-01-05 08:00:00]]
CodePudding user response:
You could try:
idx_list = (
df
.assign(
group=df["A"].gt(0).diff().fillna(False).cumsum(), idx=df.index
)[df["A"].gt(0)]
.groupby("group").agg({"idx": lambda col: [col.iat[0], col.iat[-1]]})
.idx.to_list()
)
Result with
df =
A
2013-01-05 00:00:00 0
2013-01-05 01:00:00 0
2013-01-05 02:00:00 5
2013-01-05 03:00:00 20
2013-01-05 04:00:00 10
2013-01-05 05:00:00 0
2013-01-05 06:00:00 0
2013-01-05 07:00:00 3
2013-01-05 08:00:00 6
is
[[Timestamp('2013-01-05 02:00:00'), Timestamp('2013-01-05 04:00:00')],
[Timestamp('2013-01-05 07:00:00'), Timestamp('2013-01-05 08:00:00')]]
CodePudding user response:
Change index for the column name and then extract the first and the last element from a list:
my_list = df[df['A']>0]['A'].to_list()
my_list= [my_list[0],my_list[-1]]