I have reviewed this post. I have a dataframe with some entries. In column A
are values at a certain datetime
.
date A
0 2023-01-01 20
1 2023-01-02 30
2 2023-01-03 10
3 2023-01-04 0
4 2023-01-05 0
5 2023-01-06 0
6 2023-01-07 45
7 2023-01-08 20
8 2023-01-09 0
9 2023-01-10 0
10 2023-01-11 15
11 2023-01-12 0
I want to lookup for all entries in A
where the values != 0
and combine it to a block of appearence or variable of occurence. I want to output a dict
where all successive values are combined even if there is only one entry by the index
of the dataframe
.
What I tried so far:
# identify in column where the value > 0
m = df['A'].gt(0)
This will give me bool
so far so good:
0 True
1 True
2 True
3 False
4 False
5 False
6 True
7 True
8 False
9 False
10 True
11 False
Name: A, dtype: bool
To output all values in a dictionary:
# aggregate as list, convert the output to dictionary
pos = df[m].reset_index().groupby(m.ne(m.shift()).cumsum())['index'].agg(list).to_dict()
Output of pos
:
{1: [0, 1, 2], 2: [6, 7, 10]}
But expected Output:
{1: [0, 1, 2], 2: [6, 7], 3: [10]}
CodePudding user response:
The expected output is unclear, but I imagine you might want to perform a groupby.agg
as list
with the consecutive 1 values, then convert to_dict
:
# identify 1s
m = df['B'].eq(1)
# group by successive values on the slice of 1 values
# aggregate as list, convert the output to dictionary
out = df[m].groupby(m.ne(m.shift()).cumsum())['A'].agg(list).to_dict()
Output:
{1: [20, 30, 10], 3: [45, 20, 2]}
updated question
m = df['A'].gt(0)
out = df.index[m].groupby(m.ne(m.shift())[m].cumsum())
Output:
{1: [0, 1, 2], 2: [6, 7], 3: [10]}