Home > Software engineering >  Combine the rows of a pandas dataframe to a dictionary when a value is greater 0
Combine the rows of a pandas dataframe to a dictionary when a value is greater 0

Time:01-11

I have reviewed this post. I have a dataframe with some entries. In column A are values at a certain datetime.

    date        A
0   2023-01-01  20
1   2023-01-02  30
2   2023-01-03  10
3   2023-01-04  0
4   2023-01-05  0
5   2023-01-06  0
6   2023-01-07  45
7   2023-01-08  20
8   2023-01-09  0
9   2023-01-10  0
10  2023-01-11  15
11  2023-01-12  0

I want to lookup for all entries in A where the values != 0 and combine it to a block of appearence or variable of occurence. I want to output a dict where all successive values are combined even if there is only one entry by the index of the dataframe.

What I tried so far:

# identify in column where the value > 0
m = df['A'].gt(0)

This will give me bool so far so good:

0      True
1      True
2      True
3     False
4     False
5     False
6      True
7      True
8     False
9     False
10     True
11    False
Name: A, dtype: bool

To output all values in a dictionary:

# aggregate as list, convert the output to dictionary
pos = df[m].reset_index().groupby(m.ne(m.shift()).cumsum())['index'].agg(list).to_dict()

Output of pos:

{1: [0, 1, 2], 2: [6, 7, 10]}

But expected Output:

{1: [0, 1, 2], 2: [6, 7], 3: [10]}

CodePudding user response:

The expected output is unclear, but I imagine you might want to perform a groupby.agg as list with the consecutive 1 values, then convert to_dict:

# identify 1s
m = df['B'].eq(1)

# group by successive values on the slice of 1 values
# aggregate as list, convert the output to dictionary
out = df[m].groupby(m.ne(m.shift()).cumsum())['A'].agg(list).to_dict()

Output:

{1: [20, 30, 10], 3: [45, 20, 2]}

updated question

m = df['A'].gt(0)

out = df.index[m].groupby(m.ne(m.shift())[m].cumsum())

Output:

{1: [0, 1, 2], 2: [6, 7], 3: [10]}
  • Related