I have a dataframe df
as below:
# Import pandas library
import pandas as pd
# initialize list elements
data = [10,-20,30,40,-50,60,12,-12,11,1,90,-20,-10,-5,-4]
# Create the pandas DataFrame with column name is provided explicitly
df = pd.DataFrame(data, columns=['Numbers'])
# print dataframe.
df
I want the sum of count of max consecutive positive and negative numbers.
I am able to get count of max consucutive positive and negative numbers, but unable to sum using below code.
my code:
streak = df['Numbers'].to_list()
from collections import defaultdict
from itertools import groupby
counter = defaultdict(list)
for key, val in groupby(streak, lambda ele: "plus" if ele >= 0 else "minus"):
counter[key].append(len(list(val)))
lst = []
for key in ('plus', 'minus'):
lst.append(counter[key])
print("Max Pos Count " str(max(lst[0])))
print("Max Neg Count : " str(max(lst[1])))
Current Output:
Max Pos Count 3
Max Neg Count : 4
I am struggling to get sum of max consuctive positive and negative.
Expected Output:
Sum Pos Max Consecutive: 102
Sum Neg Max Consecutive: -39
CodePudding user response:
The logic is unclear, the way I understand it is:
- group by successive negative/positive values
- get the longest stretch per group
- compute the sum
You can use:
m = df['Numbers'].gt(0).map({True: 'positive', False: 'negative'})
df2 = df.groupby([m, m.ne(m.shift()).cumsum()])['Numbers'].agg(['count', 'sum'])
out = df2.loc[df2.groupby(level=0)['count'].idxmax(), 'sum'].droplevel(1)
Output:
Numbers
negative -39
positive 102
Name: sum, dtype: int64
Intermediate df2
:
count sum
Numbers Numbers
negative 2 1 -20
4 1 -50
6 1 -12
8 4 -39 # longest negative stretch
positive 1 1 10
3 2 70
5 2 72
7 3 102 # longest positive stretch