Home > Net >  How do you sum a dataframe based off a grouping in Python pandas?
How do you sum a dataframe based off a grouping in Python pandas?

Time:12-21

I have a for loop with the intent of checking for values greater than zero.

Problem is, I only want each iteration to check the sum of a group of ID’s.

The grouping would be a match of the first 8 characters of the ID string.

I have that grouping taking place before the loop but the loop still appears to search the entire df instead of each group.

LeftGroup = newDF.groupby(‘ID_Left_8’)
for g in LeftGroup.groups:
     if sum(newDF[‘Hours_Calc’] > 0):
     print(g)

Is there a way to filter that sum to each grouping of leftmost 8 characters?

I was expecting the .groups function to accomplish this, but it still seems to search every single ID.

Thank you.

CodePudding user response:

def filter_and_sum(group):
    return sum(group[group['Hours_Calc'] > 0]['Hours_Calc'])

LeftGroup = newDF.groupby('ID_Left_8')
results = LeftGroup.apply(filter_and_sum)
print(results)

This will compute the sum of the Hours_Calc column for each group, filtered by the condition Hours_Calc > 0. The resulting series will have the leftmost 8 characters as the index, and the sum of the Hours_Calc column as the value.

  • Related