Pandas dataframe - Sum columns up to a pre given value, return index-CodePudding

I working with pandas dataframe, and cant figure out this problem:

I think I may need some for loops, but I am stuck in this one!

If the sum from bottom and up in column A is 28, i want to return the index where the sum is 28. In this example it will be 10 7 11 = 28, and the index(Date) is 5. So i want to return 5.

Date__A
0_____11
1_____9
2_____10
3_____8
4_____2
5_____11
6_____7
7_____10

CodePudding user response：

Using the following df:

 df = pd.DataFrame({'Date':[0, 1, 2, 3, 4, 5, 6, 7],
                       'A':[11, 9, 10, 8, 2, 11, 7, 10]})
    
 df = df.set_index('Date')

You can find the backwards cumulative sum, by reversing the dataframe. You can then reverse this list, and add it as another column to your original dataframe:

cumsum = df[::-1].cumsum()['A'].to_list()
cumsum.reverse()
df['cumsum'] = cumsum

Then you can get the first index of the subset of the df where the cumsum is <=28 (this will return the closest index where the sum is <28 if it doesn't add exactly to 28).

 index = df.loc[df['cumsum'] <= 28].first_valid_index()

CodePudding user response：

Use:

import pandas as pd

# setup
df = pd.DataFrame.from_dict({'Date': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7},
                             'A': {0: 11, 1: 9, 2: 10, 3: 8, 4: 2, 5: 11, 6: 7, 7: 10}})

res = df.iloc[::-1, 1].cumsum().eq(28).idxmax()
print(res)

Output

CodePudding user response：

Start from computing a temporary Series:

wrk = df.set_index('Date').A

To compute the index of your "wanted" element, counting from the top, run:

res = wrk[wrk.cumsum() == 28]
iFirst = res.index[0] if res.size > 0 else np.nan

To compute the index counting from the bottom, you should compute the cummsum also from the bottom:

res = wrk[wrk[::-1].cumsum() == 28]
iLast = res.index[-1] if res.size > 0 else np.nan