I working with pandas dataframe, and cant figure out this problem:
I think I may need some for loops, but I am stuck in this one!
If the sum from bottom and up in column A is 28, i want to return the index where the sum is 28. In this example it will be 10 7 11 = 28, and the index(Date) is 5. So i want to return 5.
Date__A
0_____11
1_____9
2_____10
3_____8
4_____2
5_____11
6_____7
7_____10
CodePudding user response:
Using the following df:
df = pd.DataFrame({'Date':[0, 1, 2, 3, 4, 5, 6, 7],
'A':[11, 9, 10, 8, 2, 11, 7, 10]})
df = df.set_index('Date')
You can find the backwards cumulative sum, by reversing the dataframe. You can then reverse this list, and add it as another column to your original dataframe:
cumsum = df[::-1].cumsum()['A'].to_list()
cumsum.reverse()
df['cumsum'] = cumsum
Then you can get the first index of the subset of the df where the cumsum is <=28 (this will return the closest index where the sum is <28 if it doesn't add exactly to 28).
index = df.loc[df['cumsum'] <= 28].first_valid_index()
CodePudding user response:
Use:
import pandas as pd
# setup
df = pd.DataFrame.from_dict({'Date': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7},
'A': {0: 11, 1: 9, 2: 10, 3: 8, 4: 2, 5: 11, 6: 7, 7: 10}})
res = df.iloc[::-1, 1].cumsum().eq(28).idxmax()
print(res)
Output
5
CodePudding user response:
Start from computing a temporary Series:
wrk = df.set_index('Date').A
To compute the index of your "wanted" element, counting from the top, run:
res = wrk[wrk.cumsum() == 28]
iFirst = res.index[0] if res.size > 0 else np.nan
To compute the index counting from the bottom, you should compute the cummsum also from the bottom:
res = wrk[wrk[::-1].cumsum() == 28]
iLast = res.index[-1] if res.size > 0 else np.nan