How to identify Winner Week when the previous row is not equal to the current row. Week is classified as a "Winner" when the [Weekly_counts] is greater than the [Winner_Num] and the previous week is not a Winner.
Here a copy of the final data set:
Year ISOweeknum Weekly_Counts NumOfWeeks Yearly_Count WeeklyAverage Winner_Num
0 2017 9 1561 44 12100 275 330
1 2017 10 1001 44 12100 275 330
2 2017 11 451 44 12100 275 330
3 2017 12 513 44 12100 275 330
4 2017 13 431 44 12100 275 330
... ... ... ... ... ... ... ...
232 2021 32 136 36 4212 117 140
233 2021 33 84 36 4212 117 140
234 2021 34 95 36 4212 117 140
235 2021 35 120 36 4212 117 140
236 2021 53 77 36 4212 117 140
I've tried using this code but not getting the results desired:
new_df3['Winner_Results'] = 0
for i in range(len(new_df3)-1):
if (new_df3['Weekly_Votes_Counts'].iloc[i] > new_df3['Winner_Num'].iloc[i]) & (new_df3['Weekly_Votes_Counts'].iloc[i 1] > new_df3['Winner_Num'].iloc[i 1]):
new_df3['Winner_Results'].iloc[i] = 'Not Winner'
else:
new_df3['Winner_Results'].iloc[i] = 'Winner'
.py:6: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy new_df3['Winner_Results'].iloc[i] = 'Winner'
The Expected Result:
[![Excel Example][1]][1]
[1]: https://i.stack.imgur.com/BEcuX.png
CodePudding user response:
Here's a way to get the result in your question:
df['Counts_Gt_Num'] = df.Weekly_Counts > df.Winner_Num
df['Cumsum'] = df['Counts_Gt_Num'].cumsum()
df.loc[(~df['Counts_Gt_Num']) & df['Counts_Gt_Num'].shift(), 'Subtract'] = df['Cumsum']
df['Is_Winner'] = (df['Cumsum'] - df['Subtract'].ffill().fillna(0)).mod(2)
df['Winner_Results'] = np.where(df['Is_Winner'], 'Winner', 'Not Winner')
df = df.drop(columns=['Counts_Gt_Num', 'Cumsum', 'Subtract', 'Is_Winner'])
Output:
Year ISOweeknum Weekly_Counts NumOfWeeks Yearly_Count WeeklyAverage Winner_Num Winner_Results
0 2017 9 1561 44 12100 275 330 Winner
1 2017 10 1001 44 12100 275 330 Not Winner
2 2017 11 451 44 12100 275 330 Winner
3 2017 12 513 44 12100 275 330 Not Winner
4 2017 13 431 44 12100 275 330 Winner
5 2017 14 371 44 12100 275 330 Not Winner
6 2017 15 361 44 12100 275 330 Winner
7 2017 16 336 44 12100 275 330 Not Winner
8 2017 17 332 44 12100 275 330 Winner
9 2017 18 124 44 12100 275 330 Not Winner
10 2017 19 142 44 12100 275 330 Not Winner
11 2017 20 290 44 12100 275 330 Not Winner
12 2017 21 116 44 12100 275 330 Not Winner
13 2017 22 142 44 12100 275 330 Not Winner
14 2017 23 132 44 12100 275 330 Not Winner
15 2017 24 69 44 12100 275 330 Not Winner
16 2017 25 124 44 12100 275 330 Not Winner
17 2017 26 136 44 12100 275 330 Not Winner
18 2017 27 63 44 12100 275 330 Not Winner
Explanation:
- mark rows with counts > num using boolean in new column
Counts_Gt_Num
- put cumsum of the above in new column
Cumsum
- create new column
Subtract
by coping fromCumsum
for rows whereCounts_Gt_Num
is False but was True in the preceding row, and forward fill usingffill()
for rows with nan - create
Is_Winner
column selecting as winners the rows at an even offset (0, 2, 4 ...) within a streak of non-zero values inCumsum - Subtract
- create
Winner_Results
by assigning the desired win/no-win value based onIs_Winner
- drop intermediate columns.