I have a dataframe :
Blocks Response RT Name
1 7 True 4630 ya__79891
2 7 True 4610 ya__79891
3 6 True 4390 ya__79891
4 6 True 5190 ya__79891
5 6 True 4260 ya__79891
6 5 True 3560 ya__79891
7 5 True 3610 ya__79891
I want to keep the latest or top-most rows which have same Blocks
values and Response
as True
consecutively for 3 times i.e. in the above example keep rows 3,4,5 only and remove all the others.
the desired output :
Blocks Response RT Name
3 6 True 4390 ya__79891
4 6 True 5190 ya__79891
5 6 True 4260 ya__79891
Is there any shorter method to do this? here is an incomplete code of what I have done :
df_1['Points'] = df_1['Response'].astype(int) #converted all the True and False values as 0 and 1
df_1 = df_1[df_1.Points != 0] #remove all the False/0 values
tmp = list(df_1['Blocks']) #created a list of the dataframe colum blocks
tmp = dict(Counter(tmp)) #created a dict`]
wm = [] #empty list for storing Blocks values`
for key,value in tmp.items():
if(value==3):
mem = key
break
wm.append(mem)
df2 = pd.DataFrame() # new data frame for saving values
df2['Name'] = name
df2['span'] = wm
Am still not able to get everything that I need as given above using this code.
Can someone help ?
CodePudding user response:
Try this:
df[df.groupby(['Blocks', 'Response'])['RT'].transform('size') > 2]
Output:
Blocks Response RT Name
3 6 True 4390 ya__79891
4 6 True 5190 ya__79891
5 6 True 4260 ya__79891