I have a dataset like below in pandas dataframe:
Name Shift Data Type
Peter 0 12 A
Peter 0 13 A
Peter 0 14 B
Sam 1 12 A
Sam 1 15 A
Sam 1 16 B
Sam 1 17 B
Mary 2 20 A
Mary 2 21 A
Mary 2 12 A
May anyone suggest how to show end result like the below? (logic is: if shift is 0, pick the 1st item under groupby "Name" and "type" columns; if shift is 1, pick the 2nd value under the groupby "Name" and "type" columns, etc... I have thought of nth(x) but I don't know how to put a variable on x in this case. Other workaround is fine that can generated the same result. Thank you.
Name Shift Data Type
Peter 0 12 A
Peter 0 14 B
Sam 1 15 A
Sam 1 17 B
Mary 2 12 A
CodePudding user response:
You can use groupby.cumcount()
Assuming your data is in a DataFrame called df
, I think this should work for you:
df = df[df.groupby(['Name','Type']).cumcount()==df['Shift']]
It compares the cumulative count of rows with the same Name and Type to the values in the Shift column to determine which rows should be kept