I have a large pandas dataframe as follows.
Input df
member_id attribute1
1 AM
1 A
26 TSE
26 TSA
26 TSE
----
97736 TMA
97736 TTE
97736 TMA
97736 ALM
Is there any way I can efficiently shift every second value in attribute1
column to a new column lets say attribute2
for each unique member_id
index.
Output df
member_id attribute1 attribute2
1 AM A
26 TSE TSA
26 TSA TSE
----
97736 TMA TTE
97736 TTE TMA
97736 TMA ALM
CodePudding user response:
Try:
# shift the rows up
df['attribute2'] = df.groupby('member_id')['attribute1'].shift(-1)
# drop the last row for each `member_id`
df = df[df['member_id'].duplicated(keep='last')]
Output:
member_id attribute1 attribute2
0 1 AM A
2 26 TSE TSA
3 26 TSA TSE
5 97736 TMA TTE
6 97736 TTE TMA
7 97736 TMA ALM