I want to check if a column value changes for each unique value of "Name".
The dataframe I have looks like this
Name | Is finished? |
---|---|
Paul | No |
Paul | No |
Paul | No |
John | No |
John | No |
John | Yes |
John | Yes |
Ringo | No |
Ringo | No |
Ringo | Yes |
Ringo | Yes |
Ringo | Yes |
My desired output is an extra column which either says "True" if a value has changed, and "False" if it didn't change. Desired output would look like this:
| Name | Is finished? | Change|
| -------- | -------------- |---|
| Paul | No| False|
| Paul | No|False|
| Paul | No|False|
| John | No|False|
| John | No|False|
| John | Yes|True|
| John| Yes|False|
| Ringo | No|False|
| Ringo | No|False|
| Ringo | Yes|True|
| Ringo | Yes|False|
| Ringo | Yes|False|
CodePudding user response:
Let us do groupby
and shift
to compare the current and previous row in Is finished
column
c = 'Is finished?'
df['Change'] = df[c].eq('Yes') & df.groupby('Name')[c].shift().eq('No')
Name Is finished? Change
0 Paul No False
1 Paul No False
2 Paul No False
3 John No False
4 John No False
5 John Yes True
6 John Yes False
7 Ringo No False
8 Ringo No False
9 Ringo Yes True
10 Ringo Yes False
11 Ringo Yes False
CodePudding user response:
import pandas as pd
df = pd.DataFrame({'Name': {0: 'Paul',1: 'Paul',2: 'Paul',3: 'John', 4: 'John',5: 'John',6: 'John',7: 'Ringo',8: 'Ringo',9: 'Ringo',10: 'Ringo',11: 'Ringo'},'Is finished?': {0: 'No',1: 'No',2: 'No',3: 'No',4: 'No',5: 'Yes',6: 'Yes',7: 'No',8: 'No',9: 'Yes',10: 'Yes',11: 'Yes'}})
result = df.drop(df.index)
for _, g in df.groupby('Name'):
g['Shift'] = g['Is finished?'].shift(1)
g['Changed?'] = g['Is finished?'] != g['Shift']
result = pd.concat([result, g.drop(columns=['Shift'])], axis=0)
print(result)
Output:
Name Is finished? Changed?
3 John No True
4 John No False
5 John Yes True
6 John Yes False
0 Paul No True
1 Paul No False
2 Paul No False
7 Ringo No True
8 Ringo No False
9 Ringo Yes True
10 Ringo Yes False
11 Ringo Yes False