So my question is how to get values of a column 'accuracy' are in -1 of each other with respect to 'vin' column. if we get -1 value than minimum 2 values of a particular 'vin' should be there and if it is less than 2 values then it will be false.
Below is my Dataframe:
import pandas as pd
df = pd.DataFrame({'vin':['aaa','aaa','aaa','aaa','bbb','bbb','bbb','bbb','ccc','ccc','ccc','ddd'],
'accuracy':[1,2,3,9,22,23,211,212,34,39,40,55]})
df
My expected output will be like column 'Result'.
df = pd.DataFrame({'vin':['aaa','aaa','aaa','aaa','bbb','bbb','bbb','bbb','ccc','ccc','ccc','ddd'],
'value':[1,2,3,9,22,23,211,212,34,39,40,55],'Result':['pass','pass','pass','fail','pass','pass','pass','pass','fail','pass','pass','fail']})
df
output:
vin value Result
0 aaa 1 pass
1 aaa 2 pass
2 aaa 3 pass
3 aaa 9 fail
4 bbb 22 pass
5 bbb 23 pass
6 bbb 211 pass
7 bbb 212 pass
8 ccc 34 fail
9 ccc 39 pass
10 ccc 40 pass
11 ddd 55 fail
CodePudding user response:
Assuming the data is sorted, you can compute a diff per group, check that the diff is ≤ 1, then use this mask and it's shift to feed to numpy.where
:
# if not sorted
# df = df.sort_values(by=['vin', 'accuracy'])
mask = df.groupby('vin')['accuracy'].diff().le(1)
df['Result'] = np.where(mask|mask.groupby(df['vin']).shift(-1), 'pass', 'fail')
output:
vin accuracy Result
0 aaa 1 pass
1 aaa 2 pass
2 aaa 3 pass
3 aaa 9 fail
4 bbb 22 pass
5 bbb 23 pass
6 bbb 211 pass
7 bbb 212 pass
8 ccc 34 fail
9 ccc 39 pass
10 ccc 40 pass
11 ddd 55 fail