I have a csv file and there are a few columns here. Some of these columns have values and some don't. What code should I write to get the output in the example below?
Id Name Library Line Source Line Destination
0 59 Ayla 2.0 57 34
1 60 Mahmut 2.0 14 22
2 61 Mine 2.0 22 43
3 62 Greg 2.0 14 62
4 63 Mahmut 2.0 14 33
5 64 Fiko 2.0 33 82
6 65 Jasmin 82 27
7 66 Mahmut 2.0 43 11
8 67 Ashley 2.0 62 53
Expected result
The necessary conditions are as follows: 'Mahmut' should be contained in the name column, and the number in the 'Line Source' field should not be in the 'Line Destination' area.
So records with index_id 1 and 4 should be listed.
Finally, the 'Line Source' value of these two records should be unique, and the output should only be the number 14
.
Is there a pandas way to do that?
CodePudding user response:
df[(df.Name == 'Mahmut') & (~df['Line Source'].isin(df['Line Destination'])]['Line Source'].unique()
CodePudding user response:
Given:
Id Name Library Line Source Line Destination
0 59 Ayla 2.0 57 34
1 60 Mahmut 2.0 14 22
2 61 Mine 2.0 22 43
3 62 Greg 2.0 14 62
4 63 Mahmut 2.0 14 33
5 64 Fiko 2.0 33 82
6 65 Jasmin NaN 82 27
7 66 Mahmut 2.0 43 11
8 67 Ashley 2.0 62 53
Doing:
>>> df[df.Name.eq('Mahmut') & ~df['Line Source'].isin(df['Line Destination'])]
Id Name Library Line Source Line Destination
1 60 Mahmut 2.0 14 22
4 63 Mahmut 2.0 14 33