Below is a sample dataframe from a much larger set of data. I need to create a new column 'Is a Manager?'
, that contains boolean results 'True'
or 'False'
. The condition; is the 'Employee ID'
listed anywhere within the 'Manager ID'
column within the dataset?
df = pd.DataFrame({'Worker': ['Sam','Tom','Justin','Jake'], 'Employee ID':[12345,12121,67891,99991], 'Manager ID': [97483, 29601,85863, 19739]})
df
Worker Employee ID Manager ID
0 Sam 12345 97483
1 Tom 12121 29601
2 Justin 67891 85863
3 Jake 99991 19739
and so on....
I have tried to use the .isin
function.
The column was added successfully, but all values state False
, when I know some should be True
.
For example, Sam's Employee ID 12345
is listed on line 245 as person X's manager 'Manager ID' = 12345
Any idea where i've gone wrong? My code is:
df3 = df.loc[:, ['Worker', 'Employee ID', 'Manager ID']]
df3.insert(1, 'Is a Manager?', df3['Employee ID'].isin(['Manager ID']))
df3
Worker Is a Manager? Employee ID Manager ID
0 A False 221113 1210236
1 B False 221359 86082653
2 C False 295142 1718020
3 D False 775199 1910236
CodePudding user response:
The problem is in this line:
df3.insert(1, 'Is a Manager?', df3['Employee ID'].isin(['Manager ID']))
You are checking whether the Employee ID is in a list containing the string "Manager ID"
.
The line should be:
df3.insert(1, 'Is a Manager?', df3['Employee ID'].isin(df3['Manager ID']))