Home > OS >  How to remove Dataframe from list if in exclusion list?
How to remove Dataframe from list if in exclusion list?

Time:02-16

I have a DataFrame

This has a list of #, if a user is in this list, I want to exclude them from my current output:

Which contains users from the ooc-exceptions.csv, but if they are in that csv, I don't want to process them

I have already tried the following:

condition = lj_df['PrimaryUser'].isin(exclude_df['LastLogonUser'])
lj_df.drop(lj_df[condition].index, inplace = True)

but it doesn't display my end result

df = (lj_df.LastLogonUser)
Employee Number 
s128331
s150792 
s128535
s128726
s129103
P306823 
s129835 
s109424 
s125025 
P305503
dE = pd.read_csv("ooc-exceptions.csv")
prohibited users:
s110856
s129103
s128331
s131420
s128726
s128350
s128535
s110991
s141490
s145811
s150640

End Result should look like this:

s150792
P306823
s129835
s109424
s125025
P305503

If prohibited user, I don't want to display it.

CodePudding user response:

If you use the isin method:

df['Employee_Number'].isin(prohibited['prohibited_users'])

you get a boolean Series where it's True for rows where employee number is in prohibited_users. Since you want the opposite of that you use the negation operator ~:

~df['Employee_Number'].isin(prohibited['prohibited_users'])

This creates a boolean Series where none of the employee numbers exist in "prohibited_users".

In pandas, you can use boolean indexing with isin to filter DataFrames, i.e. you can use a Series (or array) of True/False values to select rows efficiently.

Combining these insights:

out = df[~df['Employee_Number'].isin(prohibited['prohibited_users'])]

Output:

Employee Number
        s150792
        P306823
        s129835
        s109424
        s125025
        P305503

CodePudding user response:

You can use sets predicates and isin to filter out your dataframe:

allowed_users = set(df['Employee Number']).difference(dE['prohibited users'])

df['allowed users'] = df['Employee Number'].isin(allowed_users)
print(df)

# Output
  Employee Number  allowed users
0         s128331          False
1         s150792           True
2         s128535          False
3         s128726          False
4         s129103          False
5         P306823           True
6         s129835           True
7         s109424           True
8         s125025           True
9         P305503           True

To show allowed users:

>>> df[df['allowed users']]
  Employee Number  allowed users
1         s150792           True
5         P306823           True
6         s129835           True
7         s109424           True
8         s125025           True
9         P305503           True
  • Related