Home > other >  Pandas dataframe rows won't drop
Pandas dataframe rows won't drop

Time:01-21

Background - I am trying to drop rows from a pandas dataframe (exceptions_df) if all 3x conditions are met.

Conditions -

  1. Ownership Audit Note column value contains partial string values of either ignore or Ignore.
  2. Entity ID % column value is == Account # % (this column is formatted as a float64).
  3. % Ownership coumn is == 100. (this column is formatted as a float64)

Extract from dataframe -

  % Ownership     Ownership Audit Note       Entity ID %     Account # %  
0 100.00          [ignore] 100% Ownership    0.0000000       0.0000000  
1 100.00          [ignore] 100% Ownership    0.0000000       0.0000000  
2 100.00          [ignore] 100% Ownership    0.0000000       0.0000000  
3 100.00          [ignore] 100% Ownership    0.0000000       0.0000000    
4 100.00          [ignore] 100% Ownership    0.0000000       0.0000000    
5 100.00          [ignore] 100% Ownership    1.0000000       1.0000000  
8 100.00          [ignore] 100% Ownership    0.0000234       0.0000234  
9 100.00          [ignore] 100% Ownership    0.0000000       0.0000000

My code -

exceptions_df = exceptions_df[~exceptions_df['Ownership Audit Note'].str.contains('ignore'|'Ignore') & 
                             [~exceptions_df['% Ownership'] == 100] & 
                             [~exceptions_df['Account # %'] == 'Entity ID %']]

Issue - I seem to be getting the following TypeError:, which is referencing the above line of code. Have I missed something obvious? Strangely if I just include the first condition / first line of code, then it works fine!

TypeError: unsupported operand type(s) for |: 'str' and 'str'

CodePudding user response:

Use of wrong partition brackets. Lets try

exceptions_df = exceptions_df[(~(exceptions_df['Ownership Audit Note'].str.contains('ignore'|'Ignore'))) & 
                             (~(exceptions_df['% Ownership'] == 100)) & 
                            ( ~(exceptions_df['Account # %'] == 'Entity ID %'))]

CodePudding user response:

Need to remove inside quotes in .contains(). Made dummy df for example.

df_dict = {'Ownership Audit Note': ['ignore', 'Ignore', 'bar', 'foo', 'ignore'],
           '% Ownership': [100, 90, 80, 70, 60],
           'Account # %': [1, 2, 3, 7, 6],
           'Entity ID %': [1, 2, 6, 7, 6]}

exceptions_df = pd.DataFrame(df_dict)
    

exceptions_df = exceptions_df[(~(exceptions_df['Ownership Audit Note'].str.contains('ignore|Ignore'))) & 
                                 (~(exceptions_df['% Ownership'] == 100)) & 
                                (~(exceptions_df['Account # %'] == 'Entity ID %'))]

print(df)

Ownership Audit Note    % Ownership Account # % Entity ID %
2   bar                 80          3           6
3   foo                 70          7           7
  •  Tags:  
  • Related