Home > Software engineering >  Strange `ErrorType` error referencing columns
Strange `ErrorType` error referencing columns

Time:03-14

Given the following pandas Dataframe df -

Reporting Group Entity/Grouping Entity ID Adjusted Value (Today, No Div, USD) Adjusted TWR (Current Quarter, No Div, USD) Adjusted TWR (YTD, No Div, USD) Annualized Adjusted TWR (Since Inception, No Div, USD) Adjusted Value (No Div, USD)
Barrack Family William and Rupert Trust 9957007 -1.44 -1.44
Barrack Family Cash - -1.44 -1.44
Barrack Family Gratia Holdings No. 2 LLC 8413655 55491732.66 -0.971018847 -0.971018847 11.52490309 55491732.66
Barrack Family Investment Grade Fixed Income - 18469768.6 18469768.6
Barrack Family High Yield Fixed Income - 3668982.44 -0.205356545 -0.205356545 4.441190127 3668982.44

I am trying to only keep rows that where Entity/Grouping column values == Cash AND where any of the following column values are NaN -

  • Adjusted TWR (Current Quarter, No Div, USD)
  • Adjusted TWR (YTD, No Div, USD)
  • Annualized Adjusted TWR (Since Inception, No Div, USD)

Current code: the following function takes df and tries to perform the above said and return, ready for use in another function -

def twr_exceptions_logic():
    df = databases_creation()

    df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
    return df

Issue/Error: currently, I am getting a non-descript TypeError, which references the columns I am referencing:

KeyError: ('Adjusted TWR (Current Quarter, No Div, USD)', 'Adjusted TWR (YTD, No Div, USD)', 'Annualized Adjusted TWR (Since Inception, No Div, USD)')

Help: Am I missing something obvious? Do I have an obvious syntax error? Any hints/tips would be warmly received.

CodePudding user response:

You need a double square bracket when you are selecting a list:

df[['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull()

This solves the KeyError. However, using this would give you a new error, as this would return you a 2D matrix to pass to df.loc as a parameter, which is an invalid parameter type. Instead, you could filter them one by one.

def twr_exceptions_logic():
    df = databases_creation()

    df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)'].isnull())]
    df  = df.loc[(df['Adjusted TWR (YTD, No Div, USD)'].isnull())]
    df  = df.loc[(df['Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
    return df

CodePudding user response:

  1. You need one more square bracket.
  2. You can use any method to make a proper condition.

Code:

import pandas as pd

def databases_creation():
    import numpy as np
    return pd.DataFrame({'Reporting Group': {0: 'Barrack Family', 1: 'Barrack Family', 2: 'Barrack Family', 3: 'Barrack Family', 4: 'Barrack Family'}, 'Entity/Grouping': {0: 'William and Rupert Trust', 1: 'Cash', 2: 'Gratia Holdings No. 2 LLC', 3: 'Investment Grade Fixed Income', 4: 'High Yield Fixed Income'}, 'Entity ID': {0: '9957007', 1: '-', 2: '8413655', 3: '-', 4: '-'}, 'Adjusted Value (Today, No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}, 'Adjusted TWR (Current Quarter, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Adjusted TWR (YTD, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Annualized Adjusted TWR (Since Inception, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: 11.52490309, 3: np.NaN, 4: 4.441190127}, 'Adjusted Value (No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}})

def twr_exceptions_logic():
    df = databases_creation()

    # df  = df.loc[(df['Entity/Grouping']!= 'Cash')]
    # df  = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
    #                           'Adjusted TWR (YTD, No Div, USD)',
    #                           'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]

    mask = (df['Entity/Grouping']!= 'Cash') & (df[['Adjusted TWR (Current Quarter, No Div, USD)',
                              'Adjusted TWR (YTD, No Div, USD)',
                              'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull().any(axis=1))
    df  = df[mask]

    return df

df = twr_exceptions_logic()

print(df)

Output:

Reporting Group Entity/Grouping Entity ID Adjusted Value (Today, No Div, USD) Adjusted TWR (Current Quarter, No Div, USD) Adjusted TWR (YTD, No Div, USD) Annualized Adjusted TWR (Since Inception, No Div, USD) Adjusted Value (No Div, USD)
Barrack Family William and Rupert Trust 9957007 -1.44 nan nan nan -1.44
Barrack Family Investment Grade Fixed Income - 1.84698e 07 nan nan nan 1.84698e 07
  • Related