Given the following pandas Dataframe df
-
Reporting Group | Entity/Grouping | Entity ID | Adjusted Value (Today, No Div, USD) | Adjusted TWR (Current Quarter, No Div, USD) | Adjusted TWR (YTD, No Div, USD) | Annualized Adjusted TWR (Since Inception, No Div, USD) | Adjusted Value (No Div, USD) |
---|---|---|---|---|---|---|---|
Barrack Family | William and Rupert Trust | 9957007 | -1.44 | -1.44 | |||
Barrack Family | Cash | - | -1.44 | -1.44 | |||
Barrack Family | Gratia Holdings No. 2 LLC | 8413655 | 55491732.66 | -0.971018847 | -0.971018847 | 11.52490309 | 55491732.66 |
Barrack Family | Investment Grade Fixed Income | - | 18469768.6 | 18469768.6 | |||
Barrack Family | High Yield Fixed Income | - | 3668982.44 | -0.205356545 | -0.205356545 | 4.441190127 | 3668982.44 |
I am trying to only keep rows that where Entity/Grouping
column values == Cash
AND where any of the following column values are NaN
-
Adjusted TWR (Current Quarter, No Div, USD)
Adjusted TWR (YTD, No Div, USD)
Annualized Adjusted TWR (Since Inception, No Div, USD)
Current code: the following function takes df
and tries to perform the above said and return, ready for use in another function -
def twr_exceptions_logic():
df = databases_creation()
df = df.loc[(df['Entity/Grouping']!= 'Cash')]
df = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
'Adjusted TWR (YTD, No Div, USD)',
'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
return df
Issue/Error: currently, I am getting a non-descript TypeError
, which references the columns I am referencing:KeyError: ('Adjusted TWR (Current Quarter, No Div, USD)', 'Adjusted TWR (YTD, No Div, USD)', 'Annualized Adjusted TWR (Since Inception, No Div, USD)')
Help: Am I missing something obvious? Do I have an obvious syntax error? Any hints/tips would be warmly received.
CodePudding user response:
You need a double square bracket when you are selecting a list:
df[['Adjusted TWR (Current Quarter, No Div, USD)',
'Adjusted TWR (YTD, No Div, USD)',
'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull()
This solves the KeyError. However, using this would give you a new error, as this would return you a 2D matrix to pass to df.loc as a parameter, which is an invalid parameter type. Instead, you could filter them one by one.
def twr_exceptions_logic():
df = databases_creation()
df = df.loc[(df['Entity/Grouping']!= 'Cash')]
df = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)'].isnull())]
df = df.loc[(df['Adjusted TWR (YTD, No Div, USD)'].isnull())]
df = df.loc[(df['Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
return df
CodePudding user response:
- You need one more square bracket.
- You can use
any
method to make a proper condition.
Code:
import pandas as pd
def databases_creation():
import numpy as np
return pd.DataFrame({'Reporting Group': {0: 'Barrack Family', 1: 'Barrack Family', 2: 'Barrack Family', 3: 'Barrack Family', 4: 'Barrack Family'}, 'Entity/Grouping': {0: 'William and Rupert Trust', 1: 'Cash', 2: 'Gratia Holdings No. 2 LLC', 3: 'Investment Grade Fixed Income', 4: 'High Yield Fixed Income'}, 'Entity ID': {0: '9957007', 1: '-', 2: '8413655', 3: '-', 4: '-'}, 'Adjusted Value (Today, No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}, 'Adjusted TWR (Current Quarter, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Adjusted TWR (YTD, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: -0.971018847, 3: np.NaN, 4: -0.205356545}, 'Annualized Adjusted TWR (Since Inception, No Div, USD)': {0: np.NaN, 1: np.NaN, 2: 11.52490309, 3: np.NaN, 4: 4.441190127}, 'Adjusted Value (No Div, USD)': {0: -1.44, 1: -1.44, 2: 55491732.66, 3: 18469768.6, 4: 3668982.44}})
def twr_exceptions_logic():
df = databases_creation()
# df = df.loc[(df['Entity/Grouping']!= 'Cash')]
# df = df.loc[(df['Adjusted TWR (Current Quarter, No Div, USD)',
# 'Adjusted TWR (YTD, No Div, USD)',
# 'Annualized Adjusted TWR (Since Inception, No Div, USD)'].isnull())]
mask = (df['Entity/Grouping']!= 'Cash') & (df[['Adjusted TWR (Current Quarter, No Div, USD)',
'Adjusted TWR (YTD, No Div, USD)',
'Annualized Adjusted TWR (Since Inception, No Div, USD)']].isnull().any(axis=1))
df = df[mask]
return df
df = twr_exceptions_logic()
print(df)
Output:
Reporting Group | Entity/Grouping | Entity ID | Adjusted Value (Today, No Div, USD) | Adjusted TWR (Current Quarter, No Div, USD) | Adjusted TWR (YTD, No Div, USD) | Annualized Adjusted TWR (Since Inception, No Div, USD) | Adjusted Value (No Div, USD) |
---|---|---|---|---|---|---|---|
Barrack Family | William and Rupert Trust | 9957007 | -1.44 | nan | nan | nan | -1.44 |
Barrack Family | Investment Grade Fixed Income | - | 1.84698e 07 | nan | nan | nan | 1.84698e 07 |