I've written a function that does some calculations to dataframes that are passed in.
However, due to some formatting issues, one of the dataframes that is passed in needs additional work. I'd like to write an if statement within the function that determines if that specific dataframe is passed in, it will do some formatting work. The issue is the variable is a dataframe type, so I can't figure out how to just use the name to develop the if statement.
I have tried a few things, including trying to add a string to a variable name with ,
or &
. I've tried to convert the df to a string with df.to_string()
. And I've tried a few variations on these function outlines:
a_df
b_df
def calc_mean_max(df):
df_string = "a"
if df_string in df:
#do formatting
else:
#do regular calculations
def calc_mean_max(df):
if df == "a_df":
#do formatting
else:
#do regular calculations
Please let me know if I can clarify anything on this problem, I feel like it should be a pretty straightforward solution but maybe I'm wrong. Thanks in advance for all the help!
CodePudding user response:
As I said in comment, it's not possible to get the dataframe name inside your function but there is an elegant solution. You can use attrs
dict of a dataframe (note the warning).
def calc_mean_max(df):
if df.attrs['name'] == "a_df":
#do formatting
else:
#do regular calculations
a_df = pd.DataFrame(...)
a_df.attrs['name'] = 'a_df'
b_df = pd.DataFrame(...)
b_df.attrs['name'] = 'b_df'