Pandas pipe throws error that df to be passed as argument
Ideally pipe should take the dataframe as argument by default which is not happening in my case.
class Summary:
def get_src_base_df(self):
<do stuff>
return df
@staticmethod
def sum_agg(df):
cols = 'FREQUENCY_ID|^FLAG_'
df = (df.filter(regex=cols).fillna(0)
.groupby('FREQUENCY_ID').agg(lambda x: x.astype(int).sum()))
return df
# few other @static methods
def get_src_df(self):
df = self.get_src_base_df().pipe(self.sum_agg()) #pipe chain continues
# --> error: sum_agg() missing 1 required positional argument: 'df'
# but the below line works
# df = self.get_src_base_df().pipe((lambda x: self.sum_agg(x))) #pipe chain continues
CodePudding user response:
By doing self.sum_agg()
, you're calling the sum_agg
function (@staticmethod
s in Python are pretty much indistinguishable from functions), and since it doesn't have an argument right there in that call, it rightfully fails. You need to pass the function object, not the value returned by the function.
Do this, instead :
def get_src_df(self):
df = self.get_src_base_df().pipe(self.sum_agg) # note: no parentheses