I am transforming some data and I found myself in the situation where I needed to repeat the same process across different dataframes so I thought that building a function would be great.
I started doing this:
count_words = lambda x: len(x)
index = 'word_count'
values= ['Search term', 'Clicks', 'Impr.']
def table_transformation(dataframe, index, values):
dataframe_to_pivot = pd.pivot_table(data= dataframe,
index= index,
values= values,
aggfunc= {values[0]: count_words,
values[1]: np.sum,
values[2]: np.sum}
)
dataframe_to_pivot.sort_values(by=[index], ascending= True)
sum_counts = dataframe_to_pivot.iloc[9:].sum()
dataframe_to_pivot.drop(dataframe_to_pivot.index[9:].tolist())
dataframe_to_pivot.loc[' 10'] = sum_counts
return dataframe_to_pivot
fy21_word_counts = fy21.apply(table_transformation, args=(index, values))
fy21_word_counts
I got a KeyError: 'Search term' error
.
What I tried:
- I tried inserting the actual names of the columns inside the function but got the same error
- the logic inside the function works outside the function's
def()
structure
What is it that I overlooked/ misunderstood?
Thank you for your time.
CodePudding user response:
Use DataFrame.pipe
, because need pass DataFrame instead apply function for each column by .apply
:
fy21_word_counts = fy21.pipe(table_transformation, index, values)