I have a Pandas DataFrame with the following fields:
week
zipcode
cost
I would like to create a new DataFrame from the above DataFrame ("week", "zipcode", and "min_cost") by performing the following SQL:
SELECT week, zipcode, MIN(cost) min_cost
FROM the_first_dataframe
GROUP BY 1, 2
My constraint is that I can't use pandasql and hope to achieve just by pure pandas operations.
Any insight will be appreciated.
CodePudding user response:
Try groupby
named aggregation:
out = the_first_dataframe.groupby(['week','zipcode'], as_index=False).agg(min_cost=('cost','min'))
CodePudding user response:
Many ways to do it
option1 :Use groupby agg and add to frame
the_first_dataframe.groupby(['week','zipcode'])['cost'].agg('min').to_frame('min_cost')
option 2 :groupy, cost will be carried over. Rename it
the_first_dataframe.groupby(['week','zipcode']).min().rename(columns={'cost':'min_cost'})