I have a pandas DataFrame,
account_id, campaign_objective, campaign_spend, conversions
__________, __________________, ______________, ___________
1, sales, 100, 25
1, brand, 50, 25
2, sales, 80, 12
2, brand, 60, 12
What I would like to do is create one column for each unique campaign_objective and assign it the corresponding spend value.
account_id, sales, brand, conversions
__________, _____, _____, ___________
1, 100, 50, 25
2, 80, 60, 12
My approach has been using for loops and dictionaries. This is suboptimal as I have 20 million rows in my dataframe and there are 100 campaign_objectives; in other words, my for loop will need to iterate through 2 billion values.
new_df = {'account_id':[], 'conversions':[]}
for obj in obj_goal_list:
new_df.update({obj:[]})
for acct in df['account_id'].unique():
acct_df = df[df['account_id']==acct]
new_df['account_id'].append(acct)
new_df['conversions'].append(acct_df['conversions'])
for obj in obj_goal_list:
if obj in acct_df['objective_and_goal']:
spend = acct_df[acct_df['objective_and_goal']==obj]['spend']
new_df[obj].append(spend)
else:
new_df[obj].append(0.0)
new_df = pd.DataFrame(new_df)
I'm curious if there's a more "Pandas" way to accomplish this by pivots or other means?
CodePudding user response:
here is one way to do it
df.pivot(index=['account_id','conversions'], columns='campaign_objective', values='campaign_spend')
campaign_objective brand sales
account_id conversions
1 25 50 100
2 12 60 80
with reset_index
df.pivot(index=['account_id','conversions'], columns='campaign_objective', values='campaign_spend').reset_index()
campaign_objective account_id conversions brand sales
0 1 25 50 100
1 2 12 60 80
CodePudding user response:
Pivot method may help you.
df.pivot(index = ['account_id', 'conversions'], columns = 'campaign_objective', values = 'campaign_spend').reset_index()