I have two pandas data frames. One with customers and sales and one with customers and sales targets. I want to sum up the sales value according to number of sales from the "Sales Target" dataframe.
Sales Table
Index | Cust_ID | Date | Value |
---|---|---|---|
0 | A11 | 02.01.2021 | 100 |
1 | A11 | 03.01.2021 | 100 |
2 | A11 | 04.01.2021 | 100 |
3 | A11 | 05.01.2021 | 100 |
4 | B22 | 05.01.2021 | 100 |
5 | B22 | 06.01.2021 | 100 |
6 | B22 | 07.01.2021 | 100 |
7 | C33 | 08.01.2021 | 100 |
8 | C33 | 09.01.2021 | 100 |
Sales Targets
Index | Cust_ID | Sales_Target |
---|---|---|
0 | A11 | 4 |
1 | B22 | 2 |
2 | C33 | 4 |
Customer A11 has a "Sales_Target" of 4 he bought 4 therefore a value of 400
Customer B22 has a "Sales_Target" of 2 he bought 3 therefore only a value of 200
Customer C33 has a "Sales_Target" of 3 he bought 2 therefore only a value of 200
Index | Cust_ID | Sales_Target | Sales | Sales_Value |
---|---|---|---|---|
0 | A11 | 4 | 4 | 400 |
1 | B22 | 2 | 3 | 200 |
2 | C33 | 4 | 2 | 200 |
Sorry, I have no idea to solve the problem.
Thank you for your help.
Cheers Marcus
CodePudding user response:
What you actually want isn't a direct merge. First, transform your first table into sums grouped by customer ID:
sales_agg = sales.groupby('Cust_ID').agg(sales_value=('Value': 'sum'),
sales_count=('Value', 'count')) \
.reset_index()
Then merge your targets against this new table to introduce the new columns:
table3 = sales_targets.merge(sales_agg, on='Cust_ID', how='left', validate='1:1')
CodePudding user response:
you need to marge the table and then group_by the relevant columns
in the case of the current data you can just group by customer
sales_df = "sales_and_customer_data"
target_df = "target_and_customer_data"
merge_df = pd.merge(sales_df,target_df,how='left',on=['Cust_ID'],copy=False)
merge_df = merge_df[['Cust_ID','Sales_Target','Value']].groupby('Cust_ID').agg(sales_value=('Value': 'sum'),sales_count=('Value', 'count'))