How do you put the rows with the same order_id such that all their corresponding rows add up to form the resulting Dataframe? (in this case quantity & item price should be added with the corresponding order_id before it, and the choice_description & item_name should be added in their "str" format as well)
Reproducible input:
d = {'order_id': [1, 1, 1, 1, 2], 'quantity': [1, 1, 1, 1, 2], 'item_name': ['Chips and Fresh Tomato Salsa', 'Izze', 'Nantucket Nectar', 'Chips and Tomatillo-Green Chili Salsa', 'Chicken Bowl'], 'choice_description': [nan, '[Clementine]', '[Apple]', nan, '[Tomatillo-Red Chili Salsa (Hot), [Black Beans, Rice, Cheese, Sour Cream]]'], 'item_price': ['$2.39 ', '$3.39 ', '$3.39 ', '$2.39 ', '$16.98 ']}
df = pd.DataFrame(d)
CodePudding user response:
You can use:
out = (df
.assign(price=pd.to_numeric(df['item_price'].str.strip('$'), errors='coerce')
.mul(df['quantity']),
choice_description=df['choice_description'].astype(str),
)
.groupby('order_id')
.agg({'item_name': ','.join,
'choice_description': ','.join,
'price': 'sum',
})
.assign(price=lambda d: '$' d['price'].round(2).astype(str))
)
Output:
item_name choice_description price
order_id
1 Chips and Fresh Tomato Salsa,Izze,Nantucket Nectar,Chips and Tomatillo-Green Chili Salsa nan,[Clementine],[Apple],nan $11.56
2 Chicken Bowl [Tomatillo-Red Chili Salsa (Hot), [Black Beans, Rice, Cheese, Sour Cream]] $33.96