I'm new to Pandas. I have a data frame that looks something like this.
Name | Storage Location | Total Quantity |
---|---|---|
a | S1 | 100 |
a | S2 | 200 |
a | S3 | 300 |
a | S4 | 110 |
a | S5 | 200 |
b | S1 | 200 |
b | S2 | 300 |
b | S4 | 400 |
b | S5 | 150 |
c | S1 | 400 |
c | S5 | 500 |
I wanna sum the "Total Quantity" group by the Name and also specific storage location which are only "S1,S2,S3".
Name | Total Quantity |
---|---|
a | 600 |
b | 500 |
c | 400 |
My desired output would be something like the above. Kindly appreciate for you guys help. Thank you in advance!
CodePudding user response:
You could use where
to replace the unwanted Locations with NaN and use groupby
sum
(since sum
skips NaN by default):
out = df.where(df['Storage Location'].isin(['S1','S2','S3'])).groupby('Name', as_index=False)['Total Quantity'].sum()
Output:
Name Total Quantity
0 a 600.0
1 b 500.0
2 c 400.0
CodePudding user response:
Use:
In [2378]: out = df[df['Storage Location'].isin(['S1', 'S2', 'S3'])].groupby('Name')['Total Quantity'].sum().reset_index()
In [2379]: out
Out[2379]:
Name Total Quantity
0 a 600
1 b 500
2 c 400