I have a dataframe with the following sample data
Product quantity sold
a 30
at 20
am 10
b 5
bn 7
bt 90
c 76
c1 67
ct 54
m 12
t 87
n 12
I want to group the products that start with a into a new product name Art
, those that start with b
under name Brt
and those that start with c
into Crt
and leave products m
, t
and n
in the same dataframe into something below:
Product quantity sold
Art 60
Brt 102
Crt 197
m 12
t 87
n 12
CodePudding user response:
Since you have complex conditions, might be easy enough to just rename the ones you want.
import pandas as pd
df = pd.DataFrame({'Product': ['a', 'at', 'am', 'b', 'bn', 'bt', 'c', 'c1', 'ct', 'm', 't', 'n'],
'quantity sold ': [30, 20, 10, 5, 7, 90, 76, 67, 54, 12, 87, 12]})
df.loc[df['Product'].str.startswith('a'), 'Product'] = 'Art'
df.loc[df['Product'].str.startswith('b'), 'Product'] = 'Brt'
df.loc[df['Product'].str.startswith('c'), 'Product'] = 'Crt'
df.groupby('Product', as_index=False).sum()
Output
Product quantity sold
0 Art 60
1 Brt 102
2 Crt 197
3 m 12
4 n 12
5 t 87
CodePudding user response:
You can do it using str.map and dictionary:
grp = df['Product'].str[0].map({'a':'Art', 'b':'BRT', 'c':'CRT'}).fillna(df['Product'])
df.groupby(grp)['quantity sold'].sum()
Output:
Product
Art 60
BRT 102
CRT 197
m 12
n 12
t 87
Name: quantity sold, dtype: int64
Here, we are using a shortcut for .str.get, str[0] indexing the first character from the string, then using map to create desired groups, and those values not in map are filled with the original values from df['Product']. Lastly, we groupby the newly created group.