Product and Sub-Product Category
Product - Fruits Sub-Product - Apple Orange
Product - Vegetables Sub-Product - Tomato Onion
I am finding a challenge to rearrange the rows of the python dataframe based on product and sub-product by Code. If a Code contains both products Fruits and Vegetable, it should be rearranged to Product and Sub-Product level for example code : 3E45212, Please help!
Df1:
Code Product Limit Value
3A68185 Fruits 0.6 0
3A68185 Apple 0.6 0
3B22979 Apple 3.5 0.430145588
3B22979 Fruits 3.5 0.430145588
3B22979 Orange 0 0
3C67260 Apple 3 1.123774052
3C67260 Fruits 3 1.123774052
3C71601 Vegetables 15 0
3C71601 Tomato 15 0
3E45212 Apple 5 0
3E45212 Fruits 5 0
3E45212 Tomato 35.5 0
3E45212 Onion 0.5 0
3E45212 Vegetables 36 0
3C78910 Fruits 2 1.187282182
3C78910 Apple 2 1.187282182
3C82861 Fruits 64 0.560863589
3C82861 Apple 15 0
3C82861 Orange 49 0.560863589
3D11357 Tomato 25.5 0
3D11357 Onion 0.5 0
3D11357 Vegetables 26 0
3D51126 Onion 0.5 0
3D51126 Vegetables 15 0
3D51126 Tomato 14.5 0
3E20062 Onion 1 0
3E20062 Vegetables 1 0
Expected Output:
Code Product Limit Value
3A68185 Fruits 0.6 0
3A68185 Apple 0.6 0
3B22979 Fruits 3.5 0.430145588
3B22979 Apple 3.5 0.430145588
3B22979 Orange 0 0
3C67260 Fruits 3 1.123774052
3C67260 Apple 3 1.123774052
3C71601 Vegetables 15 0
3C71601 Tomato 15 0
3E45212 Fruits 5 0
3E45212 Apple 5 0
3E45212 Vegetables 36 0
3E45212 Tomato 35.5 0
3E45212 Onion 0.5 0
3C78910 Fruits 2 1.187282182
3C78910 Apple 2 1.187282182
3C82861 Fruits 64 0.560863589
3C82861 Apple 15 0
3C82861 Orange 49 0.560863589
3D11357 Vegetables 26 0
3D11357 Tomato 25.5 0
3D11357 Onion 0.5 0
3D51126 Vegetables 15 0
3D51126 Onion 0.5 0
3D51126 Tomato 14.5 0
3E20062 Vegetables 1 0
3E20062 Onion 1 0
Code :
df1.sort_values(by=['Code'],inplace=True)
CodePudding user response:
You may turn Product
to an ordered categorical dtype, where you set categories in the needed order:
df1['Product'] = pd.Categorical(df1['Product'], ['Fruits', 'Apple', 'Orange', 'Vegetables', 'Tomato', 'Onion'])
And then group by Code
first and then by Product
, use sum
as aggregate:
df1_grouped = df1.groupby(['Code', 'Product']).agg('sum')
Then you would need to drop rows with empty categories:
df1_grouped[df1_grouped.Limit != 0.0]
Output:
Limit Value
Code Product
3A68185 Fruits 0.6 0.000000
Apple 0.6 0.000000
3B22979 Fruits 3.5 0.430146
Apple 3.5 0.430146
3C67260 Fruits 3.0 1.123774
Apple 3.0 1.123774
3C71601 Vegetables 15.0 0.000000
Tomato 15.0 0.000000
3C78910 Fruits 2.0 1.187282
Apple 2.0 1.187282
3C82861 Fruits 64.0 0.560864
Apple 15.0 0.000000
Orange 49.0 0.560864
3D11357 Vegetables 26.0 0.000000
Tomato 25.5 0.000000
Onion 0.5 0.000000
3D51126 Vegetables 15.0 0.000000
Tomato 14.5 0.000000
Onion 0.5 0.000000
3E20062 Vegetables 1.0 0.000000
Onion 1.0 0.000000
3E45212 Fruits 5.0 0.000000
Apple 5.0 0.000000
Vegetables 36.0 0.000000
Tomato 35.5 0.000000
Onion 0.5 0.000000
CodePudding user response:
create a dictionary for sort order and then sort the dataframe
# create a sort sequence, Fruits are grouped first and then vegetables,
# fruits within fruit category are sequenced, and similar for vegetable
d={'Fruits':0, 'Apple':1, 'Orange': 2,
'Vegetables':10, 'Tomato':11, 'Onion': 12}
# create a temp sort order column and sort according to dictionary
# finally drop the temp sortid column
(df.assign(sortid=df['Product'].map(d))
.sort_values(['Code','sortid'])
.drop(columns='sortid'))
Code Product Limit Value
0 3A68185 Fruits 0.6 0.000000
1 3A68185 Apple 0.6 0.000000
3 3B22979 Fruits 3.5 0.430146
2 3B22979 Apple 3.5 0.430146
4 3B22979 Orange 0.0 0.000000
6 3C67260 Fruits 3.0 1.123774
5 3C67260 Apple 3.0 1.123774
7 3C71601 Vegetables 15.0 0.000000
8 3C71601 Tomato 15.0 0.000000
14 3C78910 Fruits 2.0 1.187282
15 3C78910 Apple 2.0 1.187282
16 3C82861 Fruits 64.0 0.560864
17 3C82861 Apple 15.0 0.000000
18 3C82861 Orange 49.0 0.560864
21 3D11357 Vegetables 26.0 0.000000
19 3D11357 Tomato 25.5 0.000000
20 3D11357 Onion 0.5 0.000000
23 3D51126 Vegetables 15.0 0.000000
24 3D51126 Tomato 14.5 0.000000
22 3D51126 Onion 0.5 0.000000
26 3E20062 Vegetables 1.0 0.000000
25 3E20062 Onion 1.0 0.000000
10 3E45212 Fruits 5.0 0.000000
9 3E45212 Apple 5.0 0.000000
13 3E45212 Vegetables 36.0 0.000000
11 3E45212 Tomato 35.5 0.000000
12 3E45212 Onion 0.5 0.000000