I've two multi-indexed dataframes with data as shown below:
df1:
qty
plot name sub plot name crop date
BL1 Cabbage-041221 Cabbage 2022-03-01 36.000000
2022-03-02 29.900000
2022-03-03 51.799999
2022-03-16 48.000000
Cluster Beans-110222 Cluster Beans 2022-04-15 5.500000
2022-04-19 2.500000
Musk Melon-050222 Musk Melon 2022-04-05 21.100000
Tomato-050222 Tomato 2022-04-14 39.000000
BL2 Green Tomato-180122 Green Tomato 2022-03-15 7.500000
2022-03-16 5.100000
2022-03-17 3.700000
Ladies Finger-191121 Ladies Finger 2022-03-02 12.600000
2022-03-03 9.200000
2022-03-04 7.200000
2022-03-05 7.600000
Totals 35199.440001
and df2:
qty
plot name sub plot name crop
BL1 Cabbage-041221 Cabbage 606.900003
Cluster Beans-110222 Cluster Beans 8.000000
Musk Melon-050222 Musk Melon 21.100000
Tomato-050222 Tomato 39.000000
BL2 Green Tomato-180122 Green Tomato 27.200000
Ladies Finger-191121 Ladies Finger 93.100000
Ladies Finger-290122 Ladies Finger 139.650002
What I need is to combine the two dataframes so I can add a row to df1 for each crop category with the value in qty in df2 as follows:
qty
plot name sub plot name crop date
BL1 Cabbage-041221 Cabbage 2022-03-01 36.000000
2022-03-02 29.900000
2022-03-03 51.799999
2022-03-16 48.000000
Subtotal 606.900003
Cluster Beans-110222 Cluster Beans 2022-04-15 5.500000
2022-04-19 2.500000
Subtotal 8.000000
Musk Melon-050222 Musk Melon 2022-04-05 21.100000
Subtotal 21.100000
Tomato-050222 Tomato 2022-04-14 39.000000
Subtotal 39.000000
BL2 Green Tomato-180122 Green Tomato 2022-03-15 7.500000
2022-03-16 5.100000
2022-03-17 3.700000
Subtotal 27.200000
Ladies Finger-191121 Ladies Finger 2022-03-02 12.600000
2022-03-03 9.200000
2022-03-04 7.200000
2022-03-05 7.600000
Subtotal 93.100000
Totals 35199.440001
How do I go about doing this without losing the multi-index? df2 was obtained by applying a groupby on df1 as this:
df2 = (df1.drop('Totals', level=0)
.groupby(level=list(range(df1.index.nlevels-1))).sum())
CodePudding user response:
Create a new column date
in df2
with the value Subtotal
then concatenate both dataframes and sort index:
out = (pd.concat([df1, df2.assign(date='Subtotal').set_index('date', append=True)])
.sort_index())
Output:
>>> out
qty
plot name sub plot name crop date
BL1 Cabbage-041221 Cabbage 2022-03-01 36.000000
2022-03-02 29.900000
2022-03-03 51.799999
2022-03-16 48.000000
Subtotal 606.900003
Cluster Beans-110222 Cluster Beans 2022-04-15 5.500000
2022-04-19 2.500000
Subtotal 8.000000
Musk Melon-050222 Musk Melon 2022-04-05 21.100000
Subtotal 21.100000
Tomato-050222 Tomato 2022-04-14 39.000000
Subtotal 39.000000
BL2 Green Tomato-180122 Green Tomato 2022-03-15 7.500000
2022-03-16 5.100000
2022-03-17 3.700000
Subtotal 27.200000
Ladies Finger-191121 Ladies Finger 2022-03-02 12.600000
2022-03-03 9.200000
2022-03-04 7.200000
2022-03-05 7.600000
Subtotal 93.100000
Ladies Finger-290122 Ladies Finger Subtotal 139.650002
Totals NaN NaN NaN 35199.440001