Home > Mobile >  Python: Reorder dataframe rows w.r.t. column conditioning
Python: Reorder dataframe rows w.r.t. column conditioning

Time:01-14

Have got an input dataframe df like below which holds data with multiple category-main_group-sub_group. Here as a sample input have provided for single category and single main_group. So the thing is to reorder rows among sub_group value for each Category-main_group combination[For eg., Here we got two subgroups KIWI FRUIT and MANDARIN for Fruit-CITRUS combination]. On considering Type column, each sub_group and type combination need to be placed alternatively as shown in expected output.

Input Dataframe df:

Category    main_group      sub_group       Item            Type        item_order      row
Fruit       CITRUS          KIWI FRUIT      GreenKiwi       Loose           1           row_1
Fruit       CITRUS          KIWI FRUIT      GoldKiwi        Loose           2           row_1
Fruit       CITRUS          KIWI FRUIT      OtherKiwi       Loose           3           row_1
Fruit       CITRUS          KIWI FRUIT      PP GreenKiwi    PP              4           row_1
Fruit       CITRUS          KIWI FRUIT      PP GoldKiwi     PP              5           row_1
Fruit       CITRUS          KIWI FRUIT      PP OtherKiwi    PP              6           row_1
Fruit       CITRUS          MANDARIN        MandarinAfourer Loose           7           row_1
Fruit       CITRUS          MANDARIN        MandarinTangold Loose           8           row_1
Fruit       CITRUS          MANDARIN        PP Mandarin     PP              9           row_1
Fruit       MANGOES         NECTARINES      NectaYellow     Loose           10          row_1
Fruit       MANGOES         NECTARINES      NectaWhite      Loose           11          row_1
Fruit       MANGOES         PEACHES         PeachYellow     Loose           12          row_1
Fruit       MANGOES         PEACHES         PeachWhite      Loose           13          row_1
Vegg        TOM/CAP         TOMATO          Tomato Truss    Loose           14          row_2
Vegg        TOM/CAP         TOMATO          Tomato Roma     Loose           15          row_2
Vegg        TOM/CAP         CUCUMBER        Capsicum Mini   Loose           16          row_2
Vegg        TOM/CAP         CUCUMBER        Capsicum Red    Loose           17          row_2
Vegg        ONION           ONION           Onion Red       Loose           18          row_2
Vegg        ONION           ONION           PP Onion        PP              19          row_2

Expected Output:

Category    main_group      sub_group       Item            Type        item_order      old_item_order      row
Fruit       CITRUS          KIWI FRUIT      GreenKiwi       Loose           1               1               row_1
Fruit       CITRUS          MANDARIN        MandarinAfourer Loose           2               7               row_1
Fruit       CITRUS          KIWI FRUIT      GoldKiwi        Loose           3               2               row_1
Fruit       CITRUS          MANDARIN        MandarinTangold Loose           4               8               row_1
Fruit       CITRUS          KIWI FRUIT      OtherKiwi       Loose           5               3               row_1
Fruit       CITRUS          KIWI FRUIT      PP GreenKiwi    PP              6               4               row_1
Fruit       CITRUS          MANDARIN        PP Mandarin     PP              7               9               row_1
Fruit       CITRUS          KIWI FRUIT      PP GoldKiwi     PP              8               5               row_1
Fruit       CITRUS          KIWI FRUIT      PP OtherKiwi    PP              9               6               row_1
Fruit       MANGOES         NECTARINES      NectaYellow     Loose           10              10              row_1
Fruit       MANGOES         PEACHES         PeachYellow     Loose           11              12              row_1
Fruit       MANGOES         NECTARINES      NectaWhite      Loose           12              11              row_1
Fruit       MANGOES         PEACHES         PeachWhite      Loose           13              13              row_1
Vegg        TOM/CAP         TOMATO          Tomato Truss    Loose           14              14              row_2
Vegg        TOM/CAP         CUCUMBER        Capsicum Mini   Loose           15              16              row_2
Vegg        TOM/CAP         TOMATO          Tomato Roma     Loose           16              15              row_2
Vegg        TOM/CAP         CUCUMBER        Capsicum Red    Loose           17              17              row_2
Vegg        ONION           ONION           Onion Red       Loose           18              18              row_2
Vegg        ONION           ONION           PP Onion        PP              19              19              row_2

Additional Condition added on edit: this sorting reorder process need to be done only if df['main_group'] == ['CITRUS','MANGOES','TOM/CAP'], other groups should not be reordered and keep the order as it is(like here 'ONION' need to be excluded).

Kindly help me through this. Thanks in Advance!

CodePudding user response:

You can use groupy.cumcount to compute a sub-order of the sub_group. Then sort by type and this order.

Example using the efficient np.lexsort:

out = df.iloc[np.lexsort([df.groupby(['Type', 'sub_group']).cumcount(), df['Type']])]

With pandas' sort_values:

(df.assign(sub_order=df.groupby(['Type', 'sub_group']).cumcount())
   .sort_values(by=['Type', 'sub_order'])
)

Output:

  Category main_group   sub_group             Item   Type  item_order    row
0    Fruit     CITRUS  KIWI FRUIT        GreenKiwi  Loose           1  row_1
6    Fruit     CITRUS    MANDARIN  MandarinAfourer  Loose           7  row_1
1    Fruit     CITRUS  KIWI FRUIT         GoldKiwi  Loose           2  row_1
7    Fruit     CITRUS    MANDARIN  MandarinTangold  Loose           8  row_1
2    Fruit     CITRUS  KIWI FRUIT        OtherKiwi  Loose           3  row_1
3    Fruit     CITRUS  KIWI FRUIT     PP GreenKiwi     PP           4  row_1
8    Fruit     CITRUS    MANDARIN      PP Mandarin     PP           9  row_1
4    Fruit     CITRUS  KIWI FRUIT      PP GoldKiwi     PP           5  row_1
5    Fruit     CITRUS  KIWI FRUIT     PP OtherKiwi     PP           6  row_1

update with more columns:

out = df.iloc[np.lexsort([df.groupby(['Type', 'sub_group']).cumcount(),
                          df['Type'], df['main_group'], df['Category']])]

Or:

(df.assign(sub_order=df.groupby(['Type', 'sub_group']).cumcount())
   .sort_values(by=['Category', 'main_group', 'Type', 'sub_order'])
)
  • Related