Reordering a DF by category in a preset order-CodePudding

df = pd.DataFrame(np.random.randint(0,100,size=(15, 3)), columns=list('NMO'))
df['Catgeory1'] = ['I','I','I','I','I','G','G','G','G','G','P','P','I','I','P']
df['Catgeory2'] = ['W','W','C','C','C','W','W','W','W','W','O','O','O','O','O']

Imagining this df is much larger with many more categories, how might I sort the list, retaining all the characteristics of any given row, by a determined order. Ex. Sorting the df only by 'category1', such that all the P's are first, the I's, then G's.

CodePudding user response：

df.sort_values('Catgeory1',ascending=False)

CodePudding user response：

You can use categorical type:

cat_type = pd.CategoricalDtype(categories=["P", "I", "G"], ordered=True)
df['Category1'] = df['Category1'].astype(cat_type)

print(df.sort_values(by='Category1'))

Prints:

     N   M   O Category1 Category2
10  49  37  44         P         O
11  72  64  66         P         O
14  39  98  32         P         O
0   93  12  89         I         W
1   20  74  21         I         W
2   25  22  24         I         C
3   47  11  33         I         C
4   60  16  34         I         C
12   0  90   6         I         O
13  13  35  80         I         O
5   84  64  67         G         W
6   70  47  83         G         W
7   61  57  76         G         W
8   19   8   3         G         W
9    7   8   5         G         W