I'm using python to organize an imported csv file. the dataset I have looks like this
Name Style ID
0 heels High end 1
1 sneaker Middle 0
2 top High end 3
3 skirt Low end 6
4 dress High end 4
5 sweater Low end 9
6 hat N/A. 2
..
I am trying to arrange it so that I have have the dataset sorted like this where High end, Middle and Low are all arranged first, and other styles follow
Name Style ID
0 heels High end 1
1 sneaker High end 3
2 top High end 4
3 skirt Middle 0
4 dress Low end 6
5 sweater Low end 9
6 hat N/A. 2
...
I tried this code
1 sort_order = {'High End':0,
2 'Middle':1, 'Low end':2,}
3 Clothing_Df['Style'].apply(lambda x: sort_order[x])
I get an error
---> 3 Clothing_Df['Style'].apply(lambda x: sort_order[x])
TypeError: list indices must be integers or slices, not str
I've also tried:
1 sortlist = ['High End':0,
2 'Middle':1, 'Low end':2,]
3 sorted(Clothing_Df['Style'], key= sortlist)
returns the same Typeerror
I am not sure how to best tackle this problem as it is a very large dataset and I simply need to figure out how to custom sort my data. Any help needed thank you
CodePudding user response:
use pd.Categorical
to specify the order.
style_list = df['Style'].unique()
sort_order = sorted(style_list, key=lambda x: (x == 'High end', x == 'Middle', x == 'Low end'), reverse=True)
df['Style'] = pd.Categorical(df['Style'], categories=sort_order, ordered=True)
df.sort_values('Style', inplace=True)
output:
> df
Name Style ID
0 heels High end 1
2 top High end 3
4 dress High end 4
1 sneaker Middle 0
3 skirt Low end 6
5 sweater Low end 9
6 hat N/A. 2
7 jacket Other 10