I'm trying to sort a small df by values in two columns, but I need different sorting orders (ascending/descending) for column A (price) depending on the values in column B (action).
So the df looks like this:
dfindex | price | action |
---|---|---|
4 | 0.9 | 'sell' |
7 | 0.7 | 'buy' |
2 | 0.9 | 'buy' |
3 | 0.4 | 'sell' |
6 | 0.6 | 'sell' |
5 | 0.8 | 'buy' |
1 | 0.7 | 'buy' |
8 | 0.9 | 'buy' |
My current sorting is via pandas:
tx_hist = tx_hist.sort_values(by=['dfindex', 'price'], ascending=[True, False], ignore_index=True)
I've also tried a separate sorting function, which takes the 'buy' prices separately, sorts them and inserts them back in the df, but I can't make it work as intended either.
def sorttxhist(tx_hist):
for i in range(len(tx_hist)):
w = i
buytxs = []
selltx = []
if tx_hist['action'].iloc[w] == 'buy':
while tx_hist['action'].iloc[w] == 'buy':
print(buytxs)
buytxs.append(tx_hist['price'].iloc[w])
w = w 1
buytxs = buytxs.sort(reverse=True)
tx_hist['price'].iloc[i:w] = buytxs
#buytxs.clear()
elif tx_hist['action'].iloc[w] == 'sell':
while tx_hist['action'].iloc[w] == 'sell':
print(selltx)
selltx.append(tx_hist['price'].iloc[w])
w = w 1
selltx = selltx.sort(reverse=False)
tx_hist['price'].iloc[i:w] = selltx
print(type(selltx))
print(tx_hist)
return tx_hist
But it doesn't sort it how I need it, which is a first sort by 'dfindex', and a secondary sort in descending order (price-wise) for 'buy' (in the 'action' col) and in ascending order (price-wise) for 'sell' (in the 'action' col).
So the outcome should look like this:
dfindex | price | action | Sorting order (not in the df) |
---|---|---|---|
2 | 0.9 | 'buy' | desc |
1 | 0.7 | 'buy' | desc |
3 | 0.4 | 'sell' | asc |
4 | 0.9 | 'sell' | asc |
5 | 0.8 | 'buy' | - / desc |
6 | 0.6 | 'sell' | - / asc |
8 | 0.9 | 'buy' | desc |
7 | 0.7 | 'buy' | desc |
Any help is much appreciated! Cheers
CodePudding user response:
make 2 columns named sort1
and sort2
for sorting
sort1
column divide between group buy and group sell
buy group have negative price in sort2
column
then sort by sort1
and sort2
(df
.sort_values('dfindex')
.assign(sort1=df1['action'].ne(df1['action'].shift(1)).cumsum())
.assign(sort2=df1['price'].mask(df1['action']=="buy", -df1['price']))
.sort_values(['sort1', 'sort2'])
)
output before drop sort1
and sort2
dfindex price action sort1 sort2
2 2 0.90 buy 1 -0.90
6 1 0.70 buy 1 -0.70
3 3 0.40 sell 2 0.40
0 4 0.90 sell 2 0.90
5 5 0.80 buy 3 -0.80
4 6 0.60 sell 4 0.60
7 8 0.90 buy 5 -0.90
1 7 0.70 buy 5 -0.70
finally drop sort1
and sort2