I'm new to pandas and I'm trying to sort a dataframe that I used melt on.
Here's what melt looks like:
id date_time sold
0 sample_id1 Monday 0 0
1 sample_id2 Monday 0 0
2 sample_id3 Monday 0 0
3 sample_id4 Monday 0 0
4 sample_id5 Monday 0 0
... ... ... ...
7387 sample_id1 Sunday 23 0
7388 sample_id2 Sunday 23 0
7389 sample_id3 Sunday 23 1
7390 sample_id4 Sunday 23 0
7391 sample_id5 Sunday 23 0
What my desired output would look like:
id date_time sold
0 sample_id1 Monday 0 0
1 sample_id1 Monday 1 0
2 sample_id1 Monday 2 0
3 sample_id1 Monday 3 0
4 sample_id1 Monday 4 0
5 sample_id1 Monday 5 0
6 sample_id1 Monday 6 0
7 sample_id1 Monday 7 0
8 sample_id1 Monday 8 0
9 sample_id1 Monday 9 0
... ... ... ...
7387 sample_id5 Sunday 19 0
7388 sample_id5 Sunday 20 0
7389 sample_id5 Sunday 21 1
7390 sample_id5 Sunday 22 0
7391 sample_id5 Sunday 23 0
I tried using sort_values(['id']), but the output mixes the date_time column - looking like this:
id date_time sold
0 sample_id1 Sunday 23 0
1 sample_id1 Saturday 18 0
2 sample_id1 Thursday 1 0
3 sample_id1 Monday 16 36
4 sample_id1 Saturday 19 6
What function from pandas do I need to use to achieve what I want?
CodePudding user response:
Use parameter key
in DataFrame.sort_values
for convert values by dictionary:
d = {'Monday': '0', 'Tuesday': '1', 'Wednesday': '2',
'Thursday': '3', 'Friday': '4', 'Saturday': '5', 'Sunday': '6'}
def f(x):
try:
return x.replace(d, regex=True).str.split().apply(lambda x: tuple(map(int, x)))
except:
return x
df = df.sort_values(['id', 'date_time'], key=f)
CodePudding user response:
Not sure about your date_time datatype, but you can always do this to sort based on two columns
sort_values(['id', 'date_time'])
CodePudding user response:
You could try creating two new columns based on your datetime column. The first would represent the day of the week with Monday being 1 and Sunday being 7, and the second would just be the number after the space. Then do a sort based on id and these two new columns.