Home > database >  Can't order data frame by index. Pandas df .value_counts()
Can't order data frame by index. Pandas df .value_counts()

Time:12-30

I have data frame with a month column whose count I'm trying to calculate using the following method:

df[['Month']].value_counts()

This method returns:

Month
5.0      1402
9.0      1375
8.0      1273
10.0     1188
7.0      1136
6.0       801
11.0      801
4.0       651
dtype: int64

I want to use the .tolist() method on this dataframe but want to order it and can't figure out how to order the dataframe by month, that is, 1.0, 2.0, 3.0, 4.0, etc.

Any help would be much appreciated!

CodePudding user response:

According to pandas docs you can use arg. sort to control the output. The default value is True, but if False then it would most likely be sorted by the index (although I can't seem to find it in the docs).

import numpy as np

pd.DataFrame(np.random.randint(0, 12, 30), columns=['mycol']).value_counts(sort=False)

You can always use sort_index as well and that's probably a safer approach

import numpy as np

pd.DataFrame(np.random.randint(0, 12, 30), columns=['mycol']).value_counts().sort_index()

CodePudding user response:

Adding to Alex Newman's answer.

You can also do this

series=pd.DataFrame(np.random.randint(0, 12, 30), columns=['mycol']).value_counts()

this returns a Series mycol

0        6
6        5
9        4
2        3
8        2
7        2
5        2
4        2
1        2
11       1
10       1
dtype: int64

then you can simply sort that using Series.sort_index()

series.sort_index()

mycol
0        6
1        2
2        3
4        2
5        2
6        5
7        2
8        2
9        4
10       1
11       1

this code is self explanatory, for efficient and compact form of code you can use Alex Newman's code.

  • Related